I just thought of a neat way to avoid git conflicts

January 14th, 2024

I think git is a technological marvel. I think it very well suits the problem it was designed to fix: lots of random people contributing random junk to try and get it into the linux kernel, and for that it is great.

But everybody else uses git too, apparently private companies, which never quite made sense to me, because everybody is [supposed to be] on the same team.

You’re not going to reject somebody’s pull request because they’re trying to put insecure code into your codebase, or whatever other reasons prs against the linux kernel are rejected. You work for the same company, probably on the same team trying to make the product better, why would you compete? You don’t, that’s not how companies work.

It seems to me that git is an overengineered heavy handed way to do source control among a group of people who are all striving for roughly the same goal.

Yet it seems to be the most popular hotness so everybody uses it. It is what it is.

One of my larger beefs with git is the merge conflict problem it creates.

Now I honestly can’t remember how we dealt with the problem, which must have existed, in the days of cvs and subversion, but somehow we managed, but I don’t remember being as frustrated with cvs as I am with git conflicts.

Firstly all git conflicts could be avoided with a little application of intelligent logistics. If you just order the work to be done such that two people don’t work on the same piece of code at the same time, there will be no conflict. All a conflict is, is a waste of time cause by poor time management. I’ve never heard of anybody enjoying resolving merge conflicts, and they’re easy to avoid but… nobody does.

But this morning, I had an idea, a stupid easy way to avoid git conflicts and help with the time management problem at the same time.

Now the technology to make the idea viable didn’t exist until fairly recently, so it wouldn’t have been workable, but now it does exist and it’s easy to use and nobody’s going to do it, because nobody likes change and people have been using git forever, and git is good and right and nobody really wants to make things better. If they did, they’d follow this easy solution: Use live shared editing.

Google docs lets two people edit the same file at the same time, and I believe vscode live share does something similar, so the technology exists.

If two people needed to work on the same code at the same time, they could. Forking/branching your own copy is what causes the conflict problem so… don’t do it. Everybody works on the live repo in main.

There are other things that can be done to make the experience better, like warn both or all parties when people start working to close to each other on the same code, and you could still use branches to segment units of work. I’m sure other tools could be invented to make this style of code editing more useful, but the basic concept, is that everybody actually edits the same file at the same time, and sees what the other editors are doing and naturally stays away when somebody else is working here. When you walk down the hallway and see somebody coming right at you, do you keep walking right at them? No, you make way. And so do they. Real simple.

As a result, one person may very well back off and go work on something else until the first person is done, thus avoiding creating conflicting code. Look at that: self driven logistics and time management.

Yes it would make writing tests a little harder because you’d have to wait for the other person to finish before you could run your test but… again, time management, find something else to do, so there’s no conflict. As annoying as it may seem, it’s way less annoying than having to always deal with conflicts or fear the git pull because you don’t know how much work you’re in for just to make your code work again.

The shiny new editor tools can put up little marker notifications annotating the editor saying things like: “so-and-so touched this a few minutes ago, they might not be done.”
Or the programmer can mark it complete while they go work on tests or move on to something else, indicating to the next person they’re free to work on this stuff.

Will it solve all the problems? No, it will solve the git conflict problem which is unquestionably a waste of resources time suck for all involved. There has to be a better way. This is one possible option.

The earth is a giant battery. With one charge.

December 3rd, 2023

Before the humans showed up, the earth was here soaking up sunlight. For hundreds of millions of years, the sun sent its energy to the earth where it turned into things like plants and trees and eventually animals.

All the trees and plants died and mushed into the ground and this went on for an absurdly long amount of time.

And then the people showed up and for the past 150 years or so began digging up all the oil which stored the energy from the sun, like a battery.

We are consuming the energy stored by the planet at a fairly quick rate, and the noise about “peak oil” has been around for a while. But there’s a much simpler way of thinking about energy consumption from the great earth battery.

If we are to go all in on green and get all of our daily energy needs from solar (directly from the sun) and wind (driven by weather changes caused by the sun) and hydro electric (which is where the energy from the sun evaporates water from the oceans and ground and carries it up to the clouds so we can draw the power from the falling water, noticing a theme here?), we have to consider that on any given day, our draw down of energy from these sources has to be less than that which is provided by the sun in a day.

My point is we’ve built up cities and cars and infrastructure by consuming the energy stored in the planet, and while there is still charge in the battery, that will continue to work, but in order for the system to function long term, it’s not a matter of where we draw the energy from but that the source of the energy (the sun in a lot of cases) has to provide more energy on average per day than how much we draw down in that same day.

If we draw down more, we will be consuming more of the stored energy from the great earthen battery, and plain and simple math, that is not sustainable. The battery will eventually run out.

So the question is: how much energy is provided by solar, wind, oxford comma, and hydro electric in a day, and how much do all the people consume in that same day. And if it’s more, long term, we’re in trouble.

An interesting idea for why time travel doesn’t exist.

April 30th, 2023

I don’t usually forward videos because I don’t like being inundated with “oh you HAVE to watch this!” so I don’t want to do it to other people. But if you’re interested in the moon landing hoax, this is worth a watch.

I love the moon landing hoax. I think it’s a great testament to how people think and the conspiracies they’re willing to adopt. It’s great amusement.

But of all of the anti-hoaxers and their arguments, this is my favorite. It has nothing to do with any of the other arguments people make to explain why the moon landing wasn’t a hoax. This one is technical, and was made by an apparent film geek (I can appreciate geeks because I am one too, just not a film geek).

In short he explains how the moon landings can’t be fake simply because the technology to create the film footage that came back couldn’t possibly have been made with the technology of the time anywhere but in space. The only way the video could have been made, was from the moon. It’s quite interesting.

The reason I bring it up is because I thought of a way to explain the lack of any apparent progress in the field of time travel, and it’s similar in thinking to the above video.

I do believe in time travel but I think you can only go forward in time and it has to do with perception more than anything else. You experience time travel every time you go to sleep.

Anyway, I am a storage geek. I like disk. And one day it occurred to me, that if time travel backwards in time were possible, that would mean that somewhere, or somehow the state of every atom in the universe would have to be stored for every instant in time. This way somebody would be able to replay that stored state in a way that could be observed or interacted with. That would mean you’d have to build some machine that would let you retrieve that stored state and pull each instant’s state (remember we’re talking every atom in the universe) into your viewer/replayer to be observed.

That’s a lot of bandwidth. That’s a lot of disk.

For the same reason that the moon landing video could only have been produced on the moon, (the limit of technology) I think we will never be able to produce time travel because even if the state of all of the atoms in the universe is stored somewhere, we’d never be able to retrieve it in able usable fashion. Even if you only wanted to grab a 1 foot square block of it. That’s an insane amount of storage to move over some type of information transfer. And remember there’s that whole speed of light thing to deal with.

So, no time travel, sorry.

So you’ll realize that I’m talking about bringing the history data where ever it is to us now in some machine, and most time travel stories have to do with taking the person to the information and not bringing the information to the person. Well as hard as it is to move a lot of data around nowadays in real time, we don’t have ANYTHING that remotely hints at being able to bring a person to the data.

So again, I’m not seeing it.

Doctors and programmers

June 11th, 2022

I went to the doctor yesterday, and he’s a good guy, he’s smart, very knowledgeable calm, very robot-like in his questioning to diagnose whatever ailments you might have. I’d say he’s a very good doctor.

But his job is to file paperwork. To be a doctor nowadays, there’s so much regulation and so much fighting for pennies with the insurance company that you spend 5 minutes with a patient and hours dealing with paperwork.

This is a sad situation you might think.

But reflect on the lot of the software programmer in life. They have the same problem, they write software for 5 minutes and fight with tools, and build systems, and broken libraries and updating patches, and security vulnerabilities and all this other stupid annoying shit nobody wants to do, that has very little to do with actually designing or writing software.

The difference is: the doctor has this crap foisted on him by external parties, whereas programmers do it to themselves.

SREs.

September 4th, 2021

So I was thinking about this recently. You’ve heard of SREs? Software Reliability Engineers. 

I like to joke: “SREs exist because we (developers) are SO BAD AT OUR JOBS that even an entire team of QA people isn’t enough to stop us from releasing shitty broken code, so now there’s an entire NEW layer of people to protect the end users from developers.”


I joke but it’s not entirely untrue. But I thought about it some more and what I realized was this: when charles babbage invented the first computer (were he able to actually build it) he was the inventor, designer, programmer, tester and user. All in one guy.


Then as time went on, we split out the work so that the hardware guys designed and built the hardware, and the software guys wrote and ran the software.


Then there were different people who starting using the software other than the people who wrote it.


Then there were QA people to separate the guys who designed and wrote the software from the guys who tested it. Then there were architects that designed the big picture project (as systems got larger and larger) and they handed down the bits of coding work to developers. And then there became sysadmins who managed the running of the computers, and the software guys just wrote the software and the sysadmins ran it.


And what I realized is that this is just the industry maturing. Same thing with cars. The first car was designed and built by one guy, now it’s farms of teams each doing a little bit. Same thing with the hardware. babbage designed the whole thing soup to nuts, now there’s teams of people designing the machine in the fab that prints the silicon.


And the SRE role is just a new split out part of the bigger picture of the software development life cycle. the process has gotten bigger and more complicated and there’s a gap between qa and sysadmin so now there’s SREs to fill that gap.
so it’s not exactly what I joke about, but it’s interesting to see that the field is still growing. and I’m sure it’s not done growing yet.

The genius of the itanium

August 13th, 2021

The final shipment of itanium chips went out a few weeks ago now, and that got me thinking about it again.

So I’ve spent most of the past 20 years championing the Itanium because it is truly brilliant. The basic design idea of taking the trying-and-optimize-the-paralellism-of-the-code that was in the processor and putting it in the compiler was brilliant. No question. I think the real reason it didn’t take off was sure, because the compilers were hard to write but so was monopoly at some point, that would eventually get worked out. Really, I think it was because the first itaniums didn’t run windows very well. Oh well, too bad, that’s history now.

But recently when I started thinking about it again and now knowing what the future turned out to be, it turns out the Itanium, though brilliant at the time was really just a stopgap.

They eventually would have squeezed all the performance possible out of the super wide instructions on the itanium, and maybe they’d find ways to expand the wide instructions and make them extra super wide or mega wide, but at the end of the day, they’d still be left with the 4ghz problem. One core can only go so fast.

Now that we see the future, we know that the world moved to parallel processing which any processor can do and all the added complexity of the Itanium would just have been a big technical burden to carry forever on. So maybe the lack of adoption was for the best after all. sure there are crazy optimizations on the x86 chips, which are now also biting us in the ass (spectre, etc) so maybe it would have turned out the same way in either case.

But my point is, I spent years marvelling at the wonders of this novel chip design and in the end, it wouldn’t really have bought us much, because like a lot of intended futures, things actually end up going off in a wildly different direction than anybody could have anticipated.

Same thing with ZFS. I love zfs, it’s amazing, but it was designed in a time when we thought we’d be adding more and more disk to a single machine. The future didn’t quite turn out that way did it. So now zfs is amazing for what it is, but it just can’t compete with an entire cloud of storage.

I think I have invented a new data structure, the stree.

September 5th, 2020

It’s hard to imagine there is a data structure that hasn’t been invented yet. wikipedia around and you’ll find dozens of oddly named trees and graphs and heaps and so on. All the great ones were invented or discovered (however you see it) in the 60s, and here we are 50-60 years later, you think that’d all be done by now.

 

So here, on 9/5/2020 I offer up the Stree.

It’s rather simple which is why I’m guessing somebody’s already done it at least in some form.

It combines a binary search tree with J. W. J. Williams’ binary heap.

The binary heap is a really neat data structure because it is tightly packed (‘complete binary tree’) and sorts pretty quickly, and you can find the child or parent of a node with simple math, if it is stored in a linear array of memory. Genius and brilliant at the same time.

As with all old things people have improved on it (Floyd came up with a faster build-the-initial-tree scheme) and I hope somebody can improve upon my idea, if in fact there isn’t one like it already.

I was looking for a tightly packed binary search tree and I couldn’t find one. So I thought about it for a bit, kinda wondering why it didn’t exist and came  up with this:

The binary heap makes the tightly packed tree by always only adding a new node at the end of the array of memory, and swapping nodes around until the tree is ‘correct’. In the case of the binary heap, correct is the parent key is greater than the child key. Doing a postorder  traversal of the tree will yield a sorted list (for a max heap tree).

The problem with post order traversal is that you can’t binary search it, or at least I can’t because the tree is sorta turned 90 degrees, and you can’t search sideways.

So how do we make it a binary search tree?

Kinda the same thing the binary heap does, but instead of swapping nodes on insert until the tree is max-heap correct, you swap nodes until it is binary search tree correct.

This is not as simple as making a max heap correct tree, but it’s not that much more complicated.

Since a binary heap is lopsided to favor bigger numbers as you go up the tree, you only have to go up once to find the final position of whatever number you are inserting.

A binary tree has one of its middle values at the top, so when inserting a new value into an stree, you have to swap values up to the top of the tree and then swap values back down (sometimes reswapping something you’ve already swapped) until you find the correct final resting place for the newly inserted value.

There are some other problems, and there are optimizations to be had, I haven’t worked out every single case yet, but I think it’s workable.

What happens, is like a binary heap, you insert by adding a new value at the next available space at the bottom/end leaf of the tree, or make a new row if the bottom row of the tree is full, and then start swapping nodes up until you get to the top. If you’ve found your final resting place (by comparing to the root node, based on what side you started on) you’re done, if not, you start working your way down the tree again until you’ve basically binary searched your way to where the new node is supposed to be.

It turns out that along the way, some of those swaps cause localized parts of the tree to become invalid, and then you have to go back and fix those too, but that only happens in the area of nodes that you’ve swapped, and you just keep track of things that need to be fixed as you’re doing all your initial swapping.

Worst case insert ends up being 2 * O(log n) plus a little bit more for some possible fixups.

I haven’t worked out node removal yet, but the concept is basically the same but backwards, you know which node you want to delete, and you know the spot in the tree that has to be freed up at the end, so you just have to swap up and down the tree until you get there.

I’m still working on this, but this is my basic idea, apparently data structures are not patentable, so I’m marking my stone in the sand here, as having come up with a usefully neat compact, binary search tree data structure.

Gotta go, my kid just woke up.

 

 

 

 

 

 

 

 

 

naming things.

August 25th, 2020

they say the two hardest things in software development is 1) naming things, 2) cache invalidation and 3) off by one bugs.

 

Amazon sells a product: amazon web services.

What is it?

It’s a bunch of software you can pay to use.

 

Google sells a product: google cloud platform.

What is it?

A bunch of software you can pay to use.

 

Microsoft sells a product: azure.

What is it?

A color.

 

I’ve been writing software for 40 years now, and I’ve seen a lot of things come and go. But I’ve noticed a trend towards the need to name things.

In the good old days, we used to write software. We wrote functions that did things, and we strung them together into programs.

 

Now we actually have a name for the fact that we need to name everything: “Patterns.”

A pattern isn’t an algorithm, it’s a word to explain that we categorize our algorithms.

So there’s the this pattern, and the that pattern, and because people can’t just say “well, that’s not a good idea.” they instead had to name that concept and call it an anti-pattern.

I think it’s all a bit silly and a waste of time given how hard it is to name things in the first place.

 

 

open source code reviews

May 23rd, 2020

I don’t have too many good things to say about open source, but where there is something, I’m happy to admit it.

Lots of good comes from the open source movement. It keeps people off the streets for example.

ha ha.

A lot of great software and tools have come out of the open source effort, but an unbelievably massive amount of lousy software, broken systems and bar-lowering attitudes towards software quality also come from there. I’m not saying open source is completely to blame, but it certainly isn’t helping.

One of the popular arguments for open sourcing software is that it allows interested third parties to code review the software for bugs, security holes and other problems.

I have a lot of things to say about code reviews, but that’s for another book.

I hear the phrase “code review” and I think of all the programming jobs I’ve had where sometimes in a group setting people get to critique your software for various qualities.

My personal view is that code reviews should be for other team members to familiarize themselves with new code you’ve written, and to look for and find bugs. Possibly in trying to understand how new code is supposed to work, and asking questions, a team member might uncover a bug the programmer didn’t spot.

That to me, is the ideal.

But nothing like that actually happens. No two programmers will ever produce the same code to solve the same problem. Everybody has a different style, different handwriting, different experience. The argument over tabs vs spaces is such a common drama that to bring it up is merely to tell a joke so old everybody laughs simply because they know they get to have the tabs vs spaces fight again.

In a business setting, I have found people’s tendencies in code reviews tends to be more about ‘that’s not the way I’d do it’ than ‘tell me if you think this is a bug or not’. I’ve personally seen code reviews that contribute nothing more than “you should change this variable name to that.” I’m trying to imagine any customer that would be willing to pay more for a piece of software because the variable names were changed from one name to another. You could argue that maybe one variable name might be a little more descriptive than another, but not so much that the customer would be willing to pay that much more for it, since now it’s eaten up developer time and increased the cost of producing the product.

This endlessly frustrates me because time that could be spent testing and finding bugs it instead spend dwelling on variable names and source code formatting.

But this is where open source really shines.

It’s hard enough to get anybody to code review anything for free. So if there’s something really worth code reviewing, the reviewer is likely to spend their time looking for bugs or fixing a problem they found, not complaining about formatting or variable names.

This Is Genius. It’s a self solving problem.

Perhaps a lesson can be learned from this.

Software developers get paid to write software, perhaps they should have to volunteer their time to do code reviews.

Now will they volunteer their time? Probably not, but perhaps there is another way to incentivize people to do code reviews for no compensation. Like making it mandatory. Like suffering a penalty if you don’t. Or maybe there’s some backhanded complement kind of thing that can be done that would still drive the need to do code reviews, but only enough to do them the open source/find bugs way, not the corporate america/please-change-your-code-to-suit-my-tastes way. Oh and by the way, after you change code, you’ve invalidated all your testing, so you need to test it all over again, so if you did test anything before, that was a waste of time, because somebody asked you to change a variable name.

I’ll have to think about this some more, but this is clearly something the open source crowd does better than the paid business crowd.

 

 

 

How to make printer ink cheaper.

March 1st, 2020