I think I have invented a new data structure, the stree.

September 5th, 2020

It’s hard to imagine there is a data structure that hasn’t been invented yet. wikipedia around and you’ll find dozens of oddly named trees and graphs and heaps and so on. All the great ones were invented or discovered (however you see it) in the 60s, and here we are 50-60 years later, you think that’d all be done by now.

 

So here, on 9/5/2020 I offer up the Stree.

It’s rather simple which is why I’m guessing somebody’s already done it at least in some form.

It combines a binary search tree with J. W. J. Williams’ binary heap.

The binary heap is a really neat data structure because it is tightly packed (‘complete binary tree’) and sorts pretty quickly, and you can find the child or parent of a node with simple math, if it is stored in a linear array of memory. Genius and brilliant at the same time.

As with all old things people have improved on it (Floyd came up with a faster build-the-initial-tree scheme) and I hope somebody can improve upon my idea, if in fact there isn’t one like it already.

I was looking for a tightly packed binary search tree and I couldn’t find one. So I thought about it for a bit, kinda wondering why it didn’t exist and cameĀ  up with this:

The binary heap makes the tightly packed tree by always only adding a new node at the end of the array of memory, and swapping nodes around until the tree is ‘correct’. In the case of the binary heap, correct is the parent key is greater than the child key. Doing a postorderĀ  traversal of the tree will yield a sorted list (for a max heap tree).

The problem with post order traversal is that you can’t binary search it, or at least I can’t because the tree is sorta turned 90 degrees, and you can’t search sideways.

So how do we make it a binary search tree?

Kinda the same thing the binary heap does, but instead of swapping nodes on insert until the tree is max-heap correct, you swap nodes until it is binary search tree correct.

This is not as simple as making a max heap correct tree, but it’s not that much more complicated.

Since a binary heap is lopsided to favor bigger numbers as you go up the tree, you only have to go up once to find the final position of whatever number you are inserting.

A binary tree has one of its middle values at the top, so when inserting a new value into an stree, you have to swap values up to the top of the tree and then swap values back down (sometimes reswapping something you’ve already swapped) until you find the correct final resting place for the newly inserted value.

There are some other problems, and there are optimizations to be had, I haven’t worked out every single case yet, but I think it’s workable.

What happens, is like a binary heap, you insert by adding a new value at the next available space at the bottom/end leaf of the tree, or make a new row if the bottom row of the tree is full, and then start swapping nodes up until you get to the top. If you’ve found your final resting place (by comparing to the root node, based on what side you started on) you’re done, if not, you start working your way down the tree again until you’ve basically binary searched your way to where the new node is supposed to be.

It turns out that along the way, some of those swaps cause localized parts of the tree to become invalid, and then you have to go back and fix those too, but that only happens in the area of nodes that you’ve swapped, and you just keep track of things that need to be fixed as you’re doing all your initial swapping.

Worst case insert ends up being 2 * O(log n) plus a little bit more for some possible fixups.

I haven’t worked out node removal yet, but the concept is basically the same but backwards, you know which node you want to delete, and you know the spot in the tree that has to be freed up at the end, so you just have to swap up and down the tree until you get there.

I’m still working on this, but this is my basic idea, apparently data structures are not patentable, so I’m marking my stone in the sand here, as having come up with a usefully neat compact, binary search tree data structure.

Gotta go, my kid just woke up.

 

 

 

 

 

 

 

 

 

naming things.

August 25th, 2020

they say the two hardest things in software development is 1) naming things, 2) cache invalidation and 3) off by one bugs.

 

Amazon sells a product: amazon web services.

What is it?

It’s a bunch of software you can pay to use.

 

Google sells a product: google cloud platform.

What is it?

A bunch of software you can pay to use.

 

Microsoft sells a product: azure.

What is it?

A color.

 

I’ve been writing software for 40 years now, and I’ve seen a lot of things come and go. But I’ve noticed a trend towards the need to name things.

In the good old days, we used to write software. We wrote functions that did things, and we strung them together into programs.

 

Now we actually have a name for the fact that we need to name everything: “Patterns.”

A pattern isn’t an algorithm, it’s a word to explain that we categorize our algorithms.

So there’s the this pattern, and the that pattern, and because people can’t just say “well, that’s not a good idea.” they instead had to name that concept and call it an anti-pattern.

I think it’s all a bit silly and a waste of time given how hard it is to name things in the first place.

 

 

open source code reviews

May 23rd, 2020

I don’t have too many good things to say about open source, but where there is something, I’m happy to admit it.

Lots of good comes from the open source movement. It keeps people off the streets for example.

ha ha.

A lot of great software and tools have come out of the open source effort, but an unbelievably massive amount of lousy software, broken systems and bar-lowering attitudes towards software quality also come from there. I’m not saying open source is completely to blame, but it certainly isn’t helping.

One of the popular arguments for open sourcing software is that it allows interested third parties to code review the software for bugs, security holes and other problems.

I have a lot of things to say about code reviews, but that’s for another book.

I hear the phrase “code review” and I think of all the programming jobs I’ve had where sometimes in a group setting people get to critique your software for various qualities.

My personal view is that code reviews should be for other team members to familiarize themselves with new code you’ve written, and to look for and find bugs. Possibly in trying to understand how new code is supposed to work, and asking questions, a team member might uncover a bug the programmer didn’t spot.

That to me, is the ideal.

But nothing like that actually happens. No two programmers will ever produce the same code to solve the same problem. Everybody has a different style, different handwriting, different experience. The argument over tabs vs spaces is such a common drama that to bring it up is merely to tell a joke so old everybody laughs simply because they know they get to have the tabs vs spaces fight again.

In a business setting, I have found people’s tendencies in code reviews tends to be more about ‘that’s not the way I’d do it’ than ‘tell me if you think this is a bug or not’. I’ve personally seen code reviews that contribute nothing more than “you should change this variable name to that.” I’m trying to imagine any customer that would be willing to pay more for a piece of software because the variable names were changed from one name to another. You could argue that maybe one variable name might be a little more descriptive than another, but not so much that the customer would be willing to pay that much more for it, since now it’s eaten up developer time and increased the cost of producing the product.

This endlessly frustrates me because time that could be spent testing and finding bugs it instead spend dwelling on variable names and source code formatting.

But this is where open source really shines.

It’s hard enough to get anybody to code review anything for free. So if there’s something really worth code reviewing, the reviewer is likely to spend their time looking for bugs or fixing a problem they found, not complaining about formatting or variable names.

This Is Genius. It’s a self solving problem.

Perhaps a lesson can be learned from this.

Software developers get paid to write software, perhaps they should have to volunteer their time to do code reviews.

Now will they volunteer their time? Probably not, but perhaps there is another way to incentivize people to do code reviews for no compensation. Like making it mandatory. Like suffering a penalty if you don’t. Or maybe there’s some backhanded complement kind of thing that can be done that would still drive the need to do code reviews, but only enough to do them the open source/find bugs way, not the corporate america/please-change-your-code-to-suit-my-tastes way. Oh and by the way, after you change code, you’ve invalidated all your testing, so you need to test it all over again, so if you did test anything before, that was a waste of time, because somebody asked you to change a variable name.

I’ll have to think about this some more, but this is clearly something the open source crowd does better than the paid business crowd.

 

 

 

How to make printer ink cheaper.

March 1st, 2020

Planning and itches.

October 31st, 2019

I started writing a kernel module for linux recently. I’ve written small proof-of-concept kernel modules before, but nothing of any substance.

This time I wanted it to do something.

I have a friend who’s a linux kernel developer who’s been helping me out. You can’t really appreciate how different a beast it is to write software for the linux kernel, until you actually try it, or spend hours sitting with somebody who is.

I have not written or tried to write a kernel module for windows, but I have seen the source code for one and have code reviewed bits of it and had some of it explained to me by the author.

In some ways these two environments are similar and in some ways very different.

And it dawned on me, that deep down, the reason for this is, because the windows kernel was designed by people with experience in designing operating system kernels and linux is the culmination of thousands of scratched itches.

There is decent tooling for debugging the windows kernel, with an actual debugger, there’s actually two.

There is none for the linux kernel. It’s a big itch to scratch and nobody has yet been up to the task.

I’ve met some linux kernel developers and have heard them give talks. Despite all my experience and generally low opinion of software developers as a group, I for some reason thought that kernel development was a step up. You had to really have your shit together to endeavor to work on the linux kernel.

Turns out I was wrong, and you’ve got the same barrier to entry for writing software for the linux kernel as you do for writing any other kind of software: None.

No license or degree or certification is required. Any moron can write software for the linux kernel and if it scratches a sufficiently annoying itch, you have a chance of getting it accepted. To be fair, there is a mailing list where things like kernel patches are discussed. Actually there’s many mailing lists, because even the kernel developers who lack the sense to use a real messaging or ticketing system appear to acknowledge the fact that it is a good idea to split up discussions into groups.

The fact that it is 2019 and the linux kernel exists, despite being the sum of many dermatological challenges, organized by a messaging system dating from the 1970s, shows the true grit of humanity.

It is this same grit that allowed the space shuttle to come into existence.

It is pathetic to compare the two though.

 

 

 

A theory on why the mess is.

August 6th, 2019

I don’t usually foray into politics, but one thing struck me a few years ago as to why we appear to be sitting in a mess of politics that we haven’t had to deal with in a long time.

It’s only a theory, but its point seems to make sense to me.

The current generation of people in government are unique in the position that their generation has never seen real hardship.

They’ve not suffered a world war, nor have they been drafted to go to vietnam.

They’ve suffered no mass plague, or invasion. The economic recessions they’ve been alive for have largely been borne by their parents. The people who suffered in 2008-2009 maybe haven’t made it to the running-the-world age yet or are a small enough fraction that it doesn’t encompass the body of people who work in government.

But the basic idea is this: when there was a bigger enemy, or a bigger problem to deal with, you had less energy to spend on fighting your enemies to get your way, and maybe were more inclined to see the perspective of others, since you weren’t so dissimilar as you had common experiences (like a world war).

But now we have a whole generation who knows nothing but good times. No wars, or famines or economic depression. They’re so used to having the good times go their way, that they have less of a sense of co-operation and compromise. Why should they submit to anything being but the way they want, as that’s all they’ve known all their lives.

So you end up with factions of people who refuse to give in and let anything happen that isn’t the way they want it to be.

Just a thought.

 

The unnecessary size of the human brain.

July 16th, 2019

This dawned on me a few weeks ago, and I don’t think I’ve really solidified it into a solid idea yet, so this might not quite make sense.

A few years ago I was watching a turkey in my backyard and I had suddenly figured out after decades, why some birds bob their heads forwards and backwards when they walk.

Ever since I was a kid in brooklyn and I saw pigeons always bobbing their heads when they walked, I always assumed that there was some bone linkage in their bodies that in order to move their legs, the structure of their skeleton required that they move their head forward and back with each step. Hey, I was five.

But I was watching this turkey against the backdrop of the woods behind it, and because of that perspective I was able to make this amazing insight: It had nothing to do with bone linkage, it had to do with keeping its head still.

I found this to be a profound insight, mostly because it jarred my 5-year-old’s version of a deeply held understanding, only because I hadn’t really thought about it since I was 5.

So then I thought: well, why does the turkey need to keep its head still?

The first thing that came to mind was that its brain was too small to process all the moving data, and it had to sit still so it could get a grasp in its little head of what was going on around it. I figured the poor turkey only saw the world in snapshots, and didn’t absorb what was going on while its head was moving forward.

So apparently I hadn’t progressed much beyond my 5 year old mentality.

When I was 5, there was no internet, but now there is, so I looked it up.

And I was wrong, but I was close.

It is true that the bird keeps its head still so it can get a non-moving view of the world, but not because its brain is too small, but because it is much easier to spot movement of predators when you’re not moving.

I always wonder how scientists know things like that. Seems like a reasonable answer, but somehow we trust them to be right.

But maybe I wasn’t completely wrong, the reason birds can’t see predators moving against the background while they are moving, is because their brains are too small and can’t process all that information. A computer could do it.

But that got me thinking, it’s pretty amazing what a bird with a brain the size of a pea can pull off.

Which makes you wonder, given the size of the human brain, you’d think we’d be able to do way more stuff than a bird, and we can, but not proportional to the the amount of extra hugeness the human brain is compared to a bird’s brain.

So what does all that extra brain do?

Well, it has feelings, it has memories, it has instincts. It has biases, oxford commas, and prejudices. It has schizophrenia, defense mechanisms, a sense of music and rhythm. It has the ability to perceive moving things while it itself, is moving. It has right-brain control of muscles that allow you to breath, chew gum, tap your head and rub your stomach, all at the same time.

And it has self awareness and intelligence. This is the part where my idea hasn’t completely been flushed out yet.

But the basic idea is that evolution made a lot more brain than needed to be self aware, and have the intelligence it has, and along the way it picked up a lot of baggage that it doesn’t need, but worked well enough to get by until the intelligence part showed up.

There is a lot of unnecessary human brain, doing all sorts of unnecessary things.

And these things may have been necessary to get to where we are, but they don’t always help now.

 

The internet is the digital version of globalization.

March 2nd, 2019

It used to be that you could buy a computer, run software on it an it worked, and it stayed working forever.
I saw a guy at a hertz rental place in the early 2000’s running the truck rental service on an ibm pc xt with a green screen and an epson dot matric printer and everything and it all just worked.

But now everything is on the internet.

So you can no longer be sure that any software you have will continue to work as long as your computer does because it has to talk to other computers that might be changing in some way.

This happened to me (again) today.

I have a machine that has a dynamic ip address and it uses inadyn to update the ddns server at afraid.org so that I can find my machine when I want to.
But the dynamic ip changed at some point and I could no longer get to the machine. I eventually got access to it and found that inadyn was no longer able to talk to the ddns server:
“The response of DYNDNS svr was an error! Aborting.”
Like all good software it told me what the problem was in detail and how to fix it.

After some rummaging around I found that the server inadyn was talking to was no longer supporting the old version of inadyn I was using, there was a new protocol and my old old version didn’t speak it.
Because I’m attached to the global supply chain, I can no longer expect things to remain working, because some other parts of the global supply chain might change and I’ll have to change as well to keep things working.

So I said, okay, I will fix it.

I downloaded the latest version of inadyn and tried to build it.
inadyns requires libconfuse.
So I download that, and try and build it, but it requires gettext.
So I build that. That works.
Then I go to build libconfuse again and it fails with some lexer problem.

I download a version that’s a bit older than the newest and built that, it builds.
Then I go back to inadyn and it builds too.
I install it, and run it, and it says… it requires a newer version of gnutls than I have.

So I download gnutls and try to build it.
It says in requires nettle 3.4. So I download that and build it. It builds.
I try to build gnutls again and it says it still requires nettle 3.4.

I google and there’s a few answers on stack overflow, but none solve my problem.

At this point I stop and I wonder what the purpose of all this is.
Somewhere at the bottom of this chain of rabbit holes I expect there will be a circular dependency making it impossible to get working.

At this point some of you are wondering “what kind of machine is this that you can’t just use the package manager to get the latest package.” It doesn’t matter, that’s not the point. It’s all broken. It’s a pile of houses of cards stacked on top of each other.

One of the more amusing points was when I noticed libconfuse titles itself thusly: “Small configuration file parser library for C.

I am libconfused as to why it takes so long to build something small and why there are so many source files just to parse a config file. Or maybe what they’re saying is that it can only use small configuration files, and large configuration files are beyond it. It still shouldn’t take that long to build. I have a small c++ class I use to read config files. It’s about 100 lines long. I can compile it faster than I can hit the period at the end of this sentence.

If it’s 2019 and we can’t make this a simpler process, then maybe it’s not worth doing at all.
But it doesn’t matter whether I like it or not or whether it works or not, because we are all part of this global interoperable supply chain that now requires you keep up to date or no promises that it will continue to work.

For really important things (ie, systems where money is traded) apparently there’s some notification system to alert you to impending breaking changes, but for anything that isn’t about transferring money, you just better keep up all the time, or suffer unexpected compatibility failures when somebody else decides to break something you set up years ago and left running because it worked.

 

SFINAEIBP

February 5th, 2019

Substitution failure is not an error is bad programming, in my opinion.

It seems to me that if you are making special cases for different classes in a template, then you’ve clearly missed the point of templates and are using them incorrectly. Templates are supposed to apply a concept or algorithm uniformly to a class. A vector, or a hash, work on objects of any type, uniformly.

If you’re SFINAEing, then what you really want to be doing is make a base class and derive other classes from it, each having traits specific to that class. That’s the very definition of what object oriented programming is for.

By taking advantage of a hack to cover a language flaw that serves no purpose but to supply entries for ‘the most heinous error message to come out of a c++ compiler’ contest, you’re being cool, but you’re not being a good programmer.

 

Google Shark Jumping

December 17th, 2018

In 2003, google says: “Seth Godin Says Google Has Officially Jumped the Shark”

I think that’s kind of a personal decision.

I think google only recently jumped the shark for me.

Google, having amassed vast amounts of information about every or at least lots of individuals can be said to jump the shark for different people at different times depending on the amount and type of data they have for a particular person and how they use it and the results that gathering that information for a particular person has had.

For me, google just jumped the shark.

A few weeks ago, probably months ago now, I forget when it was, google stopped updating the news headlines on their “google news and weather” app.

This is kinda funny because I remember them doing that once before as well, forcing me to abandon my favorite news app for something ‘better’.

Well this is the second time they’ve done that, maybe third time’s the charm.

But it wasn’t, it was the time they jumped the shark.

I replaced the “google news and weather” app with the “google news” app like a good little sheep, just like they told me to. Funny how removing weather from the app somehow was supposed to make it better.

Anyway.

A friend of mine just asked me about something related to politics and I pointed out how I don’t read too much about politics, but it made me realize that this new app shows me lots more in the way of news articles and most of them are political.

There’s the “just for you” page, and the “latest” page which are nearly identical and filled with lots of the latest political hoo-ha. I will admit to reading some of it, but not very much.

But I realize I don’t read many non-political articles, because it just doesn’t show me very many.

I have to look a number of pages in to get an article that is just a current news story about something that isn’t politics.

Then I realized, that this app never shows me sports news. That’s fine, I never follow sports, and I never click on articles, so google got that one right.

But then I thought about it some more and realized, that I do read a relatively large percentage of ‘technology’ articles, relative to all the news I read. And I realized that lately, most of what I’ve seen show up in the technology section of the news app is about games.

I’m not a gamer, I really don’t care about fortnight or why I should click A-B-B-A or this or that game. I have clicked through many more technology articles than political articles, and none of them were about games, yet that’s all google shows me now.

And I realized… google has jumped the shark. They have so fined-tuned their understanding of my interests in news articles that they can no longer show me news articles I’m actually interested in.

Congratulations google, you’ve peaked, you’ve surpassed maximum, you’re on the downside of the hill.

I can’t wait to see who’s going to replace them with a small shell script.