Peak ZFS

October 30th, 2017

I went to the open zfs developers conference last week and I learned a lot.

I’ve gone for a few years now and this year I thought was the most interesting Maybe it was only 2 years, I forget. Anyway, all the talks were technical and were about useful features and things that could make zfs better. I didn’t notice how quickly the whole day went by. It was a lot to take in, but it was all zfs all the time, and if you’re in to that sorta thing, it was a lot of good.

First of all I have to mention one very noteworthy moment. Every speaker got well deserved applause after their talk, but there was one guy, lundman who got applause just for saying what he did, and what he said was this: “I ported zfs to windows.” And he showed it and it worked. Truly a moment to behold. It was amazing.

But after listening to a few of the talks I started to notice something, and that something is that I think we’re approaching Peak ZFS.

We, as in not me, but the we of all the actual zfs developers of which I am just a wannabe so my opinion has no merit. But the web being what it is, and thinking of my favorite line from the movie Dark Star, “A concept is valid regardless of its origin” you can choose to disagree with me, but you can’t tell me what I’m saying is wrong just because I’m not a zfs developer.

ZFS has had a good run so far, it is 10-15 years old depending on how you count and it has come a long way and it does a great many amazing things, but like all software, if you keep adding to it, you’re eventually going to end up with exception upon exception upon exception that wasn’t in the original design, that has to be taken into consideration when adding new features. And any new feature you add will be an encumbrance to any future features added, so you should be careful with what you add and how you add it so as to minimize the future pain everybody’s going to have to suffer.

But that’s not what’s going on.

There are currently 3 prefetchers in zfs. Nothing wrong with prefetchers, nothing wrong with three of them either, but it is note worthy that there are three and not one.

There are currently 2 log writing systems, one for the zil and one for the spacemaps. Matt suggested adding another one to make dedup faster (a welcome feature if there is one) and Sarah from delphix suggested another log to optimize clone deletes. Possibly a less popular use case, but valid nonetheless.

Yet this will yield 4 separate and different log implementations. Is the zil anything like dedup? No, but a log is a log, and maybe it would be in somebody’s interest to save the future from the present and consolidate the logging concept into a subsystem that can be shared by all of the things that need to log things to disk for optimization purposes. All 4 of the logs (with the possible exception of spacemaps) are optimizations, and now there are 3 or 4 of them.

But the icing on the cake was this one:

George was suggesting a feature to compensate for the performance hit caused by 512 byte sector emulation on 4k sector drives. If you know anything about the world of storage, you know this is a noticeable performance problem, and not just to zfs.

But in my opinion, it’s also a problem that’s going to go away by itself, and it got me wondering if it’s really worth adding another bit of code that will probably be in zfs forever, to compensate for a temporary problem. And then I realized there were a few other features that fall into this category.

Gang blocks exist to solve the problem of zfs not dealing well when it’s running low on space.

Seems to me if you’re running a data storage system large enough to justify a filesystem that can store a zettabyte, it’s hard for me to imagine you running out of disk, and if you are, you’re probably not doing your job very well. But from now on and forever more every zfs developer has to work around gang blocks because it seemed like a worthy goal at some point to sub allocate blocks of storage to deal with low-availability situations.

Somebody pointed out that the 512 emulation problem may go away, but someday it will be replaced by a 4k->16k emulation problem. Fair enough, but I say again, if you’re running an important enough system that you require zfs, you should be able to make sure your pool is filled with disks of the same type. And if you need to move to 16k sector disks, then you make a new pool of them and figure out a way to migrate your data, not make every future zfs installation suffer the cruft of dealing with this one edge case that most of the time, nobody will experience.

Hacking more and more exceptions into zfs isn’t going to help anybody in the long term, but that’s what’s happening, and there’s no SUN in control to keep it from getting out of hand, which it seems to me, it already has.

I love zfs and probably always will, it’s hard to imagine something cooler coming along anytime soon, and I’ve been doing this software stuff for 30+ years and I’ve seen it over and over and over, it’s inevitable, and there’s nothing you can do to stop it, but you can slow it down by taking a step back and thinking about what’s really worthwhile and what can be lived without to make it last as long as possible.

Now the real answer is to “write one to throw away.” Which means starting over with the knowledge of all the lessons learned, leaving out things you no longer need (sendmail being able to send mail via carrier pigeon comes to mind (by the way, last time I looked sendmail’s main() function was 3000 lines long.))

But that can’t happen, it’s called btrfs and it didn’t fly, or is slowly heading towards a landing or something. You can’t replace zfs, if anything I predict somebody will come along and fork zfs, remove all the stuff they don’t need and maybe some other people will pick up on it and it will become the new popular zfs. But it seems unlikely a new upstart will come of out nowhere and win.

If you see how open source projects come together, it’s easy to see why zfs was awesome when managed by sun and eclipse was awesome when managed by IBM.

No disrespect to any of the current zfs developers, they’re probably the most brilliant collection of developers there is, but being open source there’s nobody running the ship with an iron fist like a company could, and in my opinion, it’s starting to show.


The third level of opposable thumb

September 23rd, 2017

What sets man apart is the opposable thumb. Man can grasp or pinch.
But that’s not all of it. With your hand you can grasp something. But you also need two arms, or more specifically two sets of pincers so that you can operate on something while holding it in place.
For example unscrewing the top off a jar.

But that’s it?

Why isn’t there a third level of pincer that allows you to hold a more complex thing in place while holding a simpler thing in place while operating on a third single part.

The answer I assume would be, “because two is good enough.”
It just struck me as odd that there are only two.

How many cool tools could we make and use if we had a third level of pincers.

What to do about identity theft.

September 20th, 2017

I’m not a tax guy but at this point it seems to me that we are forever more in our lives going to be the subject of id theft as soon as a thief gets around to us.

The information is available, it’s just that there are lots of people to thief from.
Eventually the smarter thieves will work out the most efficient ways of taking advantage of all of the information available to them, so the people with better credit scores will be targets sooner than people with bad credit scores.
So I suppose one idea is to have a lousy credit score.
Anyway. We talked about this at work and account freezes are a good idea. I’ve had freezes on my credit score accounts for a while now. (I’d recommend this for what it’s worth) which turns out to be little.
The security to protect the thief from unfreezing your account is a pin you get at freeze time.
You better not lose that pin or…. you’ll have to ask them to send it to you.
And all they require for them to send it to you, is the same information the thieves already have stolen. And to boot, somebody at work found the page and for experian at least, they’ll send the pin to any email address.
We’re all doomed.
So anyway, the point is to make it as hard as possible for the thieves so they pick on somebody else first, so freeze your accounts.
But back to my original idea.
One of the things thieves can do is steal your tax refund by filing before you do.
One simple way around this is to avoid having a refund.
Up your deductions, and start a savings plan with this newfound money so you can afford your tax bill. It may be annoying, but at least the thieves can’t get it.
And as the CPAs always say “You shouldn’t be giving the government an interest free loan anyway.”
Thoughts? Other suggestions?

Different base number set conversions.

September 3rd, 2017

My other kid is fascinated with numbers. So he asked me to make him a page that would convert numbers between any bases.

So I picked the reasonable base 2 to base 36 range.



September 2nd, 2017

If you dig a big hole, and then you take all the dirt you displaced and put it back in the hole, you’ve done a lot of work.

But you have accomplished anything? Have you improved anything?


Now let’s say you dig a big hole in somebody’s flower garden. Something pretty with rows of cutesy little flowers.

Then you take all the dirt you displaced and put it back in the hole.

You’ve done a lot of work. You can even say you’ve accomplished something. You’ve taken things a step back.

This isn’t progress, and it isn’t helpful.



Random letters

August 27th, 2017

My kid liked this page so much I thought it would be easier to run it on my server so it would load more quickly.


Progress in 2017

August 19th, 2017

So it seems I’m not the only one who has this perspective on the software development industry and it is even more rare that he also calls it “progress” just like I do. Although I use the term sarcastically and he says we’re not really making progress, which is the same thing.

So as proof that there are other people who also see the constant churn of the same stuff over and over as a giant waste of time, I give you this guy, I guess his name is uncle bob.

The smackdown:

The follow up:


Unlike me however, he offers some good examples of ways to improve the lot.

I’m starting to think that we’ve reached peak-smart. Basically there’s some bar at which most programmers basically top out at, which is what makes them go redo everything over again in a similar but different shiny sort of way. And I think they do that, because that’s it. They’ve topped out. Getting into machine learning is HARD. Whereas learning yet another silly language syntax to regurgitate the same software is easy and comforting.

Obviously there are some above-the-bar guys because machine learning exists, but I expect these are the same guys who are working on the self driving cars and the voice recognition stuff.

So it makes sense that as problems get harder, fewer and fewer people are able to solve them, and the software developers who can’t, just fall back and find a new javascript library to draw neato webpages in.

Speaking of, I’m really out of date with the web world thank The Great CPU and I’ve been doing back end block device stuff which I love. So I was surprised to find out about AMP.

I think it stands for “we’ve finally made web rendering so bad it’s unusable, so we’re hacking in yet another layer of crap to try and make it perform tolerably well.”

Go ahead everybody, go learn your next javascript AMP compliant library. Have a good time.



It’s a good thing the roman empire died out when it did.

May 8th, 2017

The greeks too, now that I think about it. There’s a number of times in human history when people were on the brink of major technological progress. Maybe the timing was wrong, on the wrong people were in the wrong place at the wrong time, or maybe there were just too many stupid people.

But for whatever reason, the industrial revolution didn’t happen until fairly recently, and it’s a good thing.

If the romans had discovered electricity and figured out they could dig up oil for fuel for generators, and for running combustion engines, there wouldn’t have been any left for me now.


C++ should have been called ++C

May 8th, 2017

Except for the odd side-effect, in C, c++ is basically equivalent to ++c.

But in C++ the magnitude of those side-effects is far greater, so much so that in a bunch of stl cases where it matters, it is far more efficient to use ++c than c++. So it seems to me it would have made more sense to recognize this by calling the language ++C instead of C++.



October 15th, 2016

I don’t get into politics much, but this election season isn’t really about politics so I thought I’d say something.
All the trash about hillary is pretty normal dirty-politician stuff. The difference is that hillary got caught.
At first I just thought she was more incompetent than the rest of the politicians who lie cheat and steal, and just don’t get caught.
Then I realized, she’s just unlucky.
It doesn’t matter who was running for president now, they would have suffered similar humiliations as hillary. With the internet solidifying its place in our society as the source of all dirt, every presidential election from now on will be made more open and public because no server is safe from hacking and computers aren’t going away any time soon.