Archive for December, 2011

A solution to the wikipedia problem.

Saturday, December 17th, 2011

I just came up with a solution to the wikipedia problem. Every year wikipedia goes on about the millions of dollars they need to keep running. Wikipedia is a volunteer effort just like the local volunteer fire department. The labor is free, but the equipment and resources are what cost the money. Most people don’t have fire equipment to donate to the local fire department, but they do have computers that are idle most of the time.

Wikipedia is the perfect system to run as a world-wide-distributed application. People volunteer content, and they can volunteer cpu and disk too. How big could all the wikipedia data possibly be? It’s mostly text. You know there’ll be some geeks out there more than happy to have copies of the entire thing, and everybody else who contributes disk and cpu (just by running a little application on their pc) would host caches of sections of the whole database. Not outrageous to imagine, and given the state of peer systems nowadays, not that hard to do. If wikipedia started building that system and transferring all their current data to it, they’d never have to ask for money again.


— later comments —


Light O’Matic  –  Well, my first thought is that they would have to have some way of protecting the content from just anyone being able to make their own version of it.. for example, javascript which does a checksum of the page against a per-page hash that is either fetched from a trusted server, or calculated cryptographically with a master key from a trusted server. Second thought was that they’d have to either make the whole wiki editing system work distributed.. or they’d have to keep editing centralized. Then I realized there are actually a lot of systems out there already that at least partly solve these problems and maybe one of them totally solves it…
Stu M's profile photo

Stu M  –  Well firstly realize, that there wouldn’t be much point to putting of fake copies of your section of the database, because… you can just edit the real thing. The effect is the same. But yeah, you could make it easier with trusted servers. What happens now? There are people who scour the changehistory list and just go and edit and validate and remove and stop flamewars. The same thing would happen, but the changes would have to propagate around instead of all being in one place. Not trivial, but I think in the case of wikipedia, it’s a lot easier than say bank records.
Light O'Matic's profile photo

Light O’Matic  –  They could distribute it with git… But maybe it would be simpler to just distribute reads and keep writes centralized. More of a caching scenario. The problem with people being able to modify their copies of pages is that I am assuming that any given page can be served from a lot of different places.. so if one or some of them have tainted versions, it might take a while to even notice it. Then you’d have to have a system to do something about removing that bad data. Whereas now, if you edit a page, everyone sees it, it’s very clear what happened. If I can server any data I want and pretend it’s from wikipedia, I could serve a worm or virus in otherwise totally legit looking pages. So, there has to be protection.
Stu M's profile photo

Stu M  –  I suppose you could go with the ‘signed by one of the trusted authorities’ type of thing, which would mean a certificate-like data included with all changes, but the trusted part would come from a top-down delegated authority, so the root ‘certificate’ would be signed by mr wikipedia himself and everybody in the chain would be trusted by him or the guy in the chain above him.


I have invented the fastest computer in the world.

Thursday, December 1st, 2011

The super zippy multi core crazy fast microprocessor in your computer spends well over 99% of its lifetime doing absolutely nothing.

On the rare occasion when you can manage it keep it a little busy you might hear the fan in your PC or laptop spin a little faster, but by and large your processor is idle most of the time.

What a waste. Most of the time the computer is waiting for you to read a web page or your email while it sits there and hums and waits for you to click the next button.

The problem though is that when you DO click something, you want it to respond quickly. So you have this incredible amount of processing capacity at your fingertips, so it can dance like crazy for you once every few minutes for a few fractions of a second and sit there useless the rest of the time.

But I have a solution. “What’s the problem?” you’re probably asking yourself…

I have designed a processor that takes all that idle processing capacity and stores it up, and then blasts through it when you want the computer to do something. In this way you can actually buy a lower capacity processor that functions much better than the current top of the line screamer. So it can be had for a lot less money and can be added to, to store more idle capacity for a lot less than the cost of a new processor or new computer.

If your processor fills up its processor capacity cache, you can sell the excess to big company server farms who are always for want of more capacity, or even “push” it over to your iphone or android machine. The market for this cache trade will be astronomical in size as more and more systems come online and intel and amd become less capable of enacting more and more of moore’s law.

You read it here first.