Conventional wisdom in corporate computing is to invest in sturdy, well performing hardware. Database servers are a good example. As the business grows, so does the storage pool and the computing demands. Every two or three years a new, sturdier, more powerful server is purchased. As the database becomes mission critical efforts are made to make it fault tolerant and to some degree it works.
Unfortunately that model doesn’t scale to Internet scale in either availability or performance. To reach internet scale, you do almost the opposite of what is instinctual in corporate computing. Much better availability and performance are achieved by purchasing a bunch of commodity servers and running them as parallel but entirely independent nodes.
It might be easier to understand this idea in terms of something more tangible than computing. Let’s translate Internet scale computing to the world of cars.
Reliability
(1 *
) = (4*
)
The most reliable car according to Forbes is the Lexus LS, which costs about $60,000. How reliable is it? I don’t know, but it’s slightly less than 100%. Flat tire. Battery drained. Out of gas. Any of the above and the car won’t run. Is the Lexus LS more reliable than a Toyota Corolla? Certainly. But what about 4 Corollas? At $15,000 each, you could buy four Corollas for the price of the Lexus. What’s the probability that when you leave for work in the morning all four won’t start? Probably almost zero. So, four inexpensive cars combined are more reliable than one super reliable car.
Performance
(1 *
) = (7 *
)
What about performance? Very few cars are faster than the $105,000 Porsche GT3. In the real world, you seldom drive a 1/4 mile in a straight line. More realistic is going to the grocery store, picking up the dry cleaning, dropping off a library book, driving Bryce to soccer practice, picking up Rachel from her piano lesson, dropping the cat off at the vet and going to the post office. What would do those tasks faster, a GT3 or our Toyota Corolla from before? For sure the GT3. But what if you had 7 Corollas with 7 drivers? Which car would complete the errands faster? I’m picking the 7 Corollas. So by working in parallel, seven inexpensive cars are faster than one super fast car.
That’s roughly how computing at Internet scale works. Buy servers that are less expensive, but buy more of them, which boosts both reliability and performance. Of course you need a fabric of software to make these independent nodes work together. Google has its proprietary triumvirate of GFS, MapReduce and BigTable, to which the open source community has responded with Hadoop. This stuff really excites us at blist. And you thought we were only interested in great user interfaces.
Subscribe via email