What Happens When Web-Scale Computing Becomes a Commodity?

17 December, 2007

With the announcement of Amazon's much-anticipated SimpleDB service this week, we now officially live in a world where the kind of enormous systems run by Google, Yahoo, Ebay, et al — systems that power huge portions of the web (where 500+ million users is totally mundane) — are available on demand in small doses and at reasonable prices to anyone who needs them. Amazon Web Services now provides all the necessary infrastructure to run applications that host millions of files for download, persist hundreds of millions of database records, and run thousands of processes, all without building or maintaining any physical infrastructure.

On this infrastructure, the only real difference between running a small application (a custom CMS for a medium-sized non-profit, for example) and a large one (say, Digg) is the size of your monthly bill. And as other companies besides Amazon enter this market as a relatively simple way to monetize their huge existing infrastructure costs the size of that bill will fall as well.

So, what are we talking about here? Within a few years, a scale of computation that is currently only available to a handful of multi-billion dollar companies will be available to any pair of dorm room-bound hacker kids with $30/mo. and a pair of MacBooks.

Just as the rise of commodity server hardware and open source software revolutionized the web over the last ten years, it would be reasonable to expect changes of similarly breathtaking scope as we undergo the commoditization of web-scale computing over the next ten.

For a few clues as to what happens when the current, if obscure, state of the art becomes an industry-standard lowest common denominator, it helps to look at some history. This has happened at least twice in the last thirty years: once when industry standardization around x86 hardware lead to the collapse in prices for DOS- and then Windows-compatible PCs that made them ubiquitous around the world; and again when these same PCs reached a level of power and the open source software written to run on them reached a level of affordability and reliability that, together, they displaced the expensive and proprietary server systems and radically lowered the barrier to entry for web-development leading, as Tim O'Reilly has clearly outlined, to Web 2.0.

Each of these transitions had the same two high level effects: they made it cheaper to produce professional caliber work and they increased the value of openness.

The rise of the ubiquitous, cheap, and powerful PC has created near-universal access to the best digital tools available. Professional accountants, graphic designers, record producers, photographers, and countless others do their work on exactly the same relatively cheap hardware available to the average consumer playing games and writing email.

Similarly, the precipitous pricing drop in web application development and deployment environments caused by the birth of the LAMP stack and the commodity servers on which it runs made it possible for startups like del.icio.us and Flickr without first raising millions of dollars in venture funding to buy Sun Workstations. (And this doesn't even take into account the second order revolution in the content industry caused by the cheap-to-free hosting publishing tools created by these very startups and run on commodity web hosting.)

The openness story is, if anything, even more shocking. Like fruit flies spontaneously generating out of garbage, Linux grew out of the universally available commodity PCs with their high levels of hardware compatibility. The same process that filled the PC landscape with identical gray boxes running Windows made it possible for a few OS geeks from obscure countries to build, in their spare time, an operating system that could feasibly run, for free, on all of them.

Similarly, the commoditization provided by the triumph of the LAMP stack made it possible for a handful of people to build non-profit web applications like Wikipedia and Craig's List whose only mission is to make useful information universally available to anyone who wants it.

Now. What form will these two kinds of changes take with the use of web-scale computing?

Let's start out with the ability to produce professional caliber work more cheaply. Currently now, it was only feasibly to build web-scale applications if the market for them was also web-scale. That is, you only got to use resource intensive technologies like full web spidering or massive file caching if you were building a mainstream service with a potential audience of 500+ million daily users. This meant the basics: search, ads, maybe games.

But, when doing these things only costs a couple hundred bucks a month, a great many smaller markets suddenly become lucrative. Could you build value on top of a dynamically updated list of every mention made of every stock ticker symbol anywhere on the web? How about every mention of every trademark? Or every mention of every mp3? Since the overhead for extracting that value no longer includes building and maintaining enormous data centers it might actually become feasible to build service with such requirements.

(As a side note, it's worth noting that one of the corollary effects of such a change is that it's no longer necessary to take tens of millions of dollars in venture funding in order to run a business that requires web-scale computing. This means even more leverage for startups when negotiating for what little funding they do need and even shakier times ahead for the VCs out there looking to invest their billions in only a small handful of huge deals.)

And what about openness? What new prospects for collaborative networked volunteer-driven world-improving projects might we see?

How about a Web OS that runs on top of a peer-to-peer network of commodity machines that's available to anyone who contributes some spare cycles to the cause — like a Google-scale Linux install running on top of SETI-at-home? Or what about a world-wide effort to federate the tracking of all manufactured objects via their RFID tags in order to maximize the efficiency of their recycling, discover any of their toxic effects, and rollback global warming?

These ideas may seem silly or grandiose, but so did Google when Larry and Sergei were still students or Linux when it was just an excuse for mailing list flame wars. This is one of those times. A lot of new things just became possible.

Tagged: , , , , ,