Webscale Databases: Is Open Source Really Necessary?

When it comes to deploying databases — or any infrastructural pieces, really — at web scale, many large sites opt to “go cheap, go custom or go home.” Given their unique needs, this credo makes sense, but I wonder if the companies following it aren’t making more work for themselves than is necessary. Might the resources spent developing open-source projects or building tools from scratch not become extraneous if companies could buy solutions that would work just fine?

Isn’t it plausible that a proprietary vendor –- Oracle, let’s say –- could launch a webscale database or analytics solution that would do the trick for a company like Facebook? If there’s one thing Larry Ellison knows better than relational databases, it’s how to make a buck. Hypothetically speaking, Oracle could offer database and data-analysis solutions that could save a company like Facebook from having to act like a software company itself. It certainly hasn’t hesitated to buy its way into alternative markets in the past.

Another consideration is where web companies draw the line regarding commercial solutions: Is an open-source but subscription-based vendor like Red Hat out of the question? What about any of the emerging startups tackling file systems, memcached and other issues?

I’m not suggesting that Facebook et al are heading down the garden path with their current approaches, or that there’s a glut of proprietary products on the market, only that it’s not inconceivable that commercial vendors could meet the needs of these companies. You can read my full column over at GigaOM Pro (subscription required). What do you think? Are open-source and DIY solutions really the best bet for webscale companies?