The new databases, part two (of three): optimization for every need

In part one, we looked at the forces driving a proliferation of new database solutions, loosely ordered within an emerging Hadoop ecosystem. Examples of these specialized analytical engines include:

1)    Databases optimized for cloud scaling. In the Gigaom Research report, What to know when choosing database as a service, George looks at the database solutions—such as VoltDB, Clustrix, and NuoDB—that are bringing a SQL interface to scalable database clusters.

2)    Databases optimized for archiving. As he describes in the Gigaom Research report, How to manage big data without breaking the bank, databases such as RainStor are able to leverage up to 40-times compression of archive data to bring new cost effectiveness and accessibility to the storage of data records.

3)    Open source NoSQL databases, such as MongoDB and CouchDB. These databases, which George expects to migrate more fully to the Hadoop ecosystem, are optimized for the frequent product updating required for mobile and web environments.

4)    Graph databases, such as Neo Technology’s Neo4J, that specialize in tracking and optimizing the multipoint networks found in shipping, transportation, and telecommunications, computer networks and similar environments.

5)    The Gnu-project statistical language and environment, R. This is a preexisting language for statistical analysis that will be used for stats-oriented databases within Hadoop.

6)    Splunk, with its machine log data and analysis that currently provides two-way integration with Hadoop and other data environments.

7)    Microsoft’s massively parallel data warehouse and Hadapt’s implementation of SQL on Hadoop. These products provide alternative routes to Hadoop database access that combine a SQL interface with very low-cost and high performance improvements over the traditional data warehouse.

Not all of these types of products are presently operative or fully functional within Hadoop. But Gigaom Research analyst George Gilbert expects they will be options within a larger Hadoop ecosystem as the IT industry undergoes a period of increasing database options and complexity under the increasingly unifying Hadoop umbrella.

In part three, we will look at how this market of largely startup and open source alternatives will mature—and be made practical for the average enterprise organization.

Teradata Aster now does graph processing

Teradata has upped the capabilities of its Teradata Aster big data platform by adding in a native graph-processing engine called SQL-GR. Not a bad idea considering the increased attention around graph processing lately, as well as the need for an aging Teradata to keep up with (or ahead of) of the Joneses in the big data space. And Teradata’s SNAP Framework — which ingests a query and then decides the right processing engines and data stores to invoke — is pretty sweet in theory.

Facebook builds a database benchmark for a graph-powered world

Facebook has built a new open source tool for benchmarking graph databases, called LinkBench. And although the chances are your infrastructure and workloads look nothing like Facebook’s, the good news is LinkBench was built with configurability in mind.

Biotech startup Syapse wants to be for our genomes

A startup called Syapse is trying to bring the world of “omics” — the study of all our genomes, biomes, proteomes and other “omes” — under control with a new data management platform based on some of the general techniques that also power Facebook’s Graph Search.

It pays to know you: Interest graph master Gravity gets $10.6M

Interest graph specialist Gravity has raised $10.6 million to expand its business of personalizing the web for consumers. Thanks to a semantic engine that associates the content site visitors read with related topics, Gravity says it can show readers just what they want to see.