Citus Data open sources tool for scalable, transactional Postgres

Database startup Citus Data has open sourced a tool, called pg_shard, that lets users scale their PostgreSQL deployments across many machines while maintaining performance for operation workloads. As the name suggests, pg_shard is a Postgres extension that evenly distributes, or shards, the database as new machines are added to the cluster.

Earlier this year, Citus developed and open sourced an extension called Cstore that lets users add a columnar data store to their Postgres databases, making them more suitable for interactive analytic queries.

It’s all part of a move to transition Citus Data from being just another analytic database company into a company that’s helping drive advanced uses of Postgres, Co-founder and CEO Umur Cubukcu said. Citus launched in early 2013 promising to let Postgres users use the same SQL to query Hadoop, MongoDB and other NoSQL data stores, but has come to realize that its customers aren’t as excited about those capabilities as they are enamored with Postgres.

[protected-iframe id=”49aa437994cc19939e148f897521bcf2-14960843-6578147″ info=”http://www.indeed.com/trendgraph/jobgraph.png?q=postgresql%2C+mysql&relative=1″ style=”width:540px”]

As Postgres undergoes something of a renaissance among web startups (it’s also the database foundation of PaaS pioneer Heroku and its managed database service), Cubukcu thinks there’s a big opportunity to provide tooling that lets developers take advantage of everything they love about Postgres and not have to worry about whether they’ll outgrow it or bring on another database to handle their analytic workloads.

The NoSQL connectivity is still there, but Cubukcu acknowledges that running analytics on those workloads might be a job best left for the technologies (e.g., Spark) focused on that world of data.

And whether or not pg_shard or Citus Data are the ultimate answer for scale-out Postgres, Cubukcu is definitely onto something when he talks about how the narrative around SQL and scalability has changed over the past few years. His company’s work, along with that of startups such as MemSQL and Tokutek, and open-source projects such as WebScaleSQL and Postgres-XL, have shown that SQL can scale. The tradeoff for developers is no longer relational capabilities for the scale of NoSQL.

Rather, Cubukcu thinks the new tradeoff is between open-source ecosystems and proprietary software as companies try to scale out their relational databases. At least when it comes to Postgres, he said, “Our take is, ‘You don’t have to do this.'”

Calvin: A fast, cheap database that isn’t a database at all

Yale researchers Daniel Abadi and Alexander Thomson think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn’t even technically a database. The two have created a system they think can level the playing field.

Nutanix gets $25M to help you scale like Google

Nutanix just closed a $25 million Series B round for its system that puts computing and storage on the same node, allowing companies to scale their storage layer without investing in a SAN. Khosla Ventures led the round, along with Lightspeed Venture Partners and Blumberg Capital.

The curious case of Hadoop in HPC

SGI and Cloudera have entered into a reseller agreement, but the most interesting part of the deal is that it’s yet another example of a vendor pushing Hadoop products at mainstream customers while keeping the custom stuff targeted at HPC.

EMC throws lots of hardware at Hadoop

Storage giant EMC is adding more muscle to its Hadoop strategy with a 1,000-node cluster for testing new Apache Hadoop releases and a new analytics appliance combining EMC’s Hadoop distribution with the EMC Greenplum Database.

Dell tunes Crowbar tool to Cloud Foundry

Dell’s Crowbar installation-and-configuration tool now works VMware’s Cloud Foundry. With servers fast becoming low-margin commodities thanks to the push toward micro servers, Dell is doing its best to make deploying the software that inspired the new generation of servers a breeze.

Why open source tools might need a hardware hook

The great things about open source software stacks is that they’re free and they work. The not-so-great thing is that — like many open source projects — they can be difficult to configure and manage. Luckily, hardware vendors are stepping in to fill the void.

Why Google spent almost a billion on infrastructure in Q2

Google spent $917 million on infrastructure during the second quarter, continuing an upward trend that helps ensure new services like Google+ keep running. It’s the eight consecutive quarter of increased capital expenditures for Google, which is now spending at near-record levels.

Big data on micro servers? You bet.

Online dating service eHarmony is using SeaMicro’s specialized Intel Atom-powered servers as the foundation of its Hadoop infrastructure, demonstrating that big data applications such as Hadoop might be a killer app for low-powered micro servers.