Rosslyn Analytics, Microsoft Finding Value in Data Aggregation

Data marketplaces, like those I discussed last week, become more valuable as they move beyond offering catalogs of individual sets to combine data from different sources. Separate conversations I had with Rosslyn Analytics CEO Charles Clark and Microsoft Windows Azure‘s Product Unit Manager for the data market business, Moe Khosravy, illustrate that such plans are being considered. They also show that a critical factor needs to be addressed in building these new business opportunities: trust.

Rosslyn Analytics provides cloud-based tools for management of corporate spend data; these tools enable customers to extract data from various financial systems across the enterprise and combine them for analysis. Customers upload data to Rosslyn’s secure servers in the U.S. and the UK, raising the possibility that further insights might be gained by aggregating anonymized trends across data from more than one company. With explicit permission to use data, Rosslyn could offer customers valuable insight into their performance relative to the broader market. They might, for example, enrich one customer’s view of its own carbon footprint statistics or travel expenditure by comparing those items with similar figures from across the same industry, companies of the same size, etc.

Technically, it would be straightforward to aggregate these figures from the 50Gb of financial data uploaded to Rosslyn each month. And the value of gaining insight into the $5 trillion of expenditure customers’ anonymous peers and competitors is obvious.¬†Far more complicated, predictably, is building the confidence in customers that their own data can be contributed to the pool without risking privacy or loss of competitive advantage. While it’s interesting to hear Rosslyn CEO Charles Clark considering the above ideas, any moves to provide such a service would need to be made slowly, and in consultation with customers who are today buying a very different product.

If Rosslyn Analytics’ proposition is one where customers upload and analyze their own data, Microsoft’s Azure DataMarket explicitly provides access to data collected for the purpose of being shared or sold to anyone who needs it. Customers visiting the DataMarket are searching for quality data from trusted names such as the United Nations, Dun & Bradstreet and Esri.

Microsoft’s Moe Khosravy certainly recognizes the importance of quality data, and agrees that simply creating a marketplace where those existing data sets can be bought and sold is to miss a massive opportunity. Khosravy suggests that the team at Microsoft is hard at work on features that will simplify the process of combining data from different sets to create new insights and knowledge.

But while this aggregation is something of a Holy Grail for many data marketplaces, it raises a host of complex issues. If data quality or the trustworthiness of its creator are key selling points for individual data sets, how do the metrics change once data is aggregated? While the reputation of the World Bank or the United Nations plays an important role in assessing the value of their own data, it is far harder to judge the quality of a new data set created by combining contributions from both. Were the data scientists who converted the data sufficiently knowledgeable? Did the two sets record information in a comparable fashion? If both data sets record details of unemployment statistics, for example, did the agencies define an unemployed person in the same way? For data marketplaces, the challenge is to answer these questions and communicate those answers clearly and succinctly to their customers. Increasingly, perhaps, the customer relationship will be with the data marketplace rather than with the creators of the individual data sets to which it provides enriched access.

For companies like Rosslyn Analytics, and others sitting on pools of private customer data, trust will be key in mining the hidden value. Despite a business model apparently more amenable to aggregating and adding value, the data marketplaces need to nurture that same trust if they are to take their business to the next level.

Question of the week

How can companies build trust with their customers when aggregating data sets?