The Hurdles for Moving Big Data ‘Round the World

Michelle Munson from Aspera at Structure Big Data 2011Underlying all the useful and inspiring applications, like Hadoop, that have emerged out of the Big Data ecosystem, is a fundamental assumption: The data that companies want will be able to be accessed when companies want and need it. That functionality requires the ability to transfer files at the speeds that people expect it, and is one of the constraints of the big data world, explained Michelle Munson, CEO and co-founder of Aspera.

Aspera has built a proprietary high-speed file-transport technology, fasp, that helps data move across networks with issues like over-burdened WANs. Aspera is primarily the province of large companies dealing with big data, including digital media companies sending content among supply-chain partners, life sciences researchers sending genome-sequencing data among institutes and government intelligence customers sending video files between agencies.

Munson said current Internet infrastructure lacks three qualities:

  1. availability
  2. geographic independence
  3. security

While all these issues need to be addressed in the fundamental architecture itself, the constraint has created an opportunity for Aspera’s transfer product. The reliability of Internet services is going up, which creates an expectation that this data will be available quickly, said Ammar Hanafi, general partner with Alloy Ventures.

While consumer web services can easily meet customer expectations, Aspera’s customers are a different story. “Our customers are moving many gigabytes and larger [quantities] of data that has to be chunked up and then distributed,” said Munson. But even if Aspera’s file transfer tech can make sure the delivery is as fast as the consumer web, the company has learned it can provide something else: predictability. “After solving the bottleneck, then you can offer customers predictability,” that manage their expectations, Munson said.

At the end of the day, its a physics problem, both Munson and Hanafi said. TCP, the transmission protocol used by IP networks, just doesn’t perform all that well for moving big data long distances. That’s both a big opportunity for startups like Aspera and big data infrastructure companies.

Watch live streaming video from gigaombigdata at