How big data will change networking

Boundary CEO Gary Read

What if you could know everything about your network? What if instead of getting snapshots — albeit very rapid snapshots — you could see the path of every packet and run basic analytics on that stream of data in real time? It’s the difference between watching a Pixar cartoon as opposed to viewing a flip book. And that changes things.

That’s why I was so excited last year to learn about Boundary, a startup that has raised $4.1 million and now has 21 paying customers after about 6 weeks of making its subscription networking monitoring service generally available. The company lets customers see their network operating in real time — every packet and every flow. Each day it gets about 200,000 inbound records per second and generates about a terabyte of data that is processed through its proprietary data store — built using a combination of Scala and Erlang.

The startup is cool, but in a chat with Gary Read, the CEO of Boundary, we discussed what customers have done with the platform and how real-time monitoring and embracing all the data instead of some of it has allowed customers to see more, see faster and save money.

Seeing more at DNSimple. Anthony Eden, the CEO of DNSimple, is using Boundary’s service to monitor traffic flowing into his DNS provider. He had tried other services such as those from New Relic, but they didn’t give him the detail he needed to understand the traffic hitting his servers.

The new visibility let him spot interesting traffic patterns, especially related to requests coming from China. Eden explained that he didn’t know if the traffic was malicious; it just looked different. “Crafted” is the word he used. The traffic patterns were subtly adjusted — something other DNS providers had seen as well. So now Eden is faced with a new traffic pattern and plans to keep an eye on it. Luckily, by watching everything in real time, problems can be detected earlier.

Seeing faster. Another customer (who preferred to be unnamed) says the service helps them detect network problems and attacks about five minutes ahead of what other software allows, thanks to the all-encompassing view of the packets. And with the speed with which a network problem can go from anomaly to all-out failure, five minutes could be the difference between a site slowdown or something like Amazon’s massive outage from 2011.

Saving money. Having access to more data can also help developers quickly update their apps and save money. Another unnamed customer who had moved applications from Google Apps (s goog) to Amazon Web Services (s amzn) realized immediately after the move that their app wasn’t performing well. After hooking it into Boundary they realized the way the app was accessing an external DNS provider was costing the app developer more money. So the developer switched to Amazon’s in-house DNS lookup service and rebuilt the application to optimize it for Amazon. In the process they estimate they have saved $15,000 per month and seen performance speed up within a few hours.

To be clear, other networking monitoring applications would likely have helped in each of these cases, but the key for customers appears to be the speed and the amount of information Boundary can parse. Of course, not everyone thinks it’s necessary to look at everything, and for some apps it may never be worth it. They won’t ever need a Pixar-style animation.

Plus, there are plenty of questions about how well Boundary can scale. If 21 customers generate 1 terabyte of data each day, imagine what happens when it has 100 or 1,000 customers? For now, Boundary starts culling data at the one-day and one-year mark, so at 1 month you might have minute-by-minute data and after a year you only have hourly data. But Read says customers could pay more for more storage. It’s early days for the company, but given the movement to a real-time Web and the speed at which things can change online, Boundary appears to be a necessary service for those who want to keep up.