Black box software: a problem for science that extends to big data

You probably don’t need to know how a calculator makes two plus two equal four, or how your favorite smartphone app works, but the way the background software is implemented can make a big difference to the output. Slight rounding errors or slow load times in these cases might be annoying, but when you scale up to big data modeling, for instance, you might want to take a closer look at the software running your calculations before you click go.

Blind trust in black box, or click-and-run, software is a growing problem in science, according to a commentary published Thursday in the journal Science, and the concern extends beyond formal research to other domains that use high performance computing.

The researchers who addressed the “troubling trend in scientific software use” were motivated by a growing unease that the abundance of powerful software is letting scientists derive answers without a thorough understanding of what the software is doing. Software snafus have been responsible for some high-profile data misinterpretations and retractions.

This wouldn’t normally cause a blip on the average citizen’s radar, but now a lot of these scientific conclusions have real-world implications, from climate modeling and weather forecasting to high volume financial trading. In any domain using big data, misplaced trust in the power of software can be problematic, particularly when the decision makers don’t know what the software they are using is doing, said lead author Lucas Joppa of Microsoft Research.

So what does ecology have to do with any of this? Joppa is an ecologist by training, and works on computational techniques in that field that may also have applications for big data more broadly. He and his colleagues surveyed scientists in a sub-field of ecology — species distribution modeling (SDM) — to find out how they choose software and how well they understand its inner workings.

“Lots of SDM techniques are only available as computational methods, but there is a lot of discourse going on in the literature about whether the methods themselves are correct,” said Joppa. Scientists use SDM to forecast where plants and animals will be in the future given current numbers, known habitats, and climate change. It’s a niche area of research, but the disquieting survey results should be noted in any domain where forecasting is done by plugging data into software.

Only 8 percent of the more than 400 scientists who responded had validated their modeling software against other methods. “The number speaks for itself,” said Joppa. “The real crux of the problem is the results from software being published in a peer-reviewed journal, versus the software itself having been peer-reviewed,” which is rare. Software packages, whether proprietary or not, are often black box systems that can’t be opened and inspected. Even if you can get under the proverbial hood, like with open source software, said Joppa, most people will still have no idea what they are looking at, or how to judge its quality.

catch 22

To top it all off, having confidence in what your software is doing results in a massive computational catch-22: how do you know the software is giving you the right answer, if you can’t get the answer without running the software? The level of confusion over what algorithms are doing in the SDM field is illustrated by a debate over which of two statistical techniques is superior. It turns out, Joppa explained, that the two techniques were mathematically equivalent, but the ways they were implemented in software resulted in big predictive differences.

This sort of mix-up isn’t surprising given the messy nature of software development (if you can even call it that) in research environments. Joppa lauded efforts like Software Carpentry that teach scientists basic software fundamentals for better programming, and said the days of getting a doctorate by merely pushing a button are over.

“Scientists themselves can learn a bare minimum of software engineering,” said Joppa. On the flip side, he said computer science students should have more exposure to scientific methods. “People with traditional software engineering training become uncomfortable with the way scientists want to work with software, where the design and specs are constantly changing. The way that scientific software is built is fundamentally different from consumer apps.”

Developers of scientific software, like MathWorks or SAS, may want to watch this space. If Joppa’s suggestions are implemented, journals may start requiring that even proprietary software be opened up for inspection and peer-review. Nearly half of the surveyed ecologists report using free statistical language R as their primary software, so maybe there is hope yet, both for open, inspectable code, and for computational science becoming more accessible while yielding trustworthy, high impact results.