Mice live roughly up to two years. Send them to the International Space Station for 6 months, and you can potentially tell the effects space has on a mammal over a quarter of its life; information that could inform future missions to Mars and other distant planets. NASA is currently working on housing for the “mousetronauts,” as Elon Musk likes to call them, and plans to send them to the ISS later this year on a SpaceX Falcon 9 rocket.
The SPARK Competition challenged teams to create a chemistry kit suited to the modern day. The winning design crams a whole kit onto a single computer chip.
Science magazine last week published an analysis of the errors and limitations in Google Flu Trends, the flu trend analysis that Google generates from its search engine traffic; and news stories this week—from The New York Times to the Financial Times—have offered their own spins on the findings. In the case of The New York Times, the story was an update from a year-earlier blog critique of reports that GFT was significantly missing the mark, particularly in comparison with the analysis from the Center for Disease Control, which has a three-week data lag.
The limits to many big data sources
The article in Science highlights the limitations of lots of search and social media-based data. For example, the algorithms may not be transparent (e.g., Google has never released the 45 search terms used for GFT), and they may be tweaked regularly (e.g. 86 reported changes to Google search in June and July 2012 alone), thus making initial analysis and attempts at replication, especially, fraught. Also, ‘red team’ attacks, whereby research subjects attempt to influence their own data for analysis (e.g., political campaigns and companies endeavoring to hit Twitter trending targets). ‘Big data hubris’ reflects a tendency to think larger data sets can replace more controlled, traditional data collection and analysis, despite frequent issues in measurement and construct validity and reliability and dependencies among data.
Lessons for moving forward
The Science article offers the following conclusions:
- Transparency and replicability are central to science. Opaque and ever-changing search algorithms sharply limit the traditional scientific process of repeat and additive studies building on earlier findings.
- Use big data to understand the unknown. GFT data can be combined with CDC data to marginally improve CDC findings. However, the potential for GFT to provide more localized data than is practical for the CDC is likely of greater complementary value.
- Study the algorithm. What is being asked and how it is being asked is, of course, central to results—and to worthwhile subsequent analysis.
- It’s not just about the size of the data. Big data offers great potential for new and more expansive research and analysis, but the Internet is also improving the potential for traditional data collection and analysis. Both methods are undergoing a revolution of sorts.
A further critique of found data
The FT article goes much further than the New York Times piece, launching into a further critique of ‘found data’ more broadly, which includes much corporate big data as well as Internet sources. Four common big data claims the FT piece addresses that are ‘at best optimistic oversimplifications’ include:
- Data produces uncannily accurate results;
- Every single data point can be captured, making old statistical sampling techniques obsolete;
- It is passé to fret over what causes what, because statistical correlation tells us all we need to know; and,
- With enough data, scientific and statistical models aren’t needed.
Indeed, the FT quotes one professor as terming the claims, “complete bollocks. Absolute nonsense.”
More big data concerns
None of the researchers in these studies and articles find big data to be without significant commercial and societal value. They simply offer cautions and caveats. But the speed with which big data collection and analysis is permeating society can hardly be overstated. The AP has a story this week on farmers bringing their concerns about agribusiness collection at their operations, including real-time, GPS-informed data feeds could be problematic. Seed companies such as Monsanto may be taking the lead on this, but entities from government agencies to commodity market traders see use in the data as well.
Among the conclusions to draw from these analyses are the following:
- New norms for the accuracy and transparent communication of big data sources and new societal standards as to its privacy permissions and reach need to be developed.
- Enterprises can minimize the adverse consequences of early use of the technology by preemptively establishing suitably conservative standards of their own.
- The adoption of big data, though still in the early stages, is already pervasive.
- With new uses found daily, we are all in for a bumpy, but fast and sometimes thrilling ride.
The UK-based Wellcome Trust, the world’s second-largest funder of medical research behind the Gates Foundation, has launched a free online magazine called Mosaic that is dedicated to longform science writing. The site will be run by former Times science editor Mark Henderson — who was involved with a monthly science magazine published by the Times called Eureka, which was shut down in 2012 — and will publish a new 3,000-word piece on a scientific topic every Tuesday. In an unusual twist, the content will be free for anyone to use under a Creative Commons license, provided they include attribution.
Swiss researchers tested the prosthetic on a Danish man who lost his hand nine years ago. He was able to feel objects without worrying about crushing them.
Are you ready for bionic super-strength? Scientists have made a breakthrough in artificial muscles.
Researchers say the pen could be used to treat severe damage from a car accident or to remodel defects in a bone.
Circuit Stickers light up, sense and even twinkle when placed on conductive material.
When cockroaches run, they avoid bumping into things by sensing via a different source of information: their antennae. Researchers believe they can develop antennae for robots to help them respond better to their surroundings.
The team’s lead said he expects the open-source invention to kickstart the creation of more metal printers. While he’s worried about the implications of printing metal, he believes it will do more good than harm.