A Stanford professor, Russ Altman, working with Microsoft, has created a new and faster way to discover drug side effects by analyzing search queries made in web browsers. This is a great example of using the crowd to find new needles in the information haystack.
Using data drawn from queries entered into Google, Microsoft and Yahoo search engines, scientists at Microsoft, Stanford and Columbia University have for the first time been able to detect evidence of unreported prescription drug side effects before they were found by the Food and Drug Administration’s warning system.Using automated software tools to examine queries by six million Internet users taken from Web search logs in 2010, the researchers looked for searches relating to an antidepressant, paroxetine, and a cholesterol lowering drug, pravastatin. They were able to find evidence that the combination of the two drugs caused high blood sugar.
The study, which was reported in the Journal of the American Medical Informatics Association on Wednesday, is based on data-mining techniques similar to those employed by services like Google Flu Trends, which has been used to give early warning of the prevalence of the sickness to the public.
The new approach is a refinement of work done by the laboratory of Russ B. Altman, the chairman of the Stanford bioengineering department. The group had explored whether it was possible to automate the process of discovering “drug-drug” interactions by using software to hunt through the data found in F.D.A. reports.
The group reported in May 2011 that it was able to detect the interaction between paroxetine and pravastatin in this way. Its research determined that the patient’s risk of developing hyperglycemia was increased compared with taking either drug individually.
Altman wondered if other techniques might be employed to discover drug side effects, and he approached Microsoft, and scientists there analyzed anonymized data from users that had opted in to allowing their search histories to be used. They pored through 82 million searches, and found coincident use of the drugs in question and ‘hyperglycemia’ or any of the common symptoms of hyperglycemia, like ‘blurry vision’ or ‘high blood sugar’.
This is obviously now going to be a key medical research tool, but the general approach has enormous potential, which is why we hear so much about big data these days. Similar techniques are possible for delving into Twitter and Facebook feeds, and are likely to be exploited in commercial and non-commercial ways.
Consider an automobile manufacturer interested in the future urban transportation market, and search through Twitter logs for people discussing its cars and competitors, and alternatives, like Zipcar, bicycles, and mass transit. That sort of analysis will yield better and more immediate results than old school surveys.
Perhaps more importantly in this urban transport case, unlike that of the drug side effects discovery, the car company can identify social psychographics based on what they are discussing: What groups are considering giving up their car? Are women riding bicycles to work? Are electric urban vehicles, like Lit Motors’ C-1 and Toyota’s i-ROAD concept, being talk about as serious alternatives? And they can find the influencers who are spreading these ideas, and follow them. This is going to be one of the biggest opportunities: filtering through sparse big data sets to find socially-scaled influence networks: bringing the dark matter of influence to light.