Using 4.4 million tweets with GPS location from over 630,000 users in New York City, Sadilek and his team were able to predict when an individual would get sick with the flu and tweet about it up to eight days in advance of their first symptoms. Researchers found they could predict said results with 90 percent accuracy.
Similar to Google’s Flu trends, which uses “flu” search trends to pinpoint where and how outbreaks are spreading, Sadilek’s system uses an algorithm to differentiate between alternative definitions of the word ‘sick.’ For example, “My stomach is in revolt. Knew I shouldn’t have licked that door knob. Think I’m sick,” is different from “I’m so sick of ESPN’s constant coverage of Tim Tebow.”
Of course, Sadilek’s system isn’t an exhaustive crystal ball. Not everyone tweets about their symptoms and not everyone is on Twitter. But considering New York City has more Twitter users than any other city in the world, the Big Apple is as good as a place as any for this study.
While one could look at this and marvel at the power of Twitter, I think the real story here is about two things: (1) the power of big data and (2) the power of social networks that Twitter harnesses. If you have people volunteering information about their lives, access to the data, and information about who users are connected to, you can do things that would have been very difficult even ten years ago.
It is interesting that this study was conducted in New York City where there is a high percentage of Twitter users. How good are predictions in cities with lower usage rates? Are we headed toward a world where public health requires people to report on their health so that outbreaks can be contained or quelled?