Big data is not just about allowing researchers to look at really large samples or lots of information at once. It also requires the use of theory and asking new kinds of questions:
Like many other researchers, sociologist and Microsoft researcher Duncan Watts performs experiments using Mechanical Turk, an online marketplace that allows users to pay others to complete tasks. Used largely to fill in gaps in applications where human intelligence is required, social scientists are increasingly turning to the platform to test their hypotheses…
This is a point political forecaster and author Nate Silver discusses in his recent book The Signal and the Noise. After discussing economic forecasters who simply gather as much data as possible and then make inferences without respect for theory, he writes:
This kind of statement is becoming more common in the age of Big Data. Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics, where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes…
The value of big data isn’t simply in the answers it provides, but rather in the questions it suggests that we ask.
This follows a similar recent argument made on the Harvard Business Review website.
I like the emphasis here on the new kinds of questions that might be possible with big data. There are a couple of ways these could happen:
1. Uniquely large datasets might allow for different comparisons, particularly among smaller groups, that are more difficult to look at even with nationally representative samples.
2. The speed at which the experiments can be conducted through means like Amazon’s Mechanical Turk means more can be done more quickly. Additionally, I wonder if this could help alleviate some of the replication issues that pop up with scientific research.
3. Instead of having to be constrained by data limitations, big data might give researchers creative space to think on a larger scale and more outside of the box.
Of course, lots of topics are not well-suited for looking at through big data but such information does offer unique opportunities for researchers and theories.