Another call for the need for theory when working with big data

Big data is not just about allowing researchers to look at really large samples or lots of information at once. It also requires the use of theory and asking new kinds of questions:

Like many other researchers, sociologist and Microsoft researcher Duncan Watts performs experiments using Mechanical Turk, an online marketplace that allows users to pay others to complete tasks. Used largely to fill in gaps in applications where human intelligence is required, social scientists are increasingly turning to the platform to test their hypotheses…

This is a point political forecaster and author Nate Silver discusses in his recent book The Signal and the Noise. After discussing economic forecasters who simply gather as much data as possible and then make inferences without respect for theory, he writes:

This kind of statement is becoming more common in the age of Big Data. Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics, where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes…

The value of big data isn’t simply in the answers it provides, but rather in the questions it suggests that we ask.

This follows a similar recent argument made on the Harvard Business Review website.

I like the emphasis here on the new kinds of questions that might be possible with big data. There are a couple of ways these could happen:

1. Uniquely large datasets might allow for different comparisons, particularly among smaller groups, that are more difficult to look at even with nationally representative samples.

2. The speed at which the experiments can be conducted through means like Amazon’s Mechanical Turk means more can be done more quickly. Additionally, I wonder if this could help alleviate some of the replication issues that pop up with scientific research.

3. Instead of having to be constrained by data limitations, big data might give researchers creative space to think on a larger scale and more outside of the box.

Of course, lots of topics are not well-suited for looking at through big data but such information does offer unique opportunities for researchers and theories.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s