One commentator suggests Big Data can’t quite capture what makes humans human:
I have been browsing in the literature on “sentiment analysis,” a branch of digital analytics that—in the words of a scientific paper—“seeks to identify the viewpoint(s) underlying a text span.” This is accomplished by mechanically identifying the words in a proposition that originate in “subjectivity,” and thereby obtaining an accurate understanding of the feelings and the preferences that animate the utterance. This finding can then be tabulated and integrated with similar findings, with millions of them, so that a vast repository of information about inwardness can be created: the Big Data of the Heart. The purpose of this accumulated information is to detect patterns that will enable prediction: a world with uncertainty steadily decreasing to zero, as if that is a dream and not a nightmare. I found a scientific paper that even provided a mathematical model for grief, which it bizarrely defined as “dissatisfaction.” It called its discovery the Good Grief Algorithm.
The mathematization of subjectivity will founder upon the resplendent fact that we are ambiguous beings. We frequently have mixed feelings, and are divided against ourselves. We use different words to communicate similar thoughts, but those words are not synonyms. Though we dream of exactitude and transparency, our meanings are often approximate and obscure. What algorithm will capture “the feel of not to feel it?/?when there is none to heal it,” or “half in love with easeful Death”? How will the sentiment analysis of those words advance the comprehension of bleak emotions? (In my safari into sentiment analysis I found some recognition of the problem of ambiguity, but it was treated as merely a technical obstacle.) We are also self-interpreting beings—that is, we deceive ourselves and each other. We even lie. It is true that we make choices, and translate our feelings into actions; but a choice is often a coarse and inadequate translation of a feeling, and a full picture of our inner states cannot always be inferred from it. I have never voted wholeheartedly in a general election.
For the purpose of the outcome of an election, of course, it does not matter that I vote complicatedly. All that matters is that I vote. The same is true of what I buy. A business does not want my heart; it wants my money. Its interest in my heart is owed to its interest in my money. (For business, dissatisfaction is grief.) It will come as no surprise that the most common application of the datafication of subjectivity is to commerce, in which I include politics. Again and again in the scholarly papers on sentiment analysis the examples given are restaurant reviews and movie reviews. This is fine: the study of the consumer is one of capitalism’s oldest techniques. But it is not fine that the consumer is mistaken for the entirety of the person. Mayer-Schönberger and Cukier exult that “datafication is a mental outlook that may penetrate all areas of life.” This is the revolution: the Rotten Tomatoes view of life. “Datafication represents an essential enrichment in human comprehension.” It is this inflated claim that gives offense. It would be more proper to say that datafication represents an essential enrichment in human marketing. But marketing is hardly the supreme or most consequential human activity. Subjectivity is not most fully achieved in shopping. Or is it, in our wired consumerist satyricon?
“With the help of big data,” Mayer-Schönberger and Cukier continue, “we will no longer regard our world as a string of happenings that we explain as natural and social phenomena, but as a universe comprised essentially of information.” An improvement! Can anyone seriously accept that information is the essence of the world? Of our world, perhaps; but we are making this world, and acquiescing in its making. The religion of information is another superstition, another distorting totalism, another counterfeit deliverance. In some ways the technology is transforming us into brilliant fools. In the riot of words and numbers in which we live so smartly and so articulately, in the comprehensively quantified existence in which we presume to believe that eventually we will know everything, in the expanding universe of prediction in which hope and longing will come to seem obsolete and merely ignorant, we are renouncing some of the primary human experiences. We are certainly renouncing the inexpressible. The other day I was listening to Mahler in my library. When I caught sight of the computer on the table, it looked small.
I think there are a couple of arguments possible about the limitations of big data and Wieseltier is making a particular argument. He does not appear to be saying that big data can’t predict or model human complexity. And fans of big data would probably say the biggest issue is that we simply don’t have enough data yet and we are developing better and better models. In other words, our abilities and data will eventually catch up to the problem of complexity. But I think Wieseltier is arguing something else: he, along with many others, does not want humans to be reduced to information. Even if we had the best models, it is one thing to see people as complex individuals and yet another to say they are simply another piece of information. Doing the latter takes away the dignity of people. Reducing people to data means we stop seeing people as people that can change their minds, be creative, and confound predictions.
It will be interesting to see how this plays out in the coming years. I think this is the same fear many people have about statistics. Particularly in our modern world where we see ourselves as sovereign individuals, describing statistical trends to people strikes them as reducing their own agency and negating their own experiences. Of course, this is not what statistics is about and something more training in statistics could help change. But, how we talk about data and its uses might go a long way to how big data is viewed in the future.