Using analytics and statistics in sports and society: a ways to go

Truehoop has been doing a fine job covering the 2013 MIT Sloan Sports Analytics Conference. One post from last Saturday highlighted five quotes “On how far people have delved into the potential of analytics“:

“We are nowhere yet.”
— Morey

“There is a human element in sports that is not quantifiable. These players bleed for you, give you everything they have, and there’s a bond there.”
— Bill Polian, ESPN NFL analyst

“When visualizing data, it’s not about how much can I put in but how much can I take out.”
— Joe Ward, The New York Times sports graphics editor

“If you are not becoming a digital CMO (Chief Marketing Officer), you are becoming extinct.”
— Tim McDermott, Philadelphia Eagles CMO

“Even if God came down and said this model is correct … there is still randomness, and you can be wrong.”
— Phil Birnbaum, By The Numbers editor

In other words, there is a lot of potential in these statistics and models but we have a long way to go in deploying them correctly. I think this is a good reminder when thinking about big data as well: simply having the numbers and recognizing they might mean something is a long way from making sense of the numbers and improving lives because of our new knowledge.

Argument: statistics can help us understand and enjoy baseball

An editor and writer for Baseball Prospectus argues that we need science and statistics to understand baseball:

Fight it if you like, but baseball has become too complicated to solve without science. Every rotation of every pitch is measured now. Every inch that a baseball travels is measured now. Teams that used to get mocked for using spreadsheets now rely on databases packed with precise location and movement of every player on every play — and those teams are the norm, not the film-inspiring exceptions. This is exciting and it’s terrifying…

I’m not a mathematician and I’m not a scientist. I’m a guy who tries to understand baseball with common sense. In this era, that means embracing advanced metrics that I don’t really understand. That should make me a little uncomfortable, and it does. WAR is a crisscrossed mess of routes leading toward something that, basically, I have to take on faith…

Yet baseball’s front offices, the people in charge of $100 million payrolls and all your hope for the 2013 season, side overwhelmingly with data. For team executives, the basic framework of WAR — measuring players’ total performance against a consistent baseline — is commonplace, used by nearly every front office, according to insiders. The writers who helped guide the creation of WAR over the decades — including Bill James, Sean Smith and Keith Woolner — work for teams now. As James told me, the war over WAR has ceased where it matters. “There’s a practical necessity for measurements like that in a front office that make it irrelevant whether you like them or you don’t.”

Whether you do is up to you and ultimately matters only to you. In the larger perspective, the debate is over, and data won. So fight it if you’d like. But at a certain point, the question in any debate against science is: What are you really fighting and why?

As someone who likes data, I would statistics is just another tool that can help us understand baseball better. It doesn’t have to be an either/or argument, baseball with advanced statistics versus baseball without advanced statistics. Baseball with advanced statistics is a more complete and gets at some of the underlying mechanics of the game rather than the visual cues or the culturally accepted statistics.

While this story is specifically about baseball, I think it also mirrors larger conversations in American society about the use of statistics. Why interrupt people’s common sense understandings of the world with abstract data? Aren’t these new statistics difficult to understand and can’t they also be manipulated? Some of this is true: looking at data can involve seeing things in news ways and there are disagreements about how to define concepts as well as how to collect to interpret data. But, in the end, these statistics can help us better understand the world.

A growing interest in science cafes in America?

Reuters reports on a supposedly growing trend: science cafes.

Science cafes have sprouted in almost every state including a tapas restaurant near downtown Orlando where Sean Walsh, 27, a graphic designer, describes himself and his friends as some of the laymen in the crowd…

But the typical participant brings at least some college-level education or at least a lively curiosity, said Edward Haddad, executive director of the Florida Academy of Sciences, which helped start up Orlando’s original cafe and organizes the events…

Haddad said the current national push to increase the number of U.S. graduates in science, technology, engineering and math, or the STEM fields, is driving up the number of science cafes…

The U.S. science cafe movement grew out of Cafe Scientifique in the United Kingdom. The first Cafe Scientifique popped up in Leeds in 1998 as a regularly scheduled event where all interested parties could participate in informal forums about the latest in science and technology.

I’m dubious that this is that big of a movement just because “almost every state” now has a science cafe. This is similar to journalists claiming that something is popular because there is a Facebook group devoted to it.

But, this sounds like a fascinating example of a “third place” where Americans can gather between home and work, learn, and interact with others interested in similar topics. In fact, it sounds more like a Parisian salon of the 1800s. However, the article also mentions these cafes are probably more attractive to the NPR crowd and I imagine many Americans would not want to go discuss science in a cafe.

I wonder if the news coverage would be different if Americans were gathering in cafes to talk about other topics. How about The Bachelor? The tea party? Religion? The tone of the article is that it is more unusual for Americans to want to hear about and discuss science when they are not being forced to.

h/t Instapundit

Confessions of researchers: #overlyhonestmethods

Here is a collection of 17 post under the Twitter hashtag #overlyhonestmethods. My favorite: “We assume 50 Ivy League kids represent the general population, b/c ‘real people’ can be sketchy or expensive.” This doesn’t surprise me considering the number of undergraduates used in psychology studies

I wonder how many researchers could tell similar stories about research methods. These admissions don’t necessarily invalidate any of the findings but rather hint at the very human dimension present in conducting research studies.

(Disclaimer: of course it is difficult to know how many of these research method confessions are true.)

Science more about consensus than proven facts

A new book titled The Half-Life of Facts looks at how science is more about consensus than canon. A book review in the Wall Street Journal summarizes the argument:

Knowledge, then, is less a canon than a consensus in a state of constant disruption. Part of the disruption has to do with error and its correction, but another part with simple newness—outright discoveries or new modes of classification and analysis, often enabled by technology. A single chapter in “The Half-Life of Facts” looking at the velocity of knowledge growth starts with the author’s first long computer download—a document containing Plato’s “Republic”—journeys through the rapid rise of the “@” symbol, introduces Moore’s Law describing the growth rate of computing power, and discusses the relevance of Clayton Christensen’s theory of disruptive innovation. Mr. Arbesman illustrates the speed of technological advancement with examples ranging from the magnetic properties of iron—it has become twice as magnetic every five years as purification techniques have improved—to the average distance of daily travel in France, which has exponentially increased over the past two centuries.

To cover so much ground in a scant 200 pages, Mr. Arbesman inevitably sacrifices detail and resolution. And to persuade us that facts change in mathematically predictable ways, he seems to overstate the predictive power of mathematical extrapolation. Still, he does show us convincingly that knowledge changes and that scientific facts are rarely as solid as they appear…

More commonly, however, changes in scientific facts reflect the way that science is done. Mr. Arbesman describes the “Decline Effect”—the tendency of an original scientific publication to present results that seem far more compelling than those of later studies. Such a tendency has been documented in the medical literature over the past decade by John Ioannidis, a researcher at Stanford, in areas as diverse as HIV therapy, angioplasty and stroke treatment. The cause of the decline may well be a potent combination of random chance (generating an excessively impressive result) and publication bias (leading positive results to get preferentially published)…

Science, Mr. Arbesman observes, is a “terribly human endeavor.” Knowledge grows but carries with it uncertainty and error; today’s scientific doctrine may become tomorrow’s cautionary tale. What is to be done? The right response, according to Mr. Arbesman, is to embrace change rather than fight it. “Far better than learning facts is learning how to adapt to changing facts,” he says. “Stop memorizing things . . . memories can be outsourced to the cloud.” In other words: In a world of information flux, it isn’t what you know that counts—it is how efficiently you can refresh.

To add to the conclusion of this review as cited above, it is less about the specific content of the scientific facts and more about the scientific method one uses to arrive at scientific conclusions. There is a reason the scientific process is taught starting in grade school: the process is supposed to help observers get around their own biases and truly observe reality in a reliable and valid way. Of course, whether our bias can actually be eliminated and how we go about observing both matter for our results but it is the process itself that remains intact.

This also gets to an issue some colleagues and I have noticed where college students talk about “proving” things about the world (natural or social). The language of “proof” implies that data collection and analysis can yield unchanging facts which cannot be disputed. But, as this book points out, this is not how science works. When a researcher finds something interesting, they report on their finding and then others go about retesting the findings or applying the findings to new areas. Over time, knowledge accumulates. To put it in the terms of this review, a consensus is eventually reached. But, new information can counteract this consensus and the paradigm building process starts over again (a la Thomas Kuhn in The Structure of Scientific Revolutions). This doesn’t mean science can’t tell us anything but it does mean that the theories and findings of science can change over time (and here is another interesting discussion point: what exactly is a law, theory, and a finding).

In the end, science requires a longer view. As I’ve noted before, the media tends to play up new scientific findings but we are better served looking at the big picture of scientific findings and waiting for a consensus to emerge.

Sociologist defends statistical predictions for elections and other important information

Political polling has come under a lot of recent fire but a sociologist defends these predictions and reminds us that we rely on many such predictions:

We rely on statistical models for many decisions every single day, including, crucially: weather, medicine, and pretty much any complex system in which there’s an element of uncertainty to the outcome. In fact, these are the same methods by which scientists could tell Hurricane Sandy was about to hit the United States many days in advance…

This isn’t wizardry, this is the sound science of complex systems. Uncertainty is an integral part of it. But that uncertainty shouldn’t suggest that we don’t know anything, that we’re completely in the dark, that everything’s a toss-up.

Polls tell you the likely outcome with some uncertainty and some sources of (both known and unknown) error. Statistical models take a bunch of factors and run lots of simulations of elections by varying those outcomes according to what we know (such as other polls, structural factors like the economy, what we know about turnout, demographics, etc.) and what we can reasonably infer about the range of uncertainty (given historical precedents and our logical models). These models then produce probability distributions…

Refusing to run statistical models simply because they produce probability distributions rather than absolute certainty is irresponsible. For many important issues (climate change!), statistical models are all we have and all we can have. We still need to take them seriously and act on them (well, if you care about life on Earth as we know it, blah, blah, blah).

A key point here: statistical models have uncertainty (we are making inferences about larger populations or systems from samples that we can collect) but that doesn’t necessarily mean they are flawed.

A second key point: because of what I stated above, we should expect that some statistical predictions will be wrong. But this is how science works: you tweak models, take in more information, perhaps change your data collection, perhaps use different methods of analysis, and hope to get better. While it may not be exciting, confirming what we don’t know does help us get to an outcome.

I’ve become more convinced in recent years that one of the reasons polls are not used effectively in reporting is that many in the media don’t know exactly how they work. Journalists need to be trained in how to read, interpret, and report on data. This could also be a time issue; how much time to those in the media have to pore over the details of research findings or do they simply have to scan for new findings? Scientists can pump out study after study but part of the dissemination of this information to the public requires a media who understands how scientific research and the scientific process work. This includes understanding how models are consistently refined, collecting the right data to answer the questions we want to answer, and looking at the accumulated scientific research rather than just grabbing the latest attention-getting finding.

An alternative to this idea about media statistical illiteracy is presented in the article: perhaps the media perhaps knows how polls work but likes a political horse race. This may also be true but there is a lot of reporting on statistics on data outside of political elections that also needs work.

How one woman helped make preventable injuries an American public health issue

The epidemiologist Susan P. Baker devoted her career to making preventable injuries a public health issue. Here is part of the story:

She embarked on an independent research project — a comparison of drivers who were not responsible for their fatal crashes with drivers who were — and in 1968 she sent Haddon a letter seeking federal financing for her study. He came through with $10,000 and continued to finance her research after he became president of the Insurance Institute for Highway Safety a year later…

Among Baker’s most important legacies is the widespread use of the infant car seat. By examining data from car crashes, she demonstrated that the passengers most likely to die were those younger than 6 months. They were killed at double the rate of 1-year-olds and triple the rate for ages 6 to 12. Why? Because babies rested in their mothers’ arms or laps, often in the front passenger seat, and because their still-fragile bodies were more susceptible to fatal injury than those of older children. Baker published her study in the journal Pediatrics in 1979, making headlines in newspapers across the country…

Around that time, Baker was one of the main authors of a report calling for the creation of a federal injury-prevention agency. Today the National Center for Injury Prevention and Control coordinates with state programs and underwrites research projects aimed at preventing injury, ranging from the intentional (rape, homicide, suicide) to the unintentional (falls, residential fires, drownings)…

Of course, Baker knows that we can’t make the world completely injury-proof. But her decades of research show how fairly simple preventive measures — fences around swimming pools, bike helmets, childproof caps on medicine containers — can save thousands of lives.

I couldn’t help thinking while reading this story that it demonstrates the interplay between science, culture, and government. The first paragraph of the article argues that in the 1960s that few people worried about preventable injuries but this has clearly changed since. Aiding this process was new scientific findings about injuries as well as presentable statistics that captured people’s attention. This reminds me of sociologist Joel Best’s explanation in Damned Lies and Statistics that the use of statistics emerged in the mid 1800s because reformers wanted to attach numbers and science to social problems they cared about. But for these numbers to matter and the science to be taken seriously, you need a culture as well as institutions that see science as a viable way of knowing about the world. Similarly, the numbers themselves are not enough to immediately lead to change; social problems such as automobile deaths go through a process by which the public becomes aware, a critical mass starts pressing the issue, and leaders respond by changing regulations. Is it a coincidence that these concerns about public health began to emerge in the 1960s at the same time of American ascendency in the scientific realm, the growth of the welfare state, the continued development of the mass media as well as mass consumption, and an era of more movements calling for human rights and governmental protections? Probably not.

h/t Instapundit

Another call for the need for theory when working with big data

Big data is not just about allowing researchers to look at really large samples or lots of information at once. It also requires the use of theory and asking new kinds of questions:

Like many other researchers, sociologist and Microsoft researcher Duncan Watts performs experiments using Mechanical Turk, an online marketplace that allows users to pay others to complete tasks. Used largely to fill in gaps in applications where human intelligence is required, social scientists are increasingly turning to the platform to test their hypotheses…

This is a point political forecaster and author Nate Silver discusses in his recent book The Signal and the Noise. After discussing economic forecasters who simply gather as much data as possible and then make inferences without respect for theory, he writes:

This kind of statement is becoming more common in the age of Big Data. Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics, where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes…

The value of big data isn’t simply in the answers it provides, but rather in the questions it suggests that we ask.

This follows a similar recent argument made on the Harvard Business Review website.

I like the emphasis here on the new kinds of questions that might be possible with big data. There are a couple of ways these could happen:

1. Uniquely large datasets might allow for different comparisons, particularly among smaller groups, that are more difficult to look at even with nationally representative samples.

2. The speed at which the experiments can be conducted through means like Amazon’s Mechanical Turk means more can be done more quickly. Additionally, I wonder if this could help alleviate some of the replication issues that pop up with scientific research.

3. Instead of having to be constrained by data limitations, big data might give researchers creative space to think on a larger scale and more outside of the box.

Of course, lots of topics are not well-suited for looking at through big data but such information does offer unique opportunities for researchers and theories.

A company offers to replicate research study findings

A company formed in 2011 is offering a new way to validate the findings of research studies:

A year-old Palo Alto, California, company, Science Exchange, announced on Tuesday its “Reproducibility Initiative,” aimed at improving the trustworthiness of published papers. Scientists who want to validate their findings will be able to apply to the initiative, which will choose a lab to redo the study and determine whether the results match.

The project sprang from the growing realization that the scientific literature – from social psychology to basic cancer biology – is riddled with false findings and erroneous conclusions, raising questions about whether such studies can be trusted. Not only are erroneous studies a waste of money, often taxpayers’, but they also can cause companies to misspend time and resources as they try to invent drugs based on false discoveries.

This addresses a larger concern about how many research studies found their results by chance alone:

Typically, scientists must show that results have only a 5 percent chance of having occurred randomly. By that measure, one in 20 studies will make a claim about reality that actually occurred by chance alone, said John Ioannidis of Stanford University, who has long criticized the profusion of false results.

With some 1.5 million scientific studies published each year, by chance alone some 75,000 are probably wrong.

I’m intrigued by the idea of having an independent company assess research results. This could work in conjunction with other methods of verifying research results:

1. The original researchers could run multiple studies. This works better with smaller studies but it could be difficult when the N is larger and more resources are needed.

2. Researchers could also make their data available as they publish their paper. This would allow other researchers to take a look and see if things were done correctly and if the results could be replicated.

3. The larger scientific community should endeavor to replicate studies. This is the way science is supposed to work: if someone finds something new, other researchers should adopt a similar protocol and test it with similar and new populations. Unfortunately, replicating studies is not seen as being very glamorous and it tends not to receive the same kind of press attention.

The primary focus of this article seems to be on medical research. Perhaps this is because it can affect the lives of many and involves big money. But it would be interesting to apply this to more social science studies as well.

Argument: still need thinking even with big data

Justin Fox argues that the rise of big data doesn’t mean we can abandon thinking about data and relationships between variables:

Big data, it has been said, is making science obsolete. No longer do we need theories of genetics or linguistics or sociology, Wired editor Chris Anderson wrote in a manifesto four years ago: “With enough data, the numbers speak for themselves.”…

There are echoes here of a centuries-old debate, unleashed in the 1600s by protoscientist Sir Francis Bacon, over whether deduction from first principles or induction from observed reality is the best way to get at truth. In the 1930s, philosopher Karl Popper proposed a synthesis, in which the only scientific approach was to formulate hypotheses (using deduction, induction, or both) that were falsifiable. That is, they generated predictions that — if they failed to pan out — disproved the hypothesis.

Actual scientific practice is more complicated than that. But the element of hypothesis/prediction remains important, not just to science but to the pursuit of knowledge in general. We humans are quite capable of coming up with stories to explain just about anything after the fact. It’s only by trying to come up with our stories beforehand, then testing them, that we can reliably learn the lessons of our experiences — and our data. In the big-data era, those hypotheses can often be bare-bones and fleeting, but they’re still always there, whether we acknowledge them or not.

“The numbers have no way of speaking for themselves,” political forecaster Nate Silver writes, in response to Chris Anderson, near the beginning of his wonderful new doorstopper of a book, The Signal and the Noise: Why So Many Predictions Fail — But Some Don’t. “We speak for them.”

These days, finding and examining data is much easier than before but it is still necessary to interpret what these numbers mean. Observing relationships between variables doesn’t necessarily tell us something valuable. We also want to know why variables are related and this is where hypotheses come in. Careful hypothesis testing means we can rule out spurious associations, other variables that may be leading to the observed relationship, and look for the influence of one variable on another when controlling for other factors (the essence of regression) or looking at more complex models where we can see how a variety of models affect each other at the same time.

Also, at the opposite end of the scientific process from the hypotheses, utilizing findings when creating and implementing policies will also require thinking. Once we have established that relationships likely exist, it takes even more work to respond to this in useful and effective ways.