Using a GRIM method to find unlikely published results

Discovering which published studies may be incorrect or fraudulent takes some work and here is a newer tool: GRIM.

GRIM is the acronym for Granularity-Related Inconsistency of Means, a mathematical method that determines whether an average reported in a scientific paper is consistent with the reported sample size and number of items. Here’s a less-technical answer: GRIM is a B.S. detector. The method is based on the simple insight that only certain averages are possible given certain sets of numbers. So if a researcher reports an average that isn’t possible, given the relevant data, then that researcher either (a) made a mistake or (b) is making things up.

GRIM is the brainchild of Nick Brown and James Heathers, who published a paper last year in Social Psychological and Personality Science explaining the method. Using GRIM, they examined 260 psychology papers that appeared in well-regarded journals and found that, of the ones that provided enough necessary data to check, half contained at least one mathematical inconsistency. One in five had multiple inconsistencies. The majority of those, Brown points out, are “honest errors or slightly sloppy reporting.”…

After spotting the Wansink post, Anaya took the numbers in the papers and — to coin a verb — GRIMMED them. The program found that the four papers based on the Italian buffet data were shot through with impossible math. If GRIM was an actual machine, rather than a humble piece of code, its alarms would have been blaring. “This lights up like a Christmas tree,” Brown said after highlighting on his computer screen the errors Anaya had identified…

Anaya, along with Brown and Tim van der Zee, a graduate student at Leiden University, also in the Netherlands, wrote a paper pointing out the 150 or so GRIM inconsistencies in those four Italian-restaurant papers that Wansink co-authored. They found discrepancies between the papers, even though they’re obviously drawn from the same dataset, and discrepancies within the individual papers. It didn’t look good. They drafted the paper using Twitter direct messages and titled it, memorably, “Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab.”

I wonder how long it will be before journals employ such methods for submitted manuscripts. Imagine Turnitin for academic studies. Then, what would happen to authors if problems are found?

It also sounds like a program like this could make it easy to do mass analysis of published studies to help answer questions like how many findings are fraudulent.

Perhaps it is too easy to ask whether GRIM has been vetted by outside persons…

Chicago innovation #14: Consumer preference research

A cousin of social science research, consumer preference research, got its start in Chicago:

It was 1928. Benton was working at Chicago’s Lord & Thomas advertising agency when owner Albert Lasker told him to land Colgate-Palmolive by impressing the outsized toiletry powerhouse with market research. Benton worked night and day for two months to record housewives’ preferences for the products of each company.

The firm used the pioneering survey in its initial Colgate-Palmolive campaign and landed the account before the survey was completed.

This drew criticism from an early sociologist:

Sociologist Gordon Hancock hated the idea. It was tantamount to cheating.

In a statement that must have brought grins to the faces of that up-and-coming generation of ad men, Hancock decried in 1926: “Excessive scientific advertising takes undue advantage of the public.”

This was, of course, the point.

This tension between marketing and sociology still exists today. The two areas use similar methods of collecting data such as surveys, focus groups, interviews, and ethnographies or participant observation. However, they have very different purposes: marketing is intended to sell products while sociologists are trying to uncover how social life works. The tension reminds me of Morgan Spurlock’s documentary The Greatest Movie Ever Sold that questions the marketing complex.

This comes in at #14 on a list of the top 20 innovations from Chicago; I highlighted the #5 innovation, balloon frame housing, in an earlier post.

 

More on limits of Census measures of race and ethnicity

Here is some more information about the limitations of measuring race with the current questions in the United States Census:

When the 2010 census asked people to classify themselves by race, more than 21.7 million — at least 1 in 14 — went beyond the standard labels and wrote in such terms as “Arab,” ”Haitian,” ”Mexican” and “multiracial.”

The unpublished data, the broadest tally to date of such write-in responses, are a sign of a diversifying America that’s wrestling with changing notions of race…

“It’s a continual problem to measure such a personal concept using a check box,” said Carolyn Liebler, a sociology professor at the University of Minnesota who specializes in demography, identity and race. “The world is changing, and more people today feel free to identify themselves however they want — whether it’s black-white, biracial, Scottish-Nigerian or American. It can create challenges whenever a set of people feel the boxes don’t fit them.”

In an interview, Census Bureau officials said they have been looking at ways to improve responses to the race question based on focus group discussions during the 2010 census. The research, some of which is scheduled to be released later this year, examines whether to include new write-in lines for whites and blacks who wish to specify ancestry or nationality; whether to drop use of the word “Negro” from the census form as antiquated; and whether to possibly treat Hispanics as a mutually exclusive group to the four main race categories.

This highlights some of the issues of social science research:

1. Social science categories change as people’s own understanding of the terms changes. Keeping up with these understandings can be difficult and there is always a lag. For example, a sizable group of respondents in the 2010 Census didn’t like the categories but the problem can’t be fixed until a future Census.

2. Adding write-in options or more questions means that the Census becomes longer, requiring more time to take and analyze. With all of the Census forms that are returned, this is no small matter.

3. Comparing results of repeated surveys like the Census can become quite difficult when the definitions change.

4. The Census is going to change things based on focus groups? I assume they will also test permutations of the questions and possible categories in smaller-scale surveys before settling on what they will do.

Ethics and social science: grad student gets 6 months sentence for studying animal rights’ groups

This is an update of a story I have been tracking for a while: a sociology graduate student who had studied animal rights’ groups was  sentenced to six months in jail. Here is a brief summary of where the case now stands:

Scott DeMuth, a sociology graduate student at the University of Minnesota, was sentenced yesterday to 6 months in federal prison for his role in a 2006 raid on a Minnesota ferret farm. A judge in Davenport, Iowa, ordered that DeMuth be taken into custody immediately.

In 2009, DeMuth was charged with felony conspiracy in connection with a separate incident, a 2004 lab break-in at the University of Iowa that caused more than $400,000 in damage. DeMuth argued that anything he might know about the Iowa incident had been collected as part of his research on radical activist groups and was therefore protected by confidentiality agreements with his research subjects. A petition started by DeMuth’s graduate advisor, David Pellow, argued that the charges violated DeMuth’s academic freedom.

Last year, prosecutors offered to drop all charges related to the Iowa break-in if DeMuth would plead guilty to a lesser misdemeanor charge related to the ferret farm incident. DeMuth took the deal. No one has been convicted in the Iowa break-in.

This has been an interesting case to introduce to students when teaching ethics amongst sociology and anthropology majors in a research class. Just how far should participant observation go? Couple this with another story, like Venkatesh knowing about possible crimes in Gang Leader for a Day, and a good conversation typically ensues.

However, this case does bring up some larger questions about how protected researchers and their subjects should be when carrying out their research. Should researchers have shield laws? How exactly do courts define “academic freedom” in cases like this?

The trolley problem, race, and making decisions

The trolley problem  is a classic vignette used in research studies and it asks under what conditions is it permissible to sacrifice one life for the lives of others (see an explanation here). Psychologist David Pizarro tweaked the trolley problem to include racial dimensions by using characters named Chip and Tyrone. Pizarro found that people’s opinions about race influenced which character they were more willing to sacrifice:

What did this say about people’s morals? Not that they don’t have any. It suggests that they had more than one set of morals, one more consequentialist than another, and choose to fit the situation…

Or as Pizarro told me on the phone, “The idea is not that people are or are not utilitarian; it’s that they will cite being utilitarian when it behooves them. People are aren’t using these principles and then applying them. They arrive at a judgment and seek a principle.”

So we’ll tell a child on one day, as Pizarro’s parents told him, that ends should never justify means, then explain the next day that while it was horrible to bomb Hiroshima, it was morally acceptable because it shortened the war. We act — and then cite whichever moral system fits best, the relative or the absolute.

Some interesting findings from a different take on a classic research tool. This is always an interesting question to ask regarding many social issues: when does the end justify the means and when does it not?