Early in Statistics courses, students learn about different ways that variables can be measured. This is often broken down into three categories: nominal variables (unordered, unranked), ordinal variables (ranked but with varied category widths), and interval-ratio (ranked and with consistent spaces between categories). Decisions about how to measure variables can have significant influence on what can be done with the data later. For example, here is a study that received a lot of attention when published but the researchers miscoded a nominal variable:
In 2015, a paper by Jean Decety and co-authors reported that children who were brought up religiously were less generous. The paper received a great deal of attention, and was covered by over 80 media outlets including The Economist, the Boston Globe, the Los Angeles Times, and Scientific American. As it turned out, however, the paper by Decety was wrong. Another scholar, Azim Shariff, a leading expert on religion and pro-social behavior, was surprised by the results, as his own research and meta-analysis (combining evidence across studies from many authors) indicated that religious participation, in most settings, increased generosity. Shariff requested the data to try to understand more clearly what might explain the discrepancy.
To Decety’s credit, he released the data. And upon re-analysis, Shariff discovered that the results were due to a coding error. The data had been collected across numerous countries, e.g. United States, Canada, Turkey, etc. and the country information had been coded as “1, 2, 3…” Although Decety’s paper had reported that they had controlled for country, they had accidentally not controlled for each country, but just treated it as a single continuous variable so that, for example “Canada” (coded as 2) was twice the “United States” (coded as 1). Regardless of what one might think about the relative merits and rankings of countries, this is obviously not the right way to analyze data. When it was correctly analyzed, using separate indicators for each country, Decety’s “findings” disappeared. Shariff’s re-analysis and correction was published in the same journal, Current Biology, in 2016. The media, however, did not follow along. While it covered extensively the initial incorrect results, only four media outlets picked up the correction.
In fact, Decety’s paper has continued to be cited in media articles on religion. Just last month two such articles appeared (one on Buzzworthy and one on TruthTheory) citing Decety’s paper that religious children were less generous. The paper’s influence seems to continue even after it has been shown to be wrong.
Last month, however, the journal, Current Biology, at last formally retracted the paper. If one looks for the paper on the journal’s website, it gives notice of the retraction by the authors. Correction mechanisms in science can sometimes work slowly, but they did, in the end, seem to be effective here. More work still needs to be done as to how this might translate into corrections in media reporting as well: The two articles above were both published after the formal retraction of the paper.
To reiterate, the researcher treated country – a nominal variable in this case since the countries were not ranked or ordered in any particular way – incorrectly which then threw off the overall results. When then using country correctly – from the description above, it sounds like using country as a dummy variable coded 1 and 0 – the findings that received all the attention disappeared.
The other issue at play here is whether corrections to academic studies or retractions are treated as such. It is hard to notify readers that a previously published study had flaws and the results have changed.
All that to say, paying attention to level of measurement earlier in the process helps avoid problems down the road.