The retraction of a study provides a reminder of the importance of levels of measurement

Early in Statistics courses, students learn about different ways that variables can be measured. This is often broken down into three categories: nominal variables (unordered, unranked), ordinal variables (ranked but with varied category widths), and interval-ratio (ranked and with consistent spaces between categories). Decisions about how to measure variables can have significant influence on what can be done with the data later. For example, here is a study that received a lot of attention when published but the researchers miscoded a nominal variable:

In 2015, a paper by Jean Decety and co-authors reported that children who were brought up religiously were less generous. The paper received a great deal of attention, and was covered by over 80 media outlets including The Economist, the Boston Globe, the Los Angeles Times, and Scientific American. As it turned out, however, the paper by Decety was wrong. Another scholar, Azim Shariff, a leading expert on religion and pro-social behavior, was surprised by the results, as his own research and meta-analysis (combining evidence across studies from many authors) indicated that religious participation, in most settings, increased generosity. Shariff requested the data to try to understand more clearly what might explain the discrepancy.

To Decety’s credit, he released the data. And upon re-analysis, Shariff discovered that the results were due to a coding error. The data had been collected across numerous countries, e.g. United States, Canada, Turkey, etc. and the country information had been coded as “1, 2, 3…” Although Decety’s paper had reported that they had controlled for country, they had accidentally not controlled for each country, but just treated it as a single continuous variable so that, for example “Canada” (coded as 2) was twice the “United States” (coded as 1). Regardless of what one might think about the relative merits and rankings of countries, this is obviously not the right way to analyze data. When it was correctly analyzed, using separate indicators for each country, Decety’s “findings” disappeared. Shariff’s re-analysis and correction was published in the same journal, Current Biology, in 2016. The media, however, did not follow along. While it covered extensively the initial incorrect results, only four media outlets picked up the correction.

In fact, Decety’s paper has continued to be cited in media articles on religion. Just last month two such articles appeared (one on Buzzworthy and one on TruthTheory) citing Decety’s paper that religious children were less generous. The paper’s influence seems to continue even after it has been shown to be wrong.

Last month, however, the journal, Current Biology, at last formally retracted the paper. If one looks for the paper on the journal’s website, it gives notice of the retraction by the authors. Correction mechanisms in science can sometimes work slowly, but they did, in the end, seem to be effective here. More work still needs to be done as to how this might translate into corrections in media reporting as well: The two articles above were both published after the formal retraction of the paper.

To reiterate, the researcher treated country – a nominal variable in this case since the countries were not ranked or ordered in any particular way – incorrectly which then threw off the overall results. When then using country correctly – from the description above, it sounds like using country as a dummy variable coded 1 and 0 – the findings that received all the attention disappeared.

The other issue at play here is whether corrections to academic studies or retractions are treated as such. It is hard to notify readers that a previously published study had flaws and the results have changed.

All that to say, paying attention to level of measurement earlier in the process helps avoid problems down the road.

Strong spurious correlations enhanced in appearance with mismatched dual axes

I stumbled across a potentially fascinating website titled Spurious Correlations that looks at relationships between odd variables. Here are two examples:

According to the site, both of these pairs have correlations higher than 0.94. In other words, very strong.

One issue: using dual axes can throw things off. The bottom chart above shows a negative relationship – but this is only because the axes are different. The top chart makes it look like the lines really go together – but the axes are way off from each other with the left side ranging from 29-34 and the right side ranging from 300-900. Overall, the charts reinforce the strong correlations between the two variables but using dual axes can be misleading.

An emerging portrait of emerging adults in the news, part 1

In recent weeks, a number of studies have been reported on that discuss the beliefs and behaviors of the younger generation, those who are now between high school and age 30 (an age group that could also be labeled “emerging adults”). In a three-part series, I want to highlight three of these studies because they not only suggest what this group is doing but also hints at the consequences.

Almost a week ago, a story ran along the wires about a new study linking “hyper-texting” and excessive usage of social networking sites with risky behaviors:

Teens who text 120 times a day or more — and there seems to be a lot of them — are more likely to have had sex or used alcohol and drugs than kids who don’t send as many messages, according to provocative new research.

The study’s authors aren’t suggesting that “hyper-texting” leads to sex, drinking or drugs, but say it’s startling to see an apparent link between excessive messaging and that kind of risky behavior.

The study concludes that a significant number of teens are very susceptible to peer pressure and also have permissive or absent parents, said Dr. Scott Frank, the study’s lead author

The study was done at 20 public high schools in the Cleveland area last year, and is based on confidential paper surveys of more than 4,200 students.

It found that about one in five students were hyper-texters and about one in nine are hyper-networkers — those who spend three or more hours a day on Facebook and other social networking websites.

About one in 25 fall into both categories.

Hyper-texting and hyper-networking were more common among girls, minorities, kids whose parents have less education and students from a single-mother household, the study found.

Several interesting things to note in this study:

1. It did not look at what exactly is being said/communicated in these texts or in social networking use. This study examines the volume of use – and there are plenty of high school students who are heavily involved with these technologies.

2. One of the best parts of this story is that the second paragraph is careful to suggest that finding an association between these behaviors does not mean that they cause each other. In other words, there is not a direct link between excessive testing and drug use. Based on this dataset, these variables are related. (This is a great example of “correlation without causation.”)

3. What this study calls for is regression analysis where we can control for other possible factors. It would then give us the ability to compare two students with the same family background and same educational performance and isolate whether texting was really the factor that led to the risky behaviors. If I had to guess, factors like family life and performance in school are more important in predicting these risky behaviors. Then, excessive texting for SNS use is an intervening variable. Why this study did not do this sort of analysis is unclear – perhaps they already have a paper in the works.

Overall, we need more research on these associated variables. While it is interesting in itself that there are large numbers of emerging adults who text a lot and use SNS a lot, we ultimately want to know the consequences. Part two and three of this series will look at a few studies that offer some possible consequences.