What counts as “good science,” happiness studies edition

Looking across studies that examined factors leading to happiness, several researchers concluded only two of five factors commonly discussed stood up to scrutiny:

Photo by Jill Wellington on Pexels.com

But even these studies failed to confirm that three of the five activities the researchers analyzed reliably made people happy. Studies attempting to establish that spending time in nature, meditating and exercising had either weak or inconclusive results.

“The evidence just melts away when you actually look at it closely,” Dunn said.

There was better evidence for the two other tasks. The team found “reasonably solid evidence” that expressing gratitude made people happy, and “solid evidence” that talking to strangers improves mood.

How might researchers improve their studies and confidence in the results?

The new findings reflect a reform movement under way in psychology and other scientific disciplines with scientists setting higher standards for study design to ensure the validity of the results.

To that end, scientists are including more subjects in their studies because small sample sizes can miss a signal or indicate a trend where there isn’t one. They are openly sharing data so others can check or replicate their analyses. And they are committing to their hypotheses before running a study in a practice known as “pre-registering.” 

These seem like helpful steps for quantitative research. Four solutions are suggested above (one is more implicit):

  1. Analyzing dozens of previous studies. When researchers study similar questions, are their findings consistent? Do they use similar methods? Is there consensus across a field or across disciplines? This summary work is useful.
  2. Avoid small samples. This helps reduce the risk of a chance finding among a smaller group of participants.
  3. Share data so that others can look at procedures and results.
  4. Test certain hypotheses set at the beginning rather than fitting hypotheses to statistically significant findings.

One thing I have not seen in discussions of these approaches intended to create better science: how much better will results be after following these steps? How much can a field improve with better confidence in the results? 5-10% 25% More?

Science problem: study says there is not enough information in methods sections of science articles to replicate

A new study suggests the methods sections in science articles are incomplete, making it very difficult to replicate the studies:

Looking at 238 recently published papers, pulled from five fields of biomedicine, a team of scientists found that they could uniquely identify only 54 percent of the research materials, from lab mice to antibodies, used in the work. The rest disappeared into the terse fuzz and clipped descriptions of the methods section, the journal standard that ostensibly allows any scientist to reproduce a study.

“Our hope would be that 100 percent of materials would be identifiable,” said Nicole A. Vasilevsky, a project manager at Oregon Health & Science University, who led the investigation.

The group quantified a finding already well known to scientists: No one seems to know how to write a proper methods section, especially when different journals have such varied requirements. Those flaws, by extension, may make reproducing a study more difficult, a problem that has prompted, most recently, the journal Nature to impose more rigorous standards for reporting research.

“As researchers, we don’t entirely know what to put into our methods section,” said Shreejoy J. Tripathy, a doctoral student in neurobiology at Carnegie Mellon University, whose laboratory served as a case study for the research team. “You’re supposed to write down everything you need to do. But it’s not exactly clear what we need to write down.”

A new standard could be adopted across journals and subfields: enough information has to be given in the methods section for another scientist to replicate the study. Another advantage of this might be that it pushes authors to try to read their paper from the perspective of outsiders who are looking at the study for the first time.

I wonder how well sociology articles would fare in this analysis. Knowing everything needed for replication can get voluminous or technical, depending on the work that went into collecting the data and then getting it ready for analysis. There are a number of choices along the way that add up.

Social psychology can move forward by pursuing more replication

Here is an argument that a renewed emphasis on replicating studies will help the field of social psychology move beyond some public issues:

Things aren’t quite as bad as they seem, though. Although Natures report was headlined “Disputed results a fresh blow for social psychology,” it scarcely noted that there have been some replications of experiments modelled on Dijksterhuis’s phenomenon. His finding could still out turn to be right, if weaker than first thought. More broadly, social priming is just one thread in the very rich fabric of social psychology. The field will survive, even if social priming turns out to have been overrated or an unfortunate detour.

Even if this one particular line of work is under a shroud, it is important not to lose sight of the fact many of the old standbys from social psychology have been endlessly replicated, like the Milgram effect—the old study of obedience in which subjects turned up electrical shocks (or what they thought were electrical shocks) all the way to four hundred and fifty volts, apparently causing great pain to their subjects, simply because they’d been asked to do it. Milgram himself replicated the experiment numerous times, in many different populations, with groups of differing backgrounds. It is still robust (in hands of other researchers) nearly fifty years later. And even today, people are still extending that result; just last week I read about a study in which intrepid experimenters asked whether people might administer electric shocks to robots, under similar circumstances. (Answer: yes.)

More importantly, there is something positive that has come out of the crisis of replicability—something vitally important for all experimental sciences. For years, it was extremely difficult to publish a direct replication, or a failure to replicate an experiment, in a good journal. Throughout my career, and long before it, journals emphasized that new papers have to publish original results; I completely failed to replicate a particular study a few years ago, but at the time didn’t bother to submit it to a journal because I knew few people would be interested. Now, happily, the scientific culture has changed. Since I first mentioned these issues in late December, several leading researchers in psychology have announced major efforts to replicate previous work, and to change the incentives so that scientists can do the right thing without feeling like they are spending time doing something that might not be valued by tenure committees.

The Reproducibility Project, from the Center for Open Science is now underway, with its first white paper on the psychology and sociology of replication itself. Thanks to Daniel Simons and Bobbie Spellman, the journal Perspectives in Psychological Science is now accepting submissions for a new section of each issue devoted to replicability. The journal Social Psychology is planning a special issue on replications for important results in social psychology, and has already received forty proposals. Other journals in neuroscience and medicine are engaged in similar efforts: my N.Y.U. colleague Todd Gureckis just used Amazon’s Mechanical Turk to replicate a wide range of basic results in cognitive psychology. And just last week, Uri Simonsohn released a paper on coping with the famous file-drawer problem, in which failed studies have historically been underreported.

It is a good thing if the social sciences were able to be more sure of their findings. Replication could go a long way to moving the conversation away from headline-grabbing findings based on small Ns to be more certain results that a broader swath of an academic field can agree with. The goal is to get it right in the long run with evidence about human behaviors and attitudes, not necessarily in the short-term.

Even with a renewed emphasis on replication, there might still be some issues:

1. The ability to publish more replication studies would certainly help but is there enough incentive for researchers, particularly those trying to establish themselves, to pursue replication studies over innovative ideas and areas that gain more attention?

2. What about the number of studies that are conducted with WEIRD populations, primarily US undergraduate students? If studies continue to be replicated with skewed populations, is much gained?