Debate over priming effect illustrates need for replication

A review of the literature regarding the priming effect highlights the need in science for replication:

At the same time, psychology has been beset with scandal and doubt. Formerly high-flying researchers like Diederik Stapel, Marc Hauser, and Dirk Smeesters saw their careers implode after allegations that they had cooked their results and managed to slip them past the supposedly watchful eyes of peer reviewers. Psychology isn’t the only field with fakers, but it has its share. Plus there’s the so-called file-drawer problem, that is, the tendency for researchers to publish their singular successes and ignore their multiple failures, making a fluke look like a breakthrough. Fairly or not, social psychologists are perceived to be less rigorous in their methods, generally not replicating their own or one another’s work, instead pressing on toward the next headline-making outcome.

Much of the criticism has been directed at priming. The definitions get dicey here because the term can refer to a range of phenomena, some of which are grounded in decades of solid evidence—like the “anchoring effect,” which happens, for instance, when a store lists a competitor’s inflated price next to its own to make you think you’re getting a bargain. That works. The studies that raise eyebrows are mostly in an area known as behavioral or goal priming, research that demonstrates how subliminal prompts can make you do all manner of crazy things. A warm mug makes you friendlier. The American flag makes you vote Republican. Fast-food logos make you impatient. A small group of skeptical psychologists—let’s call them the Replicators—have been trying to reproduce some of the most popular priming effects in their own labs.

What have they found? Mostly that they can’t get those results. The studies don’t check out. Something is wrong. And because he is undoubtedly the biggest name in the field, the Replicators have paid special attention to John Bargh and the study that started it all.

While some may find this discouraging, it sounds like the scientific process is being followed. A researcher, Bargh, finds something interesting. Others follow up to see if Bargh was right and to try to extend the idea. Debate ensues once a number of studies have been done. Perhaps there is one stage left to finish off in this process: the research community has to look at the accumulated evidence at some point and decide whether the priming effect exists or not. What does the overall weight of the evidence suggest?

For the replication process to work well, a few things need to happen. Researchers need to be willing to repeat the studies of others as well as their own studies. They need to be willing to report both positive and negative findings, regardless of which side of the debate they are on. Journals need to provide space for positive and negative findings. This incremental process will take time and may not lead to big headlines but its steady approach should pay off in the end.

Even chimpanzees can play the Ultimatum game

One of the most famous experiments of recent decades, the Ultimatum game, was recently extended to chimpanzees:

This modified game, in which two chimps decided how to divide a portion of banana slices, seems to have revealed the primates’ generous side.

The study, published in Proceedings of the National Academy of Sciences, was part of an effort to uncover the evolutionary routes of why we share, even when it does not make economic sense.

Scientists say this innate fairness is an important foundation of co-operative societies like ours…

She added though that is was not clear that the chimps completely understood the design of the game and that, with just six chimps involved in the study, further evidence would be needed to show clearly that chimps had a natural tendency towards fairness.

It sounds like there is more work to be done to demonstrate consistent effects among chimpanzees. The way to do this is to replicate the game with a variety of chimpanzees in a variety of contexts. There may be two obstacles to this. First, it sounds like it took some time to train the chimps to understand the game, especially since the chimps were not directly offered food as a reward as this had skewed a similar 2007 study. Second, replicating the study elsewhere might lead to different results – kind of like what happens when an experiment changes from involving American undergraduates to other populations in the world.

Building “a live test case” city in China

Curbed describes a proposed pop-up city in China that could be used to test a number of planning ideas:

With the amount of architectural phenomena China’s churning out these days, it can be tough for decent renderings to garner any sort wow factor. The market is just glutted with all manner of wackadoo designs, from car-free “Great Cities” to the world’s next tallest building to alien/pinecone towers. Still, these renderings for an urban oasis in Changsha, Hunan, to be built from scratch by Kohn Pedersen Fox Associates (KPF) stand out. An “experiment in future city planning,” this lakeside city lets the architects play with neighborhood structure, flood prevention systems, and urban agriculture, all the while housing 180,000 residents—that’s 100,000 more people than accounted for in China’s other planned pop-up city. KPF’s press release calls the Meixi Lake project “a live test case”—always a reassuring phrase when talking about urban architecture—designed to integrate nature into densely populated cityscapes. The city—described as “actually happening” by a spokesperson—will be organized by neighborhood pods, each housing about 10,000 people, with a school, shopping center, and other public spaces in each town-like structure. The plan, proposed five years ago, is intriguing, though the verdict’s still out on whether it has enough pie-in-the-sky details to be make it into the selective club of most outlandish cities of the future.

I detect some skepticism here. But, I’m interested in this phrase of a city acting as “a live test case.” Experimenting with cities? While the sociologists of the Chicago School suggested Chicago was a laboratory, I don’t think this is what they had in mind. I suspect this language couldn’t be used openly in the United States even though certain development plans and projects have acted as experiments of sorts over the decades. For example, public housing went through an experiment of sorts starting with the construction of high-rises in the 1950s and 1960s. However, these high-rises (famously marked by the destruction of the Pruitt-Igoe project in St. Louis) were torn down in recent decades after being marked as untenable. When talking about cities as live test cases, does that mean the development will be evaluated years down the road and if it worked, it will continue but it will be changed if it didn’t work? Could portions of test cities be torn down and then make way for new cities?

A company offers to replicate research study findings

A company formed in 2011 is offering a new way to validate the findings of research studies:

A year-old Palo Alto, California, company, Science Exchange, announced on Tuesday its “Reproducibility Initiative,” aimed at improving the trustworthiness of published papers. Scientists who want to validate their findings will be able to apply to the initiative, which will choose a lab to redo the study and determine whether the results match.

The project sprang from the growing realization that the scientific literature – from social psychology to basic cancer biology – is riddled with false findings and erroneous conclusions, raising questions about whether such studies can be trusted. Not only are erroneous studies a waste of money, often taxpayers’, but they also can cause companies to misspend time and resources as they try to invent drugs based on false discoveries.

This addresses a larger concern about how many research studies found their results by chance alone:

Typically, scientists must show that results have only a 5 percent chance of having occurred randomly. By that measure, one in 20 studies will make a claim about reality that actually occurred by chance alone, said John Ioannidis of Stanford University, who has long criticized the profusion of false results.

With some 1.5 million scientific studies published each year, by chance alone some 75,000 are probably wrong.

I’m intrigued by the idea of having an independent company assess research results. This could work in conjunction with other methods of verifying research results:

1. The original researchers could run multiple studies. This works better with smaller studies but it could be difficult when the N is larger and more resources are needed.

2. Researchers could also make their data available as they publish their paper. This would allow other researchers to take a look and see if things were done correctly and if the results could be replicated.

3. The larger scientific community should endeavor to replicate studies. This is the way science is supposed to work: if someone finds something new, other researchers should adopt a similar protocol and test it with similar and new populations. Unfortunately, replicating studies is not seen as being very glamorous and it tends not to receive the same kind of press attention.

The primary focus of this article seems to be on medical research. Perhaps this is because it can affect the lives of many and involves big money. But it would be interesting to apply this to more social science studies as well.

Moving poor families to better neighborhoods doesn’t improve jobs, education but does boost happiness

A new study suggests happiness is one of the primary benefits of poor families moving to better neighborhoods:

When thousands of poor families were given federal housing subsidies in the early 1990s to move out of impoverished neighborhoods, social scientists expected the experience of living in more prosperous communities would pay off in better jobs, higher incomes and more education.

That did not happen. But more than 10 years later, the families’ lives had improved in another way: They reported being much happier than a comparison group of poor families who were not offered subsidies to move, a finding that was published on Thursday in the journal Science.

And using the gold standard of social surveys — the General Social Survey, in which researchers have questioned thousands of Americans of all income levels going back to the 1970s — researchers even quantified how much happier the families were. The improvement was equal to the level of life satisfaction of someone whose annual income was $13,000 more a year, said Jens Ludwig, a professor of public policy at the University of Chicago and the lead author of the study…

“Mental health and subjective well-being are very important,” said William Julius Wilson, a sociology professor at Harvard whose 1987 book “The Truly Disadvantaged” pioneered theory about concentrated poverty. “If you are not feeling well, it’s going to affect everything — your employment, relations with your family.”

This seems to fit with findings from other studies looking at programs like the Gautreaux Program in Chicago or the Moving to Opportunity program that took place in a few big cities. The children of these movers/participants may have better jobs, incomes, and educations down the road but there is not much of an immediate payoff in these areas.

It is too bad Wilson doesn’t go further with his comments. What exactly does better well-being translate into? Improved or more stable family life? Better social relations? Could improved well-being translate into better jobs and higher education down the road?

Facebook runs 2010 voting experiment with over 61 million users

Experiments don’t just take place in laboratories; they also happen on Facebook.

On November 2nd, 2010, more than 61 million adults visited Facebook’s website, and every single one of them unwittingly took part in a massive experiment. It was a randomised controlled trial, of the sort used to conclusively test the worth of new medicines. But rather than drugs or vaccines, this trial looked at the effectiveness of political messages, and the influence of our friends, in swaying our actions. And unlike most medical trials, this one had a sample size in the millions.

It was the day of the US congressional elections. The vast majority of the users aged 18 and over (98 percent of them) saw a “social message” at the top of their News Feed, encouraging them to vote. It gave them a link to local polling places, and clickable button that said “I voted”. They could see how many people had clicked the button on a counter, and which of their friends had done so through a set of randomly selected profile pictures.

But the remaining 2 percent saw something different, thanks to a team of scientists, led by James Fowler from the University of California, San Diego. Half of them saw the same box, wording, button and counter, but without the pictures of their friends—this was the “informational message” group. The other half saw nothing—they were the “no message” group.

By comparing the three groups, Fowler’s team showed that the messages mobilised people to express their desire to vote by clicking the button, and the social ones even spurred some to vote. These effects rippled through the network, affecting not just friends, but friends of friends. By linking the accounts to actual voting records, Fowler estimated that tens of thousands of votes eventually cast during the election were generated by this single Facebook message.

The effects appear to be small but could still be influential when multiplied through large social networks.

I suspect we’ll continue to see more and more of this in the future. Platforms like Facebook or Google or Amazon have access to millions of users and can run experiments that don’t change a user’s experience of the website much.

What television show will assume the role of “sociological experiment of our time”?

MTV’s Jersey Shore will run only one more season. This reminded me that I have seen several sites refer to the show’s sociological nature. Two examples:

1. From Gawker.  A number of their recaps have included this claim about the show (including this March 9, 2012 post): “the greatest sociological experiment of our time.” As it is probably meant to be, this is quite hyperbolic.

2. From the New York Post:

We are gathered here this evening to celebrate and memorialize the death of an era in MTV history: The Jersey Shore era. As both a former employee of Lord Viacom MTV Networks (full disclosure: from 2008 – 2011) and a viewer, it feels as though a chapter in its life has come to a close. The pages have turned and the sun is setting on our tanned up guido friends. And for a few years, this sociological experiment defined MTV and defined the audience it cultivated. We all watched in slackjawed horror/glee the day it all began, and now we must lay it to rest. And so with it goes the days of MTV’s most polarizing programming. Let us reflect.

I’m not quite sure why this show was repeatedly tied to sociology. Perhaps some simply couldn’t understand why the show had good ratings considering the content. Perhaps it is because a lot of people wanted to hold up the show as a mirror to make claims about the excesses and ills of our larger society.

But we could also ask which shows might take up this spot in the future. I hear that Honey Boo Boo character is getting a lot of attention but there is no shortage of reality TV shows that portray interesting characters in interesting situations. Was Jersey Shore really more emblematic of American life than other shows?

Evidence: TV shows can lower fertility rates

An article about the cultural power of television discusses several studies that show TV programs can lower fertility rates:

Several years ago, a trio of researchers working for the Inter-American Development Bank set out to help solve a sociological mystery. Brazil had, over the course of four decades, experienced one of the largest drops in average family size in the world, from 6.3 children per woman in 1960 to 2.3 children in 2000. What made the drop so curious is that, unlike the Draconian one-child policy in China, the Brazilian government had in place no policy to limit family size. (It was actually illegal at some point to advertise contraceptives in the overwhelmingly Catholic country.) What could explain such a steep drop? The researchers zeroed in on one factor: television.

Television spread through Brazil in the mid-sixties. But it didn’t arrive everywhere at once in the sprawling country. Brazil’s main station, Globo, expanded slowly and unevenly. The researchers found that areas that gained access to Globo saw larger drops in fertility than those that didn’t (controlling, of course, for other factors that could affect fertility). It was not any kind of news or educational programming that caused this fertility drop but exposure to the massively popular soap operas, or novelas, that most Brazilians watch every night. The paper also found that areas with exposure to television were dramatically more likely to give their children names shared by novela characters.

Novelas almost always center around four or five families, each of which is usually small, so as to limit the number of characters the audience must track. Nearly three quarters of the main female characters of childbearing age in the prime-time novelas had no children, and a fifth had one child. Exposure to this glamorized and unusual (especially by Brazilian standards) family arrangement “led to significantly lower fertility”—an effect equal in impact to adding two years of schooling.

In a 2009 study, economists Robert Jensen and Emily Oster detected a similar pattern in India. A decade ago, cable television started to expand rapidly into the Indian countryside, where deeply patriarchal views had long prevailed. But not all villages got cable television at once, and its random spread created another natural experiment. This one yielded extraordinary results. Not only did women in villages with cable television begin bearing fewer children, as in Brazil, but they were also more able to leave their home without their husbands’ permission and more likely to disapprove of husbands abusing their wives, and the traditional preference for male children declined. The changes happened rapidly, and the magnitude was “quite large”—the gap in gender attitudes separating villages introduced to cable television from urban areas shrunk by between 45 and 70 percent. Television, with its more progressive social model, had changed everything.

Four quick thoughts:

1. Such shows (TV and radio) have been used deliberately by public health organizations to fight AIDS. It is one thing to hold training sessions and open and maintain clinics but it is another to have successful soap operas that promote certain behaviors.

2. These situations provided some fascinating natural experiments. I occasionally ask students this very question: how might you set up a natural experiment to test the effects of television? In the United States, outside of some ultra-controlled environment a la The Truman Show, it is difficult to quickly answer this question.

3. Sociologist Juliet Schor nicely explains the mechanism behind this in The Overspent American. Mass media presents average residents a new, commonly known reference group to which they can compare themselves. Instead of primarily comparing themselves to neighbors or acquaintances, viewers started seeing what “middle-class” or “normal” look like on television and then work to emulate that.

4. Media output is not simply entertainment – something is being promoted. Being able to watch and experience this critically is crucial in a world awash with media and information.

“First large scale [lost letter] study” results from London

The results of a lost letter study in London provide some interesting results:

Neighbourhood income deprivation has a strong negative effect on altruistic behaviour when measured by a ‘lost letter’ experiment, according to new UCL research published August 15 in PLoS One. Researchers from UCL Anthropology used the lost letter technique to measure altruism across 20 London neighbourhoods by dropping 300 letters on the pavement and recording whether they arrived at their destination. The stamped letters were addressed by hand to a study author’s home address with a gender neutral name, and were dropped face-up and during rain free weekdays.

The results show a strong negative effect of neighbourhood income deprivation on altruistic behaviour, with an average of 87% of letters dropped in the wealthier neighbourhoods being returned compared to only an average 37% return rate in poorer neighbourhoods.

Co-author Jo Holland said: “This is the first large scale study investigating cooperation in an urban environment using the lost letter technique. This technique, first used in the 1960s by the American social psychologist Stanley Milgram, remains one of the best ways of measuring truly altruistic behaviour, as returning the letter doesn’t benefit that person and actually incurs the small hassle of taking the letter to a post box…

As well as measuring the number of letters returned, the researchers also looked at how other neighbourhood characteristics may help to explain the variation in altruistic behaviour — including ethnic composition and population density — but did not find them to be good predictors of lost letter return.

This is a good example of a natural experiment.

I wonder if there is any equivalent to this in the online realm. Perhaps an email that is mistakenly sent to the wrong address that would require a user to then take a small amount of time to forward the email to original recipient?

The 2020 Census to have different questions about race?

The Los Angeles Times reports that the US Census Bureau is looking into possibly changing the questions about race and ethnicity in the 2020 Census:

The bureau’s new recommendations were based on research findings of a number of experimental questions given to 500,000 households during the 2010 census. The findings showed that many Americans believe the racial and ethnic categories now used by the census are confusing and don’t always jibe with their own views of their identity.

For example, asked to state their race on the 2010 census, more than 19 million people, including millions of Latinos, chose “some other race,” rather than select from the five categories offered on the census form: white, black, Asian, American Indian/Native Alaskan or Hawaiian/Pacific Islander.

One of the changes proposed now would simply ask respondents to choose their race or origin, allowing them to check a single box next to categories that would include white, black or Hispanic.

Another would add write-in categories to allow those of Middle Eastern or Arab origin to specifically identify themselves, officials said.

A third change would end the practice of offering the controversial term “Negro” as an alternative for African-American or black. Some African-Americans in 2010 criticized the government’s continuing use of the word, saying it was outdated and offensive.

As cultural definitions change, so should the Census in order to better match the lived reality. Of course, this attempt to improve the validity of the results makes it more difficult for researchers and others to match up results from newer Census results, marring the reliability. And as the article notes, this has political implications and this could play into the definitions as well.

It would be interesting to hear more about the experimental results from the 2010 survey as this is a good example of an experiment that doesn’t require a laboratory. What else did people like or not like? I assume the Census Bureau is not going to cave in to those who don’t want to answer a race or ethnicity question at all and/or those who simply answer “American.”