More on limits of Census measures of race and ethnicity

Here is some more information about the limitations of measuring race with the current questions in the United States Census:

When the 2010 census asked people to classify themselves by race, more than 21.7 million — at least 1 in 14 — went beyond the standard labels and wrote in such terms as “Arab,” ”Haitian,” ”Mexican” and “multiracial.”

The unpublished data, the broadest tally to date of such write-in responses, are a sign of a diversifying America that’s wrestling with changing notions of race…

“It’s a continual problem to measure such a personal concept using a check box,” said Carolyn Liebler, a sociology professor at the University of Minnesota who specializes in demography, identity and race. “The world is changing, and more people today feel free to identify themselves however they want — whether it’s black-white, biracial, Scottish-Nigerian or American. It can create challenges whenever a set of people feel the boxes don’t fit them.”

In an interview, Census Bureau officials said they have been looking at ways to improve responses to the race question based on focus group discussions during the 2010 census. The research, some of which is scheduled to be released later this year, examines whether to include new write-in lines for whites and blacks who wish to specify ancestry or nationality; whether to drop use of the word “Negro” from the census form as antiquated; and whether to possibly treat Hispanics as a mutually exclusive group to the four main race categories.

This highlights some of the issues of social science research:

1. Social science categories change as people’s own understanding of the terms changes. Keeping up with these understandings can be difficult and there is always a lag. For example, a sizable group of respondents in the 2010 Census didn’t like the categories but the problem can’t be fixed until a future Census.

2. Adding write-in options or more questions means that the Census becomes longer, requiring more time to take and analyze. With all of the Census forms that are returned, this is no small matter.

3. Comparing results of repeated surveys like the Census can become quite difficult when the definitions change.

4. The Census is going to change things based on focus groups? I assume they will also test permutations of the questions and possible categories in smaller-scale surveys before settling on what they will do.

Indicators that loyalty among family members is up in America

Even though we supposedly live in a disconnected and fragmented age, there are some indicators that suggest Americans feel more loyal toward their families than in the past:

“There’s been a social and economic change that’s actually made us more dependent on family loyalties,” says Stephanie Coontz, author of “Marriage, A History” (Penguin).

“You don’t know your neighbors. It would be crazy to be loyal to your employer in the same way you used to be because your employer’s not going to be loyal to you. All of those things have simultaneously made us want more loyalty — long for more loyalty — and try, I think, to have more loyalty in our personal lives.”

Loyalty itself is difficult to measure, but likely indicators such as family closeness appear to be on the rise. A 2010 Pew Research Center study found that 40 percent of Americans say their family life is closer now than when they were growing up, and only 14 percent say it is less close. Another Pew study showed that the percentage of adults who talked with a parent every day rose to 42 percent in 2005 from 32 percent in 1989.

The family loyalty picture is complex, with Bradford Wilcox, director of the National Marriage Project at the University of Virginia, saying that though couples who marry today are less likely to get divorced than couples that married in the 1970s, more people are forgoing marriage or delaying it.

The article suggests several reasons why people would be feeling more loyal toward their family today: rapid economic and social change, different expectations about family life, and people are entering intimate relationships more cautiously.

There could also be a few other factors at work:

1. I wonder if there is some social desirability bias in answering a question about family closeness. What adult today would say they are doing a worse job in creating family closeness than their parents did? Also, there is a memory issue here: how many current adults can accurately remember or assess the closeness of their family when they were younger? Their current family status is much more immediate.

2. I’m surprised this wasn’t mentioned in the article: it is relatively easier to communicate in families with the advent of email, cell phones, and text messages. However, I wonder if these easier methods of connection mean that people are confusing connected with closeness or if they are indeed one and the same.

Even if loyalty isn’t truly up compared to the “golden era” decades ago (at least in our popular culture we have this image of an era where the nuclear family never let each other down), the perception that loyalty is more important or stronger matters. This is an expectation that many people will bring to relationships and affect their actions.

(A side note: Wilcox and Coontz get interviewed for a ridiculous number of news stories about family life and marriage.)

Just the beginning of using social media to study political and social beliefs and behaviors

As the 2012 election nears, here is an overview of where we stand in using social media to understand people’s political and social beliefs and behaviors:

Marc A. Smith, a sociologist who studies online communities and founded the Silicon Valley-based Social Media Research Foundation, said “we are in the Model T Ford era of information systems” and analyzing their content.

Scott Keeter, the president of the American Assn. of Public Opinion Research, said that members of the professional organization and journalists should “proceed with a degree of humility” in deciding what social media can tell us about political campaigns. “Until we have more experience with real world outcomes, it’s hard to know the meaning of what we have captured from social media,” said Keeter, director of survey research at the Washington-based Pew Research Center for the People & the Press.

Much of the debate followed a Jan. 12 article by Politico, the online news site, which reported that it had partnered with Facebook to examine all “posting, sharing and linking about candidates” from Dec. 12 to Jan. 10. The arrangement was a first not only in that Facebook delved into both public and private messages but also used computer analysis to “identify positive and negative emotion in text.” (The company stressed that while computers draw an aggregate view of user sentiment, human beings do not monitor individual messages.)

Facebook said it employed a “well-validated software tool used frequently in social psychological research.” But Smith said he was “highly skeptical” of some of the precise findings in the Facebook analysis. He added that the intellectual disciplines focused on deciphering texts — natural language processing and computational linguistics — “are very deep and can do remarkable things, but they don’t necessarily have the ability to predict the next president of the United States of America.”

A few thoughts:

1. I like the urge to be cautious: too many news outlets jump on relatively small and meaningless events in the realm of social media and try to draw big conclusions. For example, the size of a Facebook group doesn’t say much. Similarly, I am still surprised by the number of media outlets that show the results of unofficial (and often low-count) poll results (though they now say they are not scientific results).

2. While being cautious is good now, it does suggest that this is a burgeoning area with lots of potential. The researchers who develop good methodologies and get access to specific or unique data will get a lot of attention. I wonder how much companies like Facebook really want to contribute to social science research as opposed to using their data to make money.

3. The counts of positive and negative feelings seem fairly unhelpful to me. For example, what does tracking the emotions of the world through tweets really tell us? Another example from the offline realm: “how the Bible feels.” This is where we need more than just descriptive research.

Two different methodologies to measure the US Jewish population

Measuring small populations within the United States can be difficult. Here is an example: even though two separate studies agree the US Jewish population is roughly 6.5 million, they used different methodologies to arrive at this number:

Many federations around the country commission scientific studies to better understand their local Jewish populations. These reports typically rely on random digit dialing, in which researchers come up with a percentage of Jews in the community based on the results of telephone surveys. In other instances, researchers will estimate the number of Jews based on the number of people with Jewish last names.

These reports provided the backbone for Sheskin and Dashefsky’s own annual estimate. But since not every federation studies its own population, the two conducted original research in some localities. In this, they were often aided by knowledgeable community members or by local estimates they found online. Lastly, they used data collected by the U.S. Census of three solidly Hasidic Jewish towns in New York state: Kiryas Joel, Kaser Village and New Square. (Aside from these exceptions, the U.S. Census does not count Jews.)

Adding these figures together, Sheskin and Dashefsky came up with a national estimate — albeit a patchwork one — that far exceeded previous figures. And in some ways exceeded their own expectations. Their national total of 6,588,000 is an overestimate, they contend, because some Jews — such as college students who live in one place and go to school elsewhere, or retirees who live part-time in one city and part-time in another — were likely counted twice…

Saxe came to his national estimate of 6.4 million through very different means.

Daunted by the steep expense and lengthy time required by random digit dialing, Saxe and his team ferreted out data that already existed to reach his conclusion. This included information from more than 150 government surveys on topics completely unrelated to Judaism, such as health care or education. Each study had a sample size of at least 1,000 people, and each study asked the question: What is your religion?

“From this, we are now absolutely confident — and it has been vetted by all sorts of groups and people — that about 1.8% of the adult American population says that their religion is Judaism,” he said.

Saxe adjusted his sample to account for children and came to a total of 6.4 million Jews in America.

In order to count and know more about relatively smaller populations in the United States, say Muslims in the United States when asking questions about religion, survey researchers often try to oversample these groups so that they can draw conclusions from a larger N. But as this article notes, finding people of smaller groups through random-digit dialing can take a long time.

Both of these researchers worked with existing data in order make generalizations: one worked with local figures and the other used a sample of large-scale surveys. In both cases, this is a clever use of existing data because doing a large-scale survey would have likely been a lot more costly in terms of time and money.

I would guess both sets of researchers are happy that their figures are close to those of the other study as this enhances the validity of their numbers.

Using sociological surveys as political weapons

One commentator suggests that sociological surveys were used as political weapons recently in Russia:

Long before the State Duma elections of Dec 4, the ultra-rightist and liberal mass media, collaborating with anti-Russian elements in the West, forecast that the ruling United Russia party would suffer a serious defeat.

They organized all sorts of sociological surveys to support this thoroughly planned campaign and to push their “predictions” on the “crisis” facing Russian leaders and “sharply declining rating” of Prime Minister Vladimir Putin and President Dmitry Medvedev. The anti-Putin campaign became really vociferous when the United Russia congress officially and unanimously approved Putin as its nominee for the presidential election in March 2012.

It is true that the election results showed the correlation of political forces and sentiments in Russia, which is experiencing the difficult strategic consequences of the disintegration of the erstwhile Soviet Union and the impact of the global economic crisis.

I’m less interested in dissecting recent events in Russia (which are very interesting to read about) and more interested in thinking about using sociological findings as political weapons. The argument made here is that these surveys are part of a larger, unfair, ideological campaign waged by pundits and the media. Perhaps more importantly, there is a claim that the surveys were “organized,” suggesting they were only undertaken in order to push a particular viewpoint.

I don’t doubt that sociological findings are used in struggles for power. Indeed, sociologists are not value-neutral as they themselves have their own interests and class position within society. However, I tend to think the primary purpose of sociological data is to explain what is happening in society. If sociological surveys in Russia show dissatisfaction with Putin, is it incorrect to report this? Of course, statistics and facts are open to interpretation and need to be approached carefully.

Where is the line between sociological surveys illuminating social structures, practices, and beliefs and having viewpoints and using sociological data to push these perspectives? Max Weber’s writings on value-neutrality are still useful today as we think about the proper use of sociological data.

Why cases of scientific fraud can affect everyone in sociology

The recent case of a Dutch social psychologist admitting to working with fraudulent data can lead some to paint social psychology or the broader discipline of sociology as problematic:

At the Weekly Standard, Andrew Ferguson looks at the “Chump Effect” that prompts reporters to write up dubious studies uncritically:

The silliness of social psychology doesn’t lie in its questionable research practices but in the research practices that no one thinks to question. The most common working premise of social-psychology research is far-fetched all by itself: The behavior of a statistically insignificant, self-selected number of college students or high schoolers filling out questionnaires and role-playing in a psych lab can reveal scientifically valid truths about human behavior.

And when the research reaches beyond the classroom, it becomes sillier still…

Described in this way, it does seem like there could be real journalistic interest in this study – as a human interest story like the three-legged rooster or the world’s largest rubber band collection. It just doesn’t have any value as a study of abstract truths about human behavior. The telling thing is that the dullest part of Stapel’s work – its ideologically motivated and false claims about sociology – got all the attention, while the spectacle of a lunatic digging up paving stones and giving apples to unlucky commuters at a trash-strewn train station was considered normal.

A good moment for reaction from a conservative perspective: two favorite whipping boys, liberal (and fraudulent!) social scientists plus journalists/the media (uncritical and biased!), can be tackled at once.

Seriously, though: the answer here is not to paint entire academic disciplines as problematic because of one case of fraud. Granted, some of the questions raised are good ones that social scientists themselves have raised recently: how much about human activity can you discover through relatively small sample tests of American undergraduates? But good science is not based on one study anyway. An interesting finding should be corroborated by similar studies done in different places at different times with different people. These multiple tests and observations help establish the reliability and validity of findings. This can be a slow process, another issue in a media landscape where new stories are needed all the time.

This reminds me of Joel Best’s recommendations regarding dealing with statistics. One common option is to simply trust all statistics. Numbers look authoritative, often come from experts, and they can be overwhelming. Just accepting them can be easy. At the other pole is the common option of saying that all statistics are simply interpretation and are manipulated so we can’t trust any of them. No numbers are trustworthy. Neither approaches are good options but they are relatively easy options. The better route to go when dealing with scientific studies is to have the basic skills necessary to understand whether they are good studies or not and how the process of science works. In this case, this would be a great time to call for better training among journalists about scientific studies so they can provide better interpretations for the public.

In the end, when one prominent social psychologist admits to massive fraud, the repercussions might be felt by others in the field for quite a while.

Sociologist argues that SATs not the best predictor of college success

In another round of the battles over standardized testing, a Wake Forest sociologist argues that the SAT is not the best predictor of college performance:

His conclusion? SATs don’t tell us much about how well a student will perform in college.

A better predictor of college success lies in a student’s high school grade-point average, class rank and course selection, Soares said…

Soares is editor of a new book, “SAT Wars: The Case for Test-Optional College Admissions,” that takes a critical look at the SAT while calling for a rethinking of the college admissions process…

When it dropped the SAT option, Wake Forest revamped its admissions process, beefing up its written response section and encouraging students to be interviewed by an admissions officer, a move that created a huge logistical challenge for the school.

This is not a small argument: as the article notes, this is a multi-billion dollar industry.

I wouldn’t be surprised if more schools continued to play around with the admissions processes, both to get around some of the difficulties with particular measures but also to get a competitive advantage in grabbing good students before other schools realize what is going on (the Moneyball approach to admissions?).

Dutch social psychologist commits massive science fraud

This story is a few days old but still interesting: a Dutch social psychologist has admitted to using fraudulent data for years.

Social psychologist Diederik Stapel made a name for himself by pushing his field into new territory. His research papers appeared to demonstrate that exposure to litter and graffiti makes people more likely to commit small crimes and that being in a messy environment encourages people to buy into racial stereotypes, among other things.

But these and other unusual findings are likely to be invalidated. An interim report released last week from an investigative committee at his university in the Netherlands concluded that Stapel blatantly faked data for dozens of papers over several years…

More than 150 papers are being investigated. Though the studies found to contain clearly falsified data have not yet been publicly identified, the journal Science last week published an “editorial expression of concern” regarding Stapel’s paper on stereotyping. Of 21 doctoral theses he supervised, 14 were reportedly compromised. The committee recommends a criminal investigation in connection with “the serious harm inflicted on the reputation and career opportunities of young scientists entrusted to Mr. Stapel,” according to the report…

I think the interesting part of the story here is how this was able to go on so long. It sounds like because Stapel handled more of the data himself rather than follow typical practices of handing it off to graduate students, he was able to falsify data for longer.

This also raises questions about how much scientific data might be faked or unethically tampered with. The article references a forthcoming study on the topic:

In a study to be published in a forthcoming edition of the journal Psychological Science, Loewenstein, John, and Drazen Prelec of MIT surveyed more than 2,000 psychologists about questionable research practices. They found that a significant number said they had engaged in 10 types of potentially unsavory practices, including selectively reporting studies that ‘worked’ (50%) and outright falsification of data (1.7%).

Pushing positive results, generally meaning papers that prove an alternative hypothesis, is also known to be favored by journals who don’t like negative results as much. Of course, both sets of results are needed for science to advance as both help prove and disprove arguments and theories. “Outright falsification” is another story…and perhaps even underreported (given social desirability bias and prevailing norms in scientific fields).

Given these occurrences, I wonder if scientists of all kinds would push for more regulation (IRBs, review boards, etc.) or less regulation with scientists policing themselves more (some more training in ethics, more commonly sharing data or linking studies to available data so readers could do their own analysis, etc.)

New Census definition of poverty behind the rise of poverty in the US?

While media outlets have spread the recent news from the Census Bureau that poverty has increased in the United States, some conservatives question whether this is a true change or reflects a change in the measurement of poverty:

The new Census measure suggests that the ranks of the poor – at 49 million – are 3 million larger than previously thought. The increase comes in the new way poverty is measured. The new Census report for the first time includes government subsidies and benefits such as food stamps as a part of household income, but it also factors in rising costs, such as health-care expenses. The result creates a new poverty line and a new view of who in the US is poor.

The new threshold for poverty for family of four, for example, is $24,343, as opposed to $22,113. And the revision reveals greater poverty trends among Asians, Hispanics, whites, and the elderly, and declining poverty for blacks and children, who tend to be greater beneficiaries of food stamps…

Sociologists say the new numbers give greater nuance to the portrait of poverty in the US, highlighting the degree to which government programs are keeping struggling Americans afloat. Critics counter the numbers are engineered precisely to make government assistance appear indispensable and to pave the way for a broader redistribution of American wealth toward the poor…

The Census changes are the first revisions to how the poverty rate is calculated since 1963. Since then, it has been gauged solely by cash income per household. But the new figures give a larger sense of what impact government spending has on poverty, says Timothy Smeeding, an economist at the University of Wisconsin in Madison.

Can’t really say I’m surprised that these figures are politicized. But, then again, the measurement of poverty has been a contentious topic for decades.

Measuring how much the Internet is worth: $8 trillion?

A recent report by McKinsey puts the value of the Internet at $8 trillion. Here are a few other fun facts:

There is a lot of Internet to measure, with two billion global consumers and $8 trillion in total revenue. So McKinsey’s report limited its scope to the online economy in the G-8 countries plus five more: Brazil, China, India, South Korea and Brazil. It defined Internet activities as private consumption (electronic equipment, e-commerce, broadband subscriptions, mobile Internet, and hardware and software consumption); private investment (from the telecommunications industry and the maintenance of extranet, intranet, and Web sites); public expenditure (spending and buying by government in software hardware and services); and trade (which accounts for exports of Internet equipment plus business-to-business services with overseas companies)…

As an industry, the Internet contributes more to the typical developed economy than mining, utilities, agriculture, or education. In Sweden, fully one-third of economic growth in the five years leading up to the recession came from Internet activities. For the entire G-8, the average was 21 percent. In an analysis of France since the mid-1990s, McKinsey found that the Internet created more than twice the number of jobs it destroyed.

Much of the Internet’s contribution to our lives is nearly impossible to measure. For example, I use email. How much is that worth to me? I can’t even begin to say. I read hundreds of news sources a day. What is that worth to me, or to the news organizations? Pricing this kind of thing is exhausting to think about. But since analyzing what the rest of us find “exhausting to think about” is McKinsey’s job, their researchers looked at the “consumer surplus” of the Internet, concluding that the total annual benefit to the United States comes out to $64 billion…

The United States is the world leader in the online industry, grabbing 30 percent of global Internet revenues. But the UK is the world leader in online retail. The British spent $2,535 on e-stuff in 2009, more than twice the average of the world’s largest countries and still 1.4 times the amount of the typical U.S. shopper. Sweden leads the world in Internet’s contribution to GDP. Fully 6.3 of the country’s economy is online — twice Germany, France or India. In Russia, the Internet contributes not even one percent of GDP.

Some interesting stuff here:

1. I appreciate the emphasis on the difficulty of measuring this topic. In addition to simply thinking about the economic benefits, we could spend a lot of time discussing how it has altered social interaction, private practices, and democracy. I wonder what the margin of error is on the estimates.

2. There is some indication of the splits between the Internet haves and have-nots. If the Internet is so valuable, should this be a leading component of aid to poorer countries? It does require a decent investment in infrastructure but it would allow people to easily connect to first-world countries and industries. For example, what is the impact of the less than $100 laptop that was touted for years?

3. With all of this money (and value floating around), it is a reminder why so many states want to get their hands on sales tax revenues from Internet sales. Do European countries like Britain have a similar system? I have bought a few things from Amazon.co.uk in the past and I don’t recall the experience being much different.

4. I would be interested to know the future prospects for the Internet’s growth: how quickly will it grow? How much will it expand? Is most of the growth within developed countries or in opening or expanding newer markets (China and India plus others)?