The best state to live in is North Dakota; will this change anything?

A new set of rankings suggests that North Dakota is the #1 state in which to live. Here are some of the reasons:

Lowest unemployment rate among the 50 states. North Dakota’s 3.8 percent unemployment rate is less than half the national rate.

Statewide GDP growth of 3.9 percent ranked third in the nation in 2009 behind Oklahoma and Wyoming (2010’s figures are not yet available.)

Best job growth last year. A Gallup survey reported that North Dakota businesses had the best ratio of hiring to firing among the 50 states.

Stable housing market. Across the nation, nearly 1 in 4 homeowners with a mortgage are underwater. In North Dakota, just 1 in 14 have negative equity, the fourth lowest negative-equity ratio among all the states. The state also has the third-lowest home foreclosure rate. Affordable homes are a big part of the story here; let’s just say you don’t need to overstretch to own. According to Zillow, the median home price in North Dakota is below $150,000. That’s less than three times the state’s median household income. By comparison, even after sharp post-bubble price declines, the median priced home in California is still about five times median household income.

Low violent crime rate. The incidence of violent crime per 100,000 residents in North Dakota in 2008 (latest available data) was the fourth lowest in the country and nearly 60 percent lower than the national average.

Lowest credit card default rate. According to TransUnion, North Dakotans seem to have a handle on spending within their means.

The article goes on to say that Gallup recently found North Dakota to be the 3rd happiest state in the county.

One way of thinking about this ranking is to address the typical questions about such rankings: how dependent is the ranking on what factors were considered and how they were weighted? This plagues rankings of everything from states to colleges to communities to country’s well-being.

But another way to look at this is to ask whether the ranking will have any impact in the real world. This seems akin to the issue of substantive significance: statistics or data might suggest several variables are related but this doesn’t mean that this relationship or finding makes a big difference in everyday life. If North Dakota really is #1 based on a variety of useful measures, does this mean more people will move to the state? People move for a variety of reasons: jobs, to be by family, for certain climates (warmer weather) or atmospheres (the excitement of creative class cities or more sophisticated places), for education, to escape certain issues (crime, poverty) and benefit from the advantages of certain places (schools, parks, family-friendly, kid-friendly). But would anyone ever move to North Dakota based on this ranking? Will it lead to more businesses taking a second look at locating in North Dakota rather than big cities (or their suburbs) like New York, Chicago, Los Angeles, or elsewhere?

Another possible area of impact are perceptions about the state. Will the state’s status or prestige increase due to this ranking? If the state is seen as successful by other states, they might emulate North Dakota’s policies.

Overall, if North Dakota was #1 for decades, would anything really change?

(A related issue: if people did start moving to North Dakota in large numbers, would the state be able to maintain its top rank on this list?)

The case of the insistent American Community Survey employee

The decennial US Census has its employees try to contact a household six times (see a quick summary of their procedures here). But the Problem Solver in the Chicago Tribune presents a case where a Census employee working on the American Community Survey (ACS) irritated a Chicago couple:

The first few requests were tolerable. A Census Bureau worker would knock on John and Beverly Scott’s door and ask them to fill out an American Community Survey. The McKinley Park couple would politely decline.

But as the days passed, the visits became more frequent and the requests more urgent.

Some evenings, the doorbell would ring at dinnertime, then again at 10 p.m…

Scott said the requests had become so repetitive and annoying, the couple began pulling the old “out-of-candy-on-Halloween trick.”

“I work afternoons, and I’m not home,” Scott said. “My wife has to sit with the lights off because she doesn’t want to be bothered.”

Often, even that doesn’t work.

“They knock and knock and knock and ring and ring and ring,” Beverly Scott said. “Knocking longer is not going to make me answer the door, and it’s not going to help if we’re not here.”

The final straw, John Scott said, was when a Census Bureau employee told him he would be fined $2,000 if he did not fill out the 48-question survey.

When contacted by the Problem Solver, the regional ACS office said the couple would not be fined (though the government could do this) and they would stop trying to contact the couple (and they did stop).

Surveys that pick out samples that are representative often will work hard to contact the initial respondents. If they can’t make contact or get a response, then they move on to other people who might fit what they are looking for, adding time and resources that need to be spend for the project.

Interestingly, the couple in question also notes that although they filled out their decennial survey, they are not interested in filling out the American Community Survey because they see it as too intrusive:

But they’re not too keen on the American Community Survey, a more in-depth, ongoing questionnaire the Census Bureau conducts to compile information on area demographics, consumer patterns and economic issues.

In particular, the Scotts did not want to answer questions they found too personal, such as inquiries about their income, when they left for work and their health.

“The new questionnaire has gone way over the line,” Scott said. “We have told the representative that we are not going to answer private questions, but they continue to come to our door at all hours of the day and night.”

There were occasional reports of people who felt the same about the 2010 decennial census with some suggesting that a Census should only gather a head count and no other information. But in the future, the US Census has suggested that the ACS will play an increasingly important role as the government looks to collect more frequent data. As the story suggests, the ACS data is important for determining “the Consumer Price Index and how federal funding is allocated.” Rather than waiting every 10 years for a more comprehensive counting, the ACS provides more up-to-data that governments (from the federal to local level), researchers, and the public can utilize.

The problems of classifying Hispanics in the Census

A sociology professor talks about the different ways in which the Census has classified Hispanics:

Professor RUBEN RUMBAUT (University of California at Irvine): Race is one of three questions that has been asked in every census since 1790. So for 220 years, that person’s age, sex and race have been asked in a census. Age and sex have been measured in the same way for 220 years. Race has pretty much never been measured in the same way from one census to the next, suggesting this is not a biological given category but a social and legal and political construction whose meaning changes over time…From census to census, there are slight changes in wording, in instructions, and that end up making a significant difference in the actual responses that people gave.

The sociologist goes on to explain studies he has been a part of that show how immigrant groups differ in identifying themselves as white:

A colleague of mine and I since 1991 have directed the largest study of children of immigrants in the United States over time, looking at 77 different nationalities, including all of the ones from Latin America. And over time we have asked them separate questions about their ethnic identity and also a question about race. We also independently interviewed their parents.

Cuban parents, 93 percent of them, thought that they were white, but only 41 percent of their own children thought they were white; 69 percent of Nicaraguans, Salvadoran and Guatemalan parents thought they were white, but only 19 percent of their own children thought they were white.

These are quite wide differences. The Census is supposed to offer reliable and valid data over time but in this particular category, the Census has had difficulty.

Interestingly, the sociologist suggests there were experiments embedded in the 2010 Census in order to help solve these issues for the next Census:

Already in the year 2010, there were four experiments embedded in the 2010 census looking ahead at how to make changes for the year 2020. One of the things that are being considered, for example, is trying to create a single question that combines both Hispanic ethnicity and race into a single question.

I hadn’t heard anything about these experiments and I guess we’ll have to wait and see how this turns out. Whatever is decided, sociologists and others will have to find ways to put together the various measurements over the decades.

The predictive power of sociology and learning from the past

In recent  years, the predictive element of social science has been discussed by a few people: how much can we use data from the past to predict the future? In an interview with Scientific American, a mathematical sociologist who works at Yahoo! Labs talks about our predictive abilities:

A big part of your book deals with the problem of ignoring failures—a selective reading of the past to draw erroneous conclusions, which reminds me of the old story about the skeptic who hears about sailors who survived a shipwreck supposedly because they’d prayed to the gods. The skeptic asked, “What about the people who prayed and perished?”
Right—if you look at successful companies or shipwrecked people, you don’t see the ones who didn’t make it. It’s what sociologists call “selection on the dependent variable,” or what in finance is called survivorship bias. If we collected all the data instead of just some of it, we could learn more from the past than we do. It’s also like Isaiah Berlin’s distinction between hedgehogs and foxes. The famous people in history were hedgehogs, because when those people win they win big, but there are lots of failed hedgehogs out there.

Other scholars have pointed out that ignoring this hidden history of failures can lead us to take bigger risks than we might had we seen the full distribution of past outcomes. What other problems do you see with our excessive focus on the successful end of the distribution?

It causes us to misattribute the causes of success and failure: by ignoring all the nonevents and focusing only on the things that succeed, we don’t just convince ourselves that things are more predictable than they are; we also conclude that these people deserved to succeed—they had to do something right, otherwise why were they successful? The answer is random chance, but that would cause us to look at them in a different light, and changes the nature of reward and punishment.

Interesting material and Watts’ just published book (Everything Is Obvious: *Once You Know the Answer)  sounds worthwhile. There are also some interesting thoughts later in the interview about how information in digital social networks doesn’t really get passed along through influential people.

I haven’t seen too much discussion within sociology about predictive abilities: how much do we suffer from these blind spots that Watts and others point out?

(As a reminder, Nassim Taleb, in his book Black Swan, has also written well on this subject.)

How to discover hidden racial profiling in McHenry County police data

McHenry County is located northwest of Chicago, has just over 300,000 residents, and is part of the six-county Chicago region. In recent years, the county has had a growing Hispanic population (2009 Census figures estimate Hispanics make up about 11% of the population) and there was data to suggest that Hispanics might have been racially profiled by local police. Here is how the Chicago Tribune describes the data between 2004 and 2009:

Racial profiling is difficult to prove. That’s why researchers push for data collection, to flag potential problems. In 2004, the first year data were collected, McHenry County’s indicators were high.

Statewide, minorities were 15 percent more likely to be stopped than what would have been expected based on their respective populations.

McHenry County’s disparity rate, however, was 65 percent, more than double that of the Chicago area’s five other sheriff’s departments.

The county’s rate, however, began dropping dramatically in 2007, and by 2009 was average for area sheriff’s departments.

On the surface, this data suggests the problem might have been solved: police were made aware of the issue and McHenry County’s numbers were back in line with regional figures within a few years.

But the Chicago Tribune goes on to say that a statistical analysis suggests it isn’t that racial profiling actually decreased; rather, McHenry County police simply marked Hispanics as white in their reports:

By 2009, the statistical analysis showed, 1 in 3 Hispanics cited by deputies likely were mislabeled as white or not included in department data reported to the state.

•If mislabeling and underreporting are taken into account, the department’s official rate of minority stops would have towered over its Chicago-area peers rather than appearing average.

•Department brass repeatedly missed warning signs of potential problems, even after a deputy complained that some peers targeted Hispanics.

So how exactly did the Chicago Tribune do this analysis: how does one look between the lines of arrest data to make a claim about current racial profiling? As a sidebar in the print edition and an extra link to click on online, the Tribune describes how they did their analysis:

Drivers’ names from the court and department data were compared with names in the census database to find each driver’s likelihood of Hispanic ethnicity. Mirroring methodology of similar research, drivers were deemed Hispanic only if their last names were 70 percent or more likely to be Hispanic.

The department data were used to analyze accuracy of labeling by deputies — comparing the rate of likely Hispanics with what each deputy logged. But the department database lacked records of all cited drivers, so the Tribune used the court data to determine the extent of mislabeling and incorrect logging departmentwide. The rate of likely Hispanics, as shown by the court data, was compared with the rate of Hispanics that the department told the state it cited.

In doing the departmentwide analysis, the Tribune counted only the labeling of likely Hispanics as white, because such mislabeling artificially improved the state’s rating of the department. Deputies at times also labeled likely Hispanics as other minorities, such as when a driver who looks like Sammy Sosa was labeled African-American. The analysis didn’t count that type of mislabeling because it didn’t affect the state’s rating.

Researchers say the census-based analysis is commonly used in studies but has limitations: It counts non-Hispanic women who marry Hispanics, and misses Hispanic women who marry non-Hispanics. It also misses Hispanics who have nontraditional surnames. With the limitations taken into account, it’s generally considered an undercount of Hispanics.

This is an interesting methodological process involving several moving parts. The analysis used and compared multiple sources of data. This triangulation method then doesn’t just rely the data that police report – such data can have issues as the TV show The Wire illustrated. Surnames from the records were compared to US Census records to determine the likelihood that the name is Hispanic. This isn’t going to catch all cases but the Tribune says other researchers claim this actually produces an undercount. If this is the case, perhaps McHenry County police are even further engaged in this practice. Also, what counts as a correct labeling or not is determined by the state.

A few lessons could be learned from this:

1. “Official data,” as self-reported police records here, are not necessarily trustworthy.

2. There are often multiple sources of data one can use to describe or evaluate a situation. Relying only on one source of data gives a part of the story – in this case, the one the police wanted to tell, which is interesting in itself – but having multiple sources can give a more complete picture.

3. If the Chicago Tribune analysis is correct, it is a reminder that “hiding” or “disguising” data can be difficult to do if people are interested or determined enough to look into what the data actually means.

American Sociological Association committee on doctoral program rankings

While the ranking of undergraduate programs is contentious (read about Malcolm Gladwell’s latest thoughts on the subject here), the rankings of doctoral programs can also draw attention. In February, a five-person American Sociological Association (ASA) committee released a report about the 2010 National Research Council (NRC) rankings of doctoral sociological programs (see a summary here).

The ASA committee summarized their concerns about the NRC rankings:

Based on our work, we recommend that the ASA Council issue a resolution criticizing the 2010 NRC rankings for containing both
operationalization and implementation problems; discouraging faculty, students, and university administrators from using the core 2010 NRC rankings to evaluate sociology programs;
encouraging them to be suspicious of the raw data accompanying the 2010 NRC report; and indicating that alternative rankings, such as those based on surveys of departments’ reputations, have their own sets of biases.

The explanation of these issues is an interesting methodological analysis. Indeed, this document suggests a lot of these rankings have had issues, starting with the 1987 US News & World Report rankings which were primarily based on reputational rankings.

So what did the committee conclude should be done? Here are their final thoughts:

At this time, the committee believes that ASA should encourage prospective students, faculty, university administrators or others evaluating a given program to avoid blind reliance on
rankings that claim explicitly or implicitly to list departments from best to worst. The heterogeneity of the discipline suggests that evaluators should first determine what characteristics they value in a program and then employ available sources of information to assess the program’s performance. In addition, the ASA should help facilitate, within available means, the dissemination of such information.

So the final recommendation is to be skeptical about these rankings. This seems to be a fairly common approach for those who find issues with rankings of schools or programs.

How might we get past this kind of conclusion? If the ranking process were done by just sociologists, could we decide on even a fuzzy rank order of graduate programs that most could agree upon?

Number of multiracial Americans grows in 2010 Census

In the 2000 Census, respondents were able to indicate for the first time that they are multiracial. The latest figures from the 2010 Census suggest that the multiracial population is growing at higher than expected rates:

In the first comprehensive accounting of multiracial Americans since statistics were first collected about them in 2000, reporting from the 2010 census, made public in recent days, shows that the nation’s mixed-race population is growing far more quickly than many demographers had estimated, particularly in the South and parts of the Midwest. That conclusion is based on the bureau’s analysis of 42 states; the data from the remaining eight states will be released this week.

In North Carolina, the mixed-race population doubled. In Georgia, it expanded by more than 80 percent, and by nearly as much in Kentucky and Tennessee. In Indiana, Iowa and South Dakota, the multiracial population increased by about 70 percent.

“Anything over 50 percent is impressive,” said William H. Frey, a sociologist and demographer at the Brookings Institution…

Census officials were expecting a national multiracial growth rate of about 35 percent since 2000, when seven million people — 2.4 percent of the population — chose more than one race. Officials have not yet announced a national growth rate, but it seems sure to be closer to 50 percent.

This is interesting data, particularly since these figures exceed expectations. There are several issues to note with the data. First, some of the largest growth is taking places in states like Mississippi where there is a large percentage increase because there were so few interracial people in the 2000 Census. A second question we could ask about this data is whether this is primarily an increase in multiracial relationships or is it simply a reflection of changing measurements from the US Census? One sociologist suggests the second option could be plausible:

“The reality is that there has been a long history of black and white relationships — they just weren’t public,” said Prof. Matthew Snipp, a demographer in the sociology department at Stanford University. Speaking about the mixed-race offspring of some of those relationships, he added: “People have had an entire decade to think about this since it was first a choice in 2000. Some of these figures are not so much changes as corrections. In a sense, they’re rendering a more accurate portrait of their racial heritage that in the past would have been suppressed.”

So then perhaps we shouldn’t be surprised by these large increases in percentages; rather, we have better instruments by which to collect this data.

This Census data does seems to line up with changing attitudes about interracial relationships. In a recent story from Pew Research about what 90% of Americans can agree about, Pew showed how the approval of interracial relationships has grown a lot in the last several decades:

It is remarkable how this has jumped from 48% in 1987 to 83% approval in 2009. But if there is more approval for interracial relationships, then there is likely to be more relationships, marriages, and eventually children who identify as multiracial.

Wellbeing among American cities

Gallup surveyed 188 metropolitan areas in the United States in 2010 and then ranked the cities according to their Well-Being Index. Here is the top 5:

1. Boulder, Colorado

2. Lincoln, Nebraska

3. Fort Collins-Loveland, Colorado

4. Provo-Orem, Utah

5. Honolulu, Hawaii

Here is some information on how the index was calculated:

The Gallup-Healthways Well-Being Index score is an average of six sub-indexes, which individually examine life evaluation, emotional health, work environment, physical health, healthy behaviors, and access to basic necessities. The overall score and each of the six sub-index scores are calculated on a scale from 0 to 100, where a score of 100 represents the ideal. Gallup and Healthways have been tracking these measures daily since January 2008.

In terms of analysis of these findings, Richard Florida has some thoughts. My guess is that Florida will tie these findings to own ideas about the creative class, a group that tends to live in cities that are college towns, have younger populations, higher level of innovation, and more cultural opportunities.

(A side note: I’m not sure who came up with the headline for Florida’s thoughts but calling these “America’s New Happiest Cities” may not exactly be the same things as measuring “well-being.” The Gallup index goes beyond “life evaluation” and “emotional health” to include other factors like physical health and workplace environment.)

What can 90% of Americans agree on?

The answer: not much. Pew Research has an article about the small number of issues in which 90% of Americans agree:

Yet there are some opinions that 90% of the public, or close to it, shares — including a belief that citizens have a duty to vote, an admiration for those who get rich through hard work, a strong sense of patriotism and a belief that society should give everyone an equal opportunity to succeed. Pew Research’s political values surveys have shown that these attitudes have remained remarkably consistent over time.

The proportion saying they are very patriotic has varied by just four percentage points (between 87% to 91%) across 13 surveys conducted over 22 years. Similarly, in May 1987, 90% agreed with the statement: “Our society should do what is necessary to make sure everyone has an equal opportunity to succeed.” This percentage has remained at about 90% ever since (87% in the most recent political values survey).

Interestingly, these cited figures are about foundational values in American culture. Exactly what some of these things mean could be up for debate: how should one express their “very patriotic” feelings? What exactly should it look like so that “everyone has an equal opportunity to succeed”? But as values, voting, patriotism, and meritocracy are quite powerful. (And it would also be interesting to see who doesn’t agree with these values.)

We could also ask why exactly 90% is a cutoff we should care about. Here is an explanation:

[R]eaching the 90% threshold is a rare occurrence in public opinion surveys. In part, this reflects the tendency of polling organizations to focus on current issues about which there are often considerable differences of opinion. Nonetheless, even on issues where one would expect to find near-total agreement, the public’s views are far from unanimous.

This is why Pew highlights a recent finding: “fully 90% of the public said that they were hearing mostly bad news about gas prices.”

It would be interesting to see more data on this to know just how rare 90% agreement is. How often might we expect to see this out of all survey responses? How different is the 90% occurrence compared to 80% or even 70%? Is this lack of 90% agreement unusual only for the United States or does this apply to other nations as well?

Sorting out the statistics about Christians and divorce

BeliefNet.com has a useful summary of a recent discussion that includes sociologists: do Christians divorce as frequently as other Americans?

1. Data from The Barna Group suggests that born-again Christians divorce at a similar rate as the general population. This seems to be tied to Barna’s particular definitions:

Barna’s statistics are tied to its highly specific — and controversial — definitions of born-again Christians and evangelicals.

For instance, Barna labels Christians “born-again” if they have made a personal commitment to Jesus and believe they will go to heaven because they have accepted him as their savior.

Evangelicals, on the other hand, are those who fit the born-again definition but also meet seven other conditions, including sharing their beliefs with non-Christians and agreeing that the Bible is completely accurate.

With these stricter definitions, Barna can claim that Christians and other divorce at similar rates.

2. Several sociologists, including Bradley Wright and Brad Wilcox, suggest there is a different story regarding Christians and divorce. Wright, for example, looked at General Social Survey data and found that higher rates of church attendance were related to lower rates of divorce:

Wright combed through the General Social Survey, a vast demographic study conducted by the National Opinion Research Center at the University of Chicago, and found that Christians, like adherents of other religions, have a divorce rate of about 42 percent. The rate among religiously unaffiliated Americans is 50 percent.

When Wright examined the statistics on evangelicals, he found worship attendance has a big influence on the numbers. Six in 10 evangelicals who never attend had been divorced or separated, compared to just 38 percent of weekly attendees.

Wilcox came to some similar conclusions based on another data source:

“You do hear, both in Christian and non-Christian circles, that Christians are no different from anyone else when it comes to divorce and that is not true if you are focusing on Christians who are regular church attendees,” he said.

Wilcox’s analysis of the National Survey of Families and Households has found that Americans who attend religious services several times a month were about 35 percent less likely to divorce than those with no religious affiliation.

Nominal conservative Protestants, on the other hand, were 20 percent more likely to divorce than the religiously unaffiliated.

If Wright and Wilcox are correct, it is less about whether one calls themselves a Christian or meets a theological definition of being a Christian and more about the Christian actions that they undertake. If we take church attendance as some measure of spiritual commitment or beliefs, then it appears that going to church more is tied to getting divorced less.

Another part of this debate seems to be about how to define people as Evangelicals. Barna has a particular method as do others. One standard in the field of sociology of religion is to use RELTRAD, which accounts for both “doctrine and historical changes in religious groups.”

(I explained Wright’s argument in class recently and was asked if we could take Wright’s claims about church attendance as a causal argument: does going to church lead to less divorce? Or is it that people who divorce less feel more comfortable about going to church while those who are already divorced feel less comfortable in church and therefore go less? I’m guessing someone has answered this question.)