Using cell phone data to research social networks

Social network analysis is a growing area within sociology and other disciplines. The Wall Street Journal reports on the advantages of examining cell phone data:

As a tool for field research, the cellphone is unique. Unlike a conventional land-line telephone, a mobile phone usually is used by only one person, and it stays with that person everywhere, throughout the day. Phone companies routinely track a handset’s location (in part to connect it to the nearest cellphone tower) along with the timing and duration of phone calls and the user’s billing address…

Advances in statistics, psychology and the science of social networks are giving researchers the tools to find patterns of human dynamics too subtle to detect by other means. At Northeastern University in Boston, network physicists discovered just how predictable people could be by studying the travel routines of 100,000 European mobile-phone users.

After analyzing more than 16 million records of call date, time and position, the researchers determined that, taken together, people’s movements appeared to follow a mathematical pattern. The scientists said that, with enough information about past movements, they could forecast someone’s future whereabouts with 93.6% accuracy.

The pattern held true whether people stayed close to home or traveled widely, and wasn’t affected by the phone user’s age or gender.

The rest of the article then goes on to talk about a lot of interesting research on topics like social contagions (see an example of this research here) and social relationships using this data.

Some may be concerned about privacy, particularly with recent reports about iPhones and iPads containing a file that records the movements of users. I have a few thoughts about this:

1. Compared to other possible data sources (surveys, time diaries, interviews, ethnography), this seems like a treasure trove of information. The article suggests that nearly 75% of people in the world have cell phones – what other data source can compare with that? Could the research potential outweigh individual privacy concerns? In thinking about some of these research questions, it would be very difficult to use more traditional methods to address the same concerns. And just the sheer number of cases a researcher could access and work with is fantastic. In order to build more complex models of human behavior, this is exactly the kind of data one could use.

2. I would be less concerned about researchers using this data than companies. Researchers don’t particularly care about the individual cases in the data but rather are looking for broad patterns. I would also guess that the cell phone data is anonymized so that researchers would have a difficult time pinpointing specific individuals even if they wanted to.

3. How much of a surprise is it that this available data is being used? Don’t cell phone carriers include some sort of statement in their contracts about using data in such ways? One option here would be to not get a smart phone. But if you want a smart phone (and it seems that a lot of Americans do), then this is the tradeoff. This is similar to the tradeoff with Facebook: users willingly give their information to enhance their social lives and then the company can look for ways to profit from this information.

h/t Instapundit

Conclusions about PC vs. Mac users based on an unscientific web survey

Based on the headline, this looks like an interesting story: “Mac vs. PC: The stereotypes may be true.” But there is a problem:

An unscientific survey by Hunch, a site that makes recommendations based on detailed user preferences, found that Mac users tend to be younger, more liberal, more fashion-conscious and more likely to live in cities than people who prefer PCs.

While the first part of this paragraph is treated as a clause that barely affects the rest of the text, it really is the key to the story. Hunch’s survey respondents identify as 52% PC users and 25% Mac users with 23% percent identifying with neither (and what do we do this category?). This compares to PC vs. Mac world market share of 89% to 11%. This is evidence that the online sample doesn’t quite match up with what computer users are actually buying. Voluntary web surveys are difficult to work with for this reason: even if there are a lot of respondents, we don’t know whether these respondents are representative of larger populations.

Perhaps CNN does cover themselves. The headline does suggest that these stereotypes “may” be correct and the second paragraph suggests the stereotypes may contain “some truth.” But a more cynical take regarding both CNN and Hunch is that they simply want more web visits from devoted PC or Mac defenders. Perhaps the fact that all of this is based on an unscientific survey is less important than driving visitors to one’s site and asking people to comment at the bottom of both stories.

The lack of variation in ordinal scales: color-coded terror alerts plus employee surveys and ratings

In the last few days, I ran into a few stories that are related in unusual ways: they both concern a lack of variation in an ordinal scale. First, let’s start with the announcement from the Department of Homeland Security regarding the color-coded terror alert scale:

A government review determined that the five-tiered color-coded system instituted in 2002 had suffered from a lack of credibility and eroded public confidence. The color has not been changed since 2006 and has never gone below yellow, or “elevated,” risk. Setting the risk level to green, or “low,” was never even considered.

In the long run, the problem was that the scale didn’t change. Theoretically, there were five options but the alert was generally in the same place. Since the alert was always “elevated” or above, this was not helpful. (This also seems related to the argument some have made that a multi-decade “war on drugs” or “war on poverty” doesn’t make much sense because wars are supposed to have an end. Always being at war or on alert for terror erodes the sense of urgency.)

I also came across a human resources website that recommended businesses avoid five point scales regarding certain questions asked of employees:

A typical ranking, called a Likert scale, runs from Strongly Agree to Strongly Disagree. And it’s fine for many psychological and sociological surveys. When you’re asking for ratings from 1,000 random people, you’ll get a wide variety of answers.

“But inside an organization, a 5-point scale loses its effectiveness,” Murphy says. “If you ask a group of employees at Acme Inc. to rate the statement, ‘Acme is a good place to work,’ you’re not going to get very many low responses (i.e., 1s and 2s). That’s because if you truly thought Acme was an awful place to work, you probably would have quit already.”…

But as with employee surveys, we don’t think 5-point scales are effective for performance evaluations. Many HR pros tell managers that only a very small percentage of their subordinates, say 10 percent, can be awarded the highest rating. And, managers are understandably reluctant to rate anyone as unsatisfactory—even when that’s the rating he or she deserves.

This is not just a hypothetical situation: I remember reading recently about the extremely high percent of teachers in a large district that were given satisfactory or higher ratings. (One group suggests that 91% of Chicago teachers in 2007-2008 were rated “superior or excellent”.) If the ratings mean anything and are actually measuring performance, it is difficult to believe that such a high figure is true.

The lesson to be learned here from these two cases? Be sure that there will be variation in responses if using an ordinal scale. Otherwise, the scale is quite unhelpful.

Venkatesh argues Anderson’s recent book highlights sociology’s identity problem

Sudhir Venkatesh reviews Elijah Anderson’s new book The Cosmopolitan Canopy (earlier review here) and argues that the text is emblematic of a larger identity crisis within sociology:

Anderson’s struggle to make sense of the current multicultural situation is not only a function of his own intellectual uncertainty. It is also a symptom of the field in which he is working, which is confused about its direction. Where sociology once gravitated to the most pressing problems, especially the contentious issues that drove Americans apart, it no longer seems so sure of its mission. With no obvious crisis, disaster, or glaring source of inequity as a backdrop demanding public action, a great American intellectual tradition gives every sign of weathering a troubled transition…

Anderson’s fascinating foray and his inability to tie together the seemingly contradictory threads highlight the new challenges that face our field. On the one hand, sociology has moved far away from its origins in thoughtful feet-on-the ground analysis, using whatever means necessary. A crippling debate now pits the “quants,” who believe in prediction and a hard-nosed mathematical approach, against a less powerful, motley crew—historians, interviewers, cultural analysts— who must defend the scientific rigor and objectivity of any deviation from the strictly quantitative path. In practice, this means everyone retreats to his or her comfort zone. Just as the survey researcher isn’t about to take up with a street gang to gather data, it is tough for an observer to roam free, moving from one place to another as she sees fit, without risking the insult: “She’s just a journalist!” (The use of an impenetrable language doesn’t help: A common refrain paralyzing our field is, “The more people who can understand your writing, the less scientific it must be.”)

For Anderson to give up “fly on the wall” observation, his métier, and put his corporate interviews closer to center-stage would risk the “street cred” he now regularly receives. This is sad because Anderson is on to the fact that we have to re-jigger our sociological methods to keep up with the changes taking place around us. Understanding race, to cite just one example, means no longer simply watching people riding the subway and playing chess in parks. The conflicts are in back rooms, away from the eavesdropper. They are not just interpersonal, but lie within large institutions that employ, police, educate, and govern us. A smart, nimble approach would be to do more of what Anderson does—search for clues, wherever they may lie, whether this means interviewing, observing, counting, or issuing a FOIA request for data.

If you search hard enough, you can find pockets of experimentation, where sociologists stay timely and relevant without losing rigor. It is not accidental they tend to move closer to our media-frenzied world, not away from it, because it’s there that some of the most illuminating social science is being done, free of academic conventions and strictures. At Brown and Harvard, sociologists are using the provocative HBO series, The Wire, to teach students about urban inequality. At Princeton and Michigan, faculty make documentary films and harness narrative-nonfiction approaches to invigorate their research and writing. At Boston University, a model turned sociologist uses her experiences to peek behind the unforgiving world of fashion and celebrity. And the Supreme Court’s decision to grant the plaintiffs a “class” status in the Wal-Mart gender-discrimination case will hinge on an amicus brief submitted by a sociologist of labor. None of this spirited work occurs without risk, as I’ve found out through personal experience. Each time I finish a documentary film, one of my colleagues will invariably ask, “When are you going to stop and get back to doing real sociology?”

I have several thoughts about this:

1. I think it is helpful (and perhaps unusual) to see this piece at Slate.com rather than in an academic journal. At the same time, is this only possible for an academic like Venkatesh who has a best-selling popular book (Gang Leader For a Day) and is also tied to the Freakonomics crowd?

2. Venkatesh seems to be bringing up two issues.

a. The first issue is one of direction: what are the main issues or areas in which sociology could substantially contribute to society? If some of the issues of the early days such as race (still an issue but Anderson’s data suggests it is exists in different forms) and urbanization (generally settled in favor of suburbanization in America) are no longer that noteworthy, what is next? Consumerism? Gender? Inequality between the rich and poor? Exposing the contradictions still present in society (Venkatesh’s conclusion)?

This is not a new issue. Isn’t this what public sociology was supposed to solve? There also has been some talk about fragmentation within the discipline and whether sociology has a core. Additionally, there is occasional conversation about why sociology doesn’t seem to get the same kind of public or policy attention as other fields.

b. The second issue is one of data. While both Anderson and Venkatesh are well-known for practicing urban ethnography (as Venkatesh notes, a tradition going back to the early 20th century work of the Chicago School), Venkatesh notes that even Anderson had to move on to a different technique (interviewing) to find the new story. More broadly, Venkatesh places this change within a larger battle between quantitative and qualitative data where people on each side discuss what is “real” data.

This quantitative vs. qualitative debate has also been around for a while. One effort in recent years to address this moves to mixed methods where researchers use multiple sources and techniques to reach a conclusion. But it also seems that one common way to critique the work of others is to jump right to the methodology and suggest that it is limited to the point that one cannot come to much of a conclusion. Most (if not all) data is not perfect and there are often legitimate questions regarding validity and reliability but researchers are often working with the best available data given time and monetary constraints.

In the end, I’m not sure Venkatesh provides many answers. So, perhaps just like his own conclusions regarding Anderson’s book (“Better to point [these contradictions] out, however speculative and provisional the results may be, than to hide from the truth.”), we should be content just that these issues have been outlined.

(Here is an outsider’s take on this piece: “One thing that’s the matter with sociology is that like economics the discipline’s certitude of conclusion outran its methodological rigor. Being less charitable, sociology is just an ideology which occasionally dons the gown of dispassionate objectivity to maintain a semblance of respectability.” Ouch.)

Interpreting data regarding scientists and religion

In looking at some data regarding what scientists think about religion, a commentator offers this regarding interpreting sociological data:

The point about asking such questions is not because we know the answers but to emphasise that the interpretation of sociological data is a tricky business. From the perspective of science, ants and humans are far more complex than stars and rocks. A discussion of atheism and science in the US context leads us straight to a discussion of the structure of the American educational system, the role of elites, the present polarisation of the political electorate along religious faultlines, and much else besides…

The challenge then is to think hard about the complex data and not be too dogmatic about the interpretations.

When the phrase “tricky business” is used, it sounds like it is referring to the complex nature of the social world. In order to understand the relationship between science and religion, one must account for a variety of possible factors. It is one thing to say that there are multiple possible interpretations of the same data, another to say that some twist data to support their personal interpretations, and another to suggest that we can get to a correct or right interpretation if we properly account for complexity.

While this commentary is ultimately about using caution when interpreting statistics regarding the religious beliefs of scientists, it also is a little summary of social science research regarding the religious beliefs of scientists. The 2010 study Science vs. Religion is discussed as well as a few other works.

The evolving American Dream: more dense but still private

I’ve written about several aspects of the American Dream including unhappiness and how the American Dream might now be about perfection rather than acquiring goods or status. One key aspect of this Dream is housing, often viewed as a single-family house in a suburb. A new report from the National Association of Realtors suggests homebuyers now have some new preferences:

The 2011 Community Preference Survey reveals that, ideally, most Americans would like to live in walkable communities where shops, restaurants, and local businesses are within an easy stroll from their homes and their jobs are a short commute away; as long as those communities can also provide privacy from neighbors and detached, single-family homes. If this ideal is not possible, most prioritize shorter commutes and single-family homes above other considerations.

1. The economy has had a substantial impact on attitudes toward housing and communities…

2. Overall, Americans’ ideal communities have a mix of houses, places to walk, and amenities within an easy walk or close drive…

3. Desire for privacy is a top consideration in deciding where to live…

4. But, having a reasonable commute can temper desire for more space…

5. Community characteristics are more important than size of home…

6. Improving existing communities preferred over building new roads and developments…

7. Major differences in community preferences of various types of Americans…

All of these points are from the executive summary which also has some key percentages for each point.

The results of this survey seem similar to a recent report (see here) earlier this year from the National Association of Home Builders that suggested Generation Y wants more urban settings and more social (and smaller?) homes. In the long run, it remains to be seen whether these changes are broad cultural changes, generational changes (driven by younger generations), or opinions changed primarily by recent economic conditions.

Richard Florida sums up the report this way:

We’ve come to a crossroads that neither dyed-in-the-wool sprawl advocates nor crunchy urbanists dreamed of two decades ago, in which the choice isn’t between urban and suburban but between neighborhood and subdivision. A great neighborhood is a great neighborhood whether it’s in the city or the suburbs. It’s not an either/or, between crowded apartments or Cape Cods on cul de sacs, it’s more of a blend. Developers and planners take note: there is a potentially enormous market in cities for narrow single-family houses on small lots, like you see in places like Santa Monica and Venice. And as I wrote in The Wall Street Journal not too long ago, there are countless ways that our suburbs can be densified and reinvigorated. The American Dream hasn’t died–it just looks a lot different than it did in the 1950s. It looks a lot different than it did a decade ago.

So this report may not really be a repudiation of the suburbs but rather a new vision for suburbia: private yet dense (with still a clear 80% preference for single-family homes) and with neighborhood amenities. I am a little surprised that there aren’t more specific questions about preferred housing size or housing costs. Additionally, the survey seems set up to ask a lot of questions about smart growth with little explanation why this was the main focus.

(A side note: the study was a web survey:

The 2011 BRS/NAR Community Preference Survey is a web-enabled survey of adults nationwide using the Knowledge Networks panel. Knowledge Networks uses probability methods to recruit its panel, allowing results to be generalized to the population of adults in the U.S. A total of 2,071 questionnaires were completed from February 15 to 24, 2011. The data have been weighted by gender, age, race, region, metropolitan status, and Internet access. The margin of sampling error for the sample of 2,071 is plus or minus 2.2 percentage points at the 95% level of confidence. A detailed methodology can be found in Appendix A.

Knowledge Networks (KN) is a firm that gets around some of the common problems of web surveys (typically having to do with having a representative sample) by having representative panels who take web surveys. In order to get a representative sample, KN employs this technique:  “Since almost three in ten U.S. households do not have home Internet access, we supply these households a free netbook computer and Internet service.”)

Pew using word frequencies to describe public’s opinion of budget negotiations

In the wake of the standoff over a federal government shutdown last week, Pew conducted a poll of Americans regarding their opinions on this event. One of the key pieces of data that Pew is reporting is a one-word opinion of the proceedings:

The public has an overwhelmingly negative reaction to the budget negotiations that narrowly avoided a government shutdown. A weekend survey by the Pew Research Center for the People & the Press and the Washington Post finds that “ridiculous” is the word used most frequently to describe the budget negotiations [29 respondents], followed by “disgusting,” [22 respondents] “frustrating,” [14 respondents] “messy,” [14 respondents] “disappointing” [13 respondents] and “stupid.” [13 respondents]

Overall, 69% of respondents use negative terms to describe the budget talks, while just 3% use positive words; 16% use neutral words to characterize their impressions of the negotiations. Large majorities of independents (74%), Democrats (69%) and Republicans (65%) offer negative terms to describe the negotiations.

The full survey was conducted April 7-10 among 1,004 adults; people were asked their impressions of the budget talks in interviews conducted April 9-10, following the April 8 agreement that averted a government shutdown.

I would be hesitant about leading off an article or headline (“Budget Negotiations in a Word – “Ridiculous”) with these word frequencies since they generally were used by few respondents: the most common response, “ridiculous,” was only given by 2.9% of the survey respondents (based on the figures here of 1,004 total respondents). I think the better figures to use would be the broader ones about negative responses where 69% used negative terms and a majority of all political stripes used a negative descriptor.

You also have to dig into the complete report for some more information. Here is the exact wording of the question:

PEW.2A If you had to use one single word to describe your impression of the budget negotiations in Washington, what would that one word be? [IF “DON’T KNOW” PROBE ONCE: It can be anything, just the first word that comes to mind…] [OPEN END: ENTER VERBATIM RESPONSE]

Additionally, the full report says that this descriptor question was only asked of 427 respondents on April 9-10 (so my above percentage should be altered: it should be 29/427 = 6.8%). So this is a smaller sample answering this particular question; how generalizable are the results? And the most common response to this question is the other category with 202 respondents. Presumably, the “others” are mostly negative since we are told 69% use negative terms. (As a side note, why not separate out the “don’t knows” and “refused”? There are 45 people in this category but these seem like different answers.)

One additional thought I have: at least this wasn’t put into a word cloud in order to display the data.

Sir James Dyson discusses the value of failure

Sir James Dyson, noted inventor of the Dyson vacuum cleaners, discusses how failure is necessary on the path to innovation:

It’s time to redefine the meaning of the word “failure.” On the road to invention, failures are just problems that have yet to be solved…

From cardboard and duct tape to ABS polycarbonate, it took 5,127 prototypes and 15 years to get it right. And, even then there was more work to be done. My first vacuum, DC01, went to market in 1993. We’re up to DC35 now, having improved with each iteration. More efficiency, faster motors, new materials…

The ability to learn from mistakes — trial and error — is a valuable skill we learn early on. Recent studies show that encouraging children to learn new things on their own fosters creativity. Direct instruction leads to children being less curious and less likely to discover new things.

Unfortunately, society doesn’t always look kindly on failure. Punishing mistakes doesn’t lead to better solutions or faster results. It stifles invention.

If the American Dream is now about attaining perfection, where is there room for failure? Dyson goes on to talk about how education might be changed to incorporate more room for failure but getting to the point where the broader society would be more accepting of failure is another matter.

I wonder how much this idea about innovation and failure could be tied to issues regarding publishing “negative findings” in academia.

If you want peace, you should head to Maine

The Institute for Economics and Peace has released its rankings of the most peaceful states in the United States and Maine tops the list. Here is some more information on this ranking:

The index, which defines peace as “the absence of violence,” looks at a set of five indicators, including homicide rates, violent crimes, percentage of the population in jail, number of police officers and availability of small arms (per 100,000 people) to rank the states. The data are drawn from the Bureau of Justice Statistics, FBI and Centers for Disease Control and Prevention.

On that basis, the institute finds that peace in the USA improved by 8% from 1995 to 2009.

It notes a significant correlation between a state’s level of peace and its economic opportunity, education and health but finds peacefulness is politically neutral — neither Republican nor Democratic states have an advantage.

Maine was ranked first overall because it topped the list of states on three of the five USPI indicators: number of violent crimes, number of police officers and incarceration rate.

There is some interesting regional variation with the northeast generally being more peaceful and the south being less peaceful. I’m sure there are a number of commentators and sociologists who could comment on the these findings about the South.

But, like many such rankings (see a recent example here), I’m sure people would ask whether these measures actually get at the presence or absence of violence. The percentage of the population in jail could be related to violence but there are plenty of other ways to end up in jail. The number of police officers could be related to violence but it could also be linked to funding and perceptions about crime. In terms of the availability of small arms, does this necessarily lead to violence?

Using these measures seems linked to how this organization views peace. According to the full report (page 8 of the PDF), “The methodological framework was based on envisaging a society that is perfectly at peace; a society where there is no violence, no police and no one in jail.” Here is the explanation about using the measure of small arms (page 8 of the PDF): “Additionally, this logic also applies to small arms: “the USPI does not make judgments about appropriate levels of small arms in society but rather considers their prevalence a reflection of the need for self-defense and a potential to generate violence.”

I don’t study in this area so it is interesting to read about how some of these things can ever be measured. Regarding getting a measure of small arms availability (page 10 of the PDF):

Although the U.S. has excellent data for many statistics, there is no reliable data on small arms availability, small arms ownership, or small arms sales within the U.S. or within the states of the U.S. An accurate measure of gun prevalence cannot be calculated from administrative records alone. For this reason many studies on gun prevalence use a quantitative proxy. The proxy used in the USPI is: fi rearm suicides as a percentage of total suicides (FS/S). As this indicator varied significantly from year to year for some states, a five year moving average was used in order to smooth out the variance. For example, the fi gure used for Alabama for 2008 was an average of FS/S for 2003-2007. More detail on why this proxy was chosen is supplied in Appendix B to this report.

The availability of small arms also had the lowest weighting in the rankings.

Debating the decline of religion in America

For several decades now, sociologists have upheld the idea that when compared to other industrialized nations, the United States is uniquely religious. An argument for secularization which gained prominence in the 1960s was eventually refuted as Americans showed a remarkable religious vitality.

But some argue that new data about religion in America suggests that religion may indeed may on the decline. In a new book titled The Decline of American Religion, sociologist Mark Chaves looks at some of the evidence:

His conclusion: “The burden of proof has shifted to those who want to claim that American religiosity is not declining.”…

“…[E]very indicator of traditional religiosity is either stable or declining. This is why I think it is reasonable to conclude that American religion has in fact declined in recent decades — slowly, but unmistakably,” Chaves said.

Those indicators of decline, taken from General Social Survey data, include:

  • From 1990 to 2008, the percent of people who never attend religious services rose from 13 percent to 22 percent.
  • Just 45 percent of adult respondents born after 1970 reported growing up with religiously active fathers.
  • In the 1960s, about 1 percent of college freshmen expected to become clergy. Now, about three-tenths of a percent have the same expectation.
  • The percentage of people saying they have a great deal of confidence in leaders of religious institutions has declined from about 35 percent in the 1970s to about 25 percent today.

This particular data would seem to suggest a very slow decline – though Chaves himself seems careful to say that the data could also be interpreted to say that there is stability.

Sociologist Bradley Wright looks at some similar data in his book Christians Are Hate-Filled Hypocrites (read a description of the argument here) and comes to a slightly different conclusion. Wright suggests some of the people who now identify as non-religious simply don’t like to identify with organized religion and that many of them still say they have religious beliefs and practices. Wright also briefly argues that the number of committed religious people may not have changed; rather, “cultural” Christians may be those who are now identifying as non-religious.

Time will help settle this debate: in the United States, will religion continue to decline in future years and exactly what shape will this decline take? In the meantime, we will have to see how Chaves’ claim that the burden of proof is now on those who show there is not a decline plays out.