Facebook runs 2010 voting experiment with over 61 million users

Experiments don’t just take place in laboratories; they also happen on Facebook.

On November 2nd, 2010, more than 61 million adults visited Facebook’s website, and every single one of them unwittingly took part in a massive experiment. It was a randomised controlled trial, of the sort used to conclusively test the worth of new medicines. But rather than drugs or vaccines, this trial looked at the effectiveness of political messages, and the influence of our friends, in swaying our actions. And unlike most medical trials, this one had a sample size in the millions.

It was the day of the US congressional elections. The vast majority of the users aged 18 and over (98 percent of them) saw a “social message” at the top of their News Feed, encouraging them to vote. It gave them a link to local polling places, and clickable button that said “I voted”. They could see how many people had clicked the button on a counter, and which of their friends had done so through a set of randomly selected profile pictures.

But the remaining 2 percent saw something different, thanks to a team of scientists, led by James Fowler from the University of California, San Diego. Half of them saw the same box, wording, button and counter, but without the pictures of their friends—this was the “informational message” group. The other half saw nothing—they were the “no message” group.

By comparing the three groups, Fowler’s team showed that the messages mobilised people to express their desire to vote by clicking the button, and the social ones even spurred some to vote. These effects rippled through the network, affecting not just friends, but friends of friends. By linking the accounts to actual voting records, Fowler estimated that tens of thousands of votes eventually cast during the election were generated by this single Facebook message.

The effects appear to be small but could still be influential when multiplied through large social networks.

I suspect we’ll continue to see more and more of this in the future. Platforms like Facebook or Google or Amazon have access to millions of users and can run experiments that don’t change a user’s experience of the website much.

Argument: we could have skewed survey results because we ignore prisoners

Several sociologists suggest American survey results may be off because they tend to ignore prisoners:

“We’re missing 1% of the population,” said Becky Pettit, a University of Washington sociologist and author of the book, “Invisible Men.” “People might say, ‘That’s not a big deal.’ “But it is for some groups, she writes — particularly young black men. And for young black men, especially those without a high-school diploma, official statistics paint a rosier picture than reality on factors such as employment and voter turnout.

“Because many surveys skip institutionalized populations, and because we incarcerate lots of people, especially young black men with low levels of education, certain statistics can look rosier than if we included” prisoners in surveys, said Jason Schnittker, a sociologist at the University of Pennsylvania. “Whether you regard the impact as ‘massive’ depends on your perspective. The problem of incarceration tends to get swept under the rug in lots of different ways, rendering the issue invisible.”

Further commentary in the article suggests sociologists and others, like the Census Bureau, are split on whether they think including prisoners in surveys is necessary.

Based on this discussion here, I wonder if there is another issue here: is getting slightly better survey results through picking up 1% of the population going to significantly affect results and policy decisions? If not, some would conclude it is not worth the effort. But, Petit argues some statistics could change a lot:

Among the generally accepted ideas about African-American young-male progress over the last three decades that Becky Pettit, a University of Washington sociologist, questions in her book “Invisible Men”: that the high-school dropout rate has dropped precipitously; that employment rates for young high-school dropouts have stopped falling; and that the voter-turnout rate has gone up.

For example, without adjusting for prisoners, the high-school completion gap between white and black men has fallen by more than 50% since 1980, says Prof. Pettit. After adjusting, she says, the gap has barely closed and has been constant since the late 1980s. “Given the data available, I’m very confident that if we include inmates” in more surveys, “the trends are quite different than we would otherwise have known,” she says…

For instance, commonly accepted numbers show that the turnout rate among black male high-school dropouts age 20 to 34 surged between 1980 and 2008, to the point where about one in three were voting in presidential races. Prof. Pettit says her research indicates that instead the rate was flat, at around one in five, even after the surge in interest in voting among many young black Americans with Barack Obama in the 2008 race.

It will be interesting to see how this plays out.

Disconnect between how much Americans say they give to church and charity versus what they actually give

Research working with recent data on charitable and religious giving suggests there is an interesting disconnect: some people say they give more than they actually do.

A quarter of respondents in a new national study said they tithed 10 percent of their income to charity. But when their donations were checked against income figures, only 3 percent of the group gave more than 5 percent to charity…

But other figures from the Science of Generosity Survey and the 2010 General Social Survey indicate how little large numbers of people actually give to charity.

The generosity survey found just 57 percent of respondents gave more than $25 in the past year to charity; the General Social Survey found 77 percent donated more than $25, Price and Smith reported in their presentation on “Religion and Monetary Donations: We All Give Less Than We Think.”

In one indication of the gap between perception and reality, 10 percent of the respondents to the generosity survey reported tithing 10 percent of their income to charity although their records showed they gave $200 or less.

Two thoughts, more about methodological issues than the subject at hand:

1. What people say on surveys or in interviews doesn’t always match what they actually do. There are a variety of reasons for this, not all malicious or intentional. But, this leads me to thought #2…

2. I like the way some of these studies make use of multiple sources of data to find the disconnect between what people say and what they do. When looking at an important area of social life, like altruism, having multiple sources of data goes a long way. Measuring attitudes is often important in of itself but we also need data on practices and behaviors.

 

Analyst looks at “racial breakdown of [presidential election] polls”

An analyst for RealClearPolitics takes a look at possible issues with the racial breakdown in the samples of  presidential election polls. A few of the issues:

First, as Chait repeatedly concedes, we don’t know what the ultimate electorate will look like this November. That really should be the end of the argument — if we don’t know what the racial breakdown is going to be, it’s hard to criticize the pollsters for under-sampling minorities. After all, almost all pollsters weight their base sample of adults to CPS (current population survey) estimates to ensure the base sample reflects the actual population; after that, the data simply are what they are.

It’s true that the minority share of the electorate increased every year from 1996 through 2008. But there’s a reason that 1996 is always used as a start date: After declining every election from 1980 through 1988, the white share of the vote suddenly ticked up two points in 1992. In other words, these things aren’t one-way ratchets (and while there is no H. Ross Perot this year, the underlying white working-class angst that propelled his candidacy is very much present, as writers on the left repeatedly have observed)…

“The U.S. Census Bureau allows for multiple responses when it asks respondents what race they are, and Gallup attempts to replicate the Census in that respect. While most pollsters ask two separate questions about race and Hispanic ancestry, Gallup goes a step further, asking five separate questions about race. They ask respondents to answer whether or not they consider themselves White; Black or African American; Asian; Native American or Alaska Native; and Native Hawaiian or Pacific Islander.”

In other words, how you ask the question could impact how people self-identify with regard to race and ethnicity, which could in turn affect how your weighted data look. This is a polling issue that will likely become more significant as the nation grows more diverse, and more multi-racial.

Trying to figure out who exactly is going to vote is a tricky proposition and it is little surprise that different polling organizations have slightly different figures.

I hope people don’t see stories like this and conclude that polls can’t be trusted after all. Polling is not an exact science; all polls contain small margins of error. However, polling is so widely used because it is incredibly difficult to capture information about whole populations. Even one of the most comprehensive surveys we have, the US Census, was only able to get about 70-75% cooperation and that was with a large amount of money and workers. Websites like RealClearPolitics are helpful here because you can see averages of the major polls which can help smooth out some of their differences.

A final note: this is another reminder that measuring race and ethnicity is difficult. As noted above, the Census Bureau and some of these polling organizations use different measures and therefore get different results. Of course, because race and ethnicity are fluid, the measures have to change over time.

How to define a good college town

Livability recently released a list of the Top 10 college towns and here is some discussion of how they defined such communities:

And for starters, we need a basic definition of a college town. “True college towns are places where the identity of the city is both shaped by and complementary to the presence of its university, creating an environment enjoyable to all residents, whether they are enrolled in classes or not,” Livability’s editors write. “They’re true melting pots, where young minds meet old traditions, and political, social, and cultural ideas of all kinds are welcomed.”

That’s pretty broad. But the editors go on: In a college town, “the college is not only a major employer, but also the reason for more plentiful shops, restaurants, and entertainment businesses.” And it has to look like a college town, too: “It doesn’t seem right to call a place a college town if you can’t tell classes are in session with a quick glance at the mix of people on a busy sidewalk.”…

For example, what would Baltimore be without the Johns Hopkins University? The economic equivalent of a smoldering hole in the ground, that’s what. Or consider Rochester or Syracuse, N.Y., from the same perspective. And what about Boston and Philadelphia—are they “college towns”?

As you’ll see from the list below, most of Livability’s “best” college towns are relatively small, remote places, based on colleges that are highly ranked by the Princeton Review. Livability, true to its name, also factored in cost of living and walkability. (College towns, by their nature, should be among the most pedestrian-friendly communities America has left.)

This sounds like a very traditional use of the term “college town”: places that are heavily dependent on the university or college and that are quaint yet cosmopolitan enough. I like the contrast with the big cities which often have a variety of colleges and amenities that cater to college students, faculty, and staff.

This leads to a few thoughts:

1. How many college students today pick colleges based on it being in a “college town”? The surrounding atmosphere must matter some.

2. How have college towns been affected by the recent economic downturn and its effects on college campuses? Let’s say the college bubble bursts like some are predicting: how badly hit will college towns be? Another way to put it might be to ask how resilient these communities would be if the college/university started struggling or is this another example of what could happen to communities that rely too heavily on one industry.

3. Why not include an attitudinal component with local residents asking how much they like or approve of or even know what is going on with the college? Town and gown relationships can be difficult and simply because a place is a “college town” doesn’t mean there isn’t some tension.

4. It would be interesting to trace the history of college towns and their appeal. Historically, were there advantages to having colleges in communities that were heavily dependent on them?

5. Just because a place looks like it is where learning should take place (and this seems very constructed), does it actually improve learning?

Uptick in sociology job market?

Inside Higher Ed summarizes a ASA report that suggests the number of open jobs in 2011 were near 2008 levels:

In 2011, the number of faculty jobs posted either for assistant professors or positions for which any faculty rank is possible was just 4 percent below the level in 2008, the year in which the economic downturn hit in the fall. And so many of the openings announced in 2008 were canceled that it is possible there were more actual openings in 2011. There are among the results in a new job market report issued by the American Sociological Association.

The number of faculty jobs in 2009 fell 35 percent, and the 2010 total was 14 percent below the 2008 level, so the new figures represent a significant rebound in job openings.

The data are based on openings listed with the ASA. Not all departments list positions there, so the totals don’t reflect every opening, but sociologists say that the ASA reports accurately reflect trends in the discipline, even considering positions listed elsewhere.

The top 5 specialties in demand: social control/law/crime deviance, open, race and ethnicity, medicine and health, and work/economy/organizations. The bottom 5 (last being the lowest): comparative and historical approaches, sociology of culture, education, qualitative approaches, and application and practice.

Overall, this would seem like good information though it will likely take some time to sort through the backlog of candidates who couldn’t find jobs in recent years.

Just a thought: I wonder what exactly the job figures from year to year tell us. Overall, is there a better way to get at whether the discipline is expanding or is doing well? Is it better for big departments to get bigger? For new schools to add sociology undergraduate and graduate programs? For the beginning of new graduate programs? For existing faculty to get more recognition or better salaries? To compare the growth in sociology to other disciplines?

Sociologist: “one-year change in test results doesn’t make a trend”

A sociologist provides some insights into how firms “norm” test scores from year to year and what this means about how to interpret the results:

The most challenging part of this process, though, is trying to place this year’s test results on the same scale as last year’s results, so that a score of 650 on this year’s test represents the same level of performance as a score of 650 on last year’s test. It’s this process of equating the tests from one year to the next which allows us to judge whether scores this year went up, declined or stayed the same.But it’s not straightforward, because the test questions change from one year to the next, and even the format and content coverage of the test may change.

Different test companies even have different computer programs and statistical techniques to estimate a student’s score and, hence, the overall picture of how a student, school or state is performing. (Teachers too, but that’s a subject for another day.)

All of these variables – different test questions from year to year; variations in test length, difficulty and content coverage; and different statistical procedures to calculate the scores – introduce some uncertainty about what the “true” results are…

In testing, every year is like changing labs, in somewhat unpredictable ways, even if a state hires the same testing contractor from one year to the next. For this reason, I urge readers to not react too strongly to changes from last year to this year, or to consider them a referendum on whether a particular set of education policies – or worse, a particular initiative – is working.

One-year changes have many uncertainties built into them; if there’s a real positive trend, it will persist over a period of several years. Schooling is a long-term process, the collective and sustained work of students, teachers and administrators; and there are few “silver bullets” that can be counted on to elevate scores over the period of a single school year.

Overall, this piece gives us some important things to remember: one data point is hard to put into context. You can draw a trend line between two data points. Having more data points gives you a better indication of what is happening over time. However, just having statistics isn’t enough; we also need to consider the reliability and validity of the data. Politicians and administrators seem to like test scores because they offer concrete numbers which can help them point out progress or suggest that changes need to be made. Yet, just because these are numbers doesn’t mean that there isn’t a process that goes into them or that we need to understand exactly what the numbers involve.

Questions about a study of the top Chicago commuter suburbs

The Chaddick Institute for Metropolitan Development at DePaul just released a new study that identifies the “top [20] transit suburbs of metropolitan Chicago.” Here is the top 10, starting with the top one: LaGrange, Wilmette, Arlington Heights, Glenview, Elmhurst, Wheaton, Downers Grove, Naperville, Des Plaines, and Mount Prospect. Here is the criteria used to identify these suburbs:

The DePaul University team considered 45 measurable factors to rank the best transit suburbs based on their:

1. Station buildings and platforms;

2. Station grounds and parking;

3. Walkable downtown amenities adjacent to the station; and

4. Degree of community connectivity to public transportation, as measured by the use of commuter rail services.

A couple of things strike me as interesting:

1. These tend to be wealthier suburbs but not the wealthiest. On one hand, this seems strange as living in a nicer place doesn’t necessarily translate into nicer mass transit facilities (particularly if more people can afford to drive). On the other hand, having a thriving, walkable downtown nearby is probably linked to having the money to make that happen.

2. There are several other important factors that influence which suburbs made the list:

Communities in the northern and northwestern parts of the region tended to outperform those in the southern parts, with much of the differences due to their published Walk Scores. Similarly, communities on the outer periphery of the region tend to have lower scores due to the tendency for the density of development to decline as one moves farther from downtown Chicago. As a result, both Walk Scores and connectivity to transit tended to be lower in far-out suburbs than closer-in ones.

It might be more interesting here to pick out suburbs that buck these trends and have truly put a premium on attractive transportation options. For example, can a suburb 35 miles out of Chicago put together a mass transit facilities that truly draw new residents or does the distance simply matter too much?

3. I’m not sure why they didn’t include “city suburbs.” Here is the explanation from the full report (p.11 of the PDF):

All suburbs with stations on metropolitan Chicago’s commuter-rail system, whether they are located in Illinois or Indiana, are considered for analysis except those classified as city suburbs, such as Evanston, Forest Park, and Oak Park, which have CTA rapid transit service to their downtown districts. Gary, Hammond, and Whiting, Indiana, also are generally considered cities or city suburbs rather than conventional suburbs, because all of these communities have distinct urban qualities. To assure meaningful and fair comparisons, these communities were not included in the study.

Hammond is not a “conventional suburb”? CTA service isn’t a plus over Metra commuter rail service?

4. The included suburbs had to meet three criteria (p.11 of the PDF):

1) commuter-rail service available seven days a week, with at least 14 inbound departures on weekdays, including some express trains;
2) at least 150 people who walk or bike to the train daily; and
3) a Walk Score of at least 65 on a 100-point scale at its primary downtown station (putting it near the middle of the category, described as “somewhat walkable”).

This is fairly strict criteria so not that many Chicago suburbs qualified for the study (p.11 of the PDF):

Twenty-five communities, all on the Metra system, met these three criteria (Figure 2). All were adjacent to downtown districts that support a transit-oriented lifestyle and tend to have a transit culture that many find appealing. Numerous communities, such as Buffalo Grove, Lockport, and Orland Park, were not eligible because they do not currently meet the first criteria, relating to train frequency. Some smaller suburbs, such as Flossmoor, Kenilworth and Glencoe, while heavily oriented toward transit, lack diversified downtown amenities and the services of larger stations, and therefore did not have published Walk Scores above the minimum threshold of 65.

I can imagine what might happen: all suburbs in the top 20 are going to proclaim that they are a top 20 commuter suburb! But it was only out 25…

5. There are some other intriguing methodological bits here. Stations earned points for having coffee available or displaying railroad heritage. Parking lot lighting was measured this way (p.24 of the PDF):

The illumination of the parking lot was evaluated using a standard light meter. Readings were collected during the late-evening hours between June 23 and July 5, 2012 at three locations in the main parking lots:
1) locations directly under light poles (which tend to be the best illuminated parts of the lots);
2) locations midway between the light poles (which tend to be among the most poorly illuminated parts of the lot); and
3) tangential locations, 20 and 25 feet perpendicular to the alignment of light poles and directly adjacent to the poles (in some cases, these areas having lighting provided from lamps on adjacent streets).

At least three readings were collected for category 1 and at least two readings were collected for categories two and three.

There is no widely accepted standard on parking lot lighting that balances aesthetics and security. Research suggests, however, that lighting of 35 or more lumens is preferable, but at a minimum, 10 lumens is necessary for proper pedestrian activity and safety. Scores of parking lot illuminate were based on a relative scale, as noted below. In effect, the scales grades on a “curve”, resulting in a relatively equal distribution of high and low scores for each category. In several instances, Category 3 readings were not possible due to the configuration of the parking lot. In these instances, final scores were determined by averaging the Category 1 and 2 scores.

I don’t see any evidence that commuters themselves were asked about the amenities though there was some direct observation. Why not also get information directly from those who consistently use the facilities?

Overall, I’m not sure how useful this study really is. I can see how it might be utilized by some interested parties including people in real estate and planners but I don’t know that it really captures enough of the full commuting experience available to suburbanites in the Chicago suburbs.

Four tips for making a good infographic

The head of a new infographic website suggests four tips for making a good infographic:

1. Apply a journalist’s code of ethics

An infographic starts with a great data set. Even if you’re not a journalist — but an advertiser or independent contractor, say — you need to represent the data ethically in order to preserve your credibility with your audience. Don’t source from blogs. Don’t source from Wikipedia. Don’t misrepresent your data with images.

2. Find the story in the data

There’s a popular misconception that creating a great infographic just requires hiring a great graphic designer. But even the best designer can only do so much with poor material. Mapping out the key points in your narrative should be the first order of business. “The most accessible graphics we’ve ever done are the ones that tell a story. It should have an arc, a climax and a conclusion,” Langille says. When you find a great data set, mock up your visualization first and figure out what you want to say, before contacting a designer.

3. Make it mobile and personal

As the media becomes more sophisticated, designers are developing non-static infographics. An interactive infographic might seem pretty “sexy,” Langille says, but it’s much less shareable. A video infographic, on the other hand, is both interactive and easy to port from site to site. Another way to involve readers is to create a graphic that allows them to input and share their own information.

4. Don’t let the code out

One of the easiest ways to protect your work is to share it on a community site. Visual.ly offers Creative Commons licensing to users who upload a graphic to the site. When visitors who want to use the graphic grab embed code from the site, the embedded image automatically links back to its creator. Langille suggests adding branding to the bottom of your work and never releasing the actual source file — only the PNG, JPEG, or PDF. And what if your work goes viral without proper credit? For god’s sake, don’t be a pain and demand that the thieves take it down. “It’s better to let it go and ask for a link back and credits on the graphics,” Langille said.

The first two points apply to all charts and graphs: you need to have good and compelling data and then use the graphic to tell this story. Infographics should make the relevant data easier to understand than having someone read through denser text. An easy temptation is to try new ways of displaying data without thinking through whether they are easily readable.

It would be interesting to know whether infographics are actually more effective in conveying information to viewers. In other words, is a traditional bar graph made in Excel really worse in the basic task of sharing information than a snazzy infographic? I imagine websites and publications would rather have infographics because they look better and take advantage of newer tools but a better visual does not necessarily equal connecting more with viewers.

Side note: the “meta Infographic” at the beginning of this article and the “Most Popular Infographics You Can Find Around the Web” at the end are amusing.

What happens when you let Boston residents crowdsource neighborhood boundaries

Here is a fascinating online experiment: let residents of a city, in this case, Boston, illustrate how they would draw neighborhood boundaries. Here are the conclusions of the effort thus far:

Although we talk a lot about boundaries, this post included, the maps here should also remind us that neighborhoods are not defined by their edges—essentially, what is outside the neighborhood—but rather by their contents. And it’s not just a collection of roads and things you see on a map; it’s about some shared history, activities, architecture, and culture. So while the neighborhood summaries above rely on edges to describe the maps, let’s also think about the areas represented by the shapes and what’s inside them. What are the characteristics of these areas? Why are they the shapes that they are? Why is consensus easy or difficult in different areas? What is the significance of the differences in opinion between residents of a neighborhood and people outside the neighborhood?

We’ll revisit those questions in further detail in future posts, and also generate maps of other facets of the data. Next up: areas of overlap between neighborhoods. Here we’ve looked neighborhood-by-neighborhood at how much people agree, so now let’s map those zones that exhibit disagreement. Meanwhile, thanks so much for all the submissions for this project; and if you haven’t drawn some neighborhoods, what’s your problem? Get on it!

This gets at a recurring issue for urban sociologists: how to best define communities or neighborhoods. The best option with data is to use Census boundaries such as tracts, block groups, blocks, and perhaps zip codes. This data is collected regularly, in-depth, and can be easily downloaded. However, these boundaries are crude approximations of culturally defined neighborhoods. People on the ground have little knowledge about what Census tract they live in (though this is easy to figure out online).

So if Census definitions are not the best for the on-the-ground experience, what is left? This crowdsourcing project is a modern way of doing what some researchers have done: ask the residents themselves and also observe what happens. What streets are not crossed? Which features or landmarks define a neighborhood? Who “belongs” where? What are typical activities in different places? Of course, this is a much messier process than working with clearly defined and reliable Census data but it illustrates a key aspect about neighborhoods: they are continually changing and being redefined by their own residents and others.