Good data is foundational to doing good sociological work

I’ve had conversations in recent months with a few colleagues outside the discipline about debates within sociology over the work of ethnographers like Alice Goffman, Matt Desmond, and Sudhir Venkatesh. It is enlightening to hear how outsiders see the disagreements and this has pushed me to consider more fully how I would explain the issues at hand. What follows is my one paragraph response to what is at stake:

In the end, what separates the work of sociologists from perceptive non-academics or journalists? (An aside: many of my favorite journalists often operate like pop sociologists as they try to explain and not just describe social phenomena.) To me, it comes down to data and methods. This is why I enjoy teaching both our Statistics course and our Social Research course: undergraduates rarely come into them excited but they are foundational to who sociologists are. What we want to do is have data that is (1) scientific – reliable and valid – and (2) generalizable – allowing us to see patterns across individuals and cases or settings. I don’t think it is a surprise that the three sociologists under fire above wrote ethnographies where it is perhaps more difficult to fit the method under a scientific rubric. (I do think it can be done but it doesn’t always appear that way to outsiders or even some sociologists.) Sociology is unique in both its methodological pluralism – we do everything from ethnography to historical analysis to statistical models to lab or natural experiments to mass surveys – and we aim to find causal explanations for phenomena rather than just describe what is happening. Ultimately, if you can’t trust a sociologist’s data, why bother considering their conclusions or why would you prioritize their explanations over that of an astute person on the street?

Caveats: I know no data is perfect and sociologists are not in the business of “proving” things but rather we look for patterns. There is also plenty of disagreement within sociology about these issues. In a perfect world, we would have researchers using different methods to examine the same phenomena and develop a more holistic approach. I also don’t mean to exclude the role of theory in my description above; data has to be interpreted. But, if you don’t have good data to start with, the theories are abstractions.

Estimating big crowds accurately with weather balloons, bicycles, and counting

The best way to count large crowds – such as in Washington D.C. – may be by using a weather balloon and supplementing that data:

Well, technically, a “tethered aerostat.” Tethered because it is anchored to the ground, and aerostat because it will hold a static altitude in the air. A nine-lens camera is attached to its base, so it can capture the full 360-degree view of the proceedings. It will observe the entire Women’s March…

Their technique involves more than just the weather balloon. While the weather balloon records the events from above, Westergard and his team will bike or walk around the protest site. They’ll take note of how many people are taking cover under structures, like the massive elm trees on the Mall. Sometimes they’ll even lower the aerostat so that it can capture crowds in the shade. “At 400 feet, we’re looking under the trees. At 800 feet, you’re looking at the top of them,” he told me…

Once the data is collected, they return to their headquarters. Three days of work commences. First, they will measure the density of different parts of the crowd. They do this by counting heads in a specific area. “We sit there literally, head by head, going tick-tick-tick-tick-tick” with the images, he told me. “It’s painful, it’s long, but it’s far more accurate than these algorithms.”

Sometimes they outsource this task to Amazon’s Mechanical Turk service to increase their own accuracy: They ask a dozen strangers to count heads in a certain picture without telling them where the picture was taken.

Once they have this density map, they overlay it on a map of the topography. “If you have people surrounding the Washington Monument—which is on a moderately steep hill—and you look out at a crowd, you’re going to see more people because they’re tilted toward you,” he said. The computer model will correct for those kinds of inaccuracies.

See earlier posts (such as here and here) about counting crowds.

It is also interesting that this more accurate method is explained by the leader of a private firm: “Curt Westergard…is the president of Digital Design and Imaging Service based in Falls Church, Virginia, and he stressed that his company’s methods were “at the very top of the accuracy and ethical side.”” He is working for those who want to hire him, something that could be worthwhile for the article to explore. Are they impartial observers who are doing this work for science? In other words, crowd counting could be influenced by who exactly is doing the counting. Parties who often make the counts – police, local officials, the media – have vested interests. For example, take the case of the rally for the Cubs World Series victory.

Of course, as it noted in this article, the numbers themselves are often politicized. What will be the official count accepted by posterity for a Trump inauguration that likely stirs up emotions for everyone?

Chicago sets new record for tourism

The city of Chicago may have problems but the number of tourists continues to increase:

An estimated 54.1 million visitors came to the city in 2016, up 2.9 percent from the previous year’s record-setting count. The increase marks a step towards reaching Mayor Rahm Emanuel’s goal of annually attracting 55 million out-of-towners to Chicago by 2020…

Leisure proved to be the primary attraction behind Chicago’s rising tourism numbers. About four in five visitors last year (nearly 41 million) came to Chicago for fun, city officials say…

The long-running Blues Festival, the NFL Draft and the Chicago Cubs World Series victory parade were three major events last year that helped boost tourism numbers, Kelly said…

City officials also cite business visitation, which grew by 2.1 percent from the previous year, as another factor. Some 31 major conventions and meetings were hosted citywide throughout last year, drawing nearly one million attendees; 35 business meetings are slated for 2017.

I’d love to see how these numbers were calculated. Just take the suggestion that the Cubs World Series parade and rally are part of these totals; how big were those crowds? Early estimates were high but there was little commentary later about more solid figures. Were suburbanites who came in for the day counted as tourists? If the 5 million figure holds, then this one event on its own pushed the city from a lower number than the previous year to a record number.

Where does the data on the number of Americans traveling for Thanksgiving come from?

It was widely reported this year that nearly 49 million Americans would be traveling for Thanksgiving this year. This data comes from AAA and here is the methodology according to their press release from November 15:

AAA’s projections are based on economic forecasting and research by IHS Markit. The London-based business information provider teamed with AAA in 2009 to jointly analyze travel trends during major holidays. AAA has been reporting on holiday travel trends for more than two decades. The complete AAA/IHS 2016 Thanksgiving holiday travel forecast can be found here.

When numbers like this are used in public and reported on by the media, I would guess many Americans expect these figures to be based on surveys. So what is this projection based on? Surveys (probably via phone calls)? Historical models (based on factors like gas prices and broader economic indicators)? Certain retail and tourism figures like hotel and airfare bookings?

What makes this more complicated is that AAA is an organization that could benefit from increased travel, particularly driving. And as they note on the press release, their organization can provide benefits to travelers:

AAA will rescue thousands of motorists this Thanksgiving AAA expects to rescue more than 370,000 motorists this Thanksgiving, with the primary reasons being dead batteries, flat tires and lockouts. AAA recommends that motorists check the condition of their battery and tires and pack emergency kits in their vehicles before heading out on a holiday getaway. Drivers should have their vehicles inspected by a trusted repair shop, such as one of the nearly 7,000 AAA Approved Auto Repair facilities across North America. Members can download the AAA Mobile app, visit AAA.com or call 1-800-AAA-HELP to request roadside assistance.

This does not necessarily mean that the data is inaccurate. At the same time, it would help to make the methodology of their projections available.

Another thought: are Americans helped or hindered by these broad projections of holiday travel? If you are traveling, does news like this change your plans (i.e., leave earlier)? If AAA projects more drivers, do traffic delays increase (such as on the 405 in Los Angeles)? If the BBC links the incidents, perhaps people take these figures seriously…

Middle-class incomes have biggest year to year rise – with a catch

New data suggests middle-class incomes rose in 2015:

The incomes of typical Americans rose in 2015 by 5.2 percent, the first significant boost to middle-class pay since the end of the Great Recession and the fastest increase ever recorded by the federal government, the Census Bureau reported Tuesday.

In addition, the poverty rate fell by 1.2 percentage points, the steepest decline since 1968. There were 43.1 million Americans in poverty on the year, 3.5 million fewer than in 2014…

The 5.2 percent increase was the largest, in percentage terms, ever recorded by the bureau since it began tracking median income statistics in the 1960s. Bureau officials said it was not statistically distinguishable from five other previous increases in the data, most recently the 3.7 percent jump from 1997 to 1998.

Rising incomes are generally good. But, note the catch in the third paragraph cited above: officials cannot say that the 5.2% increase is definitively higher than several previous increases. Why not? The 5.2% figure is based on a sample that has a margin of error of at least 1.5% either way. The data comes from these Census instruments:

The Current Population Survey Annual Social and Economic Supplement was conducted nationwide and collected information about income and health insurance coverage during the 2015 calendar year. The Current Population Survey, sponsored jointly by the U.S. Census Bureau and U.S. Bureau of Labor Statistics, is conducted every month and is the primary source of labor force statistics for the U.S. population; it is used to calculate the monthly unemployment rate estimates. Supplements are added in most months; the Annual Social and Economic Supplement questionnaire is designed to give annual, national estimates of income, poverty and health insurance numbers and rates.

According to the report (page 6), the margin of error for the percent change in income from 2014 to 2015 is 1.6%. Incomes may have risen even more than 5.2%! Or, they may have risen at lower rates. See the methodological document regarding the survey instruments here.

The Census has in recent years moved to more frequent reports on key demographic measures. This produces data more frequently. One of the trade-offs, however, is that these estimates are not as accurate as the dicennial census which requires a lot more resources to conduct and is more thorough.

A final note: it is good that the margin of error is hinted at in the article on rising middle-class incomes. On the other hand, it is mentioned in paragraph 12 and the headline clearly suggests that this was a record year. Statistically speaking, this may or may not be the case.

“Sociology is alien to literature”

One reviewer of a new book suggests the retelling of personal experiences cannot be equated with sociology:

Ben Simon writes in the introduction, “I am not sure that my immigration experience is representative of the immigrants from Morocco.” But elsewhere he also writes: “To date, no attempt has been made to decipher the sociology of the Moroccan immigration. This book is a modest step in that direction.”

I object to Ben Simon’s sociological aspirations in this book. In his work as a journalist, he aimed his efforts in this direction, always doing so in an interesting and profound manner. But that is not the story here, because this is a different sort of literary undertaking. Someone who seeks to tell about himself has to first employ tools of emotion, sharing experiences and memories, allowing the reader to learn the process involved in consciousness-in-the-making: a private and personal consciousness, not a “sociology,” not the diagnosis of a society, not a creation of a portrait of something – but rather literature.

By its nature, an autobiography is first and foremost a literary text. And it is enough to think of Sartre’s “The Words” to understand this. Sociology, by virtue of the alienation that underlies its definition, in its critical sense of observation from the outside – is alien to literature. Being Moroccan is, in any event, much more complex, and so too are its immigrant experiences. It is enough for me to think about my “Moroccan” family, about its consciousness, about how it coped, about its relationship to religion and its immigration experiences.

Ben Simon sets out on a journey that traces the impressive path he has forged, the consolidation of his own perspective on reality, his emotions. But in “The Moroccans,” he feels a need to package this in “sociology.” Clearly there is a context, a “period,” a reflection of reality, but it is marginal; it is not the main thing.

Without reading the book, it is hard to know exactly what is going on here. It sounds like the author wants to extrapolate a bit from his own experiences to those of all Moroccan immigrants and the reviewer suggests he can’t speak for such a large group. This kerfuffle may also be about style; autobiographies and sociological works are often written differently with the emphasis of the first more on experiences and emotions and the second on larger generalizations, data, and theory.

This does hint at a larger issue in sociology and related disciplines where some research methods – particularly ethnography – allow for the mixing of researcher experience while still attempting to remain objective and connect the research to bigger issues in the field. This line can be quite blurry; see earlier issues raised about the work of Venkatesh or Goffman. Yet, it is an issue that is not going to go away as (1) insider information continues to be valuable and (2) some look to connect with different (i.e., non-academic) audiences with more literary styles.

Doing social science research in Madagascar

One researcher discusses undertaking research in Madagascar:

My colleagues and I, from the UK, the US and South Africa, feel frustrated. It is December 2014 and we have gathered at a jungle lodge in the highlands of Madagascar with 25 academics and postgraduate students from Antananarivo’s departments of sociology and communication to hash out the methodology for a large-scale research study. However, our research partners’ greater apparent interest in discussing theoretical issues is slowing us down. It is also tough for the interpreters, grappling with three-way simultaneous translation from Malagasy to English, French to English and English to French. The day reaches a low point when I hear through my headphones: “The real problem is situated somewhere between the problematic and the problematisation.”

We feel like prisoners in a jungle of theory. However, over the next few months, I come to realise that the lecture on Weber – and other diversions into Marxist, literary or linguistic theory – are not mere academic posturing. They are – to use development jargon – capacity-building. Unicef has asked our team to build the capacity of Antananarivo staff and students to conduct social research. We know how to design a quantitative and qualitative study, do the data analysis and write the report. But we know little about Madagascar: its culture and turbulent history, or how our Malagasy colleagues regard research. Their priority for the seminar is not to draft survey questionnaires but to build an equal, trusting research partnership…

According to the research design, a quantitative study (two questionnaires, with about 1,500 respondents for each) is to be conducted first, to highlight issues to be explored in the subsequent qualitative research. Unfortunately, the eastern floods and southern drought put the project several months behind schedule, and the Antananarivo qualitative research teams go into the field at about the same time as the quantitative research is being conducted, working in different communities. They emerge with hundreds of hours of focus group and interview transcripts and field notes, and it is a formidable task to merge them with the quantitative data.

Ultimately, common sense and pragmatism prevail. We use geographic and economic criteria to classify communities into four types: interior, sub-coastal, coastal and urban. Some interior communities are two days by zebu cart from the main dirt road; including them would lengthen the research and strain the budget. We reduce the long list of variables to be analysed. Our Antananarivo colleagues have a therapeutic 15-minute debate over whether coding – or, indeed, any attempt to organise human experience – is a colonial imposition. And then everyone goes back to work.

Doing quality research in first-world countries is difficult enough and yet working through the obstacles to doing good research in the developing world could lead to many positive consequences. It would be nice to see a follow-up article that shows what came of all these efforts.