Speculating on why sociology is less relevant to the media and public than economics

In calling for more sociological insight into economics, a journalist who attended the recent ASA meetings in Philadelphia provides two reasons why sociology lags behind economics in public attention:

Economists, you see, put draft versions of their papers online seemingly as soon as they’ve finished typing. Attend their big annual meeting, as I have several times, and virtually every paper discussed is available beforehand for download and perusal. In fact, they’re available even if you don’t go to the meeting. I wrote a column two years ago arguing that this openness had given economists a big leg up over the other social sciences in media attention and political influence, and noting that a few sociologists agreed and were trying to nudge their discipline — which disseminates its research mainly through paywalled academic journals and university-press books — in that direction with a new open repository for papers called SocArxiv. Now that I’ve experienced the ASA annual meeting for the first time, I can report that (1) things haven’t progressed much since 2016, and (2) I have a bit more sympathy for sociologists’ reticence to act like economists, although I continue to think it’s holding them back.

SocArxiv’s collection of open-access papers is growing steadily if not spectacularly, and Sociological Science, an open-access journal founded in 2014, is carving out a respected role as, among other things, a place to quickly publish articles of public interest. “Unions and Nonunion Pay in the United States, 1977-2015” by Patrick Denice of the University of Western Ontario and Jake Rosenfeld of Washington University in St. Louis, for example, was submitted June 12, accepted July 10 and published on Wednesday, the day after it was presented at the ASA meeting. These dissemination tools are used by only a small minority of sociologists, though, and the most sparsely attended session I attended in three-plus days at their annual meeting was the one on “Open Scholarship in Sociology” organized by the University of Maryland’s Philip Cohen, the founder of SocArxiv and one of the discipline’s most prominent social-media voices. This despite the fact that it was great, featuring compelling presentations by Cohen, Sociological Review deputy editor Kim Weeden of Cornell University and higher-education expert Elizabeth Popp Berman of the State University of New York at Albany, and free SocArxiv pens for all.

As I made the rounds of other sessions, I did come to a better understanding of why sociologists might be more reticent than economists to put their drafts online. The ASA welcomes journalists to its annual meeting and says they can attend all sessions where research is presented, but few reporters show up and it’s clear that most of those presenting research don’t consider themselves to be speaking in public. The most dramatic example of this in Philadelphia came about halfway through a presentation involving a particular corporation. The speaker paused, then asked the 50-plus people in the room not to mention the name of said corporation to anybody because she was about to return to an undercover job there. That was a bit ridiculous, given that there were sociologists live-tweeting some of the sessions. But there was something charming and probably healthy about the willingness of the sociologists at the ASA meeting to discuss still-far-from-complete work with their peers. When a paper is presented at an economics conference, many of the discussant’s comments and audience questions are attempts to poke holes in the reasoning or methodology. At the ASA meeting, it was usually, “This is great. Have you thought about adding …?” Also charming and probably healthy was the high number of graduate students presenting research alongside the professors, which you don’t see so much at the economists’ equivalent gathering.

All in all — and I’m sure there are sociological terms to describe this, but I’m not familiar with them — sociology seems more focused on internal cohesion than economics is. This may be partly because it’s what Popp Berman calls a “low-consensus discipline,” with lots of different methodological approaches and greatly varying standards of quality and rigor. Economists can be mean to each other in public yet still present a semi-united face to the world because they use a widely shared set of tools to arrive at answers. Sociologists may feel that they don’t have that luxury.

Disciplinary differences can be mystifying at times.

I wonder about a third possible difference in addition to the two provided: different conceptions in sociology and economics about what constitutes good arguments and data (hinted at above with the idea of “lots of different methodological approaches and greatly varying standards of quality and rigor.”) Both disciplines do aspire to the idea of social science where empirical data is used to test hypotheses about human behavior, usually in collectives, works. But, this is tricky to do as there are numerous pitfalls along the way. For example, accurate measurement is difficult even when a researcher has clearly identified a concept. Additionally, it is my sense that sociologists as a whole may be more open to qualitative and quantitative data (even with occasional flare-ups between researchers studying the same topic yet falling in different methodological camps). With these methodological questions, sociologists may feel they need more time to connect their methods to a convincing causal and scientific argument

A fourth possible reason behind the differences (also hinted at above with the idea of economists having a “semi-united face” to present): sociology has a reputation as a more left-leaning discipline. Some researchers may prefer to have all their ducks in a row before they expose their work to full public scrutiny. The work of economists is more generally accepted by the public and some leaders while sociology regularly has to work against some backlash. (As an example, see conservative leaders complain about sociology excusing poor behavior when the job of the discipline is to explain human behavior.) Why expose your work to a less welcoming public earlier when you could take a little more time to polish the argument?

Online survey panels in first-world countries versus developing nations

While reading about the opposition Canadians have to self-driving cars, I ran into this explanation from Ipsos about conducting online surveys in countries around the world:


Having online panels is a regular practice among survey organizations. However, I do not recall seeing an explanation like this regarding differences in online panels across countries. The online sample in non-industrialized countries is simply unrepresentative as it reflects “a more ‘connected’ population.” Put another way, the online panel in places like Brazil, China, Russia, and Saudi Arabia reflects the upper class and people who live more like Westerners and not the vast majority of their population. Then, the sample is also smaller in these countries: 500+ rather than 1000+. Finally, it would be interesting to see how much the data needs to be weighted to “best reflect the demographic proile of the adult population.”

With all these caveats, is an online panel in a non-industrialized country worth it?

Collecting big data the slow way

One of the interesting side effects of the era of big data is finding out how much information is not actually automatically collected (or is at least not available to the general public or researchers without paying money). A quick example from the work of sociologist Matthew Desmond:

The new data, assembled from about 83 million court records going back to 2000, suggest that the most pervasive problems aren’t necessarily in the most expensive regions. Evictions are accumulating across Michigan and Indiana. And several factors build on one another in Richmond: It’s in the Southeast, where the poverty rates are high and the minimum wage is low; it’s in Virginia, which lacks some tenant rights available in other states; and it’s a city where many poor African-Americans live in low-quality housing with limited means of escaping it.

According to the Eviction Lab, here is how they collected the data:

First, we requested a bulk report of cases directly from courts. These reports included all recorded information related to eviction-related cases. Second, we conducted automated record collection from online portals, via web scraping and text parsing protocols. Third, we partnered with companies that carry out manual collection of records, going directly into the courts and extracting the relevant case information by hand.

In other words, it took a lot of work to put together such a database: various courts, websites, and companies had different pieces of information but a researcher to access all of that data and put them together.

Without a researcher or a company or government body explicitly starting to record or collect certain information, a big dataset on that particular topic will not happen. Someone or some institution, typically with resources at its disposal, needs to set a process into motion. And simply having the data is not enough; it needs to be cleaned up so it all works with the other pieces. Again, from the Eviction Lab:

To create the best estimates, all data we obtained underwent a rigorous cleaning protocol. This included formatting the data so that each observation represented a household; cleaning and standardizing the names and addresses; and dropping duplicate cases. The details of this process can be found in the Methodology Report (PDF).

This all can lead to a fascinating dataset of over 83 million records on an important topic.

We are probably still a ways off from a scenario where this information would automatically become part of a dataset. This data had a definite start and required much work. There are many other areas of social life that require similar efforts before researchers and the public have big data to examine and learn from.

The problem of archiving the Internet may be just the first problem; how do we make causal arguments from its contents?

Archiving the Internet so that it can understood and studied by later researchers and scholars may be a big problem:

In a new paper, “Stewardship in the ‘Age of Algorithms,’” Clifford Lynch, the director of the Coalition for Networked Information, argues that the paradigm for preserving digital artifacts is not up to the challenge of preserving what happens on social networks.

Over the last 40 years, archivists have begun to gather more digital objects—web pages, PDFs, databases, kinds of software. There is more data about more people than ever before, however, the cultural institutions dedicated to preserving the memory of what it was to be alive in our time, including our hours on the internet, may actually be capturing less usable information than in previous eras…

Nick Seaver of Tufts University, a researcher in the emerging field of “algorithm studies,” wrote a broader summary of the issues with trying to figure out what is happening on the internet. He ticks off the problems of trying to pin down—or in our case, archive—how these web services work. One, they’re always testing out new versions. So there isn’t one Google or one Bing, but “10 million different permutations of Bing.” Two, as a result of that testing and their own internal decision-making, “You can’t log into the same Facebook twice.” It’s constantly changing in big and small ways. Three, the number of inputs and complex interactions between them simply makes these large-scale systems very difficult to understand, even if we have access to outputs and some knowledge of inputs.

In order to study something, you have measure and document it well. This is an essential first step for many research projects.

But, I wonder if even it can all be documented well, what exactly would it tell us about behaviors and aspirations? Like any “text,” it may be difficult to make causal arguments based on the artifacts of our Internet or social media. They are controlled by a relatively small number of people. Social media is dominated by a relatively small number of users. Many people in society interact with both but how exactly are their lives changed? The history of the Internet and social media and the forces behind it is one thing; it could be fascinating to see how the birth of the World Wide Web in the early 1990s or AOL or Facebook or Google are all viewed several decades into the future. But, it will be much harder to clearly show how all these forces affected the average person. Did it change personalities? Did day-to-day life change in substantial ways? Did political opinions change? Did it disrupt or enhance relationships? What if Twitter dominates the media and the lives of 10% of the American population but little impact on most lives?

There is a lot here to sort out and a lot of opportunities for good research. At the same time, there are a lot of chances for people to make vague claims and arguments based on correlations and broad patterns that cannot be explicitly linked.

Good data is foundational to doing good sociological work

I’ve had conversations in recent months with a few colleagues outside the discipline about debates within sociology over the work of ethnographers like Alice Goffman, Matt Desmond, and Sudhir Venkatesh. It is enlightening to hear how outsiders see the disagreements and this has pushed me to consider more fully how I would explain the issues at hand. What follows is my one paragraph response to what is at stake:

In the end, what separates the work of sociologists from perceptive non-academics or journalists? (An aside: many of my favorite journalists often operate like pop sociologists as they try to explain and not just describe social phenomena.) To me, it comes down to data and methods. This is why I enjoy teaching both our Statistics course and our Social Research course: undergraduates rarely come into them excited but they are foundational to who sociologists are. What we want to do is have data that is (1) scientific – reliable and valid – and (2) generalizable – allowing us to see patterns across individuals and cases or settings. I don’t think it is a surprise that the three sociologists under fire above wrote ethnographies where it is perhaps more difficult to fit the method under a scientific rubric. (I do think it can be done but it doesn’t always appear that way to outsiders or even some sociologists.) Sociology is unique in both its methodological pluralism – we do everything from ethnography to historical analysis to statistical models to lab or natural experiments to mass surveys – and we aim to find causal explanations for phenomena rather than just describe what is happening. Ultimately, if you can’t trust a sociologist’s data, why bother considering their conclusions or why would you prioritize their explanations over that of an astute person on the street?

Caveats: I know no data is perfect and sociologists are not in the business of “proving” things but rather we look for patterns. There is also plenty of disagreement within sociology about these issues. In a perfect world, we would have researchers using different methods to examine the same phenomena and develop a more holistic approach. I also don’t mean to exclude the role of theory in my description above; data has to be interpreted. But, if you don’t have good data to start with, the theories are abstractions.

Estimating big crowds accurately with weather balloons, bicycles, and counting

The best way to count large crowds – such as in Washington D.C. – may be by using a weather balloon and supplementing that data:

Well, technically, a “tethered aerostat.” Tethered because it is anchored to the ground, and aerostat because it will hold a static altitude in the air. A nine-lens camera is attached to its base, so it can capture the full 360-degree view of the proceedings. It will observe the entire Women’s March…

Their technique involves more than just the weather balloon. While the weather balloon records the events from above, Westergard and his team will bike or walk around the protest site. They’ll take note of how many people are taking cover under structures, like the massive elm trees on the Mall. Sometimes they’ll even lower the aerostat so that it can capture crowds in the shade. “At 400 feet, we’re looking under the trees. At 800 feet, you’re looking at the top of them,” he told me…

Once the data is collected, they return to their headquarters. Three days of work commences. First, they will measure the density of different parts of the crowd. They do this by counting heads in a specific area. “We sit there literally, head by head, going tick-tick-tick-tick-tick” with the images, he told me. “It’s painful, it’s long, but it’s far more accurate than these algorithms.”

Sometimes they outsource this task to Amazon’s Mechanical Turk service to increase their own accuracy: They ask a dozen strangers to count heads in a certain picture without telling them where the picture was taken.

Once they have this density map, they overlay it on a map of the topography. “If you have people surrounding the Washington Monument—which is on a moderately steep hill—and you look out at a crowd, you’re going to see more people because they’re tilted toward you,” he said. The computer model will correct for those kinds of inaccuracies.

See earlier posts (such as here and here) about counting crowds.

It is also interesting that this more accurate method is explained by the leader of a private firm: “Curt Westergard…is the president of Digital Design and Imaging Service based in Falls Church, Virginia, and he stressed that his company’s methods were “at the very top of the accuracy and ethical side.”” He is working for those who want to hire him, something that could be worthwhile for the article to explore. Are they impartial observers who are doing this work for science? In other words, crowd counting could be influenced by who exactly is doing the counting. Parties who often make the counts – police, local officials, the media – have vested interests. For example, take the case of the rally for the Cubs World Series victory.

Of course, as it noted in this article, the numbers themselves are often politicized. What will be the official count accepted by posterity for a Trump inauguration that likely stirs up emotions for everyone?

Chicago sets new record for tourism

The city of Chicago may have problems but the number of tourists continues to increase:

An estimated 54.1 million visitors came to the city in 2016, up 2.9 percent from the previous year’s record-setting count. The increase marks a step towards reaching Mayor Rahm Emanuel’s goal of annually attracting 55 million out-of-towners to Chicago by 2020…

Leisure proved to be the primary attraction behind Chicago’s rising tourism numbers. About four in five visitors last year (nearly 41 million) came to Chicago for fun, city officials say…

The long-running Blues Festival, the NFL Draft and the Chicago Cubs World Series victory parade were three major events last year that helped boost tourism numbers, Kelly said…

City officials also cite business visitation, which grew by 2.1 percent from the previous year, as another factor. Some 31 major conventions and meetings were hosted citywide throughout last year, drawing nearly one million attendees; 35 business meetings are slated for 2017.

I’d love to see how these numbers were calculated. Just take the suggestion that the Cubs World Series parade and rally are part of these totals; how big were those crowds? Early estimates were high but there was little commentary later about more solid figures. Were suburbanites who came in for the day counted as tourists? If the 5 million figure holds, then this one event on its own pushed the city from a lower number than the previous year to a record number.