When I see “study” in a news story, I (wrongly) assume it is a peer-reviewed analysis

In the last week, I have run into two potentially interesting news stories that cite studies. Yet, when I looked into what kind of studies these were, they were not what I expected.

Photo by Pixabay on Pexels.com

First, the Chicago Tribune online headline: “Why are Chicagoans moving away during the pandemic? As study suggests outbound migration is spiking, we asked them.” The opening to the story:

Chicago’s population has been on the decline for years, with the metropolitan area suffering some of the greatest losses of any major U.S. city. But new research suggests that the pandemic might be exacerbating the exodus.

For the first time in four years, moving concierge app Updater has helped more people move out of Chicago than to it, the company said. The catch-all moving service estimates that it takes part in one-third of all U.S. moves, providing unique, real-time insight into pandemic-driven trends, said Jenna Weinerman, Updater’s vice president of marketing.

“All these macro conditions — job insecurity, remote work, people wanting to gain more space — are coming together to create these patterns,” Weinerman said.

The Chicago figures are based on approximately 39,000 moves within city limits from March 1 to Sept. 30. Compared to 2019, this year saw more moving activity in general, with an 8% jump in moves into the city — but a 19% increase in the number of people leaving.

The second article involved a study at Cafe Storage and the headline “Average Home Size in the US: New Homes Bigger than 10 Years Ago but Apartments Trail Behind” (also cited in the Chicago Tribune) From the story:

According to the latest available US Census data, the average size of single family homes built in the US was trending upwards from 2010 until 2017, when sizes hit a peak of 2,643 square feet. Since then, single family homes began decreasing in size, with homes built in 2019 averaging 2,611 square feet…

Location matters when it comes to average home size. Some urban hotspots follow the national trend, while others move in the opposite direction. Here’s how single family home and apartment sizes look in the country’s top 20 largest cities, based on Yardi Matrix, Property Shark and Point2Homes data.

As an academic, here is what I expect when I hear the word study:

  1. Peer-reviewed work published in an academic outlet.
  2. Rigorous methodology and trusted data sources.

These steps do not guarantee research free from error but it does impose standards and steps intended to reduce errors.

In both cases, this analysis does not meet those standards. Instead, they utilize more proprietary data and serve the companies or websites publicizing the findings. This does not necessarily mean the findings are untrue. It does, however, make it much more difficult for journalists or the public to know how the study was conducted, what the findings are, and what it all means.

Use of the term study is related to a larger phenomena: many organizations, businesses, and individuals have potentially interesting data to contribute to public discussions and policy making. For example, without official data about the number of people moving out of cities, we are left searching for other data sources. How reliable are they? What data is anecdotal and what can be trusted? Why don’t academics and journalists find better data?

If we use the word “study” to refer to any data analysis, we risk making it even harder for people to discern what is a trustworthy study and what is not. Call it an analysis, call it a set of findings. Make clear who conducted the research, how the analysis was conducted, and with what data. (These three steps would be good for any coverage of an academic study.) Help readers and interested parties put the findings in the context of other findings and ongoing conversations. Just do not suggest that this is a study in the same way that other analyses are studies.

A short overview of recent survey questions about Holocaust knowledge in the US

Although this article leads with recent survey results about what Americans know and think about the Holocaust, I’ll start with the summary of earlier surveys and move forward in time to the recent results:

Whether or not the assumptions in the Claims Conference survey are fair, and how to tell, is at the core of a decades long debate over Holocaust knowledge surveys, which are notoriously difficult to design. In 1994, Roper Starch Worldwide, which conducted a poll for the American Jewish Committee, admitted that its widely publicized Holocaust denial question was “flawed.” Initially, it appeared that 1 in 5, or 22 percent, of Americans thought it was possible the Holocaust never happened. But pollsters later determined that the question—“Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?”—was confusing and biased the sample. In a subsequent Gallup poll, when asked to explain their views on the Holocaust in their own words, “only about 4 percent [of Americans] have real doubts about the Holocaust; the others are just insecure about their historical knowledge or won’t believe anything they have not experienced themselves,” according to an Associated Press report at the time. More recently, the Anti-Defamation League was criticized for a 2014 worldwide study that asked respondents to rate 11 statements—“People hate Jews because of the way they behave, for example”—as “probably true” or “probably false.” If respondents said “probably true” to six or more of the statements, they were considered to harbor anti-Semitic views, a line that many experts said could not adequately represent real beliefs…

Just two years ago, the Claims Conference released another survey of Americans that found “Two-Thirds of Millennials Don’t Know What Auschwitz Is,” as a Washington Post headline summarized it. The New York Times reported on the numbers at the time as proof that the “Holocaust is fading from memory.” Lest it appear the group is singling out Americans, the Claims Conference also released surveys with “stunning” results from Canada, France, and Austria.

But a deeper look at the Claims Conference data, which was collected by the firm Schoen Cooperman Research, reveals methodological choices that conflate specific terms (the ability to ID Auschwitz) and figures (that 6 million Jews were murdered) about the Holocaust with general knowledge of it, and knowledge with attitudes or beliefs toward Jews and Judaism. This is not to discount the real issues of anti-Semitism in the United States. But it is an important reminder that the Claims Conference, which seeks restitution for the victims of Nazi persecution and also to “ensure that future generations learn the lessons of the Holocaust,” is doing its job: generating data and headlines that it hopes will support its worthy cause.

The new Claims Conference survey is actually divided into two, with one set of data from a 1,000-person national survey and another set from 50 state-by-state surveys of 200 people each. In both iterations, the pollsters aimed to assess Holocaust knowledge according to three foundational criteria: the ability to recognize the term the Holocaust, name a concentration camp, and state the number of Jews murdered. The results weren’t great—fully 12 percent of national survey respondents had not or did not think they had heard the term Holocaust—but some of the questions weren’t necessarily written to help respondents succeed. Only 44 percent were “familiar with Auschwitz,” according to the executive summary of the data, but that statistic was determined by an open-ended question: “Can you name any concentration camps, death camps, or ghettos you have heard of?” This type of active, as opposed to passive, recall is not necessarily indicative of real knowledge. The Claims Conference also emphasized that 36 percent of respondents “believe” 2 million or fewer Jews were killed in the Holocaust (the correct answer is 6 million), but respondents were actually given a multiple-choice question with seven options—25,000, 100,000, 1 million, 2 million, 6 million, 20 million, and “not sure”—four of which were lowball figures. (Six million was by far the most common answer, at 37 percent, followed by “not sure.”)

The first example above has made it into research methods textbooks regarding the importance of how survey questions are worded. The ongoing discussion in this article also could illustrate these textbook dialogues: how questions are asked and how the results are interpreted by the researchers are very important.

There are other actors in this process that can help or harm the data interpretation:

  1. Funders/organizations behind the data. What do they do with the results?
  2. How the media reports the information. Do they accurately represent the data? Do they report on how the data was collected and analyzed?
  3. Does the public understand what the data means? Or, do they solely take their cues from the researchers and/or the media reports?
  4. Other researchers who look at the data. Would they measure the topics in the same way and, if not, what might be gained by alternatives?

This all may be boring details to many but going from choosing research topics and developing questions to sharing results with the public and interpretation from others can be a process. The hope is that all of the actors involved can help get as close to what is actually happening – in this case, accurately measuring and reporting attitudes and beliefs.

Does new housing data support the claim that people are leaving cities?

Reuters tries to connect the dots between data on housing construction and claims that people are leaving cities:

selective focus photography cement

Photo by Rodolfo Quiru00f3s on Pexels.com

U.S. homebuilding increased in June by the most in nearly four years amid reports of rising demand for housing in suburbs and rural areas as companies allow employees to work from home during the COVID-19 pandemic…

A survey on Thursday showed confidence among single-family homebuilders vaulting in July to levels that prevailed before the coronavirus crisis upended the economy in March.

Builders reported increased demand for single-family homes in lower density markets, including small metro areas, rural markets and large metro suburbs. The public health crisis has shifted office work from commercial business districts to homes, a trend that economists predict could become permanent…

Home building last month was boosted by a 17.2% jump in the construction of singe-family housing units, which accounts for the largest share of the housing market, to a rate of 831,000 units. Groundbreaking activity increased in the Midwest, South and Northeast, but fell in the West.

It is widely assumed that large numbers of urban residents have left New York (and possibly) other places for suburbs and other parts of the country. If so, this could influence the housing industry. Yet, I would ask a few more questions.

First question to ask: is this activity due to people leaving cities or other factors? It would be helpful to consider other possible factors at play such as seasonal changes (more housing activity in warmer weather, more demand in warmer months) and the economy (ranging from confidence of different actors to mortgage rates to available capital to unemployment – all intertwined with COVID-19). Is the uptick in activity since roughly early March to today (when

Second question to ask: if there is evidence that things are happening simultaneously, is there more evidence to suggest they are causal patterns at play? If people are leaving cities, it does not necessarily mean they are looking for new homes. Perhaps they want to return to the city, perhaps they are living with others, perhaps they are willing to rent for a while and see what happens.

And out of my own curiosity, the reporting I have seen about people leaving cities during COVID-19 seems to primarily apply to wealthier residents. Does this mean the new construction of homes will tilt toward larger, more expensive homes? If so, this is a continuation of a bifurcated housing market where those with resources will have options while many with limited resources or opportunities will not.

There is a lot to consider here and we may not the patterns for a while yet. Even if the housing industry thinks that people are fleeing cities for good, this matters regardless of the actual data.

Do we know that 500,000 people have fled NYC since the start of COVID-19?

On the heels of much discussion of residents leaving New York City, San Francisco, and other major cities because of COVID-19, the Daily Mail suggests 500,000 people have left New York City:

vehicles on road between high rise buildings

Photo by Craig Adderley on Pexels.com

Parts of Manhattan, famously the ‘city that never sleeps’, have begun to resemble a ghost town since 500,000 mostly wealthy and middle-class residents fled when Covid-19 struck in March.

The number is also part of the headline.

But, how do we know this number is accurate? If there was ever a figure that required some serious triangulation, this could be it. Most of the news stories I have seen on people fleeing cities rely on real estate agents and movers who have close contact with people going from one place to another. Those articles rarely mention figures, settling for vaguer pronouncements about trends or patterns. Better data could come from sources like utility companies (presumably there would be a drop in the consumption of electricity and water), the post office (how many people have changed addresses), and more systematic analyses of real estate records.

A further point about the supposed figure: even if it is accurate, it does not reveal much about long-term trends. Again, the stories on this phenomenon have hinted that some of those people who left will never return while some do want to get back. We will not know until some time has gone by after the COVID-19 pandemic slows down or disappears. Particularly for those with resources, will they sell their New York property or will they sit on it for a while to give themselves options or in order to make sure they get a decent return on it? This may be a shocking figure now but it could turn out in a year or two to mean very little if many of those same people return to the city.

In other words, I would wait to see if this number is trustworthy and if so, what exactly it means in the future. As sociologist Joel Best cautions around numbers that seem shocking, it helps to ask good questions about where the data comes from, how accurate it is, and what it means.

5G over what percent of America? T-Mobile: covering over 5,000 cities and towns, 200 million Americans

T-Mobile is running a commercial touting their new 5G network. They claim it reaches 200 million Americans and over 5,000 cities and towns. What if we put those numbers in context?

On one hand, both figures sound impressive. Two hundred million people is a lot of people. This is a lot of text messages to send, TV shows and videos to stream, and social media and web pages to visit. This is a potential large market for T-Mobile. And 5,000 cities and towns sounds like a lot. I don’t know how many places Americans could name but many would probably struggle to name 5,000.

On the other hand, the figures suggest that the 5G coverage still does not reach a good portion of Americans or certain parts of the country. According to the Census Population Clock, the US population is over 329 million. So covering 200 million people comes to roughly 61% of Americans covered. This more than half, not quite two-thirds. Additionally, 5,000 cities and towns sounds like a lot. Some older data – 2007 – suggests the United States has over 19,000 municipal governments and the Census in 2012 also counted over 19,000. With these figures, 5G from T-Mobile covers a little more than one quarter of American communities.

Perhaps T-Mobile is doing the best the can with the coverage they have. The numbers are big ones and I would guess they could catch the attention of viewers. Maybe the numbers do not matter if they are trying to be first. However, just because the numbers are large does not necessarily mean the product is great. Significant segments of Americans will not have access, even with the big numbers. The numbers look good but they not be as good for some when they look into what they mean.

Sociology = studying facts and interpretations of those facts

David Brooks hits on a lesson I teach in my Social Research class: studying sociology involves both looking for empirical patterns (facts) and the interpretations of patterns, real or not (meanings). Here is how Brooks puts it:

An event is really two things. It’s the event itself and then it’s the process by which we make meaning of the event. As Aldous Huxley put it, “Experience is not what happens to you, it’s what you do with what happens to you.”

In my class, this discussion comes about through reading the 2002 piece by Roth and Mehta titled “The Rashomon Effect: Combining Positivist and Interpretivist Approaches in the Analysis of Contested Events.” The authors argue research needs to look at what actually happened (the school shootings under study here) as well as how people in the community understood what happened (which may or may not have aligned with what actually happened but had important consequences for local social life). Both aspects might be interesting to study on their own – here is a phenomenon or here is what people make of this – but together researchers can get a full human experience where facts and meanings interact.

Brooks writes this in the context of the media. A good example of how this would be applied is the matter of journalists looking to spot trends. There are new empirical patterns to spot and point out. New social phenomena develop often (and figuring out where they come from can be a whole different complex matter). At the same time, we want to know what these trends mean. If psychologist Jean Twenge says there are troubling patterns as the result of smartphone use among teenagers and young adults, we can examine the empirical data – is smartphone use connected to other outcomes? – and what we think about all of this – is it good that this might be connected to increased loneliness?

More broadly, Brooks is hinting at the realm of sociology of culture where culture can be defined as patterns of meaning-making. The ways in which societies, groups, and individuals make meaning of their own actions and the social world around them is very important.

Home value algorithms show consumers data with outliers, mortgage companies take the outliers out

A homeowner can look online to get an estimate of the value of their home but that number may not match what a lender computes:

Different AVMs are designed to deliver different types of valuations. And therein lies confusion.

Consumers don’t realize that there’s an AVM for nearly any purpose, which explains why different algorithms serve up different results, said Ann Regan, an executive product manager with real estate analytic firm CoreLogic. “The scores presented to consumers are not the same version that is being used by lenders to make decisions,” she said. “The consumer-facing AVMs are designed for consumer marketing purposes.”

For instance, more accurate models used by lenders do not include outliers — properties that sold for extremely high or low prices and that consequently would skew the averages and the comparable sales for a particular house, like yours. But models used by consumer websites, such as brokers’ sites and national listing sites, scoop in as much “sold” data as possible when concocting a valuation, because then they can claim to include all available data. That’s true, said Regan, but it’s more accurate to weed out misleading data.

AVMs used by lenders send along “confidence scores” that indicate how firm the estimate is. That is a factor typically not included alongside consumer AVMs, she added.

This is an interesting trade-off. The assumption is the consumer wants to see that all the data is accounted for, which makes it seem that the estimate is more worthwhile. More data = more accuracy. On the other hand, those that work with data know that measures of central tendency and variability can be thrown off by unusual cases, often known as outliers. If the value of a home is too high or too low, and there are many reasons why this could be the case, the rest of the data can be thrown off. If there are significant outliers, more data does not equal more accuracy.

Since this knowledge is out there (at least printed in a major newspaper), does this mean consumers will be informed of these algorithm features when they look at websites like Zillow? I imagine it could be tricky to easily explain how removing some of the housing comparison data is actually a good thing but if the long-term goal is better numeracy for the public, this could be a good addition to such websites.

Yahoo News leads with Chicago murders and then says it is not the murder capital

If the point of a news story/video is to say something is not true, would you lead with the data from the not true side?

ChicagoMurderCapital121818.png

Here is the way this seems to work: grab your attention with a publicly available statistic that stands out. Oh my, how could there be so many murders in one city?? But, several sentences later, tell the reader/viewer that multiple other cities have a higher murder rate. And include in the last sentence that the murder number in Chicago has been down in recent years. So, wait: Chicago really isn’t the murder capital?

I’m trying to figure out how this adds to the public discourse. Here are a few possibilities:

  1. It is simply about clicks. Get people’s attention with a statistic and a video, throw in some data. Easy to produce, not much content.
  2. The goal is to highlight the still-high number of murders in Chicago.
  3. The goal is to point out that other cities actually experience more murders per capita.
  4. To give those who teach statistics an example of how data can be twisted and/or used without telling much of a story.

 

Bad argument: “I turned out fine”

An Australian parenting expert details why making an “I turned out fine” argument does not work:

It’s what’s known as an anecdotal fallacy. This fallacy, in simple terms, states that “I’m not negatively affected (as far as I can tell), so it must be O.K. for everyone.” As an example: “I wasn’t vaccinated, and I turned out fine. Therefore, vaccination is unnecessary.” We are relying on a sample size of one. Ourselves, or someone we know. And we are applying that result to everyone.

It relies on a decision-making shortcut known as the availability heuristic. Related to the anecdotal fallacy, it’s where we draw on information that is immediately available to us when we make a judgment call. In this case, autobiographical information is easily accessible — it’s already in your head. We were smacked as kids and turned out fine, so smacking doesn’t hurt anyone. But studies show that the availability heuristic is a cognitive bias that can cloud us from making accurate decisions utilizing all the information available. It blinds us to our own prejudices.

It dismisses well-substantiated, scientific evidence. To say “I turned out fine” is an arrogant dismissal of an alternative evidence-based view. It requires no perspective and no engagement with an alternative perspective. The statement closes off discourse and promotes a single perspective that is oblivious to alternatives that may be more enlightened. Anecdotal evidence often undermines scientific results, to our detriment.

It leads to entrenched attitudes. When views inconsistent with our own are shared we make an assumption that whoever holds those views is not fine, refusing to engage, explore or grow. Perhaps an inability to engage with views that run counter to our own suggests that we did not turn out quite so “fine.”

One data point does not make for a broad understanding of how the world works. A single case can illustrate larger trends – but it does not necessarily describe all that happens.

I wonder if one of the issues with the health patterns discussed here is that many people do indeed turn out fine even though there is clear evidence that a certain behavior leads to bad outcomes. Take the example of not wearing a seat belt while riding in the car. Even though more than 30,000 Americans die each year in accidents, the majority of people do not die and most driving goes by without event. Accidents are common but they may not be regular.Many people could indeed say they turned out fine and the thing they experienced still be bad for people.

Speculating on why sociology is less relevant to the media and public than economics

In calling for more sociological insight into economics, a journalist who attended the recent ASA meetings in Philadelphia provides two reasons why sociology lags behind economics in public attention:

Economists, you see, put draft versions of their papers online seemingly as soon as they’ve finished typing. Attend their big annual meeting, as I have several times, and virtually every paper discussed is available beforehand for download and perusal. In fact, they’re available even if you don’t go to the meeting. I wrote a column two years ago arguing that this openness had given economists a big leg up over the other social sciences in media attention and political influence, and noting that a few sociologists agreed and were trying to nudge their discipline — which disseminates its research mainly through paywalled academic journals and university-press books — in that direction with a new open repository for papers called SocArxiv. Now that I’ve experienced the ASA annual meeting for the first time, I can report that (1) things haven’t progressed much since 2016, and (2) I have a bit more sympathy for sociologists’ reticence to act like economists, although I continue to think it’s holding them back.

SocArxiv’s collection of open-access papers is growing steadily if not spectacularly, and Sociological Science, an open-access journal founded in 2014, is carving out a respected role as, among other things, a place to quickly publish articles of public interest. “Unions and Nonunion Pay in the United States, 1977-2015” by Patrick Denice of the University of Western Ontario and Jake Rosenfeld of Washington University in St. Louis, for example, was submitted June 12, accepted July 10 and published on Wednesday, the day after it was presented at the ASA meeting. These dissemination tools are used by only a small minority of sociologists, though, and the most sparsely attended session I attended in three-plus days at their annual meeting was the one on “Open Scholarship in Sociology” organized by the University of Maryland’s Philip Cohen, the founder of SocArxiv and one of the discipline’s most prominent social-media voices. This despite the fact that it was great, featuring compelling presentations by Cohen, Sociological Review deputy editor Kim Weeden of Cornell University and higher-education expert Elizabeth Popp Berman of the State University of New York at Albany, and free SocArxiv pens for all.

As I made the rounds of other sessions, I did come to a better understanding of why sociologists might be more reticent than economists to put their drafts online. The ASA welcomes journalists to its annual meeting and says they can attend all sessions where research is presented, but few reporters show up and it’s clear that most of those presenting research don’t consider themselves to be speaking in public. The most dramatic example of this in Philadelphia came about halfway through a presentation involving a particular corporation. The speaker paused, then asked the 50-plus people in the room not to mention the name of said corporation to anybody because she was about to return to an undercover job there. That was a bit ridiculous, given that there were sociologists live-tweeting some of the sessions. But there was something charming and probably healthy about the willingness of the sociologists at the ASA meeting to discuss still-far-from-complete work with their peers. When a paper is presented at an economics conference, many of the discussant’s comments and audience questions are attempts to poke holes in the reasoning or methodology. At the ASA meeting, it was usually, “This is great. Have you thought about adding …?” Also charming and probably healthy was the high number of graduate students presenting research alongside the professors, which you don’t see so much at the economists’ equivalent gathering.

All in all — and I’m sure there are sociological terms to describe this, but I’m not familiar with them — sociology seems more focused on internal cohesion than economics is. This may be partly because it’s what Popp Berman calls a “low-consensus discipline,” with lots of different methodological approaches and greatly varying standards of quality and rigor. Economists can be mean to each other in public yet still present a semi-united face to the world because they use a widely shared set of tools to arrive at answers. Sociologists may feel that they don’t have that luxury.

Disciplinary differences can be mystifying at times.

I wonder about a third possible difference in addition to the two provided: different conceptions in sociology and economics about what constitutes good arguments and data (hinted at above with the idea of “lots of different methodological approaches and greatly varying standards of quality and rigor.”) Both disciplines do aspire to the idea of social science where empirical data is used to test hypotheses about human behavior, usually in collectives, works. But, this is tricky to do as there are numerous pitfalls along the way. For example, accurate measurement is difficult even when a researcher has clearly identified a concept. Additionally, it is my sense that sociologists as a whole may be more open to qualitative and quantitative data (even with occasional flare-ups between researchers studying the same topic yet falling in different methodological camps). With these methodological questions, sociologists may feel they need more time to connect their methods to a convincing causal and scientific argument

A fourth possible reason behind the differences (also hinted at above with the idea of economists having a “semi-united face” to present): sociology has a reputation as a more left-leaning discipline. Some researchers may prefer to have all their ducks in a row before they expose their work to full public scrutiny. The work of economists is more generally accepted by the public and some leaders while sociology regularly has to work against some backlash. (As an example, see conservative leaders complain about sociology excusing poor behavior when the job of the discipline is to explain human behavior.) Why expose your work to a less welcoming public earlier when you could take a little more time to polish the argument?