Changes in methodology behind Naperville’s move to #16 best place to live in 2022 from #45 in 2021?

Money recently released their 2022 Best Places to Live in the United States. The Chicago suburb of Naperville is #16 in the country. Last year, it was #45. How did it move so much in one year? Is Naperville that much better in one year, other places that much work, or is something else at work? I wonder if the methodology led to this. Here is what went into the 2022 rankings:

Photo by RODNAE Productions on Pexels.com

Chief among those changes included introducing new data related to national heritage, languages spoken at home and religious diversity — in addition to the metrics we already gather on racial diversity. We also weighted these factors highly. While seeking places that are diverse in this more traditional sense of the word, we also prioritized places that gave us more regional diversity and strove to include cities of all sizes by lifting the population limit that we often relied on in previous years. This opened up a new tier of larger (and often more diverse) candidates.

With these goals in mind, we first gathered data on places that:

  • Had a population of at least 20,000 people — and no population maximum
  • Had a population that was at least 85% as racially diverse as the state
  • Had a median household income of at least 85% of the state median

Here is what went into the 2021 rankings:

To create Money’s Best Places to Live ranking for 2021-2022, we considered cities and towns with populations ranging from 25,000 up to 500,000. This range allowed us to surface places large enough to have amenities like grocery stores and a nearby hospital, but kept the focus on somewhat lesser known spots around the United States. The largest place on our list this year has over 457,476 residents and the smallest has 25,260.

We also removed places where:

  • the crime risk is more than 1.5x the national average
  • the median income level is lower than its state’s median
  • the population is declining
  • there is effectively no ethnic diversity

In 2021, the top-ranked communities tend to be suburbs. In 2022, there is a mix of big cities and suburbs with Atlanta at the top of the list and one neighborhood of Chicago, Rogers Park, at #5.

So how will this get reported? Did Naperville make a significant leap? Is it only worth highlighting the #16 ranking in 2022 and ignore the previous year’s lower ranking? Even while Naperville has regularly featured in Money‘s list (and in additional rankings as well), #16 can be viewed as an impressive feat.

Why it can take months for rent prices to show up in official data

It will take time for current rent prices to contribute to measures of inflation:

Photo by Burak The Weekender on Pexels.com

To solve this conundrum, the best place to start is to understand that rents are different from almost any other price. When the price of oil or grain goes up, everybody pays more for that good, at the same time. But when listed rents for available apartments rise, only new renters pay those prices. At any given time, the majority of tenants surveyed by the government are paying rent at a price locked in earlier.

So when listed rents rise or fall, those changes can take months before they’re reflected in the national data. How long, exactly? “My gut feeling is that it takes six to eight months to work through the system,” Michael Simonsen, the founder of the housing research firm Altos, told me. That means we can predict two things for the next six months: first, that official measures of rent inflation are going to keep setting 21st-century records for several more months, and second, that rent CPI is likely to peak sometime this winter or early next year.

This creates a strange but important challenge for monetary policy. The Federal Reserve is supposed to be responding to real-time data in order to determine whether to keep raising interest rates to rein in demand. But a big part of rising core inflation in the next few months will be rental inflation, which is probably past its peak. The more the Fed raises rates, the more it discourages residential construction—which not only reduces overall growth but also takes new homes off the market. In the long run, scaled-back construction means fewer houses—which means higher rents for everybody.

To sum up: This is all quite confusing! The annual inflation rate for new rental listings has almost certainly peaked. But the official CPI rent-inflation rate is almost certainly going to keep going up for another quarter or more. This means that, several months from now, if you turn on the news or go online, somebody somewhere will be yelling that rental inflation is out of control. But this exclamation might be equivalent to that of a 17th-century citizen going crazy about something that happened six months earlier—the news simply took that long to cross land and sea.

This sounds like a research methods problem: how to get more up-to-date data into the current measures? A few quick ideas:

  1. Survey rent listings to see what landlords are asking for.
  2. Survey new renters to better track more recent rent prices.
  3. Survey landlords as to the prices of the recent units they rented.

Given how much rides on important economic measures such as the inflation rate, more up-to-date data would be helpful.

Recent quote on doing agendaless science

Vaclav Smil’s How The World Really Works offers an analysis of foundational materials and processes behind life in 2022 and what these portend for the near future. It also includes this as the second to last paragraph of the book:

Is it possible to have no agenda in carrying out analysis and writing such an overview?

Much of what Smil describes and then extrapolates from could be viewed as having an agenda. This agenda could be scientism. On one hand, he reveals some of the key forces at work in our world and on the other hand he provides interpretation of what these mean now and at other times. The writing suggests he knows this; he makes similar points to that quoted above throughout the book to address the issue.

I feel this tension when teaching Research Methods in Sociology. Sociology has a stream of positivism and scientific analysis from its early days, wanting to apply a more dispassionate method to the social world. It has also has early strands of a different approach less beholden to the scientific method and allowing for additional forms of social analysis. These strands continue today and make for an interesting challenge and opportunity in teaching a plurality of methods within a single discipline.

I learned multiple things from the book. I also will need time to ponder the implications and the approach.

“Journalism is sociology on fast forward”

Listening to 670 The Score at 12:14 PM today, I heard Leila Rehimi say this about journalism:

Photo by brotiN biswaS on Pexels.com

Journalism is sociology on fast forward.

I can see the logic in this as journalists and sociologists are interested in finding out what is happening in society. They are interested in trends, institutions, patterns, people in different roles and with different levels of access to power and resources, and narratives.

There are also significant differences in the two fields. One is hinted at in the quote above: different timelines. A typical sociology project from idea to publication in some form could takes 4-6 (a rough average). Journalists usually work on shorter timelines and have stronger pressures to generate content more quickly.

Related to this timing issue is the difference in methods for understanding and analyzing data and evidence. Sociologists use a large number of quantitative and qualitative methods, follow the scientific method, and take longer periods of time to analyze and write up conclusions. Sociologists see themselves more as social scientists, not just describers of social realities.

I am sure there are plenty of sociologists and journalists with thoughts on this. It would be interesting to see where they see convergence and divergence between the two fields.

The difficulty of collecting, interpreting, and acting on data quickly in today’s world

I do not think the issue is just limited to the problems with data during COVID-19:

Photo by Artem Saranin on Pexels.com

If, after reading this, your reaction is to say, “Well, duh, predictions are difficult. I’d like to see you try it”—I agree. Predictions are difficult. Even experts are really bad at making them, and doing so in a fast-moving crisis is bound to lead to some monumental errors. But we can learn from past failures. And even if only some of these miscalculations were avoidable, all of them are instructive.

Here are four reasons I see for the failed economic forecasting of the pandemic era. Not all of these causes speak to every failure, but they do overlap…

In a crisis, credibility is extremely important to garnering policy change. And failed predictions may contribute to an unhealthy skepticism that much of the population has developed toward expertise. Panfil, the housing researcher, worries about exactly that: “We have this entire narrative from one side of the country that’s very anti-science and anti-data … These sorts of things play right into that narrative, and that is damaging long-term.”

My sense as a sociologist is that the world is in a weird position: people expect relatively quick solutions to complex problems, there is plenty of data to think about (even as the quality of the data varies widely), and there are a lot of actors interpreting and acting on data or evidence. Put this all together and it is can be difficult to collect good data, make sound interpretations of data, and make good choices regarding acting on those interpretations.

In addition, making predictions about the future is already difficult even with good information, interpretation, and policy options.

So, what should social scientists take from this? I would hope we can continue to improve our abilities to respond quickly and well to changing conditions. Typical research cycles take years but this is not possible in certain situations. There are newer methodological options that allow for quicker data collection and new kinds of data; all of this needs to be evaluated and tested. We need better processes of reaching consensus at quicker rates.

Will we ever be at a point where society is predictable? This might be the ultimate dream of social science if only we had enough data and the correct models. I am skeptical but certainly our methods and interpretation of data can always be improved.

Illinois lost residents 2010 to 2020; discrepancies in year to year estimates and decennial count

Illinois lost residents over the last decade. But, different Census estimates at different times created slightly different stories:

Photo by Nachelle Nocom on Pexels.com

Those estimates showed Illinois experiencing a net loss of 9,972 residents between 2013 and 2014; 22,194 residents between 2014 and 2015; 37,508 residents between 2015 and 2016; about 33,700 residents between 2016 and 2017; 45,116 between 2017 and 2018; 51,250 between 2018 and 2019; and 79,487 between 2019 and 2020…

On April 26, the U.S. Census Bureau released its state-by-state population numbers based on last year’s census. These are the numbers that determine congressional apportionment. Those numbers, released every 10 years, show a different picture for Illinois: a loss of about 18,000 residents since 2010.

What’s the deal? For starters, the two counting methods for estimated annual population and the 10-year census for apportionment are separate. Apples and oranges. Resident population numbers and apportionment population numbers are arrived at differently, with one set counting Illinois families who live overseas, including in the military, and one not.

Additionally, the every-10-years number is gathered not from those county-by-county metrics but from the census forms we fill out and from door-to-door contacts made by census workers on the ground.

The overall story is the same but this is a good reminder of how different methods can produce different results. Here are several key factors to keep in mind:

  1. The time period is different. One estimate comes every year, one comes every ten years. The yearly estimates are helpful because people like data. That does not necessarily mean the yearly estimates can be trusted as much as the other ones.
  2. The method in each version – yearly versus every ten years – is different. The decennial data involves more responses and requires more effort.
  3. The confidence in the two different kinds of estimates is different because of #2. The ten year estimates are more valid because they collect more data.

Theoretically, the year-to-year estimates could lead to a different story compared to the decennial estimates. Imagine year-to-year data that told of a slight increase in population while the ten-year numbers provided a slight decrease in population. This does not mean the process went wrong there or in the narrative where the yearly and ten-year estimates agreed. With estimates, researchers are trying their best to measure the full population patterns. But, there is some room for error.

That said, now that Illinois is known as one of the three states that lost population over the last decade, it will be interesting to see how politicians and business leaders respond. I can predict some of the responses already as different groups have practiced their talking points for years. Yet, the same old rhetoric may not be enough as these figures paint Illinois in a bad light when population growth is good in the United States.

Researchers adjust as Americans say they are more religious when asked via phone versus responding online

Research findings suggest Americans answer questions about religiosity differently depending on the mode of the survey:

Photo by mentatdgt on Pexels.com

Researchers found the cause of the “noise” when they compared the cellphone results with the results of their online survey: social desirability bias. According to studies of polling methods, people answer questions differently when they’re speaking to another human. It turns out that sometimes people overstate their Bible reading if they suspect the people on the other end of the call will think more highly of them if they engaged the Scriptures more. Sometimes, they overstate it a lot…

Smith said that when Pew first launched the trend panel in 2014, there was no major difference between answers about religion online and over the telephone. But over time, he saw a growing split. Even when questions were worded exactly the same online and on the phone, Americans answered differently on the phone. When speaking to a human being, for example, they were much more likely to say they were religious. Online, more people were more comfortable saying they didn’t go to any kind of religious service or listing their religious affiliation as “none.”…

After re-weighting the online data set with better information about the American population from its National Public Opinion Reference Survey, Pew has decided to stop phone polling and rely completely on the online panels…

Pew’s analysis finds that, today, about 10 percent of Americans will say they go to church regularly if asked by a human but will say that they don’t if asked online. Social scientists and pollsters cannot say for sure whether that social desirability bias has increased, decreased, or stayed the same since Gallup first started asking religious questions 86 years ago.

This shift regarding studying religion highlights broader considerations about methodology that are always helpful to keep in mind:

  1. Both methods and people/social conditions change. More and more surveying (and other data collection) is done via the Internet and other technologies. This might change who responds, how people respond, and more. At the same time, actual religiosity changes and social scientists try to keep up. This is a dynamic process that should be expected to change over time to help researchers get better and better data.
  2. Social desirability bias is not the same as people lying to researchers or being dishonest with researchers. That implies an intentional false answer. This is more about context: the mode of the survey – phone or online – influences who the respondent is responding to. And with a human interaction, we might respond differently. In an interaction, we with impression management in mind where we want to be viewed in particular ways by the person with whom we are interacting.
  3. Studying any aspect of religiosity benefits from multiple methods and multiple approaches to the same phenomena under study. A single measure of church attendance can tell us something but getting multiple data points with multiple methods can help provide a more complete picture. Surveys have particular strengths but they are not great in other areas. Results from surveys should be put alongside other data drawn from interviews, ethnographies, focus groups, historical analysis, and more to see what consensus can be reached. All of this might be out of the reach of individual researchers or single research projects but the field as a whole can help find the broader patterns.

The Census as national process yet works better with local census takers

Among other interesting tidbits about how data was collected for the 2020 census, here is why it is helpful for census takers to be from the community in which they collect data:

Photo by Sunyu Kim on Pexels.com

As it turns out, the mass mobilization of out-of-state enumerators is not just uncommon, but generally seen as a violation of the spirit of the census. “One of the foundational concepts of a successful door-knocking operation is that census takers will be knowledgeable about the community in which they’re working,” Lowenthal explained. “This is both so they can do a good job, because they’ll have to understand local culture and hopefully the language, but also so that the people who have to open their doors and talk to them have some confidence in them.”

Going door to door is a difficult task. Some connection to the community could help convince people to cooperate. And when cooperation equals higher response rates and more accurate data, local knowledge is good.

As the piece goes on to note, this does not mean that outside census takers could not help. Having more people going to every address could help boost response rates even if the census takers were from a different part of the country.

I wonder how much local knowledge influences the response rates from proxies, other people who can provide basic demographic information when people at the address do not respond:

According to Terri Ann Lowenthal, a former staff director for the House census oversight subcommittee, 22 percent of cases completed by census takers in 2010 were done so using data taken from proxies. And of those cases, roughly a quarter were deemed useless by the Census Bureau. As a result, millions of people get missed while others get counted twice. These inaccuracies tend to be more frequent in urban centers and tribal areas, but also, as I eventually learned, in rural sections of the country.

It is one thing to have the imprimatur of the Census when talking with a proxy; it would seem to be a bonus to also be a local.

More broadly, this is a reminder of how an important data collection process depends in part on local workers. With a little bit of inside knowledge and awareness, the Census can get better data and then that information can effectively serve many.

Combating abysmally low response rates for political polling

One pollster describes the difficulty today in reaching potential voters:

Photo by Breakingpic on Pexels.com

As the years drifted by, it took more and more voters per cluster for us to get a single voter to agree to an interview. Between 1984 and 1989, when caller ID was rolled out, more voters began to ignore our calls. The advent of answering machines and then voicemail further reduced responses. Voters screen their calls more aggressively, so cooperation with pollsters has steadily declined year-by-year. Whereas once I could extract one complete interview from five voters, it can now take calls to as many as 100 voters to complete a single interview, even more in some segments of the electorate…

I offer my own experience from Florida in the 2020 election to illustrate the problem. I conducted tracking polls in the weeks leading up to the presidential election. To complete 1,510 interviews over several weeks, we had to call 136,688 voters. In hard-to-interview Florida, only 1 in 90-odd voters would speak with our interviewers. Most calls to voters went unanswered or rolled over to answering machines or voicemail, never to be interviewed despite multiple attempts.

The final wave of polling, conducted Oct. 25-27 to complete 500 interviews, was the worst for cooperation. We could finish interviews with only four-tenths of one percent from our pool of potential respondents. As a result, this supposed “random sample survey” seemingly yielded, as did most all Florida polls, lower support for President Trump than he earned on Election Day.

After the election, I noted wide variations in completion rates across different categories of voters, but nearly all were still too low for any actual randomness to be assumed or implied.

This is a basic Research Methods class issue: if you cannot collect a good sample, you are going to have a hard time reflecting reality for the population.

Here is the part I understand less. This is not a new issue. As noted above, response rates have been falling for decades. Part of it is new technology. Some of it involves new behavior, such as ignoring phone calls or distrust of political polling. The amount of polling and data collection that takes place now can lead to survey fatigue.

But, it is interesting that the techniques used to collect this data are roughly the same. Of course, it has moved from land lines to cell phones and perhaps even texting or recruited online pools of potential voters. The technology has changed some but the idea is similar in trying to reach out to a broad set of people and hope a representative enough sample responds.

Perhaps it is time for new techniques. The old ones have some advantages including the ability to relatively quickly reach a large number of people and researchers and consultants are used to these techniques. And I do not have the answers for what might work better. Researchers embedded in different communities who could collect data over time? Finding public spaces frequented by diverse populations and approaching people there? Working more closely with bellwhether or representative places or populations to track what is going on there?

Even with these low response rates, polling can still tell us something. It is not as bad as picking randomly or flipping a coin. Yet, it is not accurate enough in recent years. If researchers want to collect valid and reliable polling data in the future, new approaches may be in order.

Fewer plane flights, worse weather forecasts, collecting data

The consequences of COVID-19 continue: with fewer commercial airline flights, weather models have less data.

white clouds and blue sky

Photo by Ithalu Dominguez on Pexels.com

During their time in the skies, commercial airplanes regularly log a variety of meteorological data, including air temperature, relative humidity, air pressure and wind direction — data that is used to populate weather prediction models…

With less spring meteorological data to work with, forecasting models have produced less accurate predictions, researchers said. Long-term term forecasts suffered the most from a lack of meteorological data, according to the latest analysis…

Forecast accuracy suffered the most across the United States, southeast China and Australia, as well as more remote regions like the Sahara Desert, Greenland and Antarctica.

Though Western Europe experienced an 80 to 90 percent drop in flight traffic during the height of the pandemic, weather forecasts in the region remained relatively accurate. Chen suspects the region’s densely-packed network of ground-based weather stations helped forecasters continue to populate models with sufficient amounts of meteorological data.

Models, whether for pandemics or weather, need good input. Better data up front helps researchers adjust models to fit past patterns and predict future outcomes. Absent of data, it can be hard to fit models, especially in complex systems like weather.

As noted above, there are other ways to obtain weather data. Airplanes offered a convenient way to collect data: thousands of regular flights could lead to a lot of data. In contrast, constructing ground stations would require more resources in the short-term.

Yet, any data collector needs to remain flexible. One source of data can disappear, leading to a new approach. Or, a new opportunity might arise and switching methods makes sense. Or, those studying and predicting weather could develop multiple good sources of data that could options or redundancy amid black swan events.

Few may recognize all of this is happening. Weather forecasts will continue. Behind the scenes, we might even get better weather models in the long run as researchers and meteorologists adjust.