The importance of the decision of where to raise a child

A data scientist argues that one of the most important parenting decisions is where to raise children:

Photo by Agung Pandit Wiguna on Pexels.com

Something interesting happens when we compare the study on adoptions with this work on neighborhoods. We find that one factor about a home—its location—accounts for a significant fraction of the total effect of that home. In fact, putting together the different numbers, I have estimated that some 25 percent—and possibly more—of the overall effects of a parent are driven by where that parent raises their child. In other words, this one parenting decision has much more impact than many thousands of others.

Why is this decision so powerful? Chetty’s team has a possible answer for that. Three of the biggest predictors that a neighborhood will increase a child’s success are the percent of households in which there are two parents, the percent of residents who are college graduates, and the percent of residents who return their census forms. These are neighborhoods, in other words, with many role models: adults who are smart, accomplished, engaged in their community, and committed to stable family lives.

There is more evidence for just how powerful role models can be. A different study that Chetty co-authored found that girls who move to areas with lots of female patent holders in a specific field are far more likely to grow up to earn patents in that same field. And another study found that Black boys who grow up on blocks with many Black fathers around, even if that doesn’t include their own father, end up with much better life outcomes.

I will add this to my list of why it matters where people choose to live: it affects the life chances of kids.

Just having this data only goes so far. A few examples of where it gets trickier to figure out what to do with such information:

  1. How many parents would act on the information compared to other reasons for choosing where to live?
  2. How many parents could act on this information even if they wanted to?
  3. Are there enough neighborhoods in which children could benefit? Do the current residents of such neighborhoods want lots of people moving in?
  4. Are parents responsible for moving kids to such locations or are other actors responsible for helping kids live in these locations?

And so on. The implications of these findings could take decades to work out, particularly as Americans generally want to provide opportunities for their kids.

Estimating the undercounts and overcounts of the 2020 Census

The decennial census is a big undertaking. And the work continues: the Census Bureau just released their estimates of how well the 2020 counts reflect the population of the United States.

Photo by Kaboompics .com on Pexels.com

“Today’s results show statistical evidence that the quality of the 2020 Census total population count is consistent with that of recent censuses. This is notable, given the unprecedented challenges of 2020,” said Director Robert L. Santos. “But the results also include some limitations — the 2020 Census undercounted many of the same population groups we have historically undercounted, and it overcounted others.”

The two analyses are from the Post-Enumeration Survey (PES) and Demographic Analysis Estimates (DA) and estimate how well the 2020 Census counted everyone in the nation and in certain demographic groups. They estimate the size of the U.S. population and then compare those estimates to the census counts…

The results show that the 2020 Census undercounted the Black or African American population, the American Indian or Alaska Native population living on a reservation, the Hispanic or Latino population, and people who reported being of Some Other Race.

On the other hand, the 2020 Census overcounted the Non-Hispanic White population and the Asian population. The Native Hawaiian or Other Pacific Islander population was neither overcounted nor undercounted according to the findings.

Among age groups, the 2020 Census undercounted children 0 to 17 years old, particularly young children 0 to 4 years old. Young children are persistently undercounted in the decennial census.

I can imagine how some might read this story: the Census uses estimates and additional data to make claims about what is supposed to be a comprehensive count? Here are some quick thoughts in response:

  1. The numbers might sound like a lot: an undercount of the total population of 18.8 million? Yet, the error rates for separate groups are reported often between 1-4% and the total is off less than 6%.
  2. If the official numbers are known to be overcounts or undercounts, how might researchers take that into account when using the data?
  3. The Census is using multiple data sources to try to both get the most accurate statistics and improve its methodology. Explaining this publicly hopefully helps builds trust in the process and the numbers.
  4. It will be interesting to see how all of this informs future data gathering efforts. If there are consistent undercounts with certain groups, what changes in the coming years? If other data sources provide useful information, such as vital records, can these be incorporated into the data? And so on.

Collecting data about the population of a large country is no easy task and is a work in progress.

Illinois lost residents 2010 to 2020; discrepancies in year to year estimates and decennial count

Illinois lost residents over the last decade. But, different Census estimates at different times created slightly different stories:

Photo by Nachelle Nocom on Pexels.com

Those estimates showed Illinois experiencing a net loss of 9,972 residents between 2013 and 2014; 22,194 residents between 2014 and 2015; 37,508 residents between 2015 and 2016; about 33,700 residents between 2016 and 2017; 45,116 between 2017 and 2018; 51,250 between 2018 and 2019; and 79,487 between 2019 and 2020…

On April 26, the U.S. Census Bureau released its state-by-state population numbers based on last year’s census. These are the numbers that determine congressional apportionment. Those numbers, released every 10 years, show a different picture for Illinois: a loss of about 18,000 residents since 2010.

What’s the deal? For starters, the two counting methods for estimated annual population and the 10-year census for apportionment are separate. Apples and oranges. Resident population numbers and apportionment population numbers are arrived at differently, with one set counting Illinois families who live overseas, including in the military, and one not.

Additionally, the every-10-years number is gathered not from those county-by-county metrics but from the census forms we fill out and from door-to-door contacts made by census workers on the ground.

The overall story is the same but this is a good reminder of how different methods can produce different results. Here are several key factors to keep in mind:

  1. The time period is different. One estimate comes every year, one comes every ten years. The yearly estimates are helpful because people like data. That does not necessarily mean the yearly estimates can be trusted as much as the other ones.
  2. The method in each version – yearly versus every ten years – is different. The decennial data involves more responses and requires more effort.
  3. The confidence in the two different kinds of estimates is different because of #2. The ten year estimates are more valid because they collect more data.

Theoretically, the year-to-year estimates could lead to a different story compared to the decennial estimates. Imagine year-to-year data that told of a slight increase in population while the ten-year numbers provided a slight decrease in population. This does not mean the process went wrong there or in the narrative where the yearly and ten-year estimates agreed. With estimates, researchers are trying their best to measure the full population patterns. But, there is some room for error.

That said, now that Illinois is known as one of the three states that lost population over the last decade, it will be interesting to see how politicians and business leaders respond. I can predict some of the responses already as different groups have practiced their talking points for years. Yet, the same old rhetoric may not be enough as these figures paint Illinois in a bad light when population growth is good in the United States.

Getting the categories right in continued urban/rural-plus-suburbs-and-exurbs-in-the-middle divide

Numerous outlets have commented on the continued presence of an urban/rural divide in 2020 voting. Here is another example:

Photo by Pixabay on Pexels.com

Rather than flipping more Obama-Trump counties, Biden instead exceeded previous Democratic win margins in Wisconsin’s two biggest cities, Milwaukee and Madison.

That pattern extended to Michigan and other battleground states, with Biden building upon Democrats’ dominance in urban and suburban jurisdictions but Trump leaving most of exurban and rural America awash in red.

The urban-rural divide illustrates the pronounced polarization evident in preliminary 2020 election results. The split underscores fundamental disagreements among Americans about how to control the coronavirus pandemic or whether to even try; how to revitalize the economy and restore jobs; how to combat climate change or whether it is an emergency at all; and the roles of morality, empathy and the rule of law in the body politic.

Four thoughts in reaction to this.

  1. The urban/rural divide is described in an interesting way above: it is cities and suburbs for Democrats and exurbs and rural areas for Republicans. This matches the patterns of this and recent elections. However, is separating the suburbs and exurbs worthwhile? Here is where county level analysis may not be fine-grained enough to see the patterns. Another way to put it might be this: there is a gradient in voting by party as the distance from the big city increases. Does it shade over to Republicans only in exurbs – which are suburbs on the outer edges? Is the 50/50 split a little before exurbs? A concentric circles approach could help though there still could be pockets that break with the overall pattern.
  2. Suburbs might just be too broad of a term to be useful in such analysis. The exurb/suburb split it one way to put it. Might it help to also think of different types of suburbs (wealthier bedroom communities, ethnoburbs/majority-minority communities, working-class suburbs, industrial suburbs, etc.)?
  3. Explaining the differences as urban/rural has a nice short ring to it and it fits the data. Introducing more categories in the middle is interesting to campaigns, pundits, and researchers but is harder to quickly describe. Perhaps the urban/suburban versus exurban/rural divide?
  4. Is the urban/rural divide one of the most fundamental aspects of polarization? Or, is it a symptom? In this story, the divide leads off the discussion of polarization on a number of fronts. But, what leads to these spatial patterns in the first place? While the geography is helpful to think about, are the real issues behind the urban/rural divide about race/ethnicity and class? Given residential segregation patterns in the United States, using the spatial patterns as an explanation covers up a lot of important social forces that led to those patterns in the first place.

Trying to fit all the election results on one television screen

I watched briefly a number of election night broadcasts last night. One conclusion I came to: there is way too much data to fit on a television screen. And if you want more of the data, you need the Internet, not television.

The different broadcasts tried similar variations: flipping back and forth between a set of anchors and pundits at desks and analysts at a smart board showing election results from different states and locations. They have done this for enough election nights that the process is pretty established.

While they do this, there is often a lot of data on the screen. This could include: a map of the United States with states shaded; a chryon at the bottom with scrolling news; another panel at the bottom flipping through results from different races; and people talking, sometimes in connection to the data on the screen and sometimes. If the analyst at the smart board is on the screen, there is another set of maps to consider.

CNN broadcast, November 4, 2020

This is a lot to take in and it might not be enough. The broadcasts try to balance all of the levels of government – from the presidential race to congressional districts – and are flipping back and forth. I appreciated seeing the more simple approach of PBS which went with a lot less data on the screen, bigger images of the talking heads, and simple summary graphics of the winners.

But, if you want the data, the television broadcast does not cut it. Numerous websites offered single pages where one could monitor all of the major races in real-time. Want to keep up on both local and national races? Have two pages open. Want reaction? Add social media in a third window. Use multiple Internet-connected devices including smartphones, tablets, and computers (and maybe Internet-enabled televisions).

Furthermore, web pages give users more control over the data they are seeing. Take the final 2020 election forecast from FiveThirtyEight:

On one page, readers could see multiple presentations of data plus explanations. Want to scroll through in 10 seconds and see the headlines? Fine. Want to spend 5 minutes analyzing the various graphics? That works. Want to click on all the links for the metholodogy and commentary? A reader could do that too.

The one big advantage television offers is that it offers commentary and faces in real-time plus the potential for live coverage from the scene (such as images of gatherings for candidates) and feeling like the viewer is present when major announcements are made. The Internet has approximations of this – lively social media accounts, live blogs – but it is not the same feeling. (Of course, when you have more than ten live election night broadcasts available on your television, the audience will be pretty split there as well.) Elections are not just about data for many; they also include emotions, presence, and the potential for important memories.

Given these differences in media, I did what I am guessing many did last night: I consumed both television and Internet/social media coverage. Neither are perfect for the task. I had to go to sleep eventually. And whoever can figure out how to combine the best elements of both for election nights may do very well for themselves.

When I see “study” in a news story, I (wrongly) assume it is a peer-reviewed analysis

In the last week, I have run into two potentially interesting news stories that cite studies. Yet, when I looked into what kind of studies these were, they were not what I expected.

Photo by Pixabay on Pexels.com

First, the Chicago Tribune online headline: “Why are Chicagoans moving away during the pandemic? As study suggests outbound migration is spiking, we asked them.” The opening to the story:

Chicago’s population has been on the decline for years, with the metropolitan area suffering some of the greatest losses of any major U.S. city. But new research suggests that the pandemic might be exacerbating the exodus.

For the first time in four years, moving concierge app Updater has helped more people move out of Chicago than to it, the company said. The catch-all moving service estimates that it takes part in one-third of all U.S. moves, providing unique, real-time insight into pandemic-driven trends, said Jenna Weinerman, Updater’s vice president of marketing.

“All these macro conditions — job insecurity, remote work, people wanting to gain more space — are coming together to create these patterns,” Weinerman said.

The Chicago figures are based on approximately 39,000 moves within city limits from March 1 to Sept. 30. Compared to 2019, this year saw more moving activity in general, with an 8% jump in moves into the city — but a 19% increase in the number of people leaving.

The second article involved a study at Cafe Storage and the headline “Average Home Size in the US: New Homes Bigger than 10 Years Ago but Apartments Trail Behind” (also cited in the Chicago Tribune) From the story:

According to the latest available US Census data, the average size of single family homes built in the US was trending upwards from 2010 until 2017, when sizes hit a peak of 2,643 square feet. Since then, single family homes began decreasing in size, with homes built in 2019 averaging 2,611 square feet…

Location matters when it comes to average home size. Some urban hotspots follow the national trend, while others move in the opposite direction. Here’s how single family home and apartment sizes look in the country’s top 20 largest cities, based on Yardi Matrix, Property Shark and Point2Homes data.

As an academic, here is what I expect when I hear the word study:

  1. Peer-reviewed work published in an academic outlet.
  2. Rigorous methodology and trusted data sources.

These steps do not guarantee research free from error but it does impose standards and steps intended to reduce errors.

In both cases, this analysis does not meet those standards. Instead, they utilize more proprietary data and serve the companies or websites publicizing the findings. This does not necessarily mean the findings are untrue. It does, however, make it much more difficult for journalists or the public to know how the study was conducted, what the findings are, and what it all means.

Use of the term study is related to a larger phenomena: many organizations, businesses, and individuals have potentially interesting data to contribute to public discussions and policy making. For example, without official data about the number of people moving out of cities, we are left searching for other data sources. How reliable are they? What data is anecdotal and what can be trusted? Why don’t academics and journalists find better data?

If we use the word “study” to refer to any data analysis, we risk making it even harder for people to discern what is a trustworthy study and what is not. Call it an analysis, call it a set of findings. Make clear who conducted the research, how the analysis was conducted, and with what data. (These three steps would be good for any coverage of an academic study.) Help readers and interested parties put the findings in the context of other findings and ongoing conversations. Just do not suggest that this is a study in the same way that other analyses are studies.

A short overview of recent survey questions about Holocaust knowledge in the US

Although this article leads with recent survey results about what Americans know and think about the Holocaust, I’ll start with the summary of earlier surveys and move forward in time to the recent results:

Whether or not the assumptions in the Claims Conference survey are fair, and how to tell, is at the core of a decades long debate over Holocaust knowledge surveys, which are notoriously difficult to design. In 1994, Roper Starch Worldwide, which conducted a poll for the American Jewish Committee, admitted that its widely publicized Holocaust denial question was “flawed.” Initially, it appeared that 1 in 5, or 22 percent, of Americans thought it was possible the Holocaust never happened. But pollsters later determined that the question—“Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?”—was confusing and biased the sample. In a subsequent Gallup poll, when asked to explain their views on the Holocaust in their own words, “only about 4 percent [of Americans] have real doubts about the Holocaust; the others are just insecure about their historical knowledge or won’t believe anything they have not experienced themselves,” according to an Associated Press report at the time. More recently, the Anti-Defamation League was criticized for a 2014 worldwide study that asked respondents to rate 11 statements—“People hate Jews because of the way they behave, for example”—as “probably true” or “probably false.” If respondents said “probably true” to six or more of the statements, they were considered to harbor anti-Semitic views, a line that many experts said could not adequately represent real beliefs…

Just two years ago, the Claims Conference released another survey of Americans that found “Two-Thirds of Millennials Don’t Know What Auschwitz Is,” as a Washington Post headline summarized it. The New York Times reported on the numbers at the time as proof that the “Holocaust is fading from memory.” Lest it appear the group is singling out Americans, the Claims Conference also released surveys with “stunning” results from Canada, France, and Austria.

But a deeper look at the Claims Conference data, which was collected by the firm Schoen Cooperman Research, reveals methodological choices that conflate specific terms (the ability to ID Auschwitz) and figures (that 6 million Jews were murdered) about the Holocaust with general knowledge of it, and knowledge with attitudes or beliefs toward Jews and Judaism. This is not to discount the real issues of anti-Semitism in the United States. But it is an important reminder that the Claims Conference, which seeks restitution for the victims of Nazi persecution and also to “ensure that future generations learn the lessons of the Holocaust,” is doing its job: generating data and headlines that it hopes will support its worthy cause.

The new Claims Conference survey is actually divided into two, with one set of data from a 1,000-person national survey and another set from 50 state-by-state surveys of 200 people each. In both iterations, the pollsters aimed to assess Holocaust knowledge according to three foundational criteria: the ability to recognize the term the Holocaust, name a concentration camp, and state the number of Jews murdered. The results weren’t great—fully 12 percent of national survey respondents had not or did not think they had heard the term Holocaust—but some of the questions weren’t necessarily written to help respondents succeed. Only 44 percent were “familiar with Auschwitz,” according to the executive summary of the data, but that statistic was determined by an open-ended question: “Can you name any concentration camps, death camps, or ghettos you have heard of?” This type of active, as opposed to passive, recall is not necessarily indicative of real knowledge. The Claims Conference also emphasized that 36 percent of respondents “believe” 2 million or fewer Jews were killed in the Holocaust (the correct answer is 6 million), but respondents were actually given a multiple-choice question with seven options—25,000, 100,000, 1 million, 2 million, 6 million, 20 million, and “not sure”—four of which were lowball figures. (Six million was by far the most common answer, at 37 percent, followed by “not sure.”)

The first example above has made it into research methods textbooks regarding the importance of how survey questions are worded. The ongoing discussion in this article also could illustrate these textbook dialogues: how questions are asked and how the results are interpreted by the researchers are very important.

There are other actors in this process that can help or harm the data interpretation:

  1. Funders/organizations behind the data. What do they do with the results?
  2. How the media reports the information. Do they accurately represent the data? Do they report on how the data was collected and analyzed?
  3. Does the public understand what the data means? Or, do they solely take their cues from the researchers and/or the media reports?
  4. Other researchers who look at the data. Would they measure the topics in the same way and, if not, what might be gained by alternatives?

This all may be boring details to many but going from choosing research topics and developing questions to sharing results with the public and interpretation from others can be a process. The hope is that all of the actors involved can help get as close to what is actually happening – in this case, accurately measuring and reporting attitudes and beliefs.

Does new housing data support the claim that people are leaving cities?

Reuters tries to connect the dots between data on housing construction and claims that people are leaving cities:

selective focus photography cement

Photo by Rodolfo Quiru00f3s on Pexels.com

U.S. homebuilding increased in June by the most in nearly four years amid reports of rising demand for housing in suburbs and rural areas as companies allow employees to work from home during the COVID-19 pandemic…

A survey on Thursday showed confidence among single-family homebuilders vaulting in July to levels that prevailed before the coronavirus crisis upended the economy in March.

Builders reported increased demand for single-family homes in lower density markets, including small metro areas, rural markets and large metro suburbs. The public health crisis has shifted office work from commercial business districts to homes, a trend that economists predict could become permanent…

Home building last month was boosted by a 17.2% jump in the construction of singe-family housing units, which accounts for the largest share of the housing market, to a rate of 831,000 units. Groundbreaking activity increased in the Midwest, South and Northeast, but fell in the West.

It is widely assumed that large numbers of urban residents have left New York (and possibly) other places for suburbs and other parts of the country. If so, this could influence the housing industry. Yet, I would ask a few more questions.

First question to ask: is this activity due to people leaving cities or other factors? It would be helpful to consider other possible factors at play such as seasonal changes (more housing activity in warmer weather, more demand in warmer months) and the economy (ranging from confidence of different actors to mortgage rates to available capital to unemployment – all intertwined with COVID-19). Is the uptick in activity since roughly early March to today (when

Second question to ask: if there is evidence that things are happening simultaneously, is there more evidence to suggest they are causal patterns at play? If people are leaving cities, it does not necessarily mean they are looking for new homes. Perhaps they want to return to the city, perhaps they are living with others, perhaps they are willing to rent for a while and see what happens.

And out of my own curiosity, the reporting I have seen about people leaving cities during COVID-19 seems to primarily apply to wealthier residents. Does this mean the new construction of homes will tilt toward larger, more expensive homes? If so, this is a continuation of a bifurcated housing market where those with resources will have options while many with limited resources or opportunities will not.

There is a lot to consider here and we may not the patterns for a while yet. Even if the housing industry thinks that people are fleeing cities for good, this matters regardless of the actual data.

Do we know that 500,000 people have fled NYC since the start of COVID-19?

On the heels of much discussion of residents leaving New York City, San Francisco, and other major cities because of COVID-19, the Daily Mail suggests 500,000 people have left New York City:

vehicles on road between high rise buildings

Photo by Craig Adderley on Pexels.com

Parts of Manhattan, famously the ‘city that never sleeps’, have begun to resemble a ghost town since 500,000 mostly wealthy and middle-class residents fled when Covid-19 struck in March.

The number is also part of the headline.

But, how do we know this number is accurate? If there was ever a figure that required some serious triangulation, this could be it. Most of the news stories I have seen on people fleeing cities rely on real estate agents and movers who have close contact with people going from one place to another. Those articles rarely mention figures, settling for vaguer pronouncements about trends or patterns. Better data could come from sources like utility companies (presumably there would be a drop in the consumption of electricity and water), the post office (how many people have changed addresses), and more systematic analyses of real estate records.

A further point about the supposed figure: even if it is accurate, it does not reveal much about long-term trends. Again, the stories on this phenomenon have hinted that some of those people who left will never return while some do want to get back. We will not know until some time has gone by after the COVID-19 pandemic slows down or disappears. Particularly for those with resources, will they sell their New York property or will they sit on it for a while to give themselves options or in order to make sure they get a decent return on it? This may be a shocking figure now but it could turn out in a year or two to mean very little if many of those same people return to the city.

In other words, I would wait to see if this number is trustworthy and if so, what exactly it means in the future. As sociologist Joel Best cautions around numbers that seem shocking, it helps to ask good questions about where the data comes from, how accurate it is, and what it means.

5G over what percent of America? T-Mobile: covering over 5,000 cities and towns, 200 million Americans

T-Mobile is running a commercial touting their new 5G network. They claim it reaches 200 million Americans and over 5,000 cities and towns. What if we put those numbers in context?

On one hand, both figures sound impressive. Two hundred million people is a lot of people. This is a lot of text messages to send, TV shows and videos to stream, and social media and web pages to visit. This is a potential large market for T-Mobile. And 5,000 cities and towns sounds like a lot. I don’t know how many places Americans could name but many would probably struggle to name 5,000.

On the other hand, the figures suggest that the 5G coverage still does not reach a good portion of Americans or certain parts of the country. According to the Census Population Clock, the US population is over 329 million. So covering 200 million people comes to roughly 61% of Americans covered. This more than half, not quite two-thirds. Additionally, 5,000 cities and towns sounds like a lot. Some older data – 2007 – suggests the United States has over 19,000 municipal governments and the Census in 2012 also counted over 19,000. With these figures, 5G from T-Mobile covers a little more than one quarter of American communities.

Perhaps T-Mobile is doing the best the can with the coverage they have. The numbers are big ones and I would guess they could catch the attention of viewers. Maybe the numbers do not matter if they are trying to be first. However, just because the numbers are large does not necessarily mean the product is great. Significant segments of Americans will not have access, even with the big numbers. The numbers look good but they not be as good for some when they look into what they mean.