Show your knowledge of US metro areas with the US Census “Population Bracketology”

Even the United States Census Bureau is getting into brackets and bracketology. Go here to play “Population Bracketology” which shows your knowledge of the population of metropolitan areas in the United States.

Yes, it should be easy to select the winner. But, I like that a lot of the initial pairings matched Sunbelt versus Rust Belt cities. Some of these were hard to choose. On the other hand, the Los Angeles-New York City matchup in the first round knocked out a contender…

Misinterpreting a graph of income in the US by misreading the X-axis categories

Some graphs can be more difficult to interpret, particularly if the categories along one of the axes are not a consistent width. Here is an example: misreading a chart of income in the United States:

“When I was growing up in Canada,” says Jon Evans of Techcrunch, “I was taught that income distribution should and did look like a bell curve, with the middle class being the bulge in the middle. Oh, how naïve my teachers were. This is how income distribution looks in America today.”

file

“That big bulge up above? It’s moving up and to the left. America is well on the way towards having a small, highly skilled and/or highly fortunate elite, with lucrative jobs; a vast underclass with casual, occasional, minimum-wage service work, if they’re lucky; and very little in between.”…

Er, no.  Look closely at those last two brackets.   Now look at the brackets immediately to the right of them? What do you notice?

Probably, you notice the same thing that immediately struck me: the last two brackets cover a much, much wider income band than the rest of the brackets on the graph.

Each bar on that graph represents a $5,000 income band: Under $5,000, $5000 to $9,999, and so forth.  Except for the last two.  The penultimate band is $200,000 to $250,000, which is ten times as wide as the previous band.  And the last bar represents all incomes over $250,000–a group that runs from some law associate who pulled down $251,000 last year, through A-Rod’s $27 million annual salary, all the way to some Silicon Valley superstar who just cashed out the company for a one time windfall of hundreds of millions of dollars.  Unsurprisingly, much wider bands have more people in them than they would if you kept on extrapolating out in $5,000 increments…

To put it another way, the apparent clustering of income along the rich right tail of the distribution is just an artifact of the way that the Census presents the data.  If they kept running through $5,000 brackets all the way out to A-Rod, the spreadsheet would be about a mile long, and there would only be a handful of people in each bracket.  So at the high end, where there are few households, they summarize.

The Census likely has good reasons for reporting these higher-income categories in such a way. First, because there are relatively fewer people in each $5,000 increment, they are trying to not make the graph too wide. Second, I believe the Census topcodes income, meaning that above a certain dollar point, incomes don’t get any higher. This is done to help protect the identity of these respondents who might be easy to pick out of the data otherwise.

But, this is a classic misinterpretation of a graph. As McArdle notes, this is a long-tail graph with very few people at the top end. The graph tries to alert reader to this by also marking some of the notable percentiles; above the $130,000 to $134,999 category, it reads “The top 10 percent reported incomes above $135,000” and above the top two categories, it reads, “approximately 4 percent of households.” Making the right interpretation depends not just on the relative shape of the graph, bell curve or otherwise, but looking closely at the axes and categories.

Argument: Census Bureau could better count Hispanics by focusing on origins

As I wrote about a month ago, the Census Bureau is looking into ways to better count Hispanics in the 2020 Census. Here are a few more details:

“Many Hispanics, especially those who are immigrants, are unsure about how to respond to census questions about race because the concept of race that we use in the U.S. is not so firmly entrenched in Latin American cultures,” said Shannon Monnat, a UNLV assistant professor of sociology who studies demography…

In April the Pew Research Center published a report from a survey that verified cramming everyone together into one category was problematic.

More than half of the Pew survey respondents said they preferred to use their country of origin as an identifier, 24 percent said they would use “Hispanic” most often and 21 percent labeled themselves “American.”…

“Historically, the standard sociological practice has been to apply ‘race’ to distinctions based on physical appearance and apply ‘ethnicity’ to distinctions based on culture and language, but ethnicity now is used increasingly as an inclusive term to categorize all groups considered to share a common descent,” Monnat said. “Demographers have been predicting a much wider range of responses on census forms and increased blurring of racial categories as minority populations continue to grow and interracial marriage increases over the next several decades. The children produced from these unions will not fit neatly into any of the standard census categories.

“A more realistic approach may be to use the concept of ‘origins’ rather than the traditional concepts of race and ethnicity,” she said.

Keeping up with changing definitions is a difficult task for sociologists and demographers. And this seems like a two-step process: first, we need to know how people understand or identify themselves and then we need to get the survey questions right.

Moving toward “origins” data would be interesting. The Census has some data on this – I think this is from questions about ancestry on the long form. Here is a two paragraph description of how this was done in 2000:

Ancestry refers to a person’s ethnic origin or descent, “roots,” or heritage, or the place of birth of the person or the person’s parents or ancestors before their arrival in the United States. Some ethnic identities, such as “German” or “Jamaican,” can be traced to geographic areas outside the United States, while other ethnicities such as “Pennsylvania Dutch” or “Cajun” evolved in the United States.

The intent of the ancestry question is not to measure the degree of attachment the respondent had to a particular ethnicity. For example, a response of “Irish” might reflect total involvement in an “Irish” community or only a memory of ancestors several generations removed from the individual. A person’s ancestry is not necessarily the same as his or her place of birth; i.e., not all people of German ancestry were born in Germany (in fact, most were not).

Ancestry has its own issues.

The 2020 Census to have different questions about race?

The Los Angeles Times reports that the US Census Bureau is looking into possibly changing the questions about race and ethnicity in the 2020 Census:

The bureau’s new recommendations were based on research findings of a number of experimental questions given to 500,000 households during the 2010 census. The findings showed that many Americans believe the racial and ethnic categories now used by the census are confusing and don’t always jibe with their own views of their identity.

For example, asked to state their race on the 2010 census, more than 19 million people, including millions of Latinos, chose “some other race,” rather than select from the five categories offered on the census form: white, black, Asian, American Indian/Native Alaskan or Hawaiian/Pacific Islander.

One of the changes proposed now would simply ask respondents to choose their race or origin, allowing them to check a single box next to categories that would include white, black or Hispanic.

Another would add write-in categories to allow those of Middle Eastern or Arab origin to specifically identify themselves, officials said.

A third change would end the practice of offering the controversial term “Negro” as an alternative for African-American or black. Some African-Americans in 2010 criticized the government’s continuing use of the word, saying it was outdated and offensive.

As cultural definitions change, so should the Census in order to better match the lived reality. Of course, this attempt to improve the validity of the results makes it more difficult for researchers and others to match up results from newer Census results, marring the reliability. And as the article notes, this has political implications and this could play into the definitions as well.

It would be interesting to hear more about the experimental results from the 2010 survey as this is a good example of an experiment that doesn’t require a laboratory. What else did people like or not like? I assume the Census Bureau is not going to cave in to those who don’t want to answer a race or ethnicity question at all and/or those who simply answer “American.”

Time magazine: “The Return of the McMansion”?

Time echoes some other commentators: new data suggests we may be headed back toward bigger houses and McMansions.

When the real estate market imploded and ushered in the Great Recession, one of the biggest casualties was the size of our homes. For years, we’d been building increasingly large homes because, well, we could — and because we assumed all those two-story foyers and master suites could only go up in value. The recession put a screeching halt to this trend: After peaking at 2,521 square feet in 2007, the average size of a new home has dropped, a trend many industry observers thought would continue…

Census data shows that the average size of a new home built last year was 2,480 square feet, the first increase after three years of successive declines. Nearly 40% of new homes built last year had four or more bedrooms, a return to the all-time high reached in 2005 and 2006. And nearly 20% have three-car garages, an increase following two years of declines…

This reversal is unexpected. In a 2010 report, the National Association of Home Builders speculated that the trend of smaller homes might be due to a secular shift and that our preference for small houses would continue after the recession ended. “Part can also be attributed to trends in factors like the desire to keep energy costs down, amounts of equity in existing homes available to roll into a new one, tightening credit standards, less emphasis on the pure investment motive for buying a home, and an increased share of homes sold to first-time buyers,” the report says. “Not all of these trends are likely to reverse themselves immediately at the end of a recession.”

This illustrates the problems of making sweeping predictions on recent data: it is really difficult to predict long trends. Does the 2011 data now suggest we are going back the other direction toward bigger houses? What if the figures go down slightly again in 2012?

A second issue: moving back to bigger houses doesn’t necessarily mean that they are McMansions. The backlash against McMansions has been stiff in the last decade so these new big homes might be quite different. Perhaps they have emphases on customization (a concern of Sarah Susanka and the “Not So Big House”), more traditional looking neighborhoods (a concern of New Urbanists), and are greener and more sustainable homes.

The implications of discontinuing the American Community Survey

This didn’t exactly make the front page this week but a vote in the House of Representatives about the American Community Survey could have a big impact on how we understand the United States. Nate Berg explains:

So the Republican-led House of Representatives this week voted 232-190 to eliminate the American Community Survey, the annual survey of about 3 million randomly chosen U.S. households that’s like the Census only much more detailed. It collects demographic details such as what sort of fuel a household uses for heating, the cost of rent or mortgage payments, and what time residents leave home to go to work.

In a post on the U.S. Census Bureau’s website, Director Robert Groves says the bill “devastates the nation’s statistical information about the status of the economy and the larger society. Modern societies need current, detailed social and economic statistics. The U.S. is losing them.”

While the elimination of the ACS would take a slight nibble out of the roughly $3.8 trillion in government expenditures proposed in the 2013 federal budget, its negative impacts could be much greater – affecting the government’s ability to fund a wide variety of services and programs, from education to housing to transportation.

The issue is that the information collected in the ACS is used heavily by the federal government to figure out where it will spend a huge chunk of its money. In a 2010 report for the Brookings Institution, Andrew Reamer found that in the 2008 fiscal year, 184 federal domestic assistance programs used ACS-related datasets to help determine the distribution of more than $416 billion in federal funding. The bulk of that funding, more than 80 percent, went directly to fund Medicaid, highway infrastructure programs and affordable housing assistance. Reamer, now a research professor George Washington University’s Institute of Public Policy, also found that the federal government uses the ACS to distribute about $100 billion annually to states and communities for economic development, employment, education and training, commerce and other purposes. He says that should the ACS be eliminated, it would be very difficult to figure out how to distribute this money where it’s needed…

And it’s not just government money that would be wasted. Reamer says many businesses are increasingly reliant on the market data available within the ACS, and that without it they would have much less success picking locations where their businesses would have market demand. It would affect businesses throughout the country, “from mom-and-pops to Walmart.”

Some history might also be helpful here. The United States has carried out a dicennial census since 1790 but the American Community Survey began in the mid 1990s. There has been talk in recent years of replacing the expensive and complicated dicennial census with a beefed up American Community Survey. There would be several advantages: it wouldn’t cost as much plus the government (and the country) would have more consistent information rather than having to wait every ten years. In other words, our country is rapidly changing and we need consistent information that can tell us what is happening.

In my mind, as a researcher who consistently uses Census data, dropping the ACS would be a big loss. The government funding is important but even more important to me would be losing the more up-to-date information the ACS provides. Without this survey, we would likely have to rely on private data which is often restrictive and/or expensive. For example, I’ve used ACS data to track some housing issues but without this, I’m not sure where I could get similar data.

This is part of a larger issue of conservatives wanting to limit the reach of the Census Bureau. The argument often is that the Census is too intrusive, therefore invading the privacy of citizens (see this 2011 story about an insistent ACS worker), and the Constitution only provides for a dicennial census. I wonder if these arguments are red herrings: there is a long history of battling over Census counts and timing depending on which political party might benefit. For example, see Republican claims that inappropriate sampling techniques were used to correct undercounts for big cities, claims that the Census “imputes” races to people (so mark your race as American!), or efforts by New York City to ask for a recount in order to boost their 2010 population figures, which are tied to funding. In other words, the Census can turn into a political football even though its data is very important and it uses social science research techniques.

Controversy in using sampling for the dicennial Census

In a story about the resignation of sociologist Robert Groves as director of the United States Census Bureau, there is an overview of some of the controversy over Groves’ nomination. The issue: the political implications of using statistical sampling.

Dr. Robert M. Groves announced on Tuesday that he was resigning his position as director of the U.S. Census Bureau in order to take over as provost of Georgetown University. “I’m an academic at heart,” Groves told The Washington Post. He will leave the Bureau in August. Unlike some government officials who recently have had to resign under a cloud, such as Regina Dugan of DARPA and Martha Johnson of the General Services Administration, Groves received universal praise for the job he did directing the 2010 Census, a herculean task he completed on time and almost $2 billion under budget.
At the time of Groves’ nomination, Rep. Darrell Issa, (R-California), chairman of the House Committee on Oversight and Government Reform, said that he found it “an incredibly troubling selection that contradicts the administration’s assurances that the census process would not be used to advance an ulterior political agenda.” However, by the time Groves announced that he was leaving, Issa had changed his tune and issued a statement that “His tenure is proof that appointing good people makes a big difference.”
When President Barack Obama nominated Groves on April 2, 2009, he was viewed as a generally uncontroversial professor of sociology.  However, his nomination turned out to be contentious anyway because his support for using statistical sampling, a statistical method commonly used to correct for errors and biases in the census, raised the ire of Republican critics, who believed that sampling would benefit minorities and the poor, who generally vote Democratic…
A specialist in survey methodology and statistics, Groves was no stranger to the Census Bureau, whose decennial census is one of the world’s largest and most sophisticated statistical exercises.  Groves served there early in his career as a visiting statistician in 1982, and later as associate director of Statistical Design, Standards, and Methodology from 1990 to 1992.  It was during the latter period that Groves became embroiled in the controversy over the proposed use of statistical sampling to correct known biases and deficiencies in the Census head count.  Groves and others at the Census Bureau proposed using sampling techniques to correct an admitted 1.2% undercount in the 1990 Census, which failed to include millions of homeless, minority and poor persons mainly living in big cities, which lost millions of dollars in federal funds when Republican Commerce Secretary Robert Mosbacher vetoed the sampling proposal.

Considering Groves’ track record in sociology, I’m not surprised that he is now regarded to have done a good job in this position.

Perhaps this is a silly question in today’s world but does everything have to become politicized? Is the ultimate goal to get the most accurate count of American residents or do both parties simply assume that the other side wants to use the occasion for political gain? If you want to limit funding to cities based on population, why not go after this funding rather than try to skew the count?

Of course, this is not the first time that the dicennial Census has been politicized…

Another note: a sociologist apparently saved the government $2 billion! That alone should draw some attention.

New Census figures: population 80.7% urban, most dense cities in the West

The US Census Bureau released Monday some figures about cities in America. Here are the updated 2010 statistics about urbanization:

 The nation’s urban population increased by 12.1 percent from 2000 to 2010, outpacing the nation’s overall growth rate of 9.7 percent for the same period, according to the U.S. Census Bureau…
Urban areas — defined as densely developed residential, commercial and other nonresidential areas — now account for 80.7 percent of the U.S. population, up from 79.0 percent in 2000. Although the rural population — the population in any areas outside of those classified as “urban” — grew by a modest amount from 2000 to 2010, it continued to decline as a percentage of the national population.

Translation: the proportion of Americans living in urban areas didn’t change very much over the last 10 years. In comparison, the urban population jumped 6% from 1970 to 1980, 3% from 1980 to 1990, and 3% from 1990 to 2000 (see figures on pg. 33 of this Census document). Does this mean we are nearing a plateau in terms of the proportion of Americans living in urban areas?

And here are the new figures for the densest metropolitan areas:

The nation’s most densely populated urbanized area is Los Angeles-Long Beach-Anaheim, Calif., with nearly 7,000 people per square mile. The San Francisco-Oakland, Calif., area is the second most densely populated at 6,266 people per square mile, followed by San Jose, Calif. (5,820 people per square mile) and Delano, Calif. (5,483 people per square mile). The New York-Newark, N.J., area is fifth, with an overall density of 5,319 people per square mile…
Of the 10 most densely populated urbanized areas, nine are in the West, with seven of those in California. Urbanized areas in the U.S., taken together, had an overall population density of 2,534 people per square mile.

These new figures continue to support one of the trick questions about cities: which city is the most dense? A common answer is New York City because of Manhattan but the densest is actually Los Angeles. Of course, some of this has to do with Southern and Western cities having more space because of the drying up of annexation opportunities in Midwestern and Northeastern cities in the early 1900s.

While these are very interesting figures, where is the percentage of Americans who live in suburbs?

“Wrestling with how to get more Latinos to pick a race”

Here is another overview of the problems the US Census is having with measuring the Latino population in the United States:

So when they encounter the census, they see one question that asks them whether they identify themselves as having Hispanic ethnic origins and many answer it as their main identifier. But then there is another question, asking them about their race, because, as the census guide notes, “people of Hispanic, Latino or Spanish origin may be of any race,” and more than a third of Latinos check “other.”

This argument over identity has gained momentum with the growth of the Latino population, which in 2010 stood at more than 50 million. Census Bureau officials have acknowledged that the questionnaire has a problem, and say they are wrestling with how to get more Latinos to pick a race. In 2010, they tested different wording in questions and last year they held focus groups, with a report on the research scheduled to be released by this summer.

Some experts say officials are right to go back to the drawing table. “Whenever you have people who can’t find themselves in the question, it’s a bad question,” said Mary C. Waters, a sociology professor at Harvard who specializes in the challenges of measuring race and ethnicity…

Latinos, who make up close to 20 percent of the American population, generally hold a fundamentally different view of race. Many Latinos say they are too racially mixed to settle on one of the government-sanctioned standard races — white, black, American Indian, Alaska native, native Hawaiian, and a collection of Asian and Pacific Island backgrounds.

American conceptions of race usually center on black and white without having much room for middle or other categories. There is a long history of this in the United States as various new groups struggled to become labeled as white.

I like the admission here that the Census needs to find a definition that also fits Latinos’ own understanding. Imposing social science categories on the world can be problematic, particularly if they are not understood in the same ways by all people. Survey questions are not that great if people don’t understand the answers or see where they fit in the possible answers.

This isn’t the first acknowledgment that the Census Bureau has issues here. I would be curious to hear sociologists and others project forward: how will the Census and others measure race, ethnicity, and culture in 2050 when the United States will look very different? Are there ways to measure race and ethnicity in the Census without the pressure of it being tied to federal dollars?

“Startling” number of “near poor” in the United States

The Census Bureau released figures recently showing a growing number of Americans living below the poverty line. But the figures also showed a population increase in another group: the “near poor.”

When the Census Bureau this month released a new measure of poverty, meant to better count disposable income, it began altering the portrait of national need. Perhaps the most startling differences between the old measure and the new involves data the government has not yet published, showing 51 million people with incomes less than 50 percent above the poverty line. That number of Americans is 76 percent higher than the official account, published in September. All told, that places 100 million people — one in three Americans — either in poverty or in the fretful zone just above it…

The Census Bureau, which published the poverty data two weeks ago, produced the analysis of those with somewhat higher income at the request of The New York Times. The size of the near-poor population took even the bureau’s number crunchers by surprise.

“These numbers are higher than we anticipated,” said Trudi J. Renwick, the bureau’s chief poverty statistician. “There are more people struggling than the official numbers show.”…

Of the 51 million who appear near poor under the fuller measure, nearly 20 percent were lifted up from poverty by benefits the official count overlooks. But more than half were pushed down from higher income levels: more than eight million by taxes, six million by medical expenses, and four million by work expenses like transportation and child care.

It would be interesting to know more about this group of “near poor”: is this a consistent position they hold in society? Is there much downward or upward  mobility from this group? Is this a group that grows dramatically in tough economic times for the whole country? The story makes it sound like this is a group that could easily go either way: a better job opportunity might push a household upward while a large medical bill or the need to replace an aging car might push them back much closer to the poverty line. And after knowing more, what policies would help improve the lot of this group – jobs, education, a bigger safety net?

I wonder additionally how much of this story is really that Census Bureau researchers are “surprised” by these findings. This is what the headline emphasizes. “Surprised” suggests that no one saw this coming. Should they have been surprised? After all, the median household income in the United States is around $50,000, suggesting that there are lots of people not too far from the official poverty line. Past context is important here to know how much larger this group is now compared to past time periods.