Argument that obesity and McMansions are linked

One “muckraker” tries to suggest that bigger houses – such as McMansions – make it easier for people to be obese:

No, the truth is that like cars, McMansion houses, food portions and soft drink sizes, Americans are getting bigger every day–and because it is happening everywhere, few notice. Worse, the harder we try to lose poundage with low calorie foods, fitness centers and personal trainers, the bigger we are becoming. Even people in non-industrialized countries are packing on the pounds as Big Food peddles it high calorie, addictive processed food in “new markets.”

A correlation without causation argument. And you do not have to go McMansions to make the same claim: the average size of new homes has increased from roughly 1,000 square feet to 2,500 square over sixty years. But, how might we really show that having other bigger items in our lives leads to having other bigger items in our lives? Would the reverse also be true: that if we had increasingly smaller items in our lives, we would desire smallness over all? If these are all linked, perhaps we could tie this to the big American frontier or the large American ideals at the founding of the country.

Perhaps there are other arguments to be made here. Do McMansions offer more space for people to spread out? Or, could heavier people be more likely to purchase McMansions (and is this related more to their stage in life)?

The perils of analyzing big real estate data

Two leaders of Zillow recently wrote Zillow Talk: The New Rules of Real Estate which is a sort of Freakanomics look at all the real estate data they have. While it is an interesting book, it also illustrates the difficulties of analyzing big data:

1. The key to the book is all the data Zillow has harnessed to track real estate prices and make predictions on current and future prices. They don’t say much about their models. This could be for two good reasons: this is aimed at a mass market and the models are their trade secrets. Yet, I wanted to hear more about all the fascinating data – at least in an appendix?

2. Problems of aggregation: the data is analyzed usually at a metro area or national level. There are hints at smaller markets – a chapter on NYC for example and another looking at some unusual markets like Las Vegas – but there are not different chapters on cheaper/starter homes or luxury homes. An unanswered questino: is real estate within or across markets more similar? Put another way, are the features of the Chicago market so unique and patterned or are cheaper homes in the Chicago region more like similar homes in Atlanta or Los Angeles compared to more expensive homes across markets?

3. Most provocative argument: in Chapter 24, the authors suggest that pushing homeownership for lower-income Americans is a bad idea as it can often trap them in properties that don’t appreciate. This was a big problem in the 2000s: Presidents Clinton and Bush pushed homeownership but after housing values dropped in the late 2000s, poorer neighborhoods were hit hard, leaving many homeowners to default or seriously underwater. Unfortunately, unless demand picks up in these neighborhoods (and gentrification is pretty rare), these homes are not good investments.

4. The individual chapters often discuss small effects that may be significant but don’t have large substantive effects. For example, there is a section on male vs. female real estate agents. The effects for each gender are small: at most, a few percentage points difference in selling price as well as slight variations in speed of sale. (Women are better in both categories: higher prices, faster sales.)

5. The authors are pretty good at repeatedly pointing out that correlation does not mean causation. Yet, they don’t catch all of these moments and at other times present patterns in such a way that distort the axes. For example, here is a chart from page 202:

ZillowTalkp202

These two things may be correlated (as one goes up so does the other and vice versa) but why fix the axes so you are comparing half percentages to five percentage increments?

6. Continuing #4, I supposed a buyer and seller would want to use all the tricks they can but the tips here mean that those in the real estate market are supposed to string along all of these small effects to maximize what they get. On the final page, they write: “These are small actions that add up to a big difference.” Maybe. With margins of error on the effects, some buyers and sellers aren’t going to get the effects outlined here: some will benefit more but some will benefit less.

7. The moral of the whole story? Use data to your advantage even as it is not a guarantee:

In the new realm of real estate, everyone faces a rather stark choice. The operative question now is: Do you wield the power of data to your advantage? Or do you ignore the data, to your peril?

The same is true of the housing market writ large. Certainly, many macro-level dynamics are out of any one person’s control. And yet, we’re better equipped than ever before to choose wisely in the present – to make the kinds of measured judgments that can prevent another coast-to-coast bubble and calamitous burst. (p.252)

In the end, this book is aimed at the mass market where a buyer or seller could hope to string together a number of these small advantages. Yet, there are no guarantees and the effects are often small. Having more data may be good for markets and may make participants feel more knowledgeable (or perhaps more overwhelmed) but not everyone can take advantage of this information.

University press releases exaggerate scientific findings

A new study suggests exaggerations about scientific findings – for example, suggesting causation when a study only found correlation – start at the level of university press releases.

Yesterday Sumner and colleagues published some important research in the journal BMJ that found that a majority of exaggeration in health stories was traced not to the news outlet, but to the press release—the statement issued by the university’s publicity department…

The goal of a press release around a scientific study is to draw attention from the media, and that attention is supposed to be good for the university, and for the scientists who did the work. Ideally the endpoint of that press release would be the simple spread of seeds of knowledge and wisdom; but it’s about attention and prestige and, thereby, money. Major universities employ publicists who work full time to make scientific studies sound engaging and amazing. Those publicists email the press releases to people like me, asking me to cover the story because “my readers” will “love it.” And I want to write about health research and help people experience “love” for things. I do!

Across 668 news stories about health science, the Cardiff researchers compared the original academic papers to their news reports. They counted exaggeration and distortion as any instance of implying causation when there was only correlation, implying meaning to humans when the study was only in animals, or giving direct advice about health behavior that was not present in the study. They found evidence of exaggeration in 58 to 86 percent of stories when the press release contained similar exaggeration. When the press release was staid and made no such errors, the rates of exaggeration in the news stories dropped to between 10 and 18 percent…

Sumner and colleagues say they would not shift liability to press officers, but rather to academics. “Most press releases issued by universities are drafted in dialogue between scientists and press officers and are not released without the approval of scientists,” the researchers write, “and thus most of the responsibility for exaggeration must lie with the scientific authors.”

Scientific studies are often complex and probabilistic. It is difficult to model and predict complex natural and social phenomena and scientific studies often give our best estimate or interpretation of the data. But, science tends to steadily accumulate findings and knowledge more than a model where every single study definitively proves things. This can mean that individual studies contribute to the larger whole but often don’t set the agenda or have a radically new finding.

Yet, translating that understanding into something fit for public consumption is difficult. Academics are often criticized for dense and jargon-filled language so pieces for the general public have to be written differently. Academics want their findings to matter and colleges and universities like good publicity as well. Presenting limited or weaker findings doesn’t get as much attention.

All that said, there is an opportunity here to improve the reporting of scientific findings.

Is more Internet use correlated to a decline in religious affiliation?

A new study suggests using the Internet more is correlated with lower levels of religious affiliation:

Downey analyzed data from the General Social Survey, a well-respected annual research survey carried out by the University of Chicago, to make his findings.

Downey says the single biggest cause of religious affiliation is upbringing: those you are raised in religious households are much more likely to remain in their family’s religion as adults…

By far the largest factor, says Downey, is Internet use.

In the 1980s, Internet use was virtually non-existent, but in 2010, 53 per cent of people spent two hours online a week and 25 per cent spent more than seven hours…

Downey says that his research has controlled for ‘most of the obvious candidates, including income, education, socioeconomic status, and rural/urban environments’ to discount a third factor, one that is responsible both for the rise of Internet use and the drop in religiosity.

Since the full story is behind a subscriber wall, two speculations about the methodology of this study:

1. This sounds like a regression and/or ANOVA analysis based on R-squared changes. In other words, when one explanatory factor is in the model, how much more of the variation in the dependent variable (religiosity) is explained? You can then add or subtract different factors singly or in combination to see how that percent of variation explained changes.

2. Looking at religious affiliation is just one way to measure religiosity. Affiliation is based on self-identification (do you consider yourself a Catholic, mainline Protestant, conservative Protestant, etc.) or what religious congregation you regularly attend or interact with. But, levels of religious affiliation have been falling in recent years even as not all measures of religiosity are falling. Research about the rise of the “religious nones” shows a number of these people still are spiritual or perform religious practices.

If there is a strong causal relationship between increased Internet use and less religiosity, why might this be the case? A few ideas:

1. The Internet opens people up to a whole realm of information beyond themselves. Traditionally, people would look to those around them, whether individuals or institutions, within relatively close proximity. The Internet breaks a lot of these social boundaries and allows people to search for information way beyond themselves.

2. The Internet offers social interactions in a way that religion used to. Instead of going to a religious congregation to meet people, the Internet offers the possibilities of finding like-minded people in all sorts of areas from hobbies and interests, people in the same career field, dating websites, and people you want to sell goods to. In other words, some of the social aspects of religion can now be replicated online.

3. The Internet in its medium and content tends to be individualistic. Anyone with an Internet connection can do all sorts of things without relying on others (outside of having a service provider). This simply feeds into individualistic attitudes that already existed in the United States.

It sounds like there is a lot more here for researchers to explore and unpack.

Sociologists argue it is difficult to find causal data for how inequality leads to different outcomes

Two sociologists tackle the question of how exactly inequality is related to a variety of social outcomes and argue it is difficult to find causal, and not correlative, data:

For all the brain power thrown at the problem since then, however, specific evidence about inequality’s effects has been hard to find. Mr. Jencks said he could already picture the book’s reviews, “Professor Doesn’t Know What He Is Talking About.”…

One problem with these analyses is that they are based on correlations between levels of inequality and variables like life expectancy or the odds of poor children climbing the income ladder. But such correlations can’t prove inequality causes other social ills. They can’t disentangle inequality from the myriad things pushing American society this way and that.

Life expectancy in the United States might lag that of other countries because the United States still does not have universal health care. Scandinavia may enjoy higher upward mobility than the United States because governments in Sweden, Denmark and other Scandinavian countries invest a lot in early childhood education and the United States does not.

Lane Kenworthy, a sociologist at the University of Arizona, is all too aware of these limitations. He was to be Mr. Jencks’s co-author on the book about inequality’s consequences. Now he is going it alone, hoping to publish “Should We Worry About Inequality?” next year.

“People that worry about inequality for normative reasons have been very quick to jump on plausible hypothesis and a little bit of evidence to make sweeping conclusions about its consequences,” Professor Kenworthy told me.

It sounds like these sociologists are asking for some more methodological rigor in studying how inequality affects social life. Finding direct relationships between social forces and outcomes can be difficult but I look forward to seeing more work on the subject.

Read more in this follow-up interview with Lane Kenworthy.

Correlation between migration patterns and state freedom in the United States?

A new report suggests there is a correlation between migration to freer, more conservative states:

It found that the freest states tended to be conservative “red” states, while the least free were liberal “blue” states.

The freest state overall, the researchers concluded, was North Dakota, followed by South Dakota, Tennessee, New Hampshire and Oklahoma. The least free state by far was New York, followed by California, New Jersey, Hawaii and Rhode Island.

The study also compared its measures of economic and personal freedom to population shifts and income growth, and found that freer states tend to do better on both scores than those less free.

For example, it found a strong correlation between a state’s freedom ranking and migration, which means that Americans are gravitating toward states that have less-intrusive governments.

This might be part of an explanation for migration. But the website itself makes it difficult to find the correlation – go to the FAQs and then you can click through to a 234 page PDF file. And then I can’t find exact correlations. Here is what the regression results suggest (page 105 of the PDF):

The estimates from equation 2 imply that a half-unit change in fiscal policy score, for instance from Michigan to New Hampshire (2011 values), is associated with an increase in net interstate migration of about 2 percent of 2000 population; a half-unit change in regulatory policy score, for instance from New Jersey to Virginia (2011 values), is associated with an increase in net interstate migration of about 4.2 percent of 2000 population; and a quarter-unit change in personal freedom score, for instance from Alabama to Maine (2011 values), is associated with an increase in net interstate migration of about 2.5 percent of 2000 population. If we can interpret these relationships as causal, then to policy makers interested in attracting new
residents and businesses we would recommend measures to increase freedom and reduce cost of living.

I would want to see some other variables tested to rule out other competing factors.

Correlations that get at why big cities lean toward Democrats

Richard Florida discusses several reasons, based on correlations, why big cities now so clearly lean toward the Democratic party:

Density played a key role in the metro vote. (To capture it we use a measure we of population-based density, which accounts for the concentration of people in metro). The average Obama metro was more than twice as dense as the average Romney metro, 412 versus 193 people per square mile. With a correlation of .50, density was an even bigger factor than population (where the correlation is .34). The reverse pattern holds for the share of Romney votes; the negative correlation for density (-.51) was significantly higher than that for population (-.33)…

The chart below plots the relationship between a metro’s share of college grads and its share of Obama votes. The line slopes steeply upward showing how the share of Obama votes increase alongside metro density. The share of college grads in a metro is positively correlated with the share of Obama votes (.42) and negatively with the share of Romney votes (-.44)…

The chart above shows the relationship between the share of the creative class and the share of Obama votes across metro areas. The line slopes steeply upward, indicating a considerable positive relationship. The share of creative class workers is positively correlated with the share of Obama votes (.40) and negatively with the share of Romney votes (-.41)…

Republicans may still be the party of the rich, but most of the country’s more-affluent metros lined up squarely in the Obama camp. The correlation between the average wages and salaries of metros and the share of Obama votes is positive (.50) and it is negative for Romney votes (-.51). This makes sense too, as larger metros have greater concentrations of knowledge-based talent and industries and are wealthier to begin with. (The associations we find are even more substantial for metros with more than one million people, with the correlations increasing to .71 for Obama and -.72 for Romney.) This follows the “Red State, Blue State, Rich State, Poor State” pattern identified by Andrew Gelman of Columbia University, who infamously found that while rich voters continue to trend Republican, rich states trend Democratic.

Florida argues this is evidence of class-based differences in American life, specifically, differences between the creative class and those in knowledge industries compared to the rest of the United States.

However, this raises a few questions:

1. The analysis here seems to be done across metropolitan areas while some of these voting patterns break down as we compare cities versus suburbs. For example, there are those who suggest it is really about cities and inner-ring suburbs that vote Democratic while more further flung suburbs and exurbs vote Republican. See earlier posts about the analysis of Joel Kotkin – here and here.

2. Making claims with correlations with tricky. Florida acknowledges this before he rolls out the analysis: “As usual, I point out that correlation points to associations between variables only, not causation.” But, then why stop the analysis at correlations here? Looking at the relationships just between two variables at a time ignores the complex relationships between factors like race, class, location, jobs, and more. Why not quickly run some regressions?

3. If this analysis is correct (and we need more in-depth analysis to check), why are Republicans so bad at appealing to the creative class?

Correlation and not causation: Redskins games predict results of presidential election

Big events like presidential elections tend to bring out some crazy data patterns. Here is my nomination for the oddest one of this election season: how the Washington Redskins do in their final game before the election predicts the presidential election.

Since 1940 — when the Redskins moved to D.C. — the team’s outcome in its final game before the presidential election has predicted which party would win the White House each time but once.

When the Redskins win their game before the election, the incumbent party wins the presidential vote. If the Redskins lose, the non-incumbent wins.

The only exception was in 2004, when Washington fell to Green Bay, but George W. Bush still went on to win the election over John Kerry.

This is simply a quirk of data: how the Redskins do should have little to no effect on voting in other states. This is exactly what correlation without causation is about; there may be a clear pattern ut it doesn’t necessarily mean the two related facts cause each other. There may be some spurious association here, some variable that predicts both outcomes, but even that is hard to imagine. Yet, the Redskins Rule has garnered a lot of attention in recent days. Why? A few possible reasons:

1. It connects two American obsessions: presidential elections and the NFL. A sidelight: both may involve a lot of betting.

2. So much reporting has been done on the 2012 elections that this adds a more whimsical and mysterious element.

3. Humans like to find patterns, even if these patterns don’t make much sense.

What’s next, an American octopus who can predict presidential elections?

Century 21 says winning NFL teams boost housing prices

A new study from Century 21suggests housing values rise when NFL teams win:

The question was this: What is the impact on a city when the hometown team does well or doesn’t do well? Century 21 looked at teams’ successes, population growth from census numbers, home value appreciation and attendance rates. And the correlation between on-the-field success and real estate prices was evident:Four of the five cities with teams that went from a losing record in 2010 to a winning record in 2011 saw average home sales prices increase between 2010 and 2011.

After winning the Super Bowl, Green Bay, Wis., saw a population growth of 1.7 percent in 2011, compared with runner-up Pittsburgh’s 0.6 percent growth.

Going from a record of 10-6 in 2010 to 2-14 in 2011, Indianapolis, the home of the Colts, saw a 19.8 percent decrease in home sales.

Eight of the nine cities with a team that had attendance rates of 100 percent or more in 2011 saw average home sales prices rise that year.

Here is the original Century 21 blog post with this information.

The NFL is a powerful entity but does it have this much power? Is this due to a small sample size (this article mentions only one year of data)? Are there other factors behind this correlation? If I had to guess at what is going on here, I suspect this is too small of a sample and that 2011 prices in certain cities happened to coincide with NFL results. Why not look at the housing crisis years and see the relationship between records and housing values?

I’m generally skeptical of sports fans and others that claim sports are important for the civic pride of a community or that new stadiums need to be funded by taxpayers because the loss of a team will hurt the local economy. However, this could be pure genius from Century 21. What better way to boost business than to hook your services to the popular NFL? Hey, there was even a Century 21 2012 Super Bowl ad!

Participating in culturally elite activities related to lower BMI?

A new sociological study suggests there is a relationship between participating in certain cultural activities and having a lower BMI:

The study uses survey data from 17 nations, most of which are in Europe. In each country, a representative sample of the population was asked not only about height and weight, but also about time spent in a variety of activities. These included reading, going to cultural events, socializing with family and friends, attending sporting events, watching TV, going shopping, and exercising.A scale that measures interest in ideas, art, and knowledge—by surveying the amount of time spent reading, attending cultural events, going to movies, and using the Internet—is associated as strongly as exercise with a lower body-mass index, or BMI (a measure of weight relative to height). In other words, reading and exercise appear similarly beneficial in terms of BMI.

In contrast, people participating in other activities such as watching TV, socializing, playing cards, attending sporting events, and shopping have higher average BMI. Although time spent reading and time spent watching TV both expend few calories, one is associated with lower weight, and the other with higher weight…

So why might reading and related cultural activities be associated with thinness? The social meaning of the activity rather than the activity itself must be important for weight control. Leisure-time activities involve more than the calories burned; they also reflect differences across social groups in motives and means for good health.

These sound like interesting findings but I wonder if this is a classic example of “correlation does not imply causation.” Since these cultural activities might be related to social class, how do these findings line up with current statistics about weight (and health) by social class?