Wait, “RIP, McMansion” or are McMansions making a comeback?

Depending on who you read and what statistics are cited, McMansions are either returning or dead. Here is a new article in the second category:

The “McMansion” is dead.

That jumbo-sized, aspirational edifice, often with vaulted foyers, vast bathrooms and granite countertops, has become a relic of the housing bust in the Hudson Valley, builders and real estate experts say.

“It all boils down to the caution that buyers have adopted since the downturn,” said J. Philip Faranda, whose J. Philip Real Estate business is based in Briarcliff…

This is one way to interpret recent data: baby boomers and younger adult Americans, in particular, want smaller homes in more urban areas. Yet, there is also evidence that big homes are rebounding: Toll Brothers is doing okay and there are still a lot of big houses being built. So which side is correct? As I’ve suggested before, there may be two options. First, it will take some time to sort out the longer-term trends and whether the housing activity in the economic crisis continues for years. Second, it may be that both trends are happening: more Americans want smaller homes even as a decent segment of wealthy Americans can still afford supersized homes.

You can collect lots of Moneyball-type data but it still has to be used well

Another report from the MIT Sloan Sports Analytics Conference provides this useful reminder about statistics and big data:

Politics didn’t come up at the conference, except for a single question to Nate Silver, the FiveThirtyEight election oracle who got his start doing statistical analysis on baseball players. Silver suggested there wasn’t much comparison between the two worlds.

But even if there’s no direct correlation, there was an underlying message I heard consistently throughout the conference that applies to both: Data is an incredibly valuable resource for organizations, but you must be able to communicate its value to stakeholders making decisions — whether that’s in the pursuit of athletes or voters.

And the Obama 2012 campaign successfully put this together. Here is one example:

Data played a major role. There’s perhaps no better example than the constant testing of email subject lines. The performance of the Obama email with the subject line “I will be outspent” earned the campaign an estimated $2.6 million. Had the campaign gone with the lowest-performing subject line, it would have raised $2.2 million less, according to “Inside the Cave,” a detailed report from Republican strategist Patrick Ruffini and the team at Engage.

This is an important reminder about statistics: they still have to be used well and effectively shared with leaders and the public. We are now in a world where more data is available than ever before but this doesn’t necessarily mean life is getting better.

I recently was in a conversation about the value of statistics. I suggested that if colleges and others were able to effectively train the students of today in statistics and how to use them in the real world, we might be better off as a society in a few decades as these students go on to become leaders who can make statistics a regular part of their decision-making. We’ll see if this happens…

Tim Tebow is America’s favorite pro athlete…with 3% of the vote!

The fact that Tim Tebow is America’s favorite pro athlete may be a great headline but it covers up the fact that very few people actually selected him:

How big is Tebow-mania? According to the ESPN Sports Poll, Tim Tebow is now America’s favorite active pro athlete.

The poll, calculated monthly, had the Denver Broncos quarterback ranked atop the list for the month of December. In the 18 years of the ESPN Sports Poll only 11 different athletes — a list that includes Michael Jordan, Tiger Woods and LeBron James — have been No. 1 in the monthly polling.

In December’s poll, Tebow was picked by 3 percent of those surveyed as their favorite active pro athlete. That put him ahead of Kobe Bryant (2 percent), Aaron Rodgers (1.9 percent), Peyton Manning (1.8 percent) and Tom Brady (1.5 percent) in the top-five of the results.

The poll results were gathered from 1,502 interviews from a nationally representative sample of Americans ages 12 and older.

Tebow is the favorite and he was selected by 3% of the respondents? This is not a lot. While it is meaningful that he was selected so early in his career says something but we need some more data to think through this. What percent have previous favorite athletes gotten? Have previous iterations of this poll had larger gaps between the favorite and second-place? Are responses to this poll more diverse now than in the past?

I wonder about the validity of such questions that ask Americans to pick a favorite as they can garner low totals. Isn’t Tebow’s advantage over Bryant easily within the margin of error of the survey? The issues here are even greater than a recent poll asking about favorite Presidents. If you are a marketer, does this result clearly tell you that you should have Tebow sell your product?

Some quick history of the ESPN Sports Poll.

Using sociological surveys as political weapons

One commentator suggests that sociological surveys were used as political weapons recently in Russia:

Long before the State Duma elections of Dec 4, the ultra-rightist and liberal mass media, collaborating with anti-Russian elements in the West, forecast that the ruling United Russia party would suffer a serious defeat.

They organized all sorts of sociological surveys to support this thoroughly planned campaign and to push their “predictions” on the “crisis” facing Russian leaders and “sharply declining rating” of Prime Minister Vladimir Putin and President Dmitry Medvedev. The anti-Putin campaign became really vociferous when the United Russia congress officially and unanimously approved Putin as its nominee for the presidential election in March 2012.

It is true that the election results showed the correlation of political forces and sentiments in Russia, which is experiencing the difficult strategic consequences of the disintegration of the erstwhile Soviet Union and the impact of the global economic crisis.

I’m less interested in dissecting recent events in Russia (which are very interesting to read about) and more interested in thinking about using sociological findings as political weapons. The argument made here is that these surveys are part of a larger, unfair, ideological campaign waged by pundits and the media. Perhaps more importantly, there is a claim that the surveys were “organized,” suggesting they were only undertaken in order to push a particular viewpoint.

I don’t doubt that sociological findings are used in struggles for power. Indeed, sociologists are not value-neutral as they themselves have their own interests and class position within society. However, I tend to think the primary purpose of sociological data is to explain what is happening in society. If sociological surveys in Russia show dissatisfaction with Putin, is it incorrect to report this? Of course, statistics and facts are open to interpretation and need to be approached carefully.

Where is the line between sociological surveys illuminating social structures, practices, and beliefs and having viewpoints and using sociological data to push these perspectives? Max Weber’s writings on value-neutrality are still useful today as we think about the proper use of sociological data.

Two issues with Most Admired poll: a large gap between #1 and others, low numbers for #1

While it is interesting to note that sitting presidents tend to lead in Gallup’s “Most Admired Lists,” two other things immediately struck me when looking at the tables:

1. There is a relatively big gap between #1 for most admired man and woman and everyone else. This year, President Obama is at 17% and his next closest competitor is at 3% while Hillary Clinton is also at 17% and her next competitor is at 7%. Since Gallup asks this as an open-ended question (exact phrasing: “What man that you have heard or read about, living today in any part of the world, do you admire most? And who is your second choice?”), it suggests that people name famous people, particularly types who are likely to be in the news a lot and whose positions are notable. If this is the case, is this really a survey about who is most admired or more about who is most well-known?

2. The leaders in each category are only at 17% and their competitors are quite a ways back. This could lead to several suggestions. Perhaps Americans don’t think in these terms much. For men, 32% said none or had no opinion and for women, 29% said none or had no opinion. Additionally, when asked about men 9% said a friend or relative and 12% said the same when asked about women. Even the current President is only most admired by 17%, suggesting that Americans are not necessarily looking to admire their political leaders. Another possible explanation might be that there is a wide range of admirable famous people in the United States. For men, the top 10 only account for 31% of responses though the top 10 females account for 47% of responses. This might reflect the lesser number of women in positions of power or leadership so more attention is focused on a select few.

This leads me to think that this poll may not really not tell us much about anything. Those selected as admired have relatively low figures, certain positions in society lead to being selected, and there are clear leaders but then also a mass of closely-admired figures.

UPDATE 12/28/11 10:11 PM – There seems to be similar variability in a recent poll that asked Americans which celebrity they most wanted to live next door. Also:

The majority of surveyed adults (42 percent) said they did not want to live next to any celebrities. “As a voyeuristic culture that breathlessly tracks every celebrity movement, it’s extremely surprising to see so many Americans saying they wouldn’t like to live next to any celebrity at all,” said Zillow Chief Marketing Officer Amy Bohutinsky. “In fact, more people opted out of a celebrity neighbor in 2012 than in any of the past years we’ve run this poll.”

Perhaps Americans are more tired of famous people this year?

Space for sociological factors when looking at scientific research

I ran into this blog post discussing a recent study published in Hormones and Behavior titled “Maternal tendencies in women are associated with estrogen levels and facial femininity.” This particular blogger at Scientific American starts out by suggesting she doesn’t like the results:

Friend of the blog Cackle of Rad was the first person to send me this paper, and when I first tried to read it, I got…pretty angry. Being a rather obsessively logical person, I know why I felt angry about this paper, and I worked very hard to step back from it and approach it in a thoroughly scientific manner.

It didn’t work, I called in Kate. That helped a little.

In the end, it’s not a bad paper. The data are the data, as my graduate advisor always says. But data need to be interpreted, and interpretations require context. And I think what’s missing from this paper is not data or adequate methods. It’s context.

In the end, the blogger suggests the “context” needed really are a number of sociological factors that might influence perceptions:

So I wonder if the authors should make more effort to look into sociological factors. How does the intense pressure on women to become wives and mothers change as a function of how feminine the girl looks? I think you can’t separate any of this from this whole “women with higher estrogen want to be mothers” idea. This is why papers like this bug me, because they try to sell this as a evolutionary thing, without really acknowledging how much sociological pressure goes in to making women want to be mothers. And of course now I read them and I instantly get bristly, because what I see is people making assumptions about what I want, and what I must feel like, based on a few aspects of my physiology. It can be of value scientifically…but I don’t want it to apply to ME. I know it might be science, but I also find it more than a bit insulting.

I don’t know this area of research so I don’t have much room to dispute the results of the original study. However, how this blogger goes about this argument for adding sociological factors is interesting. Here are two possible options for making this argument:

1. Argument #1: the study actually could benefit from sociological factors. Definitions of femininity are wrapped up in cultural assumptions and patterns. There is a lot of research to back this up and perhaps we can point to specific parts of this study that would be altered if context was taken into account. But this doesn’t seem to be conclusion of this blog post.

2. Argument #2: there must be some sociological factors involved here because I don’t like these results. On one hand, perhaps it is admirable to admit one doesn’t like these research results. This can often be true about scientific results: it challenges our personal understandings of the world. So why end the post by again emphasizing that the blogger doesn’t like the results? Does this simply reduce sociology to the backup science that one only calls in to suggest that everything is cultural or relative or socially conditioned?

Perhaps I am simply reading too much into this. I don’t know how much natural science research could be improved by including sociological factors, whether it is often considered, or whether this is simply an unusual blog post. Argument #1 is the stronger scientific argument and is the one that should be emphasized more here.

Same data, different conclusions about poverty in “Rick Perry’s Texas”

With the increased national exposure of Texas Governor Rick Perry comes more people picking apart his political career. While Perry has been quick to tout Texas’ economic progress during his tenure, the same data regarding the state’s poverty rate can be used to reach different conclusions.

A CNN article titled “Poverty grows in Rick Perry’s Texas” has this to say:

While it’s true that Texas is responsible for 40% of the jobs added in the U.S. over the past two years, its poverty rate also grew faster than the national average in 2010.

Texas ranks 6th in terms of people living in poverty. Some 18.4% of Texans were impoverished in 2010, up from 17.3% a year earlier, according to Census Bureau data released this week. The national average is 15.1%.

And being poor in Texas isn’t easy. The state has one of the lowest rates of spending on its citizens per capita and the highest share of those lacking health insurance. It doesn’t provide a lot of support services to those in need: Relatively few collect food stamps and qualifying for cash assistance is particularly tough.

“There are two tiers in Texas,” said Miguel Ferguson, associate professor of social work at University of Texas at Austin. “There are parts of Texas that are doing well. And there is a tremendous number of Texans, more than Perry has ever wanted to acknowledge, that are doing very, very poorly.”

This is the more negative interpretation of this data that highlights a growing underclass in Texas. Perry may talk about job growth but there is a growing segment of the population that isn’t participating in this growth.

On the other side of the spectrum, a “Democrat and urbanist” (Instapundit’s description) suggests “The Texas Story Is Real“:

Lastly, the poverty rate is higher in Texas than in the US as a whole – 17.2% vs. 14.3%, not a small difference. However, the gap actually narrowed between the two during the 2000s, as the chart below in the percentage point change in the poverty rate illustrates.

[The graph shows the “Change in % of Population For Whom Poverty Status Is Determined (2000-2009).” Texas is at roughly 1.8%, the United States as a whole at roughly 1.95%.]

While every statistic isn’t a winner for Texas, most of them are, notably on the jobs front. And if nothing else, it does not appear that Texas purchased job growth at the expense of job quality, at least not at the aggregate level.  There are certainly deeper places one might drill into and find areas of concern or underperformance, but that’s true of everywhere.  And these top line statistics are commonly used to compare cities and states. Unless Texas critics are ready to retire these measures from their own arsenal, it seems clear that Texas is a winner.  The Texas story is real.

While acknowledging that Texas has a higher poverty rate (and this doesn’t include 2010 data), this commentator suggests that Texas had a smaller increase in this population compared to the United States.

This is a classic example of how two sides that are looking at the same data can come to two very different conclusions. For one, the poverty data indicates that Rick Perry is allowing some of Texas’ population to fall behind while the other suggests the poverty data isn’t so bad since the poverty rate grew less than that of the United States as a whole. In this case, I suspect the data itself won’t win over either side since ideology trumps the data.

More broadly, will most Americans consider these fine-tuned arguments when considering Rick Perry as a candidate? Probably not. Quoting a sociologist in a post yesterday, “Questioning someone’s religious sincerity is totally a factor of whether you already like that person.” This may also apply to their supposed economic impact.

Example of problems with statistics “nearly 1,500 millionaires” (out of more than 235,000) “paid no federal taxes”

Statistics can be used well and they can be used not so well. Here is an example where the headline statistic suggests something different from the rest of the story:

Of an already small pool of millionaires and billionaires, 1,470 didn’t pay any federal income taxes in 2009, according to the Internal Revenue Service.

Just over 0.1% of taxpayers — or 8,274 out of 140 million total — made more than $10 million in 2009, according to the agency. More than 235,000 taxpayers earned $1 million or more, according to a recent report from the agency.

But of the high earners who avoided paying income taxes, many did so due to heavy charity donations or foreign investments.

About 46% of all American households won’t pay federal income tax in 2011, many due to low income, tax credits for child care and exemptions, according to the nonpartisan Tax Policy Center.

The headline makes it sound like there are a lot of millionaires who are avoiding paying taxes. The actual percentage hinted at it in the story suggests something else: less than 0.63% of all millionaires (1,470/235,000 – less than 1 in a 100)) paid no taxes. In the midst of a political debate about whether to raise taxes for the wealthy in America, each side could grab on to factual yet different figures: the 1,500 figure sounds high like the country is missing out on a lot money while the 0.63% figure suggests almost all pay some taxes. It wouldn’t take much to include both figures, the actual number and the percentage in the story.

Examples like this help contribute to the reaction some people have when they see statistics in the media: how can I trust any of them if they will just use the figures that suit them? All statistics become suspect and it is then hard to get a handle on what is going on in the world.

More difficulty with housing vacancy data

I’ve written about this before but here is some more evidence that one should be careful in looking at housing vacancy data:

In early 2009 the Richmond, Virginia press wrote numerous articles after quarterly HVS data on metro area rental vacancy rates “showed” that the rental vacancy rate in the Richmond, Virginia metro area in the fourth quarter of 2008 was 23.7%, the highest in the country. This shocked local real estate folks, including folks who tracked rental vacancy rates in apartment buildings in the area. The Central Virginia Apartment Association, e.g., found that the rental vacancy rate based on a survey of 52 multi-family properties in the Richmond, VA metro area was around 8% — above a more “normal” 5%, but no where close to 23.7%. And while the HVS attempts to measure the overall rental vacancy rate (and not just MF apartments for rent), the data seemed “whacky.”

When I talked to Census folks back then, they said that there quarterly metro area vacancy rates were extremely volatile and had extremely high standard errors, and that folks should focus on annual data.

However, “annual average” data from the HVS showed MASSIVELY different rental vacancy rates in Richmond, Virginia than did the American Community Survey, which also produces estimates of the vacancy rate in the overall rental market…

There are several other MSAs where the HVS rental vacancy rates just look plain “silly.” Some Census analysts agree that the HVS MSA data aren’t reliable, and even that several state data aren’t reliable, but, well, er, the national data are probably “ok” – which they are not.

If you want to read more on the issue, there are a number of links at the bottom of the story.

If the estimates are so far off from other estimates generally regarded as being reliable like the American Community Survey or the decennial Census, it would look like a new system is needed to calculate the quarterly vacancy rates.

I wonder how much these figures could hurt a particular community. Take the case of Richmond: if data suggests the vacancy rate is the highest in the country even though it is not, is this simply bad publicity or would it actually affect decisions made by residents, businesses, and local governments?

Defining the poverty line in Indonesia

One statistic that tends to generate discussion, including in the United States, is where to draw the poverty line (see a quick overview here). The issue is also drawing attention in Indonesia:

According to the Central Statistics Agency (BPS), based on the one-dollar-a-day poverty line, there are about a million fewer poor Indonesians this year. The new BPS statistics released on Friday showed that the poor now constitute 12.5 percent of Indonesia’s population, down from 13.3 percent last year. BPS says this translates to 30.02 million poor Indonesians, as opposed to the 31.02 million in March last year. ..

BPS head Rusman Heriawan said this drop was recorded even though the government raised the poverty line to Rp 233,740 ($27.35) per capita per month from Rp 211,726 last year.

Despite the raised figure, the definition of poverty still worried experts. “The poverty line indicator is the minimum income for people to survive,” said Bambang Shergi Laksmono, dean of the University of Indonesia’s Social and Political Science Faculty.

Statistics are rarely just statistics: they are numbers politicians and others want to use to shed light on a particular issue. Here, the government wants to suggest that poverty has been reduced. On the other side, academics suggest there are plenty of people living in difficult situations and the poverty threshold doesn’t really doesn’t measure anything. Who is right, or at least perceived as right, will be adjudicated in the court of public opinion.

While it appears that the number in people living in critical poverty has been reduced, this is also a reminder that one needs to look behind claims of progress to see what exactly is being measured and whether the measurements have simply changed.