Nate Silver: “The World May Have A Polling Problem”

In looking at the disparities between polls and recent election results in the United States and UK, Nate Silver suggests the polling industry may be in some trouble:

Consider what are probably the four highest-profile elections of the past year, at least from the standpoint of the U.S. and U.K. media:

  • The final polls showed a close result in the Scottish independence referendum, with the “no” side projected to win by just 2 to 3 percentage points. In fact, “no” won by almost 11 percentage points.
  • Although polls correctly implied that Republicans were favored to win the Senate in the 2014 U.S. midterms, they nevertheless significantly underestimated the GOP’s performance. Republicans’ margins over Democrats were about 4 points better than the polls in the average Senate race.
  • Pre-election polls badly underestimated Likud’s performance in the Israeli legislative elections earlier this year, projecting the party to about 22 seats in the Knesset when it in fact won 30. (Exit polls on election night weren’t very good either.)

At least the polls got the 2012 U.S. presidential election right? Well, sort of. They correctly predicted President Obama to be re-elected. But Obama beat the final polling averages by about 3 points nationwide. Had the error run in the other direction, Mitt Romney would have won the popular vote and perhaps the Electoral College.

Perhaps it’s just been a run of bad luck. But there are lots of reasons to worry about the state of the polling industry. Voters are becoming harder to contact, especially on landline telephones. Online polls have become commonplace, but some eschew probability sampling, historically the bedrock of polling methodology. And in the U.S., some pollsters have been caught withholding results when they differ from other surveys, “herding” toward a false consensus about a race instead of behaving independently. There may be more difficult times ahead for the polling industry.

It sounds like there are multiple areas for improvement:

1. Methodology. How can polls reach the average citizen two decades into the 21st century? How can they collect representative samples?

2. Behavior across the pollsters, the media, and political operatives. How are these polls reported? Is the media more interested in political horse races than accurate poll results? Who can be viewed as an objective polling organization? Who can be viewed as an objective source for reporting and interpreting polling figures?

3. A decision for academics as well as pollsters: how accurate should polls be (what are the upper bounds for margins of error)? Should there be penalties for work that doesn’t accurately reflect public opinion?

Comparing the overall diversity of cities versus neighborhood diversity within cities

Nate Silver looks at how large cities can be diverse overall but still have high levels of residential segregation:

This is what the final metric, the integration-segregation index, gets at. It’s defined by the relationship between citywide and neighborhood diversity scores. If we graph the 100 most populous cities on a scatterplot, they look like this:

silver-segregation-scatter

The integration-segregation index is determined by how far above or below a city is from the regression line. Cities below the line are especially segregated. Chicago, which has a -19 score, is the most segregated city in the country. It’s followed by Atlanta, Milwaukee, Philadelphia, St. Louis, Washington and Baltimore.

Cities above the red line have positive scores, which mean they’re comparatively well-integrated. Sacramento’s score is a +10, for instance.

But here’s the awful thing about that red line. It grades cities on a curve. It does so because there aren’t a lot of American cities that meet the ideal of being both diverse and integrated. There are more Baltimores than Sacramentos.

Furthermore, most of the exceptions are cities like Sacramento that have large Hispanic or Asian populations. Cities with substantial black populations tend to be highly segregated. Of the top 100 U.S. cities by population, 35 are at least one-quarter black, and only 6 of those cities have positive integration scores.

So perhaps the Chicago School was correct: it is really neighborhoods that matter, even within cities with millions of people. This is what an interesting recent map of Detroit showed where there are clearly clusters of the city that have traditional neighborhoods while other parts are more vacant urban prairies. The sociological literature on poor neighborhoods that emerged starting in the 1970s gets at a similar concept: there are unique conditions and processes at work in such neighborhoods. And, Silver’s analysis confirms sociological research on residential segregation that for decades (perhaps highlights most memorably in American Apartheid) has argued that black-white residential segregation is in a league of its own.

Based on this kind of analysis, should sociologists only use city-wide or region-wide measures of segregation in conjunction with measures or methods that account for neighborhoods?

2014 Democrats echo 2012 Republicans in arguing political polls are skewed

Apparently, this is a strategy common to both political parties: when the poll numbers aren’t in your favor on the national stage, argue that the numbers are flawed.

The [Democratic] party is stoking skepticism in the final stretch of the midterm campaign, providing a mirror image of conservative complaints in 2012 about “skewed” polls in the presidential race between President Obama and Republican Mitt Romney.

Democrats who do not want their party faithful to lose hope — particularly in a midterm election that will be largely decided on voter turnout — are taking aim at the pollsters, arguing that they are underestimating the party’s chances in November.

At the center of the storm, just as he was in 2012, is Nate Silver of fivethirtyeight.com…

This year, Democrats have been upset with Silver’s predictions that Republicans are likely to retake the Senate. Sen. Heidi Heitkamp (D-N.D.) mocked Silver at a fundraising luncheon in Seattle that was also addressed by Vice President Biden, according to a White House pool report on Thursday.

“Pollsters and polling have sort of elbowed their way to the table in terms of coverage,” Berkovitz said. “Pollsters have become high profile: They are showing up on cable TV all the time.”

This phenomenon, in turn, has led to greatly increased media coverage of the differences between polling analyses. In recent days, a public spat played out between Silver and the Princeton Election Consortium’s Sam Wang, which in turn elicited headlines such as The Daily Beast’s “Why is Nate Silver so afraid of Sam Wang?”

There are lots of good questions to ask about political polls, including looking at their sampling, the questions they ask, and how they make their projections. Yet, that doesn’t automatically mean that everything has been manipulated to lead to a certain outcome.

One way around this? Try to aggregate among various polls and projections. RealClearPolitics has a variety of polls in many races for the 2014 elections. Aggregation also helps get around the issue of celebrity where people like Nate Silver build careers on being right – until they are wrong.

At the most basic level, the argument about flawed polls is probably about turning out the base to vote. If some people won’t vote because they think their vote won’t overturn the majority, then you have to find ways to convince them that their vote still matters.

Krugman: prediction problems in economics due to the “sociology of economics”

Looking at the predictive abilities of macroeconomics, Paul Krugman suggests there is an issue with the “sociology of economics”:

So, let’s grant that economics as practiced doesn’t look like a science. But that’s not because the subject is inherently unsuited to the scientific method. Sure, it’s highly imperfect — it’s a complex area, and our understanding is in its early stages. And sure, the economy itself changes over time, so that what was true 75 years ago may not be true today — although what really impresses you if you study macro, in particular, is the continuity, so that Bagehot and Wicksell and Irving Fisher and, of course, Keynes remain quite relevant today.

No, the problem lies not in the inherent unsuitability of economics for scientific thinking as in the sociology of the economics profession — a profession that somehow, at least in macro, has ceased rewarding research that produces successful predictions and rewards research that fits preconceptions and uses hard math instead.

Why has the sociology of economics gone so wrong? I’m not completely sure — and I’ll reserve my random thoughts for another occasion.

This is an occasional discussion in social sciences like economics or sociology: how much are they really like a science in the sense of making testable predictions (not about the natural world but for social behavior) versus whether they are more interpretive. I’m not surprised Krugman takes this stance but it is interesting that he says the issue is within the discipline itself for rewarding the wrong things. If this is the case, what could be done to reward successful predictions? At this point, Krugman is suggesting a problem without offering much of a solution. As a number of people, like Nassim Taleb and Nate Silver, have noted in recent years, making predictions is quite difficult, requires a more humble approach, and requires particular methodological and statistical approaches.

Will citing the Kindle location, and not the page number, become the norm?

Nate Silver’s book The Signal and the Noise contains an interesting bibliographic twist: he sometimes cites Kindle locations, not page numbers. Here is an example: footnote 42 in Chapter 8.

McGrayne, The Theory That Would Not Die, Kindle location 46.

Silver doesn’t do this for every book though he does sometimes say a book is the Kindle edition if he is giving the full citation. Is Silver on to a new trend? Will readers and scholars want Kindle locations?

I think we’re probably a long ways from this becoming standard. The problem is that it requires having all of your books in Kindle form. Ebooks are popular but I’m not sure how far people are willing to go to replace all of their older books with Kindle editions. (Particularly if you are dealing with more esoteric published material.) I could see this happening more for new books which are more likely to be purchased in Kindle form. Or perhaps we are headed for a world where everyone has Kindle access to all major books (a subscription service? An expanded Project Gutenberg?) on their phone, tablet, or computer and looking up a Kindle location becomes really easy.

Perhaps this won’t really matter until I see it in a student research paper…

“The Nate Silver of immigration reform”

Want a statistical model that tells you which Congressman to lobby on immigration reform? Look no further than a political scientist at UC San Diego:

In the mold of Silver, who is famous for his election predictions, Wong bridges the gap between equations and shoe-leather politics, said David Damore, a political science professor at the University of Nevada, Las Vegas and a senior analyst for Latino Decisions, a political opinion research group.

Activists already have an idea of which lawmakers to target, but Wong gives them an extra edge. He can generate a custom analysis for, say, who might be receptive to an argument based on religious faith. With the House likely to consider separate measures rather than a comprehensive bill, Wong covers every permutation.

“In the House, everybody’s in their own unique geopolitical context,” Damore said. “What he’s doing is very, very useful.”

The equations Wong uses are familiar to many political scientists. So are his raw materials: each lawmaker’s past votes and the ethnic composition of his or her district. But no one else appears to be applying those tools to immigration in quite the way Wong does.

So is there something extra in the models that others don’t have or is Wong extra good at interpreting the results? The article suggests there are some common factors all political scientists would consider but then it also hints there are some more hidden factors like religiosity or district-specific happenings.

A fear I have for Nate Silver as well: what happens when the models are wrong? Those who work with statistics know they are just predictions and statistical models always have error but this isn’t necessarily how the public sees things.

Will Nate Silver ruin his brand with NCAA predictions?

Statistical guru Nate Silver, known for his 2012 election predictions, has been branching out into other areas recently on the New York Times site. Check out his 2013 NCAA predictions. Or look at his 2013 Oscar predictions.

While Silver has a background in sports statistics, I wonder if these forays into new areas with the imprimatur of the New York Times will eventually backfire. In many ways, these new areas have less data than presidential elections and thus, Silver has to step further out on a limb. For example, look at these predictions for the 2013 NCAA bracket:

The top pick for 2013, Louisville, only has a 22.7% chance of winning. If Silver goes with this pick of Louisville, and he does, then he by his own figures will be wrong 77.3% of the time. These are not good odds.

I’m not sure Silver can really win much by predicting the NCAA champion or the Oscars because the odds of making a wrong prediction are higher. What happens if he is wrong a number of times in a row? Will people still listen to him in the same way? What happens when the 2016 presidential election comes along? Of course, Silver could continue to develop better models and make more accurate picks but even this takes attention away from his political predictions.

One chart that situates the 2012 Republican presidential contenders

One of the key purposes of a chart or graph is to distill a lot of complicated information into a simple graphic so readers can quickly draw conclusions. In the midst of a crowded field of people who may (or may not) be vying to be the Republican candidate for president in 2012, one chart attempts to do just that.

This chart has two axes: moderate to conservative and insider to outsider. While these may be fuzzy concepts, creator Nate Silver suggests these axes give us some important information:

With that said, it is exceptionally important to consider how the candidates are positioned relative to one another. Too often, I see analyses of candidates that operate through what I’d call a checkbox paradigm, tallying up individual candidates’ strengths and weaknesses but not thinking deeply about how they will compete with one another for votes.

Silver then goes on to explain two other pieces of information for each candidate that is part of the circle used to place each candidate on the graph: the color indicates the region and the size of the circle represents their relative stock on Intrade.

Based on this chart, it looks like we have a diagonal running from top left to bottom right, from moderate insider (Mitt Romney) to conservative outsider (Sarah Palin) with Tim Pawlenty and Mike Huckabee trying to straddle the middle. We will have to see how this plays out.

But as a statistics professor who is always on the lookout for cool ways of presenting information, this is an interesting graphic.