SurveyMonkey made good 2014 election predictions based on experimental web polls

Here is an overview of some experimental work at SurveyMonkey in doing political polls ahead of the 2014 elections:

For this project, SurveyMonkey took a somewhat different approach. They did not draw participants from a pre-recruited panel. Instead, they solicited respondents from the millions of people that complete SurveyMonkey’s “do it yourself” surveys every day run by their customers for companies, schools and community organizations. At the very end of these customer surveys, they asked respondents if they could answer additional questions to “help us predict the 2014 elections.” That process yielded over 130,000 completed interviews across the 45 states with contested races for Senate or governor.

SurveyMonkey tabulated the results for all adult respondents in each state after weighting to match Census estimates for gender, age, education and race for adults — a relatively simple approach analogous to the way most pollsters weight random sample telephone polls. SurveyMonkey provided HuffPollster with results for each contest tabulated among all respondents as well as among subgroups of self-identified registered voters and among “likely voters — those who said they had either already voted or were absolutely certain or very likely to vote (full results are published here).

“We sliced the data by these traditional cuts so we could easily compare them with other surveys,” explains Jon Cohen, SurveyMonkey’s vice president of survey research, “but there’s growing evidence that we shouldn’t necessarily use voters’ own assessments of whether or not they’ll vote.” In future elections, Cohen adds, they plan “to dig in and build more sophisticated models that leverage the particular attributes of the data we collect.” (In a blog post published separately on Thursday, Cohen adds more detail about how the surveys were conducted).

The results are relatively straightforward. The full SurveyMonkey samples did very well in forecasting winners, showing the ultimate victor ahead in all 36 Senate races and missing in just three contests for Governor (Connecticut, Florida and Maryland)…

The more impressive finding is the way the SurveyMonkey samples outperformed the estimates produced by HuffPost Pollster’s poll tracking model. Our models, which are essentially averages of public polls, were based on all available surveys and calibrated to corresponded to results from the non-partisan polls that had performed well in previous elections. SurveyMonkey’s full samples in each state showed virtually no bias, on average. By comparison, the Pollster models overstated the Democrats’ margins against Republican candidates by an average 4 percent. And while SurveyMonkey’s margins were off in individual contests, the spread of those errors was slightly smaller than the spread of those for the Pollster averages (as indicated by the total error, the average of the absolute values of the error on the Democrat vs Republican margins).

The general concerns with web surveys involve obtaining a representative sample, either because it is difficult to identify the particular respondents who would meet the appropriate demographics or the survey is open to everyone. But, SurveyMonkey was able to produce good predictions for this past election cycle. Was it because they had (a) large enough samples that their data was a better approximation of the general population (they were able to reach a large number of people who use their services or (b) their weighting was particularly good?

The real test of this will be when a major organization, particularly a media outlet, solely utilizes web polls ahead of a major election. Given these positive results, perhaps we will see this in 2016. Yet, I imagine there may be some kinks to work out of the system or some organizations would only be willing to do that if they paired the web data with more traditional forms of polling.

The bias toward one party in 2014 election polls is a common problem

Nate Silver writes that 2014 election polls were generally skewed toward Democrats. However, this isn’t an unusual problem in election years:

This type of error is not unprecedented — instead it’s rather common. As I mentioned, a similar error occurred in 1994, 1998, 2002, 2006 and 2012. It’s been about as likely as not, historically. That the polls had relatively little bias in a number of recent election years — including 2004, 2008 and 2010 — may have lulled some analysts into a false sense of security about the polls.

Interestingly, this year’s polls were not especially inaccurate. Between gubernatorial and Senate races, the average poll missed the final result by an average of about 5 percentage points — well in line with the recent average. The problem is that almost all of the misses were in the same direction. That reduces the benefit of aggregating or averaging different polls together. It’s crucially important for psephologists to recognize that the error in polls is often correlated. It’s correlated both within states (literally every nonpartisan poll called the Maryland governor’s race wrong, for example) and amongst them (misses often do come in the same direction in most or all close races across the country).

This is something we’ve studied a lot in constructing the FiveThirtyEight model, and it’s something we’ll take another look at before 2016. It may be that pollster “herding” — the tendency of polls to mirror one another’s results rather than being independent — has become a more pronounced problem. Polling aggregators, including FiveThirtyEight, may be contributing to it. A fly-by-night pollster using a dubious methodology can look up the FiveThirtyEight or Upshot or HuffPost Pollster or Real Clear Politics polling consensus and tweak their assumptions so as to match it — but sometimes the polling consensus is wrong.

It’s equally important for polling analysts to recognize that this bias can just as easily run in either direction. It probably isn’t predictable ahead of time.

The key to the issue here seems to be the assumptions that pollsters make before the election: who is going to turn out? Who is most energized? How do we predict who exactly is a likely voter? What percentage of a voting district identifies as Republican, Democrat, or Independent?

One thing that Silver doesn’t address is how this affects both perceptions of and reliance on such political polls. To have a large number of these polls lean in one direction (or lean in Republican directions in previous election cycles) suggests there is more work to do in perfecting such polls. All of this isn’t an exact science yet the numbers seem to matter more than ever; both parties jump on the results to either trumpet their coming success or to try to get their base out to reverse the tide. I’ll be curious to see what innovations are introduced heading into 2016 when the polls matter even more for a presidential race.

Facebook not going to run voting experiments in 2014

Facebook is taking an increasing role in curating your news but has decided to not conducts experiments with the 2014 elections:

Election Day is coming up, and if you use Facebook, you’ll see an option to tell everyone you voted. This isn’t new; Facebook introduced the “I Voted” button in 2008. What is new is that, according to Facebook, this year the company isn’t conducting any experiments related to election season.

That’d be the first time in a long time. Facebook has experimented with the voting button in several elections since 2008, and the company’s researchers have presented evidence that the button actually influences voter behavior…

Facebook’s experiments in 2012 are also believed to have influenced voter behavior. Of course, everything is user-reported, so there’s no way of knowing how many people are being honest and who is lying; the social network’s influence could be larger or smaller than reported.

Facebook has not been very forthright about these experiments. It didn’t tell people at the time that they were being conducted. This lack of transparency is troubling, but not surprising. Facebook can introduce and change features that influence elections, and that means it is an enormously powerful political tool. And that means the company’s ability to sway voters will be of great interest to politicians and other powerful figures.

Facebook will still have the “I voted” button this week:

On Tuesday, the company will again deploy its voting tool. But Facebook’s Buckley insists that the firm will not this time be conducting any research experiments with the voter megaphone. That day, he says, almost every Facebook user in the United States over the age of 18 will see the “I Voted” button. And if the friends they typically interact with on Facebook click on it, users will see that too. The message: Facebook wants its users to vote, and the social-networking firm will not be manipulating its voter promotion effort for research purposes. How do we know this? Only because Facebook says so.

It seems like there are two related issues here:

1. Should Facebook promote voting? I would guess many experts would like popular efforts to try to get people to vote. After all, how good is democracy if many people don’t take advantage of their rights to vote? Facebook is a popular tool and if this can help boost political and civic engagement, what could be wrong with that?

2. However, Facebook is also a corporation that is collecting data. Their efforts to promote voting might be part of experiments. Users aren’t immediately aware that they are participating in an experiment when they see a “I voted” button. Or, the company may decide to try to influence elections.

Facebook is not alone in promoting elections. Hundreds of media outlets promote election news. Don’t they encourage voting? Aren’t they major corporations? The key here appears to be the experimental angle: people might be manipulated. Might this be okay if (1) they know they are taking part (voluntary participation is key to social science experiments) and (2) it promotes the public good? This sort of critique implies that the first part is necessary because fulfilling a public good is not enough to justify the potential manipulation.

Political campaigns combining big data, ground games

Close elections mean both political parties are combining ground games and big data to try to eke out victories:

Workers like Ms. Wellington and Mr. Noble are, in the end, critical to any ground campaign, no matter how sophisticated data collection and targeting models are, said Sasha Issenberg, author of “The Victory Lab: The Secret Science of Winning Campaigns.”

“The great irony of the modern ground game is it’s this meeting of incredibly modern analytics and data married to very old-fashioned delivery devices,” he said. “It’s people knocking on doors; it’s people making phone calls out of phone banks; but the calculations that are determining which door and which phone are different.”…

The Democratic Senatorial Campaign Committee ramped up its commitment, creating the “Bannock Street project,” a multimillion dollar, data-driven effort to persuade, register and turn out voters.

“The easiest way to look at it is our strategy to winning is expanding the voting universe,” said Preston Elliott, Hagan’s campaign manager, in an interview in his Greensboro office. “It’s a little more machineish than just catching a wave and riding momentum.”

Republicans say they are catching up. In Raleigh, campaign workers and volunteers showed off a new smartphone app that helps canvassers target their door knocks. But Republican officials refused to reveal volunteer numbers, paid staff totals, field office locations or a tabulation of voter contacts. Nor would they allow reporters to recount the phone-bank pitch, “the secret sauce,” as they called it.

This is taking new information about voters – something political parties always want – and putting it into real-time (or close) models in order to produce more effective targeted efforts with limited time and efforts before elections.

Two other thoughts:

1. It would be interesting to then see how these new efforts fit with broad appeals politicians make to the public. Does this new kind of information and targeting mean that politicians will spend less time making big claims and instead focus on smaller segments of voters?

2. Americans aren’t always thrilled with the kind of information corporations or tech companies have about them. Are they happy with political parties having more information? Of course, people don’t have to give out this information but this information is going into the hands of political parties who don’t exactly have the highest ratings these days.

2014 Democrats echo 2012 Republicans in arguing political polls are skewed

Apparently, this is a strategy common to both political parties: when the poll numbers aren’t in your favor on the national stage, argue that the numbers are flawed.

The [Democratic] party is stoking skepticism in the final stretch of the midterm campaign, providing a mirror image of conservative complaints in 2012 about “skewed” polls in the presidential race between President Obama and Republican Mitt Romney.

Democrats who do not want their party faithful to lose hope — particularly in a midterm election that will be largely decided on voter turnout — are taking aim at the pollsters, arguing that they are underestimating the party’s chances in November.

At the center of the storm, just as he was in 2012, is Nate Silver of fivethirtyeight.com…

This year, Democrats have been upset with Silver’s predictions that Republicans are likely to retake the Senate. Sen. Heidi Heitkamp (D-N.D.) mocked Silver at a fundraising luncheon in Seattle that was also addressed by Vice President Biden, according to a White House pool report on Thursday.

“Pollsters and polling have sort of elbowed their way to the table in terms of coverage,” Berkovitz said. “Pollsters have become high profile: They are showing up on cable TV all the time.”

This phenomenon, in turn, has led to greatly increased media coverage of the differences between polling analyses. In recent days, a public spat played out between Silver and the Princeton Election Consortium’s Sam Wang, which in turn elicited headlines such as The Daily Beast’s “Why is Nate Silver so afraid of Sam Wang?”

There are lots of good questions to ask about political polls, including looking at their sampling, the questions they ask, and how they make their projections. Yet, that doesn’t automatically mean that everything has been manipulated to lead to a certain outcome.

One way around this? Try to aggregate among various polls and projections. RealClearPolitics has a variety of polls in many races for the 2014 elections. Aggregation also helps get around the issue of celebrity where people like Nate Silver build careers on being right – until they are wrong.

At the most basic level, the argument about flawed polls is probably about turning out the base to vote. If some people won’t vote because they think their vote won’t overturn the majority, then you have to find ways to convince them that their vote still matters.