Google Street View, machine learning, and social patterns

I have wondered why more researchers do not make use of Google Street View. Here is a new study that connects vehicles in neighborhoods with voting patterns and demographics:

Abstract: The United States spends more than $250 million each year on the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed several years. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may become an increasingly practical supplement to the ACS. Here, we present a method that estimates socioeconomic characteristics of regions spanning 200 US cities by using 50 million images of street scenes gathered with Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22 million automobiles in total (8% of all automobiles in the United States), were used to accurately estimate income, race, education, and voting patterns at the zip code and precinct level. (The average US precinct contains 1,000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographics may effectively complement labor-intensive approaches, with the potential to measure demographics with fine spatial resolution, in close to real time.

And a little more explanation from a news source:

The researchers created an algorithm to identify the brand, model and year of every car sold in the US since 1990.

The types of cars also provided information about the race, income and education levels of a neighborhood, the study said.

Volkswagens and Aston Martins were associated with white neighborhoods while Chryslers, Buicks and Oldsmobiles tended to appear in African-American neighborhoods, the study found.

This study seems to do two things that get at different areas of research:

  1. Linking lifestyle choices to voting behavior as well as other social traits. Researchers and marketers have done this for decades. For example, see this earlier post about media consumption and voting behavior. This hints at the work Bourdieu who suggested class status is defined by cultural tastes and lifestyles in addition to access to resources and power.
  2. Connecting different publicly available big data sets to find connections. Google Street View is available to all and election outcomes are also accessible. All it takes is a method to put these two things together. Here, it was a machine learning algorithm by which different kinds of vehicles could be identified. It would take humans a long time to connect these pieces of data but algorithms, once they correctly are identifying vehicles, can do this very quickly.

Of course, this still leaves us with questions about what to do with it all. The authors seem interested in helping facilitate more efficient national data-gathering efforts. The American Community Survey and the Dicennial Census are both costly efforts. Could machine learning help reduce the effort needed while providing accurate results? At the same time, it is less clear regarding the causal mechanisms behind these findings: do people buy pick-up trucks because they are Republican? How does this choice of a vehicle fit with a larger constellation of behaviors and beliefs? If someone wanted to change voting patterns, could encouraging the purchase of more pick-up trucks or sedans actually change voting patterns (or are these more of correlations)?

The power grid, Wall Street, and presidential elections as “critical infrastructure”

The Department of Homeland Security is considering oversight of the critical infrastructure of the presidential election:

“We should carefully consider whether our election system, our election process, is critical infrastructure like the financial sector, like the power grid,” Homeland Security Secretary Jeh Johnson said.

“There’s a vital national interest in our election process, so I do think we need to consider whether it should be considered by my department and others critical infrastructure,” he said at media conference earlier this month hosted by the Christian Science Monitor…

DHS describes it this way on their website: “There are 16 critical infrastructure sectors whose assets, systems, and networks, whether physical or virtual, are considered so vital to the United States that their incapacitation or destruction would have a debilitating effect on security, national economic security, national public health or safety, or any combination thereof.”…

Johnson also said that the big issue at hand is that there isn’t a central election system since the states run elections. “There’s no one federal election system. There are some 9,000 jurisdictions involved in the election process,” Johnson said.

The term infrastructure usually brings to mind public services like electricity, water, and transportation. This is a broader definition that hints at what the government think is essential to American society. Wall Street as infrastructure? If something crashed for a significant amount of time – whether through error or malfunction or nefarious intervention – the ripple effects could be huge. If the national election system couldn’t be trusted, it could have significant implications for a democracy.

See the full list of the 16 areas identified as critical infrastructure – including food and agriculture as well as critical manufacturing – at the DHS website. I wonder what other sectors could be added in coming years…

New Naperville mayor/era approved by 10% of Naperville adults

Naperville just had an election for the successor to long-time mayor George Pradel but the winning candidate did not receive support from much of the community:

 

NapervilleMayorResults

According to the Census, Naperville has over 144,000 residents, of which over 71% are 18 or over. That gives roughly 102,000 potential voters. Yet, under 18,000 people voted for mayor. This is less than a fifth of the adults. The winning candidate, local business owner Steve Chirico, won with 60.5% of the vote. But, those who voted for him only made up a little more than a tenth of the adults in the suburb.

Turnout is a big issue in many elections, particularly local elections that are held separately from major national or statewide races. Theoretically, this frees up more attention for local candidates. Yet, for a suburb like Naperville that has a high quality of life and often claims that it has a strong community spirit, the election that was said by some to be about a new era is really more of a whimper than a resounding suggestion about what direction Naperville is headed.

Facebook not going to run voting experiments in 2014

Facebook is taking an increasing role in curating your news but has decided to not conducts experiments with the 2014 elections:

Election Day is coming up, and if you use Facebook, you’ll see an option to tell everyone you voted. This isn’t new; Facebook introduced the “I Voted” button in 2008. What is new is that, according to Facebook, this year the company isn’t conducting any experiments related to election season.

That’d be the first time in a long time. Facebook has experimented with the voting button in several elections since 2008, and the company’s researchers have presented evidence that the button actually influences voter behavior…

Facebook’s experiments in 2012 are also believed to have influenced voter behavior. Of course, everything is user-reported, so there’s no way of knowing how many people are being honest and who is lying; the social network’s influence could be larger or smaller than reported.

Facebook has not been very forthright about these experiments. It didn’t tell people at the time that they were being conducted. This lack of transparency is troubling, but not surprising. Facebook can introduce and change features that influence elections, and that means it is an enormously powerful political tool. And that means the company’s ability to sway voters will be of great interest to politicians and other powerful figures.

Facebook will still have the “I voted” button this week:

On Tuesday, the company will again deploy its voting tool. But Facebook’s Buckley insists that the firm will not this time be conducting any research experiments with the voter megaphone. That day, he says, almost every Facebook user in the United States over the age of 18 will see the “I Voted” button. And if the friends they typically interact with on Facebook click on it, users will see that too. The message: Facebook wants its users to vote, and the social-networking firm will not be manipulating its voter promotion effort for research purposes. How do we know this? Only because Facebook says so.

It seems like there are two related issues here:

1. Should Facebook promote voting? I would guess many experts would like popular efforts to try to get people to vote. After all, how good is democracy if many people don’t take advantage of their rights to vote? Facebook is a popular tool and if this can help boost political and civic engagement, what could be wrong with that?

2. However, Facebook is also a corporation that is collecting data. Their efforts to promote voting might be part of experiments. Users aren’t immediately aware that they are participating in an experiment when they see a “I voted” button. Or, the company may decide to try to influence elections.

Facebook is not alone in promoting elections. Hundreds of media outlets promote election news. Don’t they encourage voting? Aren’t they major corporations? The key here appears to be the experimental angle: people might be manipulated. Might this be okay if (1) they know they are taking part (voluntary participation is key to social science experiments) and (2) it promotes the public good? This sort of critique implies that the first part is necessary because fulfilling a public good is not enough to justify the potential manipulation.

2014 Democrats echo 2012 Republicans in arguing political polls are skewed

Apparently, this is a strategy common to both political parties: when the poll numbers aren’t in your favor on the national stage, argue that the numbers are flawed.

The [Democratic] party is stoking skepticism in the final stretch of the midterm campaign, providing a mirror image of conservative complaints in 2012 about “skewed” polls in the presidential race between President Obama and Republican Mitt Romney.

Democrats who do not want their party faithful to lose hope — particularly in a midterm election that will be largely decided on voter turnout — are taking aim at the pollsters, arguing that they are underestimating the party’s chances in November.

At the center of the storm, just as he was in 2012, is Nate Silver of fivethirtyeight.com…

This year, Democrats have been upset with Silver’s predictions that Republicans are likely to retake the Senate. Sen. Heidi Heitkamp (D-N.D.) mocked Silver at a fundraising luncheon in Seattle that was also addressed by Vice President Biden, according to a White House pool report on Thursday.

“Pollsters and polling have sort of elbowed their way to the table in terms of coverage,” Berkovitz said. “Pollsters have become high profile: They are showing up on cable TV all the time.”

This phenomenon, in turn, has led to greatly increased media coverage of the differences between polling analyses. In recent days, a public spat played out between Silver and the Princeton Election Consortium’s Sam Wang, which in turn elicited headlines such as The Daily Beast’s “Why is Nate Silver so afraid of Sam Wang?”

There are lots of good questions to ask about political polls, including looking at their sampling, the questions they ask, and how they make their projections. Yet, that doesn’t automatically mean that everything has been manipulated to lead to a certain outcome.

One way around this? Try to aggregate among various polls and projections. RealClearPolitics has a variety of polls in many races for the 2014 elections. Aggregation also helps get around the issue of celebrity where people like Nate Silver build careers on being right – until they are wrong.

At the most basic level, the argument about flawed polls is probably about turning out the base to vote. If some people won’t vote because they think their vote won’t overturn the majority, then you have to find ways to convince them that their vote still matters.

Large “sociological exercise”: nearly 1 in 6 global residents to vote in India’s elections

While Americans may think our country does things on a large scale, nothing quite matches the “sociological exercise” of democracy in India:

The world’s largest democracy is bracing itself for the most anticipated event every 5 years. To keep things in perspective, almost 1 in 6 on earth would be voting this April-May 2014. More than the election extravaganza, this is the world’s largest sociological exercise; an exercise that places everything else outside and puts the Indian at heart and mind while casting the ballot. As much as the focus on this has been the youth, there is a particular section of society which is slightly undermined yet equally important; the Indian women.

India has over 1.2 billion people while the US has over 310 million. While the American Revolution led to a new kind of country and government sometimes referred to as the American experiment (attributed to de Toqueville), this is quite different than developing a modern government and economy for so many people.

I sometimes think part of the current issues in the United States simply have to do with our relatively large population. Coming to a consensus among so many groups and interests is difficult. In comparison, other industrialized nations have smaller populations and are often more homogeneous. But, these issues are multiplied in India with even more interests.

More aldermen voting with Emanuel than did with Daley

Chicago may have a newer mayor but a new study shows voting with the mayor is now even more pronounced for Chicago aldermen:

After analyzing 30 divided roll calls in the nearly two years since Emanuel took office, University of Illinois at Chicago researchers concluded that Emanuel has enjoyed more iron-fisted control over the council than former mayors Richard M. Daley, Richard J. Daley or Ed Kelly, the Democratic machine co-founder.

Twenty-one aldermen supported the mayor’s programs 100 percent of the time, while 18 others were more than 90 percent in lock-step.

There have been no shortage of controversies — ranging from speed cameras, police station and mental health clinic closings to the mayor’s Infrastructure Trust and his plan to nearly double water and sewer fees.

But only seven of the 30 issues drew six or more dissenting votes. Emanuel’s average level of support on all of the divided roll calls was 93 percent, compared to 83 percent during Richard J. Daley’s first two years in office and Kelly’s 88 percent…

Pressed to explain the City Council’s obedience, Simpson pointed to the take-no-prisoners reputation Emanuel built while working under former President Bill Clinton and current President Barack Obama and as chief architect of the 2006 Democratic takeover of the U.S. House.

Still Chicago, “the city that works“?

One issue with this analysis is that is still leaves Chicago residents with little knowledge of whether these voting patterns are unusual or not. Do other major cities have more contentious voting patterns? Or, is this fairly normal for big cities outside of the occasional wide disagreement? There are always references to more contentious times in the history of the Chicago City Council (see the short-lived Council Wars) but how about even a long view within Chicago for sake of comparison? I imagine this consistent voting together is fairly unusual but once you are around Chicago long enough, this becomes normal.

And regardless of the voting patterns, how about more analysis about whether Mayor Emanuel’s decisions have been good for Chicago in the long-term? Some of this will take time to sort out…