Google Street View, machine learning, and social patterns

I have wondered why more researchers do not make use of Google Street View. Here is a new study that connects vehicles in neighborhoods with voting patterns and demographics:

Abstract: The United States spends more than $250 million each year on the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed several years. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may become an increasingly practical supplement to the ACS. Here, we present a method that estimates socioeconomic characteristics of regions spanning 200 US cities by using 50 million images of street scenes gathered with Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22 million automobiles in total (8% of all automobiles in the United States), were used to accurately estimate income, race, education, and voting patterns at the zip code and precinct level. (The average US precinct contains 1,000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographics may effectively complement labor-intensive approaches, with the potential to measure demographics with fine spatial resolution, in close to real time.

And a little more explanation from a news source:

The researchers created an algorithm to identify the brand, model and year of every car sold in the US since 1990.

The types of cars also provided information about the race, income and education levels of a neighborhood, the study said.

Volkswagens and Aston Martins were associated with white neighborhoods while Chryslers, Buicks and Oldsmobiles tended to appear in African-American neighborhoods, the study found.

This study seems to do two things that get at different areas of research:

  1. Linking lifestyle choices to voting behavior as well as other social traits. Researchers and marketers have done this for decades. For example, see this earlier post about media consumption and voting behavior. This hints at the work Bourdieu who suggested class status is defined by cultural tastes and lifestyles in addition to access to resources and power.
  2. Connecting different publicly available big data sets to find connections. Google Street View is available to all and election outcomes are also accessible. All it takes is a method to put these two things together. Here, it was a machine learning algorithm by which different kinds of vehicles could be identified. It would take humans a long time to connect these pieces of data but algorithms, once they correctly are identifying vehicles, can do this very quickly.

Of course, this still leaves us with questions about what to do with it all. The authors seem interested in helping facilitate more efficient national data-gathering efforts. The American Community Survey and the Dicennial Census are both costly efforts. Could machine learning help reduce the effort needed while providing accurate results? At the same time, it is less clear regarding the causal mechanisms behind these findings: do people buy pick-up trucks because they are Republican? How does this choice of a vehicle fit with a larger constellation of behaviors and beliefs? If someone wanted to change voting patterns, could encouraging the purchase of more pick-up trucks or sedans actually change voting patterns (or are these more of correlations)?

Comparing the McMansions of Matt Ryan and Tom Brady

Relive some of the excitement of Super Bowl by comparing the McMansion of Matt Ryan in Duluth, Georgia versus Tom Brady’s homes:

I’d say Matty Ice picked himself the most conventional McMansion possible…

But what his house, or houses? I bet he has no taste…This house in Brookline, Massachussets? You’re kidding me. It’s kind of tasteful. Okay, it’s great, it’s perfect…What about the house in LA? I bet that’s hideous…You know what though, Tommy Boy? You are not McMansion material…

The winner here is Matt Ryan for keeping it real.

All those hours of coverage of the big game and you didn’t see important information like this. Both clearly have large homes but there are notable differences. This analysis suggests this comes down to personal taste but I think there are some other factors at work:

  1. Brady operates in different locations where expectations about large homes may be different. Compared to the Atlanta area, are there were fewer McMansions in Brookline (probably) or in the Los Angeles area (maybe not but there are also more legitimate mansions)?
  2. Brady operates in a different social circle than Ryan. With his model wife, Brady has to fit in with a range of famous people while Ryan is with the football crowd. Both have plenty of money but there is a difference in social class and taste a la Bourdieu.
  3. Both grew up in suburban areas: Ryan in Exton, Pennsylvania (outside Philadelphia) and Brady in San Mateo, California (Bay Area). This could influence both wanting to live in suburban areas now.
  4. Ryan is younger than Brady and perhaps he hasn’t had the time or experience to move to a more “mature” home.

Overall, I suspect many pro athletes have homes critics would call McMansions.

The cultural bubbles of popular TV shows tell us what exactly?

This is a cool set of maps of the popularity of 50 different TV shows across zip codes in the United States. But, what is the data and what exactly can it tell us? Here is the brief explanation:

When we looked at how many active Facebook users in a given ZIP code “liked” certain TV shows, we found that the 50 most-liked shows clustered into three groups with distinct geographic distributions. Together they reveal a national culture split among three regions: cities and their suburbs; rural areas; and what we’re calling the extended Black Belt — a swath that extends from the Mississippi River along the Eastern Seaboard up to Washington, but also including city centers and other places with large nonwhite populations.

Some quick thoughts:

  1. Can we assume that Facebook likes are an accurate measure? How many people are represented per zip code? Who tends to report their TV show preferences on Facebook? Why not use Nielsen data which likely has a much smaller sample but could be considered more reliable and valid?
  2. How exactly does television watching influence everyday beliefs and actions? Or, does it work the other way: people have certain beliefs and behaviors and they watch what confirms what they already like? Sociologists and others that study the effects of television don’t always have data on the direct connections between viewing and other parts of life. (I’m not suggesting television has no influence. Given that the average American still watches several hours a day, it is still a powerful medium even with the rise of
  3. The opening to the article both suggests TV viewing and the related cultures fall along an urban/rural divide but then also split across three groups. The maps display three main groups – metro areas, rural areas, and areas with higher concentrations of African Americans. I would want to know more about two areas. First, political data – and this article wants to make the link between TV watching and the 2016 election – suggests the final divide is really in the suburbs between areas further out from the big city and those closer. Can we get finer grained data between exurbs and inner-ring suburbs? Second, does this mean that Latinos and Asian Americans aren’t differentiated enough to be their own TV watching cultures?
  4. The introduction to this article also repeats a common line among those that study television:

In the 1960s and ’70s, even if you didn’t watch a show, you at least probably would have heard of it. Now television, once the great unifier, amplifies our divisions.

We certainly are way into the cable era of television (and probably beyond with all of the options now available through the Internet and streaming) but could we argue instead that the earlier era of fewer channels and viewing options simply papered over differences? As numerous historians and other scholars have argued, the 1950s might have appeared to be a golden era but most of the benefits went to white, middle class, suburban families.

In other words, I would be hesitant to state that these TV patterns are strong evidence of three clearly different cultures in the United States. Could these television viewing patterns fit in with other cultural tastes differentiating various groups based on class and race and ethnicity? Yes, though I’d much rather see serious academic work on this developing Bourdieu’s ideas and encompassing all sorts of consumption items treasured by Americans (homes, vehicles, sports fandom, making those hard choices like Coke and Pepsi or McDonald’s and Burger King or Walmart and Target). Also, limiting ourselves to geography may not work as well – this approach has been tried by many including in books like The Big Sort or Our Patchwork Nation – as it did in the past.

High performing school districts driving residential segregation

A new sociological study suggests schools are helping lead to residential segregation:

Study author Ann Owens, an assistant professor of sociology at USC Dornsife College of Letters, Arts and Sciences, examined census data from 100 major U.S. metropolitan areas, from Los Angeles to Boston. She found that, among families with children, neighborhood income segregation is driven by increased income inequality in combination with a previously overlooked factor: school district options.

For families with high income, school districts are a top consideration when deciding where they will live, Owens said. And for those in large cities, they have multiple school districts where they could choose to buy homes.

Income segregation between neighborhoods rose 20 percent from 1990 to 2010, and income segregation between neighborhoods was nearly twice as high among households that have children compared to those without…

She recommended that educational leaders should consider redrawing boundaries to reduce the number and fragmentation of school districts in major metropolitan areas. They also should consider designing inter-district choice plans and strengthening current plans within districts to address inequities.

Generally, wealth and race leads to residential segregation but it is interesting to see through what mechanisms this works. As Bourdieu (and others) suggested, schools tend to reproduce existing social stratification and here they work to reify desirable housing locations.

“The rise and rise of Pierre Bourdieu in US sociology”

A French sociologist looks at the popularity of Pierre Bourdieu:

Pierre Bourdieu would have turned 85 on 1 August 2015. Thirteen years after his death, the French sociologist remains one of the leading social scientists in the world. His work has been translated into dozens of languages (Sapiro & Bustamante 2009), and he is one of the most cited social theorists worldwide, ahead of major thinkers like Jurgen Habermas, Anthony Giddens, or Irving Goffman (Santoro 2008). That Bourdieu is one of the most prominent social theorists will come as no surprise to those accustomed to the academic scene. A more surprising fact, however, is that he is probably the most cited scholar in the social sciences. In a forthcoming paper on the reception of French sociologists in the United States, Andrew Abbott and I show that, at the turn of this decade, he is referenced in more than 100 sociological articles a year. Important authors like Paul Di Maggio or James Coleman are only cited 60 times, while Mark Granovetter has nearly 50 mentions. Bourdieu is also referenced more often than Émile Durkheim, who for a long time epitomized (French) sociology…

This diversity of topics influenced the reception of Bourdieu’s work abroad. As has been pointed out (see Sallasz and Zavisca, 2008), it was initially read by different (unrelated) groups. Though it happened fairly early, the reception of his work remained confined to local areas for over two decades. In the United States, this situation changed in the late 1980s following a number of efforts to emphasize the systematic character of Bourdieu’s research. The key initiative among these was the 1992 interview book co-authored by Bourdieu and Loïc Wacquant, Invitation to a Reflexive Sociology. Written in English with a US audience in mind, it aims at presenting Bourdieu’s system to a foreign audience. Our data shows that after publication of this book, his work subsequently gained widespread exposure beyond the limited local fields in which it was already popular. Not only were his concepts now used outside of those fields, but references to his work also increasingly pertained to theoretical aspects rather to empirical ones. Starting in the mid-1990s, Bourdieu was regarded as a general social theorist and read across sub-disciplinary lines—as well as across disciplines.

What will happen next? Although prediction and social science don’t square well, several signs indicate that Bourdieu is currently entering the canon of worldwide sociology. In the United States, our study shows that while the number of references to his work continues to increase, scholars’ level of engagement with the text is decreasing. In fact, over the last few years, references to Bourdieu have become more allusive. To measure this change, we hand-coded several hundred references from different periods. The proportion of those extensively citing Bourdieu has decreased steadily since the 2000s. This trait is characteristic of a process of canonization, when an author becomes equated with an idea or a set of ideas (e.g. Foucault and power, Goffman and face-to-face interactions, etc.), and is therefore considered a mandatory reference on the topic. The citation becomes a ritual. In some cases, the author has obviously not read the text in question.

Has Bourdieu become a museum piece? It does not seem so, at least for now. Scholarly interest is still strong and his work is still very much discussed. A good indicator of this is the number of references to an author per article, and comparison with other authors is telling here. Whereas Durkheim is routinely cited but not much debated, and receives an average of one reference per article citing him (fig2a), Bourdieu’s work is still an object of active investment (fig2b). At least 25% of the articles citing Bourdieu make two references to his work, sometimes many more. Bourdieu may well be entering the canon, but his appropriation abroad still fosters debates.

This sort of analysis could be undertaken with any major figure in an academic field: how did their work spread, who was spreading it, when did it peak, and how did the citations coalesce around particular topics or ideas?

The case of Bourdieu is interesting for several reasons. It involves a sociologist from another country who wrote in another language and American sociology can sometimes be provincial. Bourdieu’s work began in the ethnographic realm but hit upon key areas in sociology after the 1960s including social class, culture, and education. His findings have been utilized in multiple disciplines and across countries.

At some point, will there be a Bourdieu backlash or major opposition? The article suggested the next stage is “museum piece” and this seems to imply that his work will remain important but fade into history. Who has the big ideas to replace Bourdieu or significantly tweak his work?

Patterns in college major by parent’s income

College students with parents with higher incomes study different subjects:

Once financial concerns have been covered by their parents, children have more latitude to study less pragmatic things in school. Kim Weeden, a sociologist at Cornell, looked at National Center for Education Statistics data for me after I asked her about this phenomenon, and her analysis revealed that, yes, the amount of money a college student’s parents make does correlate with what that person studies. Kids from lower-income families tend toward “useful” majors, such as computer science, math, and physics. Those whose parents make more money flock to history, English, and performing arts.

http://www.theatlantic.com/business/archive/2015/07/college-major-rich-families-liberal-arts/397439/
The explanation is fairly intuitive. “It’s … consistent with the claim that kids from higher-earning families can afford to choose less vocational or instrumental majors, because they have more of a buffer against the risk of un- or under-employment,” Weeden says. With average earnings for different types of degrees as well-publicized as they are—the difference in lifetime earnings among majors can be more than $3 million, one widely covered study found—it’s not hard to imagine a student deciding his or her academic path based on its expected payout. And it’s especially not hard to imagine poorer kids making this calculation out of necessity, while richer kids forgo that means-to-an-end thinking.

Another trend expressed in the data, Weeden notes, is that lower-income families and higher-income families tend to send their children to schools with different options for majors: Most of the priciest, top-tier schools don’t offer Law Enforcement as a major, for instance. There is also the possibility that children from higher-income families were more exposed to the sorts of art, music, and literature that colleges deem worthy of study, an exposure that might inspire them to pursue those subjects when they get to college…

From this angle, college majors and occupations start to look more and more like easily-interpreted, if slightly crude, badges doled out to people based on the wealth and educational levels of the parents they were born to. There’s a reason that the first question asked at parties is often “So, what do you do?” “If we tend to avoid asking acquaintances about their income,” four prominent sociologists wrote in the 2011 anthology The Inequality Reader, “it’s not just because doing so is viewed as too intrusive and personal but also because we suspect that querying about occupation will yield more in the way of useful information.”

Four quick thoughts:

1. Of course, what majors actually lead to what jobs is not as clear as people might make it out to be. Just because someone has a particular major doesn’t mean that is where they will be working in 10 or 20 years. At the same time, some majors might lend themselves to particular jobs right after college.

2. Outside of an associate’s degree, the majors with the lowest parent incomes (top of the chart) are helping professions. This might indicate a bigger interest in wanting to work with people or directly give back to the community. Reading uncharitably, do the majors with higher parent incomes lend themselves to a certain distance from people?

3. It is interesting that sociology, political science, and anthropology are higher up on the list of parent’s incomes. Students sometimes seem to suggest that these are luxury subjects – interesting perhaps (if they don’t think it is just common sense) but too difficult for finding a career.

4. This would all make sense in Bourdieu’s ideas about social class. Those with less economic capital tend to favor more functional items while those with more capital lean toward the abstract. Why should college major be exempt from the powerful organizing forces of social class?

“McAnger” over new big homes in New York City suburbs

Some new large homes in Westchester County have drawn some “McAnger”:

“This is really stupid,” wrote Laura Kerns. “No one needs this much house.”…”It’s sad, really,” David Raguso wrote. “This county just doesn’t care about the average person.”

Said Dana Doyle, “Bye bye, middle-class! The rich folk are taking over!”…

Like others, Daphne Philipson questioned the need for so much square footage. “The Gilded Age is back – and we know how well that went for everyone.”…

“Wretched excess,” he wrote. “There is nothing wrong with being financially successful, but why then not be reserved about it? How much house does a man need? Find meaning in meaningful things.”…Some were not so much annoyed but still critical of the new homes, critiquing the exterior appearance specifically as a hodgepodge of conflicting architectural styles. “Looks like it was thrown together at different times by different moods,” wrote Erika Kislaki-Bauer.

Eileen Healy Rehill lamented the addition of “more overly priced McMansions” in Westchester rather than “nice yet affordable housing for the middle class.” She was far from the only one, with housing for seniors and the disabled also mentioned.

Some familiar comments when McMansions are involved. Three quick thoughts, with the first two mentioned briefly in this summary of feedback:

1. Westchester County already is a wealthy county. It was known as the home to many wealthy estates as New York City was growing. A number of high-profile companies moved there post-World War II, including IBM. It is home to “Hipsterurbia.” In other words, McMansions are just symptomatic of a wealthy county where many communities would not welcome affordable housing and builders see ongoing opportunities for wealthy buyers.

2. These new homes are indeed large and luxurious. But, the conversation about “who needs this” can get sticky. How much do Westchester County residents consume? How many suburbanites buy a home that is too small for them? How many people don’t seek through the exterior of their home or the things inside to provide some markers of their social status? On one hand, Americans have historically tended to frown upon opulent wealth (hence, everyone wants to be middle class) yet consumption is rampant and the American middle class is very well off by American standards (though there may be a big gap between them and many Westchester County residents).

3. The critique of the architecture might seem class neutral. After all, people could build both big and small houses that match the local styles or are done in good taste. Yet, architectural styles and design are likely class-based tastes, a la Bourdieu.