Media and product consumption by political views

This article looks at how political campaigns are using media and production consumption data to make appeals to voters and also includes some interesting charts that map out the differences between those with different political leanings:

Inside microtargeting offices in Washington and across the nation, individual voters are today coming through in HDTV clarity — every single digitally-active American consumer, which is 91 percent of us, according to Pew Internet research. Political strategists buy consumer information from data brokers, mash it up with voter records and online behavior, then run the seemingly-mundane minutiae of modern life — most-visited websites, which soda’s in the fridge — through complicated algorithms and: pow! They know with “amazing” accuracy not only if, but why, someone supports Barack Obama or Romney, says Willie Desmond of Strategic Telemetry, which works for the Obama reelection campaign…

All of these online movements contribute to what Gage calls “data exhaust.” Email, Amazon orders, resume uploads, tweets — especially tweets — cough out fumes that microtargeters or data brokers suck up to mold hyper-specific messaging. We’ve been hurled into an era of “Big Data,” Gage said. In the last eight years the amount of information slopped up by firms like his, which sell information to politicians, has tripled, from 300 distinct bits on each voter in 2004 to more than 900 today. We have the rise of social media and mobile technology to thank for this.

What I like about this analysis is that it starts to get at an understanding of different lifestyle behaviors or groups that underlie both consumer choices as well as political choices. Voting decisions are not made in a vacuum nor are consumer choices: these are guided by larger concerns that sociologists often talk about such as class, education level, race/ethnicity, and two factors that doesn’t get as much attention as perhaps they should, where people live and who they interact with on a regular basis (not necessarily the same things but related to each other). While the microtargeting may help tailor individual appeals, it might also obscure some of these larger concerns.

While the article suggests this data collection is all very creepy, this is made tricky because of one fact: some of this information is offered voluntarily by users.

Both Obama and Romney’s sites allow, if not encourage, visitors to login to their campaign websites with a Facebook account, thereby unveiling a wealth of information: email address, friend list, birthday, gender, and user ID. Obama’s team, in accordance with the president’s call for greater transparency, details his campaign’s privacy policyin an exhaustive 2,600-word treatise. It begins like an online Miranda Rights: “Make sure that you understand how any personal information you provide will be used.” Then things get a little weird.

Among other points, the policy says the campaign can monitor users’ messages and emails between members, share their personal information with any like-minded organization it chooses, and follow up by sending them news it deems they’d find worthwhile. In other words, target anger points. Then there’s something called “passive collection,” which means cookies — lots and lots of cookies. Obama’s campaign, as well as third-party vendors working with, spray trackers so other websites can flash personalized ads based on knowledge of the trip to barackobama.com. And finally, near the end of the policy, comes one more caveat: “Nothing herein restricts the sharing of aggregated or anonymized information, which may be shared with third parties without your consent.”

Romney’s site apparently wants even more from its visitors, asking users who login with Facebook to “post on (their) behalf” and “access (their) data any time” they’re not using the application. You can deny both functions.

Perhaps at the least, users should be made more aware upfront of how their information is going to be used. This could be similar to the new boxes included on credit card statements: the consumer should be able to clearly see what is going to happen rather than have to dig through online user agreements. At the same time, making users aware is different than stopping companies from using information in certain ways. I also wonder how these online companies, like banks and credit card providers, will find other ways to collect data and money if these avenues are closed off. For example, would the average internet user rather give up some of this personal information for the sale of targeted advertisements or pay a small fee to access a website each year?

Differences in who blogs by race and education

A new sociological study shows that who blogs is affected by both race and education:

While African Americans as a whole are less likely to afford laptops and personal computers, Internet-savvy blacks, on average, blog one and a half times to nearly twice as much as whites, while Hispanics blog at the same rate as whites, according to a study published in the March online issue of the journal, Information, Communication & Society.

“Blacks consume less online content, but once online, are more likely to produce it,” said the study’s author, Jen Schradie, a doctoral candidate in sociology at UC Berkeley and a researcher at the campus’s Berkeley Center for New Media.

Schradie analyzed data from more than 40,000 Americans surveyed between 2002 and 2008 for the Pew Internet and American Life Project, which tracks Internet use and social media trends. Her latest findings follow up on a 2011 study in which Schradie found a “digital divide” among online content producers based on education and socio-economic status…

But, she said, “While blacks are more likely to blog than whites, it doesn’t mean the digital divide is over. People with more income and education are still more likely to blog than those with just a high school education and Internet access.”

There is not a whole lot of public discussion about this “digital divide” but it is interesting to see how this plays out with blogs. Of course, blogs are just one part of the content of the Internet and are a form that generally lends itself to longer pieces of writing (say compared to Twitter, Facebook, comment sections, discussion boards). In general, how involved are minorities in other forms of web content?

I wonder if the link between blogging and education is tied to the idea that more educated Internet users feel like they have something to say and contribute. Or perhaps education leads people to think that they should have a voice. For example, if you think about Annette Lareau’s theories about two types of parenting, “concerted cultivation” leads to adults who are assertive and comfortable in conversing with others.

Sociology professor developed and used computer program for grading papers

Sociologist Ed Brant has developed and used a grading program for student papers:

Brent designed software called a SAGrader to grade student papers in a matter of seconds. The program works by analyzing sentences and paragraphs for keywords and relationships between terms. Brent believes the program can be used as a tool to save time for teachers by zeroing in on the main points of an essay and allowing teachers to rate papers for the use of language and style.

“I don’t think we want to replace humans,” Brent says in an article in Wired. “But we want to do the fun stuff, the challenging stuff. And the computer can do the tedious but necessary stuff.”

Using the software still requires work on the teacher’s part, though. To prepare the program to grade papers, a teacher must enter all of the components they expect a paper to include. Teachers also have to consider the hundreds of ways a student might address the pieces of an essay.

Interestingly, one person in the testing business argues that the biggest issue is not how well the software does at grading but whether people believe the program can do a good job:

But it’s tough to tout a product that tinkers with something many educators believe only a human can do.

“That’s the biggest obstacle for this technology,” said Frank Catalano, a senior vice president for Pearson Assessments and Testing, whose Intelligent Essay Assessor is used in middle schools and the military alike. “It’s not its accuracy. It’s not its suitability. It’s the believability that it can do the things it already can do.”

If this were used widely and becomes normal practice, it could redefine what it means to be a professor or teacher. This is not a small issue in an era where many argue that learning online or from a book could be as effective (or at least as cost-effective) compared to sending students to pricey colleges.

I wonder what percentage of sociologists would support using such grading programs in their own classrooms and throughout academic institutions.

Fighting the “artificial positivity” on Facebook with EnemyGraph

A new Facebook app called EnemyGraph allows users to openly mark their “enemies” on Facebook:

EnemyGraph, a new app for Facebook, allows users to do just that: Declare their enemies on the world’s most popular social network.

It may sound sinister, but the motivation is more sociological, say the developers, a group from the Emerging Media + Communications program at the University of Texas at Dallas.

“Facebook has this artificial positivity kind of forced upon it,” said Harrison Massey, a student at UT Dallas who, along with Dean Terry, the director of the program, and Bradley Griffith, a graduate student, collaborated to develop the app. “We believe that there is a certain amount of health in saying that you don’t like something, that something is your enemy, because you can create conversations about that. You can bond with people over that.”

Massey said, for example, that users could bond over the common dislike of a company or a political party…

“We are misusing the word ‘enemy’ the same way that Facebook misuses the word ‘friend,”‘ Terry, the UT professor behind the project, told HuffPost. “It’s totally inaccurate. It’s not about individuals. It’s really about things in popular culture.”

Several thoughts about this:

1. I think Facebook is pretty smart by limiting the negativity within the software itself. Of course, users can make negative statements on walls but even these can be deleted. Since Facebook is about connecting people, formal negativity could detract from this. Let’s say that you don’t like someone’s posts: Facebook’s easiest answer these days is to simply block them from showing up on your news feed. The genius is that the other person doesn’t know this so the negative interaction between the two people is limited and life goes on.

2. EnemyGraph seems to be channeling the sort of sentiment sometimes expressed by users asking for a “dislike” button to balance the “like” button. Isn’t it more “balanced” to have both options?

3. Perhaps in EnemyGraph’s favor, social interactions, particularly group interactions, are often reliant on clearly labeling who is “in” and who is “out.” Without the easy ability to mark who is “out” in Facebook, marking symbolic and moral boundaries becomes more difficult. Developing deeper relationships through having common enemies could be more difficult. Of course, drawing strong subgroup boundaries can lead to other issues such as antagonism between groups.

4. I bet Facebook would argue (and perhaps could even prove) that their current system actually increases social interaction and introducing more negative capabilities would limit social interaction. Think about other areas of web interaction (comment sections or pages like Digg) and see how the ability to formally report negative feelings leads to a different kind of environment.

5. I’m amused about broadly defining “enemy” just as Facebook broadly defines “friend.”

Reflections on reasons some people hold out against smartphones

I found this overview of reasons why some people haven’t yet adopted smartphones to be quite interesting having been one of “those people” up until a few months ago. Here are the five reasons given for why some people haven’t made the switch:

  • Fear of addiction. “I don’t want to end up falling victim to the smartphone, where I dive in and get lost for hours at a time,” dumbphone owner 24-year-old Jim Harig, 24 told The Times‘ Teddy Wayne.
  • The benefits of disconnectivity. “I also fear my own susceptibility to an e-mail-checking addiction,” writes Wayne. “The pressure to always be in communication with people is overwhelming,”  Erica Koltenuk tells the Journal‘s Sue Shellenbarger.
  • Cost. “These die-hards say they are reducing waste and like sidestepping costly service contracts,” writes Shellenbarger.
  • Durability. “I want a phone that you could drop-kick into a lake and go get it and still be able to make a call,” says Patrick Crowley, who bought a new phone 5 years ago.
  • Anti-consumerism. “[David] Blumenthal sees no need to ‘keep running out and buying new things if you can patch them and they hold together,'” explains the Journal.

Until this past December, I would have argued for the first three reasons. Here are my experiences of these three reasons in the four months I have had a smartphone:

1. Fear of addiction. I didn’t want to be a person who pulls out their phone at every dull moment. I don’t think I do this today but the phone is undeniably handy in several situations. Since I love learning and information, it is invaluable to be able to look things up. Also, in moments that where I would have been waiting already, say the barber shop or in line, I can quickly look things up and use my time well (what a rationalization…). Third, a smartphone is indispensable while traveling whether one needs a map, restaurant reviews, airline info, and more. I would say that addiction is hard to combat though.

2. Disconnectivity. I like the occasional experience of being disconnected. In fact, I think it is necessary to disconnect occasionally from all electronic/digital media. Here is my personal measure of addiction: if I can still enjoy a longer period of time (a few hours to a few days) without feeling a consistent need to check my phone, I’m in good shape. The smartphone should be a tool, not my life. The phone can enhance my interaction with others but it can also be a hindrance and I want to be mindful of this. Additionally, I have refused to connect my phone to my work email and I don’t want any apps that would allow me to do work through my phone.

3. Cost. I’m still irritated about this issue but there are cheaper options than the contract carriers. My wife and I got phones from Virgin Mobile and while it is not perfect, it is cheaper than any of the contract options. Perhaps this is simply the price of living in the modern world and considering that these phones are like little computers, it is a worthwhile investment.

All in all, the smartphone world is a nice one even if I have lost the “pride” mentioned in this article of being someone who can still hold out against the powerful forces of technology and consumerism. But I can still be part of the camp that relishes not having an iPhone

The rise of “data science” as illustrated by examining the McDonald’s menu

Christopher Mims takes a look at “data science” and one of its practitioners:

Before he was mining terabytes of tweets for insights that could be turned into interactive visualizations, [Edwin] Chen honed his skills studying linguistics and pure mathematics at MIT. That’s typically atypical for a data scientist, who have backgrounds in mathematically rigorous disciplines, whatever they are. (At Twitter, for example, all data scientists must have at least a Master’s in a related field.)

Here’s one of the wackier examples of the versatility of data science, from Chen’s own blog. In a post with the rousing title Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process, Chen delves into the problem of clustering. That is, how do you take a mass of data and sort it into groups of related items? It’s a tough problem — how many groups should there be? what are the criteria for sorting them? — and the details of how he tackles it are beyond those who don’t have a background in this kind of analysis.

For the rest of us, Chen provides a concrete and accessible example: McDonald’s

By dumping the entire menu of McDonald’s into his mathemagical sorting box, Chen discovers, for example, that not all McDonald’s sauces are created equal. Hot Mustard and Spicy Buffalo do not fall into the same cluster as Creamy Ranch, which has more in common with McDonald’s Iced Coffee with Sugar Free Vanilla Syrup than it does with Newman’s Own Low Fat Balsamic Vinaigrette.

This sounds like an updated version of factor analysis: break a whole into its larger and influential pieces.

Here is how Chen describes the field:

I agree — but it depends on your definition of data science (which many people disagree on!). For me, data science is a mix of three things: quantitative analysis (for the rigor necessary to understand your data), programming (so that you can process your data and act on your insights), and storytelling (to help others understand what the data means). So useful skills for a data scientist to have could include:

* Statistics, machine learning (on the quantitative analysis side). For example, it’s impossible to extract meaning from your data if you don’t know how to distinguish your signals from noise. (I’ll stress, though, that I believe any kind of strong quantitative ability is fine — my own background was originally in pure math and linguistics, and many of the other folks here come from fields like physics and chemistry. You can always pick up the specific tools you’ll need.)

* General programming ability, plus knowledge of specific areas like MapReduce/Hadoop and databases. For example, a common pattern for me is that I’ll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python or R for further analysis, pull from a database to grab some extra fields, and so on, often integrating what I find into some machine learning models in the end.

* Web programming, data visualization (on the storytelling side). For example, I find it extremely useful to be able to throw up a quick web app or dashboard that allows other people (myself included!) to interact with data — when communicating with both technical and non-technical folks, a good data visualization is often a lot more helpful and insightful than an abstract number.

I would be interested in hearing whether data science is primarily after descriptive data (like Twitter mood maps) or explanatory data. The McDonald’s example is interesting but what kind of research question does it answer? Chen mentions some more explanatory research questions he is pursuing but it seems like there is a ways to go here. I would also be interested in hearing Chen’s thoughts on how representative the data is that he typically works with. In other words, how confident are he and others are that the results are generalizable beyond the population of technology users or whatever the specific sampling frame is. Can we ask and answer questions about all Americans or world residents from the data that is becoming available through new data sources?

h/t Instapundit

The legality of a prospective employer asking for your Facebook login information

I’ve seen several stories about this: more employers are asking prospective employees to provide their Facebook login information (or login in front of them) so that they can look over your profile. While this is sure to anger some people, how legal is it?

Questions have been raised about the legality of the practice, which is also the focus of proposed legislation in Illinois and Maryland that would forbid public agencies from asking for access to social networks…

Companies that don’t ask for passwords have taken other steps — such as asking applicants to friend human resource managers or to log in to a company computer during an interview. Once employed, some workers have been required to sign non-disparagement agreements that ban them from talking negatively about an employer on social media…

Giving out Facebook login information violates the social network’s terms of service. But those terms have no real legal weight, and experts say the legality of asking for such information remains murky.

The Department of Justice regards it as a federal crime to enter a social networking site in violation of the terms of service, but during recent congressional testimony, the agency said such violations would not be prosecuted.

But Lori Andrews, law professor at IIT Chicago-Kent College of Law specializing in Internet privacy, is concerned about the pressure placed on applicants, even if they voluntarily provide access to social sites.

So when will we get our first court case that tackles this issue?

I assume these companies have weighed the negative consequences of following these practices. Perhaps the logic goes something like this: if people have nothing to hide online, then there should be no problem having employers see their information. But I can’t imagine this will lead to good publicity for many corporations. Privacy is a big concern to many people and corporations are often seen as the bad guys in the larger battle.

Additionally, don’t employers have other ways to find out information that doesn’t require asking for login information? Perhaps they wouldn’t be able to get at Facebook information but that is not the only way to find out about people. What about asking for more references instead, professional and perhaps personal, and calling those references and asking thorough questions?

I’m also struck by the idea that some employers seem to be very afraid of Facebook and social media. Yes, it can backfire on their corporation or organization. But employees are capable of doing all sorts of dumb things and this is not restricted to Facebook posts.

Obama campaign data mining information for fundraising, voters

Politico reports on how the Obama campaign is using data mining in its quest to win reelection:

Obama for America has already invested millions of dollars in sophisticated Internet messaging, marketing and fundraising efforts that rely on personal data sometimes offered up voluntarily — like posts on a Facebook page— but sometimes not.

And according to a campaign official and former Obama staffer, the campaign’s Chicago-based headquarters has built a centralized digital database of information about millions of potential Obama voters.

It all means Obama is finding it easier than ever to merge offline data, such as voter files and information purchased from data brokers, with online information to target people with messages that may appeal to their personal tastes. Privacy advocates say it’s just the sort of digital snooping that his new privacy project is supposed to discourage…

There’s an added twist for Obama: He’s making these moves at the same moment his administration is pushing the virtues of online privacy, last month proposing a consumer bill of rights to protect it.

This has been brewing for some time: back in July 2011, Ben Smith reported that the Obama campaign was advertising for “Predictive Modeling/Data Mining Scientists and Analysts.”

I really want to ask: what took so long? This is a gold mine for candidates.

I’ll be curious to see how far these hypocrisy charges go. If companies are going to make money off the Internet, don’t they have to have some of these abilities to put information together? Which group do people trust less to have their information: corporations or political parties?

Post political content on Facebook and risk losing friends

Results from a new study show that 18% of adults on Facebook say they have responded to political posts by friends by dropping those friends or blocking their posts:

Eighteen percent of the 2,253 adults surveyed by Pew said they had blocked, unfriended, or hidden a friend on a social network over a political post. It isn’t hard to see why: The Pew survey found that because people who post about politics tend to be very liberal or very conservative, the offending posts are more likely to be out of line with other people’s views. Indeed, only one in four users surveyed by Pew said they “usually” or “always” agree with their friends’ political posts; 73 percent said they only sometimes or never do.

Though most people—roughly two in three—take no action over political posts they disagree with, some 28 percent said they counter with a comment or competing post, another behavior the Pew survey said leads to friends going their own way.

Despite everyone’s apparent distaste for other people’s political views, the survey found most users continue to post their own: 75 percent of adults who use social sites said their friends post political content, and 37 percent said they post at least some of their own.

My interpretation (filtered through my own research): political comments (and some discussion?) are common on Facebook but it doesn’t appeal to everyone and some people can go over the line (either through posting more “extreme” political posts or posting too many political comments).

I would be interested to hear a lot more about this: what is the threshold for appropriate political posts? Why are some users so uninterested in political posts to go so far as to block/drop friends? Are there similar areas of discussion, perhaps religion, that evoke similarly strong reactions from other users?