Nielsen and Twitter combine to measure Twittering about television

With the rise of Twitter messages about television shows and events, Nielsen and Twitter just announced a new project to measure the connection:

“The Nielsen Twitter TV Rating is a significant step forward for the industry, particularly as programmers develop increasingly captivating live TV and new second-screen experiences, and advertisers create integrated ad campaigns that combine paid and earned media,” said Steve Hasker, President, Global Media Products and Advertiser Solutions at Nielsen. “As a media measurement leader we recognize that Twitter is the preeminent source of real-time television engagement data.”…

The Nielsen Twitter TV Rating will enhance the social TV analytics and metrics available today from SocialGuide by adding the first-ever measurement of the total audience for social TV activity – both those participating in the conversation and those who were exposed to the activity –providing the precise size of the audience and effect of social TV to TV programming.

SocialGuide, recently acquired by Nielsen and NM Incite, currently captures Twitter TV activity for all U.S. programming across 234 TV channels in English and Spanish, and more than 36,000 programs.  Through a sophisticated classification process, SocialGuide matches Tweets to TV programs to offer key social TV metrics including the number of unique Tweets associated with a given program and rankings for the most social TV programs.

This may be interesting in itself but the key may just be translating this into information that TV networks can sell to advertisers:

Brad Adgate, an analyst at Horizon Media, said advertisers will view the Twitter ratings as a useful layer of information about a show’s popularity, but it is “not going to be close to the currency” of existing ratings metrics.

“It lets producers and creative directors know if the storyline is working, like a huge focus group,” Adgate said. “But I don’t think you can translate comments to ratings for a show. Right now I think the bark right now is bigger than its bite.”…

Mark Burnett, executive producer of NBC’s hit “The Voice,” argued that advertisers should value programs that can attract a high level of social media engagement from viewers. Deeply embedded social media elements, such as live Twitter polls, were critical in driving “The Voice” to the top of the Tuesday night ratings among viewers between 18 to 49, Burnett said.

“If you’re an advertiser, wouldn’t you want to know whether people are watching this show passively or if they’re actively engaged in the viewing experience?” Burnett said. “Five years from now this will make traditional television ratings seem archaic.”

In other words, if this metric works well, television networks will be able to charge advertisers more based on increased levels of Twitter engagement or find some way to provide more targeted advertising to Twitter users. What will Twitter engaged TV watchers get out of it? I’m not sure. Will any of this measurement and action based on the data enhance the interactive element of TV watching? Theoretically, if TV networks could get more money for advertising based on social media engagement, they might have more money to put into developing quality programming. But, there are few guarantees there.

I’ll be very interested to see in coming years if Twitter and Facebook continue to remain relatively ad-free or if the need to monetize these experiences to make money takes precedence.

The rise of misattributed quotes on the Internet, social media

An editor at RealClearPolitics examines an erroneous online list of Mark Twain quotes and takes a broad view of quotes in the age of the Internet and social media:

The point of this example is that lists of quotes without specific and verifiable citations — where and when it appeared — are useless, and invariably rife with errors. Websites with names like “Brainyquote” and “Thinkexist.com” are essentially Internet compost piles.

In the pre-Internet days, “Bartlett’s Familiar Quotations” and “The Oxford Dictionary of Quotations” were the gold standards, although sometimes misattributed quotes found their way into those volumes. Much of this material is now online, but the best source of accurate quotes today is the “Yale Book of Quotations,” edited by the rigorous and charming Fred R. Shapiro.

Many of the most frequently misquoted historical figures have websites devoted to keeping the record straight for their heroes. These range from one established by a conscientious amateur Twain aficionada named Barbara Schmidt to WinstonChurchill.org, which is run by the Churchill Centre and Museum in London. The latter site even has a section called “Quotes Falsely Attributed.”

In his anthology, Shapiro goes the extra mile in tracking down the origin of erroneous quotes. Thus, he is no stranger to the misuse of quotations or even obvious forgeries. But even he was astonished at the casual speciousness of the Huffington Post inventory.

This has been a widespread issue in recent years – remember the fake MLK viral quote after the death of Osama bin Laden? While Wikipedia might have relatively good information that is regularly edited, quotations are simply floating around the Internet and social media.

I think this is tied to two other phenomena related to the Internet and social media:

1. The desire people have to find a quote that represents them. In an era of profiles and status updates, people are defined more and more by short, snappy bursts. There is simply not space to write more and who wants to read a long piece about your existence (except on blogs)? Finding the right sentence or two that sums up one’s existence or current state is a difficult task that can be aided with quotes attributed to famous figures. If you don’t want to use quotes, you can always use pictures – witness the rise of Instagram.

2. Many of these quotes are inspirational or witty. If you look at the inspirational quotes on Facebook profiles or Twitter feeds, many suggest people are continually facing and then overcoming challenges and obstacles. The overcoming-type quotes are empowering as individuals can quickly equate their challenges to some of the greatest in history. The witty quotes do something else; they suggest the user is facing life with verve and can find and wield profound words. Witty quotes can then become another status game as users try to one-up each other with piercing and whimsical takes on the world.

Perhaps this is how the average person gets to participate on a daily basis in a sound bite culture.

Highlights from the Nielsen Social Media Report 2012

Nielsen just released the Social Media Report 2012 (more data here). Here are a few things to note:

Facebook remains the most-visited social network in the U.S. via PC (152.2 million visitors), mobile apps (78.4 million users) and mobile web (74.3 million visitors), and is multiple times the size of the next largest social site across each platform.  The site is also the top U.S. web brand in terms of time spent, as some 17 percent of time spent online via personal computer is on Facebook.

-More than 70% of Pinterest’s users are female.

-The top three reasons by far for why social networks users become connected/friends: know person in real life, interested in keeping up, mutual friends. This is more evidence that social networks are mainly about maintaining existing connections rather than creating new connections.

-Watching TV is increasingly linked to tablet, smartphone, and Twitter usage. Multitasking is alive and well and perhaps TV can be interactive after all.

There is also some fascinating data at the end about social media usage around the world.

Summarizing sociological theories in 140 characters or less

A sociology instructor is having his students tweet criminal-justice theories:

“They have all these theories to learn,” Atherton said. “Some of them are very dense, and complex. What I try to get them to do, and I tie some extra credit to it, is see if they can boil the theory down, the essence of it, to 140 characters.”…

In a recent class session, Atherton shared tweets from a lesson on a theory of social disorganization, displaying the tweets under Twitter’s signature bluebird.

“Social disorganization refers to communities as a whole not coming together for common goals, ultimately causing a disruption,” the first tweet stated.

Another tweet on the topic read: “theory suggests criminal activity comes from the neighborhood where someone lives and how it shapes them living there.”

If the American Sociological Association is working on a Wikipedia initiative, why not also start a Twitter push? Since it looks like Karl Marx’s Das Kapital is being tweeted (over 41,000 tweets and counting), there is work to be done.

While I think this could be an interesting pedagogical exercise as it allows students to use a current medium as well as put complex theories into their own terms, I wonder if this doesn’t perfectly illustrate the issues with Twitter. Sociological theories are often messy and complex, taking some time to explain and think through. For a very basic understanding, 140 characters could work but if this is all students know about sociological theories, is this worthwhile in the long run?

Buying followers on Twitter

The New York Times examines the market for buying followers on Twitter:

The practice is surprisingly easy. A Google search for “buy Twitter followers” turns up dozens of Web sites like USocial.net, InterTwitter.com, and FanMeNow.com that sell Twitter followers by the thousands (and often Facebook likes and YouTube views). At BuyTwitterFollow.com, for example, users simply enter their Twitter handle and credit card number and, with a few clicks, see the ranks of their followers swell in three to four days…

“And it’s so cheap, too,” he said. In one instance, Mr. Mitchell said, he bought 250,000 for $2,500, or a penny each…

Twitter followers are sold in two ways: “Targeted” followers, as they are known in the industry, are harvested using software that seeks out Twitter users with similar interests and follows them, betting that many will return the favor. “Generated” followers are from Twitter accounts that are either inactive or created by spamming computers — often referred to as “bots.”

When numbers are taken as a measure of success or popularity, why should we be surprised by this? It is also interesting that people figured out how to discover the fake followers. Here is what one tool revealed:

If accurate, the number of fake followers out there is surprising. According to the StatusPeople tool, 71 percent of Lady Gaga’s nearly 29 million followers are “fake” or “inactive.” So are 70 percent of President Obama’s nearly 19 million followers.

So if paying for followers is supposed to boost status, could discovering that they have a lot of fake followers reduce their status? Lady Gaga is frequently cited as having the most Twitter followers; how would her brand be reduced if that wasn’t really true?

I am struck by the contrast with Facebook. While the term “friends” has been roundly panned, it does denote a stronger relationship than “follower.” Facebook users tend to look down on other users who accumulate too many friends. After all, Dunbar’s number suggests we can only have 150 friends in the offline world. Perhaps Facebook got this more right than Twitter…

More on Twitter co-founder and his teardown vs. neighbors in San Francisco

I recently wrote about Twitter co-founder Evan Williams’ fight with his San Francisco neighbors over his proposed teardown McMansion. Here is more information about the story:

“We don’t want nouveau riches McMansions sprouting up all over our ridges,” one resident wrote to San Francisco’s Planning Department.

And here, at least, is one local example of the side-effect of a tech boom that the city has fought hard to fuel. San Francisco worked hard in particular to convince Twitter to keep its headquarters in town in hopes that it would amp up the tech scene north of Silicon Valley. Williams, who is 40, was Twitter’s CEO before stepping down in 2010 to support more tech startups…

The strife started after Williams and Lundberg Design, the design firm hired by Williams, contacted neighbors about the couple’s plans. A couple of longtime residents quickly began circulating a handwritten flyer around the neighborhood, decrying the “APPALLING” plan to demolish a “widely coveted, unique and historic (to most) house.”

“TEAR DOWN is NEEDLESS, WASTEFUL, POLLUTION, DISRESPECTFUL,” the flyer said in all caps. It asked people to send in one letter per person if possible because “volume counts.”…

Williams isn’t alone in his neighborhood woes. Other high tech moguls have run into opposition from neighbors, including late Apple CEO Steve Jobs, who was trying to demolish a Woodside property and rebuild as well, and Oracle CEO Larry Ellison, who sued his Pacific Heights neighbors last year for their overgrown trees. Ellison’s Pacific Heights residence was, coincidentally, designed by Lundberg Design.

Sounds quite contentious. The columnist suggests San Francisco might have to change a little if it wants to keep important firms; what if the Twitter co-founder threatened to move away, taking away tax revenue and jobs? Communities compete against each other by offering tax breaks or other incentives so couldn’t corporations and their leaders make stipulations about housing issues?

Using Twitter to predict when you will get sick with 90% accuracy

A new study uses tweets in New York City to predict when a user will get sick – and does so with 90% accuracy.

Using 4.4 million tweets with GPS location from over 630,000 users in New York City, Sadilek and his team were able to predict when an individual would get sick with the flu and tweet about it up to eight days in advance of their first symptoms. Researchers found they could predict said results with 90 percent accuracy.

Similar to Google’s Flu trends, which uses “flu” search trends to pinpoint where and how outbreaks are spreading, Sadilek’s system uses an algorithm to differentiate between alternative definitions of the word ‘sick.’ For example, “My stomach is in revolt. Knew I shouldn’t have licked that door knob. Think I’m sick,” is different from “I’m so sick of ESPN’s constant coverage of Tim Tebow.”

Of course, Sadilek’s system isn’t an exhaustive crystal ball. Not everyone tweets about their symptoms and not everyone is on Twitter. But considering New York City has more Twitter users than any other city in the world, the Big Apple is as good as a place as any for this study.

While one could look at this and marvel at the power of Twitter, I think the real story here is about two things: (1) the power of big data and (2) the power of social networks that Twitter harnesses. If you have people volunteering information about their lives, access to the data, and information about who users are connected to, you can do things that would have been very difficult even ten years ago.

It is interesting that this study was conducted in New York City where there is a high percentage of Twitter users. How good are predictions in cities with lower usage rates? Are we headed toward a world where public health requires people to report on their health so that outbreaks can be contained or quelled?

Naperville government leads Illinois’ top 20 cities in social media use

Naperville is used to accolades – see this well trumpeted #2 ranking in Money‘s Best Places to Live in 2006. Here is a new measure of excellence: Naperville is #1 in a suburban government’s use of social media.

A University of Illinois at Chicago study ranks the western suburb No. 1 among local government websites in a study of social media use by Illinois’ 20 largest cities.

Researchers from the university’s College of Urban Planning and Public Affairs analyzed the websites using at least 90 criteria to determine how well each provided residents with information and the opportunity to interact with officials. Chicago and Elgin round out the top three…

In addition to its main website, Naperville uses Facebook, Twitter, YouTube, RSS feeds and about two dozen e-newsletters to communicate with residents. It also is looking into starting a mass notification system that Community Relations Manager Nadja Lalvani likened to a “reverse 311.”

“It’s very important for us to be able to communicate effectively and efficiently with residents and other constituents,” Lalvani said. “Social media is very prevalent and another tool to make sure the message is penetrating our audience.”

The UIC study also found increasing use of social media by cities around the country. In 2011, 87 percent of the 75 largest U.S. cities used Twitter, compared with 25 percent in 2009. Likewise, 87 percent used Facebook, compared with 13 percent two years prior.

It doesn’t surprise me that Naperville would lead the way: they seem to have the resources to make this happen as well as the interest in being efficient, taking advantage of new technology (see the ongoing debate over wireless electricity meters – the city’s view and an opposition group), and communicating with people.

I wonder if the study included talking to residents to see if these efforts are reaching them. This is an on-going issue for many communities: the city/village/town claims that they are putting out information while residents suggest they are blindsided at the last minute or aren’t informed at all. I think both sides are often right: many communities have newsletters and websites where information can be found. However, searching out and reading this information does require some effort on the part of residents. Add in the issue that many communities are without local newspapers and it is more difficult to transmit this information broadly. If this plan of attack in Naperville is successful, I imagine more communities will follow their lead.

A second issue could still limit the effectiveness of the social media outreach. I was reminded of this by a talk I heard last week: governments may make information publicly available but they don’t necessarily make the information easily understandable. For example, a community may release some data or an important report but the language and data requires interpretation that the average citizen may be incapable of doing. There is a translation issue here from technical or government speak to what people can understand and then react to. Or a large dataset may be public but it requires knowledge of statistics and specialized software to make some sense of it. Granted, it can be hard to boil down complex issues into newsletter items but it also shouldn’t be the case that newsletters and tweets only cover basic stuff like brush pick-up and meeting times.

 

The rise of “data science” as illustrated by examining the McDonald’s menu

Christopher Mims takes a look at “data science” and one of its practitioners:

Before he was mining terabytes of tweets for insights that could be turned into interactive visualizations, [Edwin] Chen honed his skills studying linguistics and pure mathematics at MIT. That’s typically atypical for a data scientist, who have backgrounds in mathematically rigorous disciplines, whatever they are. (At Twitter, for example, all data scientists must have at least a Master’s in a related field.)

Here’s one of the wackier examples of the versatility of data science, from Chen’s own blog. In a post with the rousing title Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process, Chen delves into the problem of clustering. That is, how do you take a mass of data and sort it into groups of related items? It’s a tough problem — how many groups should there be? what are the criteria for sorting them? — and the details of how he tackles it are beyond those who don’t have a background in this kind of analysis.

For the rest of us, Chen provides a concrete and accessible example: McDonald’s

By dumping the entire menu of McDonald’s into his mathemagical sorting box, Chen discovers, for example, that not all McDonald’s sauces are created equal. Hot Mustard and Spicy Buffalo do not fall into the same cluster as Creamy Ranch, which has more in common with McDonald’s Iced Coffee with Sugar Free Vanilla Syrup than it does with Newman’s Own Low Fat Balsamic Vinaigrette.

This sounds like an updated version of factor analysis: break a whole into its larger and influential pieces.

Here is how Chen describes the field:

I agree — but it depends on your definition of data science (which many people disagree on!). For me, data science is a mix of three things: quantitative analysis (for the rigor necessary to understand your data), programming (so that you can process your data and act on your insights), and storytelling (to help others understand what the data means). So useful skills for a data scientist to have could include:

* Statistics, machine learning (on the quantitative analysis side). For example, it’s impossible to extract meaning from your data if you don’t know how to distinguish your signals from noise. (I’ll stress, though, that I believe any kind of strong quantitative ability is fine — my own background was originally in pure math and linguistics, and many of the other folks here come from fields like physics and chemistry. You can always pick up the specific tools you’ll need.)

* General programming ability, plus knowledge of specific areas like MapReduce/Hadoop and databases. For example, a common pattern for me is that I’ll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python or R for further analysis, pull from a database to grab some extra fields, and so on, often integrating what I find into some machine learning models in the end.

* Web programming, data visualization (on the storytelling side). For example, I find it extremely useful to be able to throw up a quick web app or dashboard that allows other people (myself included!) to interact with data — when communicating with both technical and non-technical folks, a good data visualization is often a lot more helpful and insightful than an abstract number.

I would be interested in hearing whether data science is primarily after descriptive data (like Twitter mood maps) or explanatory data. The McDonald’s example is interesting but what kind of research question does it answer? Chen mentions some more explanatory research questions he is pursuing but it seems like there is a ways to go here. I would also be interested in hearing Chen’s thoughts on how representative the data is that he typically works with. In other words, how confident are he and others are that the results are generalizable beyond the population of technology users or whatever the specific sampling frame is. Can we ask and answer questions about all Americans or world residents from the data that is becoming available through new data sources?

h/t Instapundit

What happens when Tim Pawlenty comes to your sociology class

Courtesy of modern technology, you could have been following a live Twitter stream chronicling what happens when former Minnesota governor and former Republican presidential candidate Tim Pawlenty visits a sociology class at the University of Kansas:

“23 minutes later and I have no idea what he’s talking about,” tweeted Ray. “Freedom, drugs, a kickass pool, meatpacking, MLK.”

It sounded interesting, so I called Ray for an after-action report. The room, he said, was somewhat full and somewhat interested.

“A few hundred students are enrolled in class,” he said, “but maybe a hundred show up. I figure that a lot of the people in the class are freshmen who are just taking it to take it. They probably know Romney, they know Santorum, but Pawlenty dropped out so early that they might not know him.”

But what did the great man say? “Somebody asked him what he thought about Santorum’s victories yesterday,” remembered Gray. “He congratulated him, but he brought up the fact that John McCain lost 19 states and still won the nomination.” Gray paused. “It sounded like a backhanded compliment. And he referred to Minnesota as one of the smaller states, in terms of political power.”

A few quick thoughts:

1. Should we trust a single student’s report in a large 100-level lecture class where roughly half the students don’t attend? I always find it interesting to hear what students remember or find noteworthy.

2. Politicians are now tracked at almost every turn.

3. What exactly does Tim Pawlenty know about sociology? The class is titled “American Identity”…was Pawlenty talking about what he thinks this identity is? I would be really curious to hear (1) what Pawlenty thinks sociology is and (2) whether he thinks sociology has any value.

4. It sounds like Pawlenty was on campus to talk about how the still-to-be determined candidate for President will run a campaign and govern.