Getting the data to model society like we model the natural world

A recent session at the American Association for the Advancement of Science included a discussion of how to model the social world:

Dirk Helbing was speaking at a session entitled “Predictability: from physical to data sciences”. This was an opportunity for participating scientists to share ways in which they have applied statistical methodologies they usually use in the physical sciences to issues which are more ‘societal’ in nature. Examples stretched from use of Twitter data to accurately predict where a person is at any moment of each day, to use of social network data in identifying the tipping point at which opinions held by a minority of committed individuals influence the majority view (essentially looking at how new social movements develop) through to reducing travel time across an entire road system by analysing mobile phone and GIS (Geographical Information Systems) data…

With their eye on the big picture, Dr Helbing and multidisciplinary colleagues are collaborating on FuturICT, a 10-year, 1 billion EUR programme which, starting in 2013, is set to explore social and economic life on earth to create a huge computer simulation intended to simulate the interactions of all aspects of social and physical processes on the planet. This open resource will be available to us all and particularly targeted at policy and decision makers. The simulation will make clear the conditions and mechanisms underpinning systemic instabilities in areas as diverse as finance, security, health, the environment and crime. It is hoped that knowing why and being able to see how global crises and social breakdown happen, will mean that we will be able to prevent or mitigate them.

Modelling so many complex matters will take time but in the future, we should be able to use tools to predict collective social phenomena as confidently as we predict physical pheno[men]a such as the weather now.

This will require a tremendous amount of data. It may also require asking for a lot more data from individual members of society in a way that has not happened yet. To this point, individuals have been willing to volunteer information in places like Facebook and Twitter but we will need much more consistent information than that to truly develop models like are suggested here. Additionally, once that minute to minute information is collected, it needs to be put in a central dataset or location to see all the possible connections. Who is going to keep and police this information? People might be convinced to participate if they could see the payoff. A social model will be able to do what exactly – limit or stop crime or wars? Help reduce discrimination? Thus, getting the data from people might be as much of a problem as knowing what to do with it once it is obtained.

Oregon testing out five different ways to pay vehicle-miles traveled tax

The state of Oregon is currently running a small test program with five different ways of paying a vehicle-miles driven tax:

The new usage charge pilot program, which began in November and runs through the end of this month, involves about 40 volunteers from state government. Participants chose the tracking plan that best fit their privacy tastes and will pay 1.56 cents for each mile driven — receiving a credit for any gas tax paid during the test period. The idea is to make sure each tracking option works in practice…

The five tracking plans vary in terms of oversight. Two are managed by the Oregon D.O.T., three by a third-party vendor. They also vary in terms of payment: some require setting up an online account tied to credit or debit information, others go the old fashion route of monthly bills payable by check.

The key difference is the tracking system. Two advanced plans track mileage data as well as movement with a G.P.S.; the advantage here is that users aren’t charged a fee for driving on private or out-of-state roads — only public roads in Oregon. Two basic plans involve an odometer-type device that collects mileage data but has no G.P.S. to track movement. Users may end up paying a little more, but they’re getting privacy in return.

The most primitive plan, for people who want the most privacy, uses no tracking device at all. Users pre-pay a flat fee that assumes a monthly mileage. At some point, say when the car gets official inspections, the odometer is checked and the difference between miles paid and miles driven is reconciled…

Despite these cautions, Oregon is preparing to take its system public soon. The state legislature has prepared a bill that would implement a V.M.T. fee on all vehicles getting 55 miles per gallon or better. (The change only applies to car models beginning in 2015, however, and as currently written the law wouldn’t go into effect until that year.) Olson says the bill will be introduced sometime in 2013.

It sounds like this small test is more about finding about which of the five options are doable and/or appealing, mainly on the dimension of privacy, rather than asking whether a vehicles miles tax should be implemented at all. As the article notes, a bill will come up this year to start the ball rolling. If this is the case, why not run a test bigger than 40 state employees?

Another thought: the system is set up so that drivers only pay for driving on Oregon’s public roads. Wouldn’t a comprehensive system of driving tax collection have to account for driving in other states?

Claim: 90% of information ever created by humans was created in the last two years

An article on big data makes a claim about how much information humans have created in the last two years:

In the last two years, humans have created 90% of all information ever created by our species. If our data output used to be a sprinkler, it is now a firehose that’s only getting stronger, and it is revealing information about our relationships, health, and undiscovered trends in society that are just beginning to be understood.

This is quite a bit of data. But a few points in a response:

1. I assume this refers only to recorded data. While there are more people on earth than before, humans are expressive creatures and have been for a long time.

2. This article could be interpreted by some to mean that we need to pay more attention to online privacy but I would guess much of this information is volunteered. Think of Facebook: users voluntarily submit information their friends and Facebook can access. Or blogs: people voluntarily put together content.

3. This claim also suggests we need better ways to sort through and make sense of all this data. How can the average Internet user put it all this data together in a meaningful way? We are simply awash in information and I wonder how many people, particularly younger people, know how to make sense of all that is out there.

4. Of course, having all of this information out there doesn’t necessarily mean it is meaningful or worthwhile.

How powerful is the distrust of Facebook among its 900 million plus users?

A commentator who praises Facebook tries to get at why so many users are suspicious about Facebook and willing to believe rumors like the recent one that Facebook was revealing private messages on walls:

The problem is that when technologists talk about data and privacy, for many of us it is still in the abstract. For technologists and computer scientists, data is a thing that lives somewhere, it has a logic and can be parsed, made sense of, organized into databases. It can be searched and ultimately sold. But as Nathan Jurgenson, a social-media theorist, points out, for most people “data is this weird nebulous concept that somebody knows something about me, but I don’t know what they know.”…

A Democratic candidate for the Maine State Senate was attacked recently by her Republican opponent for her playing of the multiplayer online game “World of Warcraft.” According to her critics, the politician playing a “rogue orc assassin” was unbecoming. This collision of two seemingly different personalities — on the one hand, a social worker and moderate politician, and on the other, a violent assassin (online) who likes stabbing things — is what sociologists have called “role strain.”

“Identities that were cultivated in little tide pools, that were conceived to be separate, come clashing together,” says Marc A. Smith, a sociologist and social-media expert. “The issue now is that all of these other identities, the idea that we can perform them on separate stages and that they had separate audiences, that is collapsing and the sound of its collapse is the sound of people squealing.”

In his 1959 “Presentation of Self In Everyday Life,” the sociologist Erving Goffman wrote about the idea of “front stage” and “back stage.” In Goffman’s theory, when they’re “front stage,” people engage in “impression management,” choosing their clothing, speech, and adapting the way they present themselves to their audience. “Back stage” they can be more themselves, which might mean shedding their societal role. In the era of social media, Smith says that “we live in a culture where the back stage keeps disappearing.” We think the conversations we are having are in private, but, in fact, they are publicly accessible and data has a long half-life. When U.S. presidential candidate Mitt Romney spoke to a select audience about the “47 percent,” he was, in fact, speaking to everyone. What happens in “World of Warcraft” doesn’t always stay in “World of Warcraft.”…

Or perhaps front stage there is a deep sense of unease about Facebook, but back stage we are not half as worried as we seem.

The suggestion here is that the world of audience segregation and impression management, where we can and do craft our actions, words, and behaviors to a particular audience, is slowly fading away. By doing more things online, these different parts of life are coming together in new ways. And I tend to agree with this journalist: there are over 900 million Facebook users, many of whom have calculated that they are willing to at least put a little information out there in return for the benefits that Facebook like keeping in touch with friends, being able to access information about others that was previously unavailable, or even acquiring the status that comes with keeping up with everyone else. A good number of users express complaints or features of Facebook that make them uneasy but relatively few are willing to give it up all together.

Indeed, we might be in the middle of a very important era where slowly individuals are thinking about and practicing new ways to present themselves and see others through mediums like Facebook. Mark Zuckerberg has expressed the goal of Facebook being a more open society where even less information on Facebook would be private, hidden, or restricted to friends. We could also look at this from the other angle: isn’t it remarkable that millions of people around the world in a span of less than 10 years have voluntarily put out information about themselves? One key might be that Facebook doesn’t force them to reveal everything; users can still practice impression management by crafting a profile. However, these are not “fake” or “untrue” profiles; rather the information is an approximation of the user’s true self.

Employers to applicants: not being a member of Facebook means you are suspicious

Beware job applicants: not having a Facebook account could cast suspicion on you.

On a more tangible level, Forbes.com reports that human resources departments across the country are becoming more wary of young job candidates who don’t use the site.

The common concern among bosses is that a lack of Facebook could mean the applicant’s account could be so full of red flags that it had to be deleted…

It points out that Holmes, who is accused of killing 12 people and an unborn child and wounding 58 others at a movie theater in Aurora, Colorado, and Breivik, who murdered  77 people with a car bomb and mass shooting, did not use Facebook and had small online footprints…

And this is what the argument boils down to: It’s the suspicion that not being on Facebook, which has become so normal among young adults, is a sign that you’re abnormal and dysfunctional, or even dangerous, ways.

Facebook is the new normal, but the idea that people not on Facebook are necessarily suspicious is a gross overgeneralization, particularly when tied to just two tragedies. I can imagine a variety of good reasons for being a nonuser that doesn’t indicate one is a psychopath.

The interest employers have in Facebook certainly is interesting. I blogged a while back about some employers wanting the password of applicants so they could look over their profiles. How does looking at a profile stack up against other ways of getting information such as reading a resume, doing a background check, and checking references?

 

Quick Review: The Immortal Life of Henrietta Lack

After a few people mentioned a particular New York Times bestseller to me recently, I decided to read The Immortal Life of Henrietta Lack. While the story itself was interesting, there is a lot of material here that could be used in research methods and ethics classes. A few thoughts about the book:

1. The story is split into two narratives. One is about both the progress science has made with a Lack’s cells but also the struggle of her family to understand what actually has been done with her cells. The story of scientific progress is unmistakable: we have come a long way in identifying and curing some diseases in the last sixty years. (This narrative reminded me of the book The Emperor of All Maladies.)

2. The second narrative is about the personal side of scientific research and how patients and relatives interpret what is going on. The author initially finds that the Lacks know very little about how their sister or mother’s cells have been used. These problems are compounded by race, class, and educational differences between the Lacks and the doctors utilizing Henrietta’s cells. In my opinion, this aspect is understated in this book. At the least, this is a reminder about how inequality can affect health care. But I think this personal narrative is the best part of the book. When I talk in class about the reasons for Institutional Review Boards, informed consent, and ethics, students often wonder how much social science research can really harm people. As this book discusses, there are some moments in relatively recent history that we would agree were atrocious: Nazi experiments, the Tuskegee experiments, experiments in Guatemala, and so on. Going beyond those egregious cases, this book illustrates the kind of mental and social harm that can result from research even if using Henrietta’s cells never physically harmed the Lacks. I’m thinking about using some sections of this narrative in class to illustrate what could happen; even if new research appears to be safe, we have to make sure we are protecting our research subjects.

3. This book reminded me of the occasional paternalistic side of the medical field. This book seems to suggest this isn’t just an artifact of the 1950s or a racial division; doctors appear slow in addressing concerns some people might have about the use of human tissue in research. I realize that there is a lot at stake here: the afterward of the book makes clear how difficult it would be to regulate this all and how this might severely limit needed medical research. At the same time, doctors and other medical professionals could go further in explaining the processes and the possible outcomes to patients. Perhaps this is why the MCAT is moving toward involving more sociology and psychology.

4. There is room here to contrast the discussions about using body tissue for research and online privacy. In both cases, a person is giving up something personal. Are people more disturbed by their tissue being used or their personal information being used and sold online?

All in all, this book discusses both scientific breakthroughs, how patients can be hurt by the system, and a number of ethical issues that have yet to be resolved.

The legality of a prospective employer asking for your Facebook login information

I’ve seen several stories about this: more employers are asking prospective employees to provide their Facebook login information (or login in front of them) so that they can look over your profile. While this is sure to anger some people, how legal is it?

Questions have been raised about the legality of the practice, which is also the focus of proposed legislation in Illinois and Maryland that would forbid public agencies from asking for access to social networks…

Companies that don’t ask for passwords have taken other steps — such as asking applicants to friend human resource managers or to log in to a company computer during an interview. Once employed, some workers have been required to sign non-disparagement agreements that ban them from talking negatively about an employer on social media…

Giving out Facebook login information violates the social network’s terms of service. But those terms have no real legal weight, and experts say the legality of asking for such information remains murky.

The Department of Justice regards it as a federal crime to enter a social networking site in violation of the terms of service, but during recent congressional testimony, the agency said such violations would not be prosecuted.

But Lori Andrews, law professor at IIT Chicago-Kent College of Law specializing in Internet privacy, is concerned about the pressure placed on applicants, even if they voluntarily provide access to social sites.

So when will we get our first court case that tackles this issue?

I assume these companies have weighed the negative consequences of following these practices. Perhaps the logic goes something like this: if people have nothing to hide online, then there should be no problem having employers see their information. But I can’t imagine this will lead to good publicity for many corporations. Privacy is a big concern to many people and corporations are often seen as the bad guys in the larger battle.

Additionally, don’t employers have other ways to find out information that doesn’t require asking for login information? Perhaps they wouldn’t be able to get at Facebook information but that is not the only way to find out about people. What about asking for more references instead, professional and perhaps personal, and calling those references and asking thorough questions?

I’m also struck by the idea that some employers seem to be very afraid of Facebook and social media. Yes, it can backfire on their corporation or organization. But employees are capable of doing all sorts of dumb things and this is not restricted to Facebook posts.

Obama campaign data mining information for fundraising, voters

Politico reports on how the Obama campaign is using data mining in its quest to win reelection:

Obama for America has already invested millions of dollars in sophisticated Internet messaging, marketing and fundraising efforts that rely on personal data sometimes offered up voluntarily — like posts on a Facebook page— but sometimes not.

And according to a campaign official and former Obama staffer, the campaign’s Chicago-based headquarters has built a centralized digital database of information about millions of potential Obama voters.

It all means Obama is finding it easier than ever to merge offline data, such as voter files and information purchased from data brokers, with online information to target people with messages that may appeal to their personal tastes. Privacy advocates say it’s just the sort of digital snooping that his new privacy project is supposed to discourage…

There’s an added twist for Obama: He’s making these moves at the same moment his administration is pushing the virtues of online privacy, last month proposing a consumer bill of rights to protect it.

This has been brewing for some time: back in July 2011, Ben Smith reported that the Obama campaign was advertising for “Predictive Modeling/Data Mining Scientists and Analysts.”

I really want to ask: what took so long? This is a gold mine for candidates.

I’ll be curious to see how far these hypocrisy charges go. If companies are going to make money off the Internet, don’t they have to have some of these abilities to put information together? Which group do people trust less to have their information: corporations or political parties?

Battening down the Facebook privacy hatches

The Pew Internet & American Life Project released a new study yesterday that suggests Facebook users are paying more attention to their privacy settings, meaning they are editing comments and photos more and being more selective about their friendships:

The report released Friday by the Pew Internet & American Life Project found that people are managing their privacy settings and their online reputation more often than they did two years earlier. For example, 44 percent of respondents said in 2011 that they deleted comments from their profile on a social networking site. Only 36 percent said the same thing in 2009…

Along those lines is “profile pruning,” which Pew reports is on the rise. Nearly two-thirds of people on social networks said last year that they had deleted friends, up from 56 percent in 2009. And more people are removing their names from photos than two years ago. This practice is especially common on Facebook, where users can add names of their friends to photos they upload…

Women are much more likely than men to restrict their profiles. Pew found that 67 percent of women set their profiles so that only their “friends” can see it. Only 48 percent of men did the same…

Possibly proving that with age comes wisdom, young adults were more likely to post something regrettable than their older counterparts. Fifteen percent of social network users aged 18 to 29 said they have posted something regrettable. Only 5 percent of people over 50 said the same thing.

Several thoughts about this:

1. This isn’t a huge trend: for both deleting comments and friends, a little less than 10% more users did this than two years ago. If this is a long-term trend that keeps going up 10% every few years, this would be especially noteworthy.

2. This is still a low number of people who say they “posted something regrettable.” These figures seem to suggest that many users are ahead of the game here: they are making sure they are being presented in a good light before it could turn into something regrettable. These figures go against a common media image that social media users regularly do crazy things, are always at risk, or don’t know what they are doing.

3. Is privacy the best word to describe all of this? I wonder if we could call this behavior “selective interaction” as it is more about limiting the display of information to certain people rather than hiding information from everyone. If people truly wanted online privacy, they wouldn’t have a Facebook profile in the first place.

4. The removal of friends is interesting. I wonder if this is more of a function of how long one has had Facebook (tied to realizing that one doesn’t really interact with that many people and all of those friends don’t show up in your news feed even if they are updating their information) or changes in life stages (once one leaves high school or college, does one need to remain friends with all of those people you once ran into or thought you might interact with?).

h/t Instapundit

Why a small minority of Americans don’t use Facebook

The New York Times has a piece looking at why some Americans don’t use Facebook:

As Facebook prepares for a much-anticipated public offering, the company is eager to show off its momentum by building on its huge membership: more than 800 million active users around the world, Facebook says, and roughly 200 million in the United States, or two-thirds of the population…

Many of the holdouts mention concerns about privacy. Those who study social networking say this issue boils down to trust. Amanda Lenhart, who directs research on teenagers, children and families at the Pew Internet and American Life Project, said that people who use Facebook tend to have “a general sense of trust in others and trust in institutions.” She added: “Some people make the decision not to use it because they are afraid of what might happen.”…

Facebook executives say they don’t expect everyone in the country to sign up. Instead they are working on ways to keep current users on the site longer, which gives the company more chances to show them ads. And the company’s biggest growth is now in places like Asia and Latin America, where there might actually be people who have not yet heard of Facebook…

And whether there is haranguing involved or not, the rebels say their no-Facebook status tends to be a hot topic of conversation — much as a decision not to own a television might have been in an earlier media era…

Some quick thoughts:

1. This is a relatively small percentage of Americans who don’t use Facebook. If 200 million Americans are on Facebook, that is the vast majority of people 13 years old and above. Roughly 15-20% of Americans are not eligible for Facebook (older 2000 figures here). The comparison made in the article is to the percent of people without cell phones which is roughly 16%.

1a. Because of its general ubiquity, perhaps it would be more interesting then to differentiate between people who it frequently (multiple times a day?) versus those who check infrequently (say once a week or less).

1b. Is this the activity Americans most share in common perhaps beside watching TV?

2. Privacy issues don’t seem to bother most Facebook users. Even though there may be little revolts when Facebook changes its privacy policy or makes a mistake, this isn’t driving people away in large numbers. And, as I’ve said before, if you want to remain private you should probably stay off the Internet all together. Another warning for non-users: Facebook may already have information about you anyway.

3. It would be interesting to see figures of how long people stay on Facebook. And speaking of getting people to see advertisements, this small study used eye tracking to see what catches people’s attention when they look at profiles.

3a. If Facebook does need to keep users’ attention, is there a line between always having to change things versus helping people feel comfortable with the site? I say this as we await the Timeline change and the inevitable negative responses.

4. As the article hints at by briefly looking at the pressure non-users get from Facebook users, there is a whole set of social norms that have arisen around the use of Facebook.