Lobbynomics v. empirical data

Ars Technica points to a UK report asserting that “lobbynomics” rather than empirical data drives much of the intellectual property policy debate:

There are three main practical obstacles to using evidence on the economic impacts of IP…[3] Much of the data needed to develop empirical evidence on copyright and designs is privately held. It enters the public domain chiefly in the form of “evidence” supporting the arguments of lobbyists (“lobbynomics”) rather than as independently verified research conclusions.

My own experience in dissecting IP developments supports this view.  It is surprisingly difficult to find “hard data” about copyright piracy, leaving any “debate” to a shouting match between proponents of bald assertions.

We need better data, and we all need to be more circumspect (and humble) before drawing sweeping conclusions from the little that is available.

Numbers to back claims about “SEAL-mania”?

I am often on the look-out for news stories that relate to data analysis and interpretation that I can then use in my Statistics and Social Research classes. Here is an example of the AP reporting on “SEAL-mania”:

Stumpf is one of a growing number of Americans putting themselves through grueling fitness programs modeled after Navy SEAL workouts as interest in the elite military unit has soared since one of its teams killed Osama bin Laden. Everyone these days seems to be dreaming of what it’s like to be a SEAL, know a SEAL or at least look like one.

Book publishers say they cannot order the printings of the memoirs of former SEALs fast enough, while people are dialing 1-800-Hooyah! like mad to get their hands on T-shirts emblazoned with the SEAL insignia and sayings like: “When it absolutely, positively must be destroyed overnight! Call in the US Navy SEALs.”

Awe over the covert operation is even putting the city of Fort Pierce, Fla., on the map for vacation destinations. The city’s National Navy UDT-SEAL Museum — the only museum dedicated to the secretive SEALs — has been flooded with calls from people planning to visit.

But nothing short of joining the SEALs offers a more true-to-life taste of their toughness than the workout places run by ex-Navy commandos.

There may actually be an uptick in interest in Navy SEALS (apparently Disney and others are interested) but the story gives us little actual data to support this. We are told about some books, t-shirts, calls to a museum, and an increase in interest in workouts but no hard numbers to go by. In fact, the story seems to revolve around this tentative sentence: “Everyone these days seems to be dreaming of what it’s like to be a SEAL, know a SEAL or at least look like one.” I am skeptical about claims about “everyone.” The story could at least cite Google trend data (a big spike occurs in early May when searching for “SEALs”) or Twitter trend data (another big spike). These may not be ideal data sources but at least they provide some data beyond broad claims. If a media source wants to make a causal claim (Navy SEALs participation in the Bin Laden raid has led to “SEAL-mania” among Americans), then they should provide some better evidence to back up their argument.

(Another odd thing about this story is that the rest of it is about SEALs workouts. It almost seems as if there was some copy about these workouts waiting to be attached to a larger story and this raid presented itself as an opportunity.)

Data on millennials’ life-long take on Osama Bin Laden

In an op-ed, a millennial considers some data regarding how the younger generation viewed Osama Bin Laden throughout their lives. While the media has suggested Bin Laden was a key figure in their young lives, this commentator suggests the data regarding his generation’s view of Bin Laden is more mixed:

Let’s start with the media’s attempts to establish Bin Laden’s impact on millennials. In addition to student sound bites and expert testimony, newspapers turned to sociological evidence to support their theories. To show how 9/11 inspired millennials to pursue public service, USA Today cited the increase in applications for nonprofit jobs. (The week before, this would have been proof of our struggling economy.) To show how 9/11 left millennials in a state of perpetual distress, the newspaper cited a Pew survey claiming that 83% of young people sleep with their cellphones on. (The week before, this would have been proof of our declining attention spans.)

Notice what USA Today didn’t cite: data on millennials’ opinions of Bin Laden from before his death. That’s because these data don’t support the narrative of a generation defining itself in the shadow of the Twin Towers. Not too long ago, the Woodrow Wilson National Fellowship Foundation ran a series of focus groups on college students’ attitudes toward 9/11. The foundation asked students to name the most important social or political event of their lifetime. The most common answer was not 9/11 — in fact, it was one of the least common — but the rise of the Internet.

Even data that support the media’s theories stop well short of suggesting a millennial reboot. In 2000, for example, UCLA’s Higher Education Research Institute reported that the number of freshmen who considered keeping up with political affairs to be “essential” or “very important” hit an election-year low: 28%. After 9/11, that number did bounce back — but only to 39% in 2008, well below the 60%-plus who answered affirmatively in 1966, the first year of the annual poll.

These statistics, I think, capture my generation’s real relationship to Bin Laden. It would be too much to say we had forgotten about him, but it also would be too much to say he haunted or defined us in any real way.

I, too, have heard this media narrative and now that I think about it, the primary data marshaled in support of it were the college student celebrations the night of the announcement of Bin Laden’s death. I would need to see more data on this to be convinced either way but it sounds like an interesting argument. If the media story is incorrect, it seems like it wouldn’t be too hard to put together more data to suggest this is the case. I assume most polling organizations have asked plenty of questions about Bin Laden and terrorism over the last ten years and these organizations could easily break out the data by age. If it turned out that millennials were not terribly impacted by 9/11 or Bin Laden’s death, what would be the reaction of older generations?

The rest of the op-ed contains opinions about the partying reaction of millennials. The public discussion regarding the celebration of and reaction to Bin Laden’s death has been intriguing though it is hard to know exactly what is going on and what it might say, if anything, about the larger American culture. My initial reaction to seeing the college students partying in front of the White House was to think that they were looking for an excuse to party on a Sunday night with school the next day…

The rankings of liveable cities

Architecture critic Edwin Heathcote of the Financial Times asks why the most livable cities in the world, such as Vancouver, are not necessarily the the most loved cities.

This is another argument that deals with methodology: how exactly does one determine which cities are the “most liveable”? If just one or two factors are tweaked by certain publications, the list changes. Just like college rankings (recent thoughts here), such lists should be viewed with some skepticism.

Additionally, the criteria used by publications is not necessarily the criteria used by citizens who have some choices about where to move. Indeed, such lists seem to presume that these are the choices people would make if they had equal opportunity to move within their own country and/or around the world. Of course, most people have more restricted options due to job availability, price, personal preferences, location of family, and more.

In reading about this, it also strikes me that lists of liveable cities also might not make sense to many Americans: why would they want to live in a city when a majority have already chosen a suburban life?

h/t Instapundit

Rising debt for college loans better than debt for a McMansion

The college Class of 2011 might expect more in life than simply to be known as “the most indebted ever“:

22,900: Average student debt of newly minted college graduates

The Class of 2011 will graduate this spring from America’s colleges and universities with a dubious distinction: the most indebted ever.

Even as the average U.S. household pares down its debts, the new degree-holders who represent the country’s best hope for future prosperity are headed in the opposite direction. With tuition rising at an annual rate of about 5% and cash-strapped parents less able to help, the mean student-debt burden at graduation will reach nearly $18,000 this year, estimates Mark Kantrowitz, publisher of student-aid websites Fastweb.com and FinAid.org. Together with loans parents take on to finance their children’s college educations — loans that the students often pay themselves – the estimate comes to about $22,900. That’s 8% more than last year and, in inflation-adjusted terms, 47% more than a decade ago.

In the long run, the investment is probably worth it. Education is a much better reason to borrow money than buying cars or McMansions, and it endows people with economic advantages that the recession and slow recovery have only accentuated. As of 2009, the annual pre-tax income of households headed by people with at least a college degree exceeded that of less-educated households by 101%, up from 91% in 2006. As of April, the unemployment rate among college graduates stood at 4.5%, compared to 9.7% for those with only a high-school diploma and 14.6% for those who never finished high school.

I am intrigued by the McMansion comparison here as it is used to illustrate the foolishness of overspending on a big or expensive house versus the possible “good debt” of college loans. Of course, this is all in economic terms as the education is expected to pay off down the road while McMansion purchases of the last 15 years are not expected to yield such great values in this poor housing market. (And using a car as a debt comparison seems a bit strange: a car is rarely an investment but rather a black hole for money.) But this view of a house, as an investment opportunity, is a relatively recent development.

There is something about this data that could warrant a closer look: while it appears that the average college student debt has increased, is the average really the best measure here? I would much rather see a distribution of college debt in order to better know whether this mean is heavily influenced by people with massive amounts of college debt. Here is a paragraph from a recent New York Times article regarding college loans:

Two-thirds of bachelor’s degree recipients graduated with debt in 2008, compared with less than half in 1993. Last year, graduates who took out loans left college with an average of $24,000 in debt. Default rates are rising, especially among those who attended for-profit colleges.

And here is some additional data from recent years that sheds more light on the distribution of college debt:

These figures were calculated using the data analysis system for the 2007-2008 National Postsecondary Student Aid Study (NPSAS) conducted by the National Center for Education Statistics at the US Department of Education. (For comparison, cumulative education debt statistics from the 2003-2004 NPSAS are also available.) The 2007-2008 NPSAS surveyed 114,000 undergraduate students and 14,000 graduate and professional students. These statistics are not necessarily available from published NPSAS reports.The median cumulative debt among graduating Bachelor’s degree recipients at 4-year undergraduate schools was $19,999 in 2007-08. One quarter borrowed $30,526 or more, and one tenth borrowed $44,668 or more. 9.5% of undergraduate students and 14.6% of undergraduate student borrowers graduating with a Bachelor’s degree graduated with $40,000 or more in cumulative debt in 2007-08. This compares with 6.4% and 10.0%, respectively, for Bachelor’s degree recipients graduating with $40,000 or more (2008 dollars) in cumulative debt in 2003-04.

This data provides a median that is somewhat similar to the two figures cited above. Based on these three figures and interpretations, it sounds like more college students are taking on debt rather than some students are taking on a lot more debt.

More appealing measurements of the American economy

The Economist looks at several ways in which the US federal government calculates certain economic statistics that might make our economic situation look most appealing. Here is their conclusion:

Conspiracy theorists might conclude that the American government is trying to nip and tuck its way to attractiveness. The persistent downward revisions to GDP growth do look suspicious. But in other areas American number-crunchers seem to believe that their measures are better; indeed, history shows that European statistical agencies have often later adopted their methods. The world’s biggest economy is also much less bothered about the international comparability of its numbers than smaller European countries. True, when the statisticians at the IMF or the OECD produce comparative data, they do so on the basis of standardised definitions. The snag comes if investors fail to grasp that official national figures can show the American economy in an overly flattering light.

Complex numbers, such as these, can be difficult to operationalize or calculate but they also need to be interpreted. Economic experts may know about these methodological differences and can account for these but I’m guessing that the average citizen of the US or European countries has less of an idea about what is going on.

Another US figure that has recently attracted methodological attention is unemployment. While the US unemployment rate has undoubtedly risen in the economic crisis of recent years, it has its own quirks. One part that has been discussed in that people have to be actively looking for work in the last 4 weeks and once people move beyond that cut-off point, they are no longer counted as being unemployed. Another area involves those who work less than full-time but want full-time work and could be classified as “underemployed.” (You can see how the Bureau of Labor Statistics calculates unemployment here.)

(It is also interesting in this story that they compare the calculation of these statistics to cosmetic surgery, apparently an important marker of American culture.)

A call to collect better data in order to predict economic crises

Economist Robert Shiller says that we would be better able to predict economic crises if we only had better data:

Eventually, these advances led to quantitative macroeconomic models with substantial predictive power — and to a better understanding of the economy’s instabilities. It is likely that the “great moderation,” the relative stability of the economy in the years before the recent crisis, owes something to better public policy informed by that data.

Since then, however, there hasn’t been a major revolution in data collection. Notably, the Flow of Funds Accounts have become less valuable. Over the last few decades, financial institutions have taken on systemic risks, using leverage and derivative instruments that don’t show up in these reports.

Some financial economists have begun to suggest the kinds of measurements of leverage and liquidity that should be collected. We need another measurement revolution like that of G.D.P. or flow-of-funds accounting. For example, Markus Brunnermeier of Princeton, Gary Gorton of Yale and Arvind Krishnamurthy of Northwestern are developing what they call “risk topography.” They explain how modern financial theory can guide the collection of new data to provide revealing views of potentially big economic problems.

Even if more data was collected, it would still require interpretation. If we had the right data before the ongoing current economic crisis, I wonder how confident Shiller would be that we would have made the right predictions (50%? 70% 95%?). From the public narrative that has developed, it looks like there was enough evidence that the mortgage industry was doing some interesting things but few people were looking at the data or putting the story together.

And for the future, do we even know what data we might need to be looking at in order to figure out what might go wrong next?

Using cell phone data to research social networks

Social network analysis is a growing area within sociology and other disciplines. The Wall Street Journal reports on the advantages of examining cell phone data:

As a tool for field research, the cellphone is unique. Unlike a conventional land-line telephone, a mobile phone usually is used by only one person, and it stays with that person everywhere, throughout the day. Phone companies routinely track a handset’s location (in part to connect it to the nearest cellphone tower) along with the timing and duration of phone calls and the user’s billing address…

Advances in statistics, psychology and the science of social networks are giving researchers the tools to find patterns of human dynamics too subtle to detect by other means. At Northeastern University in Boston, network physicists discovered just how predictable people could be by studying the travel routines of 100,000 European mobile-phone users.

After analyzing more than 16 million records of call date, time and position, the researchers determined that, taken together, people’s movements appeared to follow a mathematical pattern. The scientists said that, with enough information about past movements, they could forecast someone’s future whereabouts with 93.6% accuracy.

The pattern held true whether people stayed close to home or traveled widely, and wasn’t affected by the phone user’s age or gender.

The rest of the article then goes on to talk about a lot of interesting research on topics like social contagions (see an example of this research here) and social relationships using this data.

Some may be concerned about privacy, particularly with recent reports about iPhones and iPads containing a file that records the movements of users. I have a few thoughts about this:

1. Compared to other possible data sources (surveys, time diaries, interviews, ethnography), this seems like a treasure trove of information. The article suggests that nearly 75% of people in the world have cell phones – what other data source can compare with that? Could the research potential outweigh individual privacy concerns? In thinking about some of these research questions, it would be very difficult to use more traditional methods to address the same concerns. And just the sheer number of cases a researcher could access and work with is fantastic. In order to build more complex models of human behavior, this is exactly the kind of data one could use.

2. I would be less concerned about researchers using this data than companies. Researchers don’t particularly care about the individual cases in the data but rather are looking for broad patterns. I would also guess that the cell phone data is anonymized so that researchers would have a difficult time pinpointing specific individuals even if they wanted to.

3. How much of a surprise is it that this available data is being used? Don’t cell phone carriers include some sort of statement in their contracts about using data in such ways? One option here would be to not get a smart phone. But if you want a smart phone (and it seems that a lot of Americans do), then this is the tradeoff. This is similar to the tradeoff with Facebook: users willingly give their information to enhance their social lives and then the company can look for ways to profit from this information.

h/t Instapundit

Conclusions about PC vs. Mac users based on an unscientific web survey

Based on the headline, this looks like an interesting story: “Mac vs. PC: The stereotypes may be true.” But there is a problem:

An unscientific survey by Hunch, a site that makes recommendations based on detailed user preferences, found that Mac users tend to be younger, more liberal, more fashion-conscious and more likely to live in cities than people who prefer PCs.

While the first part of this paragraph is treated as a clause that barely affects the rest of the text, it really is the key to the story. Hunch’s survey respondents identify as 52% PC users and 25% Mac users with 23% percent identifying with neither (and what do we do this category?). This compares to PC vs. Mac world market share of 89% to 11%. This is evidence that the online sample doesn’t quite match up with what computer users are actually buying. Voluntary web surveys are difficult to work with for this reason: even if there are a lot of respondents, we don’t know whether these respondents are representative of larger populations.

Perhaps CNN does cover themselves. The headline does suggest that these stereotypes “may” be correct and the second paragraph suggests the stereotypes may contain “some truth.” But a more cynical take regarding both CNN and Hunch is that they simply want more web visits from devoted PC or Mac defenders. Perhaps the fact that all of this is based on an unscientific survey is less important than driving visitors to one’s site and asking people to comment at the bottom of both stories.

Venkatesh argues Anderson’s recent book highlights sociology’s identity problem

Sudhir Venkatesh reviews Elijah Anderson’s new book The Cosmopolitan Canopy (earlier review here) and argues that the text is emblematic of a larger identity crisis within sociology:

Anderson’s struggle to make sense of the current multicultural situation is not only a function of his own intellectual uncertainty. It is also a symptom of the field in which he is working, which is confused about its direction. Where sociology once gravitated to the most pressing problems, especially the contentious issues that drove Americans apart, it no longer seems so sure of its mission. With no obvious crisis, disaster, or glaring source of inequity as a backdrop demanding public action, a great American intellectual tradition gives every sign of weathering a troubled transition…

Anderson’s fascinating foray and his inability to tie together the seemingly contradictory threads highlight the new challenges that face our field. On the one hand, sociology has moved far away from its origins in thoughtful feet-on-the ground analysis, using whatever means necessary. A crippling debate now pits the “quants,” who believe in prediction and a hard-nosed mathematical approach, against a less powerful, motley crew—historians, interviewers, cultural analysts— who must defend the scientific rigor and objectivity of any deviation from the strictly quantitative path. In practice, this means everyone retreats to his or her comfort zone. Just as the survey researcher isn’t about to take up with a street gang to gather data, it is tough for an observer to roam free, moving from one place to another as she sees fit, without risking the insult: “She’s just a journalist!” (The use of an impenetrable language doesn’t help: A common refrain paralyzing our field is, “The more people who can understand your writing, the less scientific it must be.”)

For Anderson to give up “fly on the wall” observation, his métier, and put his corporate interviews closer to center-stage would risk the “street cred” he now regularly receives. This is sad because Anderson is on to the fact that we have to re-jigger our sociological methods to keep up with the changes taking place around us. Understanding race, to cite just one example, means no longer simply watching people riding the subway and playing chess in parks. The conflicts are in back rooms, away from the eavesdropper. They are not just interpersonal, but lie within large institutions that employ, police, educate, and govern us. A smart, nimble approach would be to do more of what Anderson does—search for clues, wherever they may lie, whether this means interviewing, observing, counting, or issuing a FOIA request for data.

If you search hard enough, you can find pockets of experimentation, where sociologists stay timely and relevant without losing rigor. It is not accidental they tend to move closer to our media-frenzied world, not away from it, because it’s there that some of the most illuminating social science is being done, free of academic conventions and strictures. At Brown and Harvard, sociologists are using the provocative HBO series, The Wire, to teach students about urban inequality. At Princeton and Michigan, faculty make documentary films and harness narrative-nonfiction approaches to invigorate their research and writing. At Boston University, a model turned sociologist uses her experiences to peek behind the unforgiving world of fashion and celebrity. And the Supreme Court’s decision to grant the plaintiffs a “class” status in the Wal-Mart gender-discrimination case will hinge on an amicus brief submitted by a sociologist of labor. None of this spirited work occurs without risk, as I’ve found out through personal experience. Each time I finish a documentary film, one of my colleagues will invariably ask, “When are you going to stop and get back to doing real sociology?”

I have several thoughts about this:

1. I think it is helpful (and perhaps unusual) to see this piece at Slate.com rather than in an academic journal. At the same time, is this only possible for an academic like Venkatesh who has a best-selling popular book (Gang Leader For a Day) and is also tied to the Freakonomics crowd?

2. Venkatesh seems to be bringing up two issues.

a. The first issue is one of direction: what are the main issues or areas in which sociology could substantially contribute to society? If some of the issues of the early days such as race (still an issue but Anderson’s data suggests it is exists in different forms) and urbanization (generally settled in favor of suburbanization in America) are no longer that noteworthy, what is next? Consumerism? Gender? Inequality between the rich and poor? Exposing the contradictions still present in society (Venkatesh’s conclusion)?

This is not a new issue. Isn’t this what public sociology was supposed to solve? There also has been some talk about fragmentation within the discipline and whether sociology has a core. Additionally, there is occasional conversation about why sociology doesn’t seem to get the same kind of public or policy attention as other fields.

b. The second issue is one of data. While both Anderson and Venkatesh are well-known for practicing urban ethnography (as Venkatesh notes, a tradition going back to the early 20th century work of the Chicago School), Venkatesh notes that even Anderson had to move on to a different technique (interviewing) to find the new story. More broadly, Venkatesh places this change within a larger battle between quantitative and qualitative data where people on each side discuss what is “real” data.

This quantitative vs. qualitative debate has also been around for a while. One effort in recent years to address this moves to mixed methods where researchers use multiple sources and techniques to reach a conclusion. But it also seems that one common way to critique the work of others is to jump right to the methodology and suggest that it is limited to the point that one cannot come to much of a conclusion. Most (if not all) data is not perfect and there are often legitimate questions regarding validity and reliability but researchers are often working with the best available data given time and monetary constraints.

In the end, I’m not sure Venkatesh provides many answers. So, perhaps just like his own conclusions regarding Anderson’s book (“Better to point [these contradictions] out, however speculative and provisional the results may be, than to hide from the truth.”), we should be content just that these issues have been outlined.

(Here is an outsider’s take on this piece: “One thing that’s the matter with sociology is that like economics the discipline’s certitude of conclusion outran its methodological rigor. Being less charitable, sociology is just an ideology which occasionally dons the gown of dispassionate objectivity to maintain a semblance of respectability.” Ouch.)