Social inequalities in accessing open government data

Some governments are providing more open data. But, this may not be enough as citizens don’t necessarily have equal access to the data or abilities to interpret the information:

At least 16 nations have major open data initiatives; in many more, pressure is building for them to follow suit. The US has posted nearly 400,000 data sets at Data.gov, and organizations like the Sunlight Foundation and MAPlight.org are finding compelling ways to use public data—like linking political contributions to political actions. It’s the kind of thing that seems to prove Louis Brandeis’ famous comment: “Sunlight is said to be the best of disinfectants.” But transparency alone is not a panacea, and it may even have a few nasty side effects. Take the case of the Bhoomi Project, an ambitious effort by the southern Indian state of Karnataka to digitize some 20 million land titles, making them more accessible. It was supposed to be a shining example of e-governance: open data that would benefit everyone and bring new efficiencies to the world’s largest democracy. Instead, the portal proved a boon to corporations and the wealthy, who hired lawyers and predatory land agents to challenge titles, hunt for errors in documentation, exploit gaps in records, identify targets for bribery, and snap up property. An initiative that was intended to level the playing field for small landholders ended up penalizing them; bribery costs and processing time actually increased.

A level playing field doesn’t mean much if you don’t know the rules or have the right sporting equipment. Uploading a million documents to the Internet doesn’t help people who don’t know how to sift through them. Michael Gurstein, a community informatics expert in Vancouver, British Columbia, has dubbed this problem the data divide. Indeed, a recent study on the use of open government data in Great Britain points out that most of the people using the information are already data sophisticates. The less sophisticated often don’t even know it’s there.

This touches on two issues of social inequality that are not discussed as much as they might be. First, not everyone has consistent access to the internet. It may be a necessity for the younger generations but for example, there are still problems in doing web surveys because internet users are not a representative cross-sample of the US population. Making the data available on the internet would make it available to more users but not necessarily all users. This ties in with some earlier thoughts I’ve had about whether internet access will become a de facto or defined human right in the future.

Second, not everyone knows where the open data is or how to go through it. Government information dumps require sorting through and some time to figure out what is going on. There may or may not be a guide through the information. As someone who has worked with some large sociological datasets, it always takes some time to become acclimated with the files and data before one can begin an analysis. This should legitimately become part of a college education: some training in how to sort through information and common databases. If we get to a point where the average informed citizen needs to be able to sort through government information online, wouldn’t this be a basic skill that all need to be taught? As the commentator suggests, the trained and sophisticated can take advantage of this data while the average citizen may be left behind.

The idea of having more open government information should cause us to think about how the internet might help close the gap between people (though I don’t hold any utopian expectations about this) rather than sustain or exacerbate social inequalities.

A sociological view of science

A while back, I had a conversation with friends about how undergraduate students understand and use the word “proof” when talking about knowing about the world. Echoing some of our conversation, a sociologist describes science:

I am a sociologist and read philosophy guardedly. As a social scientist, I tell my students again and again that while a theory or a sparkling generalization may be beautiful, the real test is always an “appeal to the empirical.” A proposition may be very appealing and may seem to provide powerful and enticing descriptions and understandings. However, until we gather evidence that shows that the proposition can be supported by information confirmed by the senses we must hold any proposition as one possibility among other competing explanations. Further, even when a theory or a set of ideas has been measured repeatedly against the empirical world, science never leads to certainty. Rather, science is always a modest enterprise. Even at its best and most rigorous, science is inherently “probabilistic” — we can have varying degrees of confidence in a finding, but certainty is not possible. As humans, our knowing is contingent and limited. Even the best designed scientific tests carry with them the possibility of disconfirmation in later tests. Science at its best offers acceptable levels of persuasiveness but cannot offer final conclusions.

Several things stand out to me in this explanation:

1. The appeal to data and weighing information versus existing explanations.

2. The lack of certainty in science and a probabilistic view of the world. Certainty here might be defined as “100% knowledge.” I think we can be functionally certain about some things. But the last bit about persausiveness is interesting.

3. Human knowledge is limited. There are always new things to learn, particularly about people and societies.

4. Scientific tests are undertaken to test existing theories and discover new information.

This sounds like a reasonable sociological perspective on science.

How a pharmacy receipt illustrates identity issues in the EU

Sociological ideas can come from all sorts of places. Here how a French receipt for toothpaste provides insights into the unity of the European Union:

Harrington had spent her senior year of high school in France and had fallen in love with a specific toothpaste flavored with a lemony-minty herb called “verveine.” So she went to the nearest pharmacy, bought the place out (all two bottles worth), and forked over her euros.

But when Harrington looked at her receipt, she saw something that looked out of place. Below the price of her toothpaste in euros, there was a conversion statement that said 1 euro is equal to 6.5597 francs.

Handy? Well, sure, until one remembers that France has been on the Euro since 1999. That gives people more than a decade to practice converting euros to francs.

Most people would probably forget about this cultural oddity as soon as they crammed the receipt into their back pockets. But Harrington is an economic sociologist at the Copenhagen Business School, and decided to dig deeper. What she found was that this little line on the pharmacy receipt was indicative of larger identity issues in contemporary Europe.

With the advent of the European Union and a common currency, citizens had to reconcile their national identities with a new continental identity. The process has been far from frictionless — not only do many French people still talk about prices in terms of francs, but Germans still speak in terms of deutschmarks (and they don’t even have dual pricing receipts).

Harrington says the current debt crisis has revealed cracks in the spirit of European collectivism “because it was never a seamless whole to begin with.”

This story suggests a kind of intellectual curiosity I would guess a lot of sociologists would want to instill in their students: how might you see the world, including pharmacy receipts, from a sociological perspective? Within the field of sociology, Harrington would have to build upon this single piece of evidence. In order to draw publishable conclusions about “European collectivism,” she would need to draw upon a broader dataset that would have demonstrated patterns of behavior.

Data to assess Ray Lewis’ claim that crime would increase if the NFL doesn’t play

A while back, I called for data to assess Ray Lewis’ claim that crime would rise if the NFL doesn’t have games. A group of journalists decided to use data to examine Lewis’ argument and even made at least one of the comparisons I suggested might be helpful:

The AJC accepted Lewis’ invitation to do that research, contacted the Northeastern’s Sport in Society center and was told that “there is very little evidence supporting Lewis’ claim that crime will increase the longer the work stoppage lasts.”…

The Sun looked at crime in Baltimore the four weeks before the season started and the first four weeks of the season. There was the same number of crimes. The Sun also examined the crime rate there at the end of the Ravens’ season and what happened afterward. What did it find? There was less crime after the season ended in early January.

The Sun stressed several times that its findings were unscientific…

The AJC then went to look at increases in crime during bye weeks, assuming that the no football/higher crime equation would fit a much shorter time frame. No real evidence was presented that would lead in one direction or another.

One criminologist we interviewed had a different take. Northeastern University professor James A. Fox heard Lewis’ comments and did a study. He looked at key FBI data from the last three years available, 2006 through 2008, focusing on the week before the Super Bowl because there were no games that week and there was intense interest in football around that time of the year. Fox, who was referred to us by the FBI, found no increase in crime the week there was no football.

This isn’t comprehensive data – but it’s a start. Of course, such studies need to control for a lot of possible factors that could affect crime levels and fairly large samples across multiple cities are needed.

I still don’t quite understand why the media’s response to Lewis’ claim with data has been either slow or not disseminated widely, particularly Lewi’s argument was widely aired and discussed.

James Q. Wilson on the difficulties of studying culture

In a long opinion piece looking at possible explanations for the reduction in crime in America, James Q. Wilson concludes by suggesting that cultural explanations are difficult to test and develop:

At the deepest level, many of these shifts, taken together, suggest that crime in the United States is falling—even through the greatest economic downturn since the Great Depression—because of a big improvement in the culture. The cultural argument may strike some as vague, but writers have relied on it in the past to explain both the Great Depression’s fall in crime and the explosion of crime during the sixties. In the first period, on this view, people took self-control seriously; in the second, self-expression—at society’s cost—became more prevalent. It is a plausible case.

Culture creates a problem for social scientists like me, however. We do not know how to study it in a way that produces hard numbers and testable theories. Culture is the realm of novelists and biographers, not of data-driven social scientists. But we can take some comfort, perhaps, in reflecting that identifying the likely causes of the crime decline is even more important than precisely measuring it.

I find it a little strange that a social scientist wants to leave culture to the humanities (“novelists and biographers”). This sounds like a traditional social science perspective: culture is a slippery concept that is difficult to quantify and make generalizations about. I can imagine this viewpoint from quantitatively minded social scientists who would ask, “where it the data?”

But there is a lot of good research regarding culture that utilizes data. Some of this data is fuzzier qualitative data that involves ethnographies and long interviews and observations. But other data regarding culture comes from more traditional data sources such as large surveys. And if you put together a lot of these data-driven studies, qualitative and quantitative, I think you could put together some hypotheses and ideas regarding American culture and crime. Perhaps all of this data can’t fit into a regression or this isn’t the way that crime is traditionally studied but that doesn’t mean we have to simply abandon cultural explanations and studies.

The social history of the food pyramid

With the unveiling later this week of a replacement to the food pyramid (it will be a “plate-shaped symbol, sliced into wedges for the basic food groups and half-filled with fruits and vegetables”), the New York Times provides a quick look at the background of the food pyramid:

The food pyramid has a long and tangled history. Its original version showed a hierarchy of foods, with those that made up the largest portions of a recommended diet, like grains, fruit and vegetables, closest to the wide base. Foods that were to be eaten in smaller quantities, like dairy and meat, were closer to the pyramid’s tapering top.

But the pyramid’s original release was held back over complaints from the meat and dairy industry that their products were being stigmatized. It was released with minor changes in 1992.

A revised pyramid was released in 2005. Called MyPyramid, it turned the old hierarchy on its side, with vertical brightly colored strips standing in for the different food groups. It also showed a stick figure running up the side to emphasize the need for exercise.

But the new pyramid was widely viewed as hard to understand. The Obama administration began talking about getting rid of it as early as last summer. At that time, a group of public health experts, nutritionists, food industry representatives and design professionals were invited to a meeting in Washington where they were asked to discuss possible alternative symbols. One option was a plate.

Two things stand out to me:

1. This is partly about changing nutritional standards but also is about politics and lobbying. Food groups are backed by businesses and industries that have a stake in this. Did they play any part in this new logo?

2. This is a graphical design issue. The old food pyramid suggests that certain foods should be the basis/foundation for eating. The most recent pyramid is a bit strange as the pyramid is broken into slivers so the peaking aspect of a pyramid seems to have been discarded. The new logo sounds like it will be a more proportional based object where people can quickly see what percentage of their diet should be devoted to different foods. Since this is a logo that is likely to be slapped on many educational materials and food packages, it would be helpful if it is easy to understand.

Columnist cites FBI data regarding Ray Lewis’ football lockout crime claim

Earlier this week, I posted about Ray Lewis’ comment that if there is a football lockout, crime rates will increase. While Lewis has taken a media beating, I suggested that I hadn’t seen anyone cite data to refute (or support) Lewis’ claim. A columnist in Salt Lake City does look at some data that perhaps sheds light on the relationship between football and crime:

Well, it turns out that crime rates among the general population do actually decrease during the football season. The FBI believes the trend is not connected to football, but to the change in weather and the end of summer break for students. Apparently, criminals like to do their work in warm weather and when they’re not on vacation.

Research indicates that the only crime connection to football might be the increase in domestic violence on NFL Sundays when home teams lose emotional games. Maybe Lewis is wrong; maybe the lockout will reduce crime in the home.

I wish there were specific citations in this column but here is the gist of this cited data: overall, crime goes down in fall (compared to summer) and domestic violence goes up after certain game outcomes. The problem here is that it is difficult to separate the effects of fall (weather, kids back to school, etc.) from the effect of football games themselves. And if there are no close football games, then domestic violence cases might go down. Per my earlier post, I still think we could get more specific data, particularly comparing crime rates on Sundays with or without games and crime rates on other nights with football games (Monday, Thursday, Saturday) versus those same nights without games.

This columnist also throws out another idea that I had thought about:

Then it occurred to me: Maybe Lewis didn’t mean the fans would go on a crime wave without football; maybe he meant THE PLAYERS.

That’s not a big stretch. Look how Antonio Bryant has fared in recent months without football. Look what Michael Vick, Plaxico Burress and Ben Roethlisberger, among many others, did when they were away from football. Idle hands and all that. Maybe what Lewis meant was that we better end this lockout before the players starting (ran)sacking villages and throwing innocent bystanders for losses and intercepting Brinks trucks and so forth.

This image would fit with research suggesting NFL players are arrested at fairly high rates.

Looking for data on Ray Lewis’ comment about the football lockout and crime

Ray Lewis has gained a lot of attention with his recent comment that crime would increase if there was no football in the fall:

In an interview with ESPN, Lewis suggested that the NFL lockout could cause a spike in crime. “Do this research,” Lewis said. “If we don’t have a season. Watch how much evil, which we call crime, how much crime picks up, if you take away our game. There’s nothing else to do.”

Lewis offered no evidence from prior work stoppages that crime rose when the NFL season shut down. “I don’t know how he [Lewis] can make that kind of determination without having more facts,” New York Giants defensive end Mathias Kiwanuka told CBS Sports, in response to Lewis’ comments. If the current dispute between owners and the players crept into the season, forcing the cancellation of games, the economic impact would be harsh for the stadium workers and local retail establishments who serve fans on game-day. There will be significant spillover effects. But would a bored populace turn to street looting because the Steelers aren’t on TV?

Kiwanuka is in agreement with a lot of others that have suggested that Lewis’ comments were strange and unsubstantiated. But why haven’t we seen many people use data to refute Lewis’ claim? We could look at two possible scenarios that Lewis was describing:

1. The more limited scenario: in the times when football games would normally be played, crime would be higher. I remember news stories during the Chicago Bulls’ championship runs in the 1990s about the decrease in ambulance calls and emergency room visits during games and I recall seeing similar stories about NFL playoff games. Couldn’t police departments or other agencies quickly go through their files to see whether crime rates differ during NFL games compared to other times? (Run a comparison between Sunday afternoons of bye weeks or weeks when the team plays on Sunday, Monday, or Thursday night compared to typical Sunday afternoon game times.)

2. The broader scenario: overall, crime would be higher when there are no football games. This would take some more data work to find comparison periods in the fall/early winter when there is no football at all. How about looking at cities by year when they had a NFL playoff team compared to years that they did not? I’m sure someone could figure this out with the appropriate crime data and years of football records.

Getting better data on how students use laptops in class: spy on them

Professors like to talk about how students use laptops in the classroom. Two recent studies shed some new light on this issue and they are unique in how they obtained the data: they spied on students.

Still, there is one notable consistency that spans the literature on laptops in class: most researchers obtained their data by surveying students and professors.

The authors of two recent studies of laptops and classroom learning decided that relying on student and professor testimony would not do. They decided instead to spy on students.

In one study, a St. John’s University law professor hired research assistants to peek over students’ shoulders from the back of the lecture hall. In the other, a pair of University of Vermont business professors used computer spyware to monitor their students’ browsing activities during lectures.

The authors of both papers acknowledged that their respective studies had plenty of flaws (including possibly understating the extent of non-class use). But they also suggested that neither sweeping bans nor unalloyed permissions reflect the nuances of how laptops affect student behavior in class. And by contrasting data collected through surveys with data obtained through more sophisticated means, the Vermont professors also show why professors should be skeptical of previous studies that rely on self-reporting from students — which is to say, most of them.

While these studies might be useful for dealing with the growing use of laptops in classrooms, discussing the data itself would be interesting. A few questions come to mind:

1. What discussions took place with an IRB? It seems that this might have been a problem in the study using spyware on student computers and this was reflected in the generalizability of the data with just 46% of students agreeing to have the spyware on their computer. The other study also could run into issues if students were identifiable. (Just a thought: could a professor insist on spyware being on student computers if the students insisted on having a laptop in class?)

2. These studies get at the disparities between self-reported data and other forms of data collection. I would guess that students would underestimate their distractable laptop use on self-reported surveys because they would suspect that this is the answer that they should give (social desirability bias). But it could also reveal things about how cognizant computer/Internet users are about how many windows and applications they actually cycle through.

3. Both of these studies are on a relatively small scale: one had 45 students, the other had a little more than 1,000 but the data was “less precise” since it involved TAs sitting in the back monitoring students. Expanding the Vermont study and linking laptop use to outcomes on a larger scale is even better: move beyond just talking about the classroom experience and look at its impact on learning outcomes. Why doesn’t someone do this on a larger scale and in multiple settings? Would it be too difficult to get past some of the IRB issues?

In looking at the comments about this story, it seems like having better data on this topic would go a long ways to moving the discussion beyond anecdotal evidence.

Study human flourishing rather than happiness

A well-known psychologist suggests we should study human flourishing rather than just happiness:

In theory, life satisfaction might include the various elements of well-being. But in practice, Dr. Seligman says, people’s answers to that question are largely — more than 70 percent — determined by how they’re feeling at the moment of the survey, not how they judge their lives over all.

“Life satisfaction essentially measures cheerful moods, so it is not entitled to a central place in any theory that aims to be more than a happiology,” he writes in “Flourish.” By that standard, he notes, a government could improve its numbers just by handing out the kind of euphoriant drugs that Aldous Huxley described in “Brave New World.”

So what should be measured instead? The best gauge so far of flourishing, Dr. Seligman says, comes from a study of 23 European countries by Felicia Huppert and Timothy So of the University of Cambridge. Besides asking respondents about their moods, the researchers asked about their relationships with others and their sense that they were accomplishing something worthwhile.

Denmark and Switzerland ranked highest in Europe, with more than a quarter of their citizens meeting the definition of flourishing. Near the bottom, with fewer than 10 percent flourishing, were France, Hungary, Portugal and Russia.

Studiers of happiness tend to ask about two areas: immediate happiness and longer-term happiness, typically referred to as “life satisfaction.” But Seligman is suggesting that these questions about satisfaction don’t really move beyond the immediate mood of the respondent. Additionally, the questions need to be adjusted to account for relationships and whether the respondent feels a sense of accomplishment in life.

It is interesting to see some of the cross-country comparisons. How might national or smaller cultures influence how individuals feel about life satisfaction? In the long run, do people actually have to be accomplishing something satisfying or is it more about perceptions? Can living a decent life in the American suburbs be ultimately satisfying for Americans or do they just think that it should be?

I wonder how these findings line up with earlier findings that religion leads to higher levels of life satisfaction.

(I also wonder if people think that the language of “flourishing” seems archaic or overly humanistic.)