“The most closely studied troublemakers in history”

See this story for how a large study of Boston’s youths begun in 1939 sheds light on the recent arrest of mobster James “Whitey” Bulger:

It all began in 1939, when husband-and-wife researchers Sheldon and Eleanor Glueck assembled a team of investigators to go door to door through a number of poor Boston neighborhoods and collect data on boys who had grown up there. Their goal was to understand what causes some boys and not others to get involved with crime, a question which, as it happened, would be dramatically brought to life in the story of Whitey Bulger and his overachieving brother in the state Senate, William.

The Gluecks picked a sample of 1,000 boys, half of whom had stayed out of trouble while the other half had racked up records and gotten themselves locked up at one of two local reform schools, Lyman and Shirley. The boys were interviewed repeatedly – once when they were around 14, then again when they were 25 and 32 – as were their teachers, parents, and neighbors. Their world – Whitey’s world – was carefully documented, and their lives were charted as they grew from adolescents into adults…

The original researchers didn’t publish all of their data and several decades later, two criminologists dug into the data and interviewed some of the original participants. Here is what they found:

Their study earned Laub and Sampson accolades in their field for their insights into the nature of crime. But it also points to a few truths specifically about Boston, and the way the city shaped the Glueck boys while they grew into the Glueck men. It mattered a lot where these boys came from, Laub and Sampson concluded: The city had influenced them like no other city could have. Specifically, according to Sampson, it had made them cynical about authority.

All the poor neighborhoods in Boston were isolated to some degree in the 1940s: As Sampson and Laub discovered, kids who grew up in ethnic enclaves like Southie or the North End during that time did not identify with the city as a whole. Their lives were just too separate from everyone else’s, their daily routines too local. Plus, they knew the people who ran the show on Beacon Hill thought of their neighborhoods as slums, and they resented it.

This is an interesting piece as such large studies can offer a wealth of data and insights. This makes me wonder if other large datasets would benefit from teams of researchers later combing through the data to explore different areas and follow-up.

This is the sort of information that would help provide a broader context to Bulger’s case but I suspect the media will mainly stick to his mob background.

Scientists call for more rules and regulations about data

There are a lot of academics and researchers collecting data on a variety of topics. Some scientists argue that we need more regulations about data so that researchers can work with and access data collected by others:

In 10 new articles, also published in Science, researchers in fields as diverse as paleontology and neuroscience say the lack of data libraries, insufficient support from federal research agencies, and the lack of academic credit for sharing data sets have created a situation in which money is wasted and information that could reveal better cancer treatments or the causes of climate change goes by the wayside…

A big problem is the many forms of data and the difficulty of comparing them. In neuroscience, for instance, researchers collect data on scales of time that range from nanoseconds, if they are looking at rates of neuron firing, to years, if they are looking at developmental changes. There are also difference in the kind of data that come from optical microscopes and those that come from electron microscopes, and data on a cellular scale and data from a whole organism…

He added that he was limited by how data are published. “When I see a figure in a paper, it’s just the tip of the iceberg to me. I want to see it in a different form in order to do a different kind of analysis.” But the data are not available in a public, searchable format.

Shared data libraries sound like they could be useful. Based on experience, however, even if data is made available, it still takes a good amount of time to download data, read the documentation, and reshape the data in a way that one can start to replicate findings from journal articles.