An example of statistics in action: measuring faculty performance by the grades students receive in subsequent courses

Assessment, whether it is for student or faculty outcomes,  is a great area in which to find examples of statistics. This example comes from a discussion of assessing faculty by looking at how students do in subsequent courses:

[A]lmost no colleges systematically analyze students’ performance across course sequences.

That may be a lost opportunity. If colleges looked carefully at students’ performance in (for example) Calculus II courses, some scholars say, they could harvest vital information about the Calculus I sections where the students were originally trained. Which Calculus I instructors are strongest? Which kinds of homework and classroom design are most effective? Are some professors inflating grades?

Analyzing subsequent-course preparedness “is going to give you a much, much more-reliable signal of quality than traditional course-evaluation forms,” says Bruce A. Weinberg, an associate professor of economics at Ohio State University who recently scrutinized more than 14,000 students’ performance across course sequences in his department.

Other scholars, however, contend that it is not so easy to play this game. In practice, they say, course-sequence data are almost impossible to analyze. Dozens of confounding variables can cloud the picture. If the best-prepared students in a Spanish II course come from the Spanish I section that met at 8 a.m., is that because that section had the best instructor, or is it because the kind of student who is willing to wake up at dawn is also the kind of student who is likely to be academically strong?

It sounds like the relevant grade data for this sort of analysis would not be difficult. The hard part is making sure the analysis includes all of the potentially relevant factors, “confounding variables,” that could influence student performance.

One way to limit these issues is to limit student choice regarding sections and instructors. Interesting, this article cites studies done at the Air Force Academy, where students don’t have many options in the Calculus I-II sequence. In summary, this setting means “the Air Force Academy [is] a beautifully sterile environment for studying course sequences.”

Some interesting findings both from the Air Force Academy and Duke: students who were in introductory/earlier classes that they considered more difficult or stringent did better in subsequent courses.

What may look like a decent survey is lacking generalizability, military officer edition

The latest issue of Atlantic has an interesting article discussing why a number of US military officers are leaving the military. The argument: the military is too bureaucratic and doesn’t practice meritocracy so the brightest and more entrepreneurial officers leave for other fields.

All of this is interesting but I was struck by the data used for the article. Here is how the author describes the surveys he conducted and draws conclusions from:

In a recent survey I conducted of 250 West Point graduates (sent to the classes of 1989, 1991, 1995, 2000, 2001, and 2004), an astonishing 93 percent believed that half or more of “the best officers leave the military early rather than serving a full career.” By design, I left the definitions of best and early up to the respondents. I conducted the survey from late August to mid-September, reaching graduates through their class scribes (who manage e-mail lists for periodic newsletters). This ensured that the sample included veterans as well as active-duty officers. Among active- duty respondents, 82 percent believed that half or more of the best are leaving. Only 30 percent of the full panel agreed that the military personnel system “does a good job promoting the right officers to General,” and a mere 7 percent agreed that it “does a good job retaining the best leaders.”

This sort of paragraph is very helpful and is toward the front of the story. And the numbers look overwhelming, particularly the first cited figure about 93% believing the best officers leave early.

But there is an issue here: the generalizability of this data. The article suggests surveys were conducted with 250 officers spread across six graduating classes (presumably to help control for time effects). But does this represent West Point graduates on the whole? Does this even represent each graduating class? If one looks at the class page for the graduating class of 2004, there were almost 1,200 entering students. Even if a decent amount leave before graduating, this is a lot more than the 40 or so that would have been surveyed if we had equal representation out of the six graduating classes (250 total surveys divided by six graduate classes).

This does not necessarily mean that these survey results and their interpretation are necessarily wrong. But it should cast doubt: does this survey really speak for all West Point graduates or even more broadly, military officers as a whole? While conducting some sort of survey is better than simply working with anecdotes one hears from officers veterans, this survey could still be improved so that the results could be generalized to all officers. We need a larger N of officers to survey in order to have results that we could really trust.

Limited American meritocracy and the importance of a college education

A foundational cultural value in America is that residents should have equal opportunities and that if people work hard and grasp these opportunities, they will be able to get ahead. But academics have suggested for decades that while this might sound good, real chances to move up the social ladder are more limited. Some recent data suggests this is indeed the case: compared to other industrialized nations, being born into a poor American family is more limiting.

Among children born into low-income households, more than two-thirds grow up to earn a below-average income, and only 6% make it all the way up the ladder into the affluent top one-fifth of income earners, according to a study by economists at Washington’s Brookings Institution.

We think of America as a land of opportunity, but other countries appear to offer more upward mobility. Children born into poverty in Canada, Britain, Germany or France have a statistically better chance of reaching the top than poor kids do in the United States.

What’s gone wrong? Thanks to globalization, the economy is producing high-income jobs for the educated and low-income jobs for the uneducated — but few middle-income jobs for workers with high school diplomas…And Harvard sociologist Robert Putnam argues that thanks partly to the rise of two-income households, intermarriage between rich and poor has declined, choking off another historical upward path for the underprivileged.

“We’re becoming two societies, two Americas,” Putnam told me recently. “There’s a deepening class divide that shows up in many places. It’s not just a matter of income. Education is becoming the key discriminant in American life. Family structure is part of it too.”

Increasingly, college-educated Americans live in a different country from those who never made it out of high school.

This article only mentions a small bit of data and it would be interesting to see the mobility rates for all Americans.

But these findings present Americans with a contradiction: we talk about social mobility but reality is a lot harsher. What often happens is that certain cases of people who “made it” are trumpeted and held up as examples when really those people were exceptions rather than the rule.

Malcolm Gladwell’s book Outliers lays this out in a simple way: those born into more privileged positions accumulate advantages over time. One of these advantages in America today is a college education. For many in the middle and upper classes, college is a foregone conclusion: a young child is expected to accomplish this goal. But to get to this point, middle and upper class children have more financial resources, better schools, better health and nutrition, parental support (“concerted cultivation”), and more.

This gap between the college educated and those with less than a college education is an important one to watch in the coming decades.

Google offers tool to analyze texts going back to the 1500s

Among other projects Google has been working on, they recently opened a new online tool that allows users to search for certain words in texts going back to the 1500s:

With little fanfare, Google has made a mammoth database culled from nearly 5.2 million digitized books available to the public for free downloads and online searches, opening a new landscape of possibilities for research and education in the humanities.

The digital storehouse, which comprises words and short phrases as well as a year-by-year count of how often they appear, represents the first time a data set of this magnitude and searching tools are at the disposal of Ph.D.’s, middle school students and anyone else who likes to spend time in front of a small screen. It consists of the 500 billion words contained in books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian…

“The goal is to give an 8-year-old the ability to browse cultural trends throughout history, as recorded in books,” said Erez Lieberman Aiden, a junior fellow at the Society of Fellows at Harvard…

“We wanted to show what becomes possible when you apply very high-turbo data analysis to questions in the humanities,” said Mr. Lieberman Aiden, whose expertise is in applied mathematics and genomics. He called the method “culturomics.”

The article mentions some projects that use this database and sound interesting. And it sounds the dataset can be downloaded and analyzed by users on their own computers.

But thinking about the methodology of this all, I would have some questions.

1. Do we know how well these digitized texts represent the full population of texts? This is a sampling issue – could there be some sort of bias in what kind of texts ended up in this database?

2. Studying word frequency by itself is tricky. Simply counting words and when they appear is one measurement while trying to assess the importance placed in each word is another task. Do the three little “culturnomics” graphs on the left side of the online story really tell us much?

3. It sounds like this would be best for looking at how language (grammar, word choices, structure, etc.) has changed over time.

WikiLeak cables as historical documents

How should the WikiLeaks cables be viewed as historical documents? One historian suggests caution:

In the short term, this is a potential gold mine for foreign-affairs scholarship. In the long term, however, what WikiLeaks wants to call “Cablegate” will very likely make life far more difficult for my profession.

For now, things certainly look very sweet. Timothy Garton Ash characterized the documents as “the historian’s dream.” Jon Western, a visiting professor of international relations at the University of Massachusetts at Amherst, blogged that WikiLeaks may allow scholars to “leapfrog” the traditional process of declassification, which takes decades. While the first wave of news reports focused on the more titillating disclosures (see: Col. Muammar el-Qaddafi’s Ukrainian nurse), the second wave has highlighted substantive and trenchant aspects of world politics and American foreign policy. The published memos reveal provocative Chinese perspectives on the future of the Korean peninsula, as well as American policy makers’ pessimistic perceptions of the Russian state.

Scholars will need to exercise care in putting the WikiLeaks documents in proper perspective. Some researchers suffer from “document fetishism,” the belief that if something appears in an official, classified document, then it must be true. Sophisticated observers are well aware, however, that these cables offer only a partial picture of foreign-policy decision-making. Remember, with Cablegate, WikiLeaks has published cables and memos only from the State Department. Last I checked, other bureaucracies—the National Security Council, the Defense Department—also shape U.S. foreign policy. The WikiLeaks cables are a source—they should not be the sole source for anything.

Seems like a reasonable argument to me. Much research, history included, includes collecting a variety of evidence from a variety of sources. Claiming that these cables represents THE view of the United States is naive. They do reveal something, particularly about how diplomatic cables and reports work, but not everything. How much one can generalize based on these cables is unclear.

As this article points out, how these cables have been portrayed in the media is interesting. Where are the historians and other scholars to put these cables in perspective?

If you want to model the world, look into these online databases

MIT’s Technology Review lists 70 online databases that one could look into in order to model our complex world.

Having used several of the social sciences databases listed here, I am impressed with several features of such databases:

1. The variety of data one can quickly find. (There is a lot of data being collected in the world today.)

2. The openness of this data to users rather than being restricted just to the people who collected the data.

3. The growing ability to do a quick analysis within the database websites.

To me, this is one of the primary functions of the Internet: making good data (information) on all sorts of subjects available to a wide cross-section of users.

Now, with all of this data out there and available, can we do complex modeling of all of social life or natural life or Earthly life? Helbing’s Earth Simulator, mentioned in this story, sounds interesting…

The methodology behind Money’s 2010 best places to live

Every year, Money magazine publishes a list of “the best places to live.” I’ve always enjoyed this list as it attempts to distill what communities truly match what people would desire in a community. The winner in 2010 (in the August issue) was Eden Prairie, Minnesota

But one issue with this list is how the communities are selected. In 2009, the list was about small towns, communities between 8,500 and 50,000. In 2010, the list was restricted to “small cities,” places with 50,000 to 300,000 residents. Here is how the magazine selected its 2010 list of communities to grade and rank:

746
Start with all U.S. cities with a population of 50,000 to 300,000.

555
Exclude places where the median family income is more than 200% or less than 85% of the state median and those more than 95% white.

322
Screen out retirement communities, towns with significant job loss, and those with poor education and crime scores. Rank remaining places based on housing affordability, school quality, arts and leisure, safety, health care, diversity, and several ease-of-living criteria.

100
Factor in additional data on the economy (including fiscal strength of the government), jobs, housing, and schools. Weight economic factors most heavily.

30
Visit towns and interview residents, assessing traffic, parks, and gathering places and considering intangibles like community spirit.

1
Select the winner based on the data and reporting.

A couple of questions I have:

1. I agree that it can be hard to compare communities with 10,000 people and 150,000 people. But can the list from each year be called “the best place to live” if the communities of interest change?

2. I wonder how they chose the median income cutoffs. So this cuts out places that might be “too exclusive” or “not exclusive enough.” Are these places not desirable to people?

3. Some measure of racial homogeneity is included in several steps. How many home buyers desire this? We know from a lot of research that whites tend to avoid neighborhoods with even moderate levels of African-Americans.

4. Weighting economic factors heavily seems to make sense. Jobs and economic opportunities are a good enticement for moving.

5. I would be interested to see what kind of information they collected on their 30 community visits. How many residents and leaders did they talk to? How does one measure “community spirit”? If a community says it has “community spirit,” how exactly do you check to see whether that is correct?

Overall, this is a complicated methodology that accounts for a number of factors. What I would like to know is how this list compares with how Americans make decisions about where to live. Do people want to move up to places like this and then stay there or is the dream for many to move on to more exclusive communities (if possible)? How many Americans could realistically afford to or possibly move into these communities?

(A side note: the four Chicago suburbs in the top 100 for 2010: Bolingbrook at #43, Naperville at #54, Mount Prospect at #56, and Arlington Heights at #59. Naperville used to rank much higher earlier in the 21st century – I wonder how it has slipped in the rankings.)

New study on American church attendance: a 10-18 percent gap between what people say versus what they actually do

The United States is consistently cited as a religious nation. The contrast is often drawn with a number of European nations where church attendance is usually said to be significantly lower than the American rate of about 40-45% of Americans attending on a regular basis. These figures have driven several generations of sociologists to debate the secularization thesis and why the American religious landscape is different.

But what if Americans overstate their church attendance on surveys and in reality, do attend church on a rate similar to European nations? A new study based on time diary data suggests this is the case:

While conventional survey data show high and stable American church attendance rates of about 35 to 45 percent, the time diary data over the past decade reveal attendance rates of just 24 to 25 percent — a figure in line with a number of European countries.

America maintains a gap of 10 to 18 percentage points between what people say they do on survey questions, and what time diary data says they actually do, Brenner reports. The gaps in Canada resemble those in America, and in both countries, gaps are both statistically and substantively significant…

“The consistency and magnitude of the American gap in light of the multiple sources of conventional survey data suggests a substantive difference between North America and Europe in overreporting.”

Given these findings, Brenner notes, any discussion of exceptional American religious practice should be cautious in using terms like outlier and in characterizing American self-reported attendance rates from conventional surveys as accurate reports of behavior. Rather, while still relatively high, American attendance looks more similar to a number of countries in Europe, after accounting for over-reporting.

A couple of thoughts about this:

1. This is another example where the research method used to collect data matters. Ask people about something on a survey and then compare that data to what people report in a time diary and it is not unusual to get differing responses. What exactly is going on here? Surveys ask people to consult their memory, a notoriously faulty source of information. Diaries have their own issues but supposedly are better at getting better information about daily or regular practices.

2. Even if church attendance data is skewed in the US, it doesn’t necessarily mean that America might still not be exceptional in terms of religion. Religiosity is made up of a number of factors including doctrinal beliefs, importance of religion in everyday life, membership in a religious congregation, the prevalence of other religious practices, and more. Church attendance is a common measure of religiosity but not the only one.

3. This is interesting data but it leads to another interesting question: why exactly would Americans overestimate their church attendance by this much? Since the time diary data from Europe showed a smaller gap, it suggests that Americans think they have something to gain by overestimating their church attendance. Perhaps Americans think they should say they attend church more – there is still social value and status attached to the idea that one attends church.

A reminder that information overload is not just limited to our particular era in history

There is an incredible amount of data one can access today through a computer and high-speed Internet connection: websites, texts, statistics, videos, music, and more. While it all may seem overwhelming, a Harvard history professor reminds us that facing a glut of information is not a problem that has been faced only by people in the Internet age:

information overload was experienced long before the appearance of today’s digital gadgets. Complaints about “too many books” echo across the centuries, from when books were papyrus rolls, parchment manuscripts, or hand printed. The complaint is also common in other cultural traditions, like the Chinese, built on textual accumulation around a canon of classics…

It’s important to remember that information overload is not unique to our time, lest we fall into doomsaying. At the same time, we need to proceed carefully in the transition to electronic media, lest we lose crucial methods of working that rely on and foster thoughtful decision making. Like generations before us, we need all the tools for gathering and assessing information that we can muster—some inherited from the past, others new to the present. Many of our technologies will no doubt rapidly seem obsolete, but, we can hope, not human attention and judgment, which should continue to be the central components of thoughtful information management.

As technology changes, people and cultures have to adapt. We need citizens who are able to sift through all the available information and make wise decisions. This should be a vital part of the educational system – it is no longer enough to know how to access information but rather we need to be able to make choices about which information is worthwhile, how to interpret it, and how to put it into use.

Take, for example, the latest Wikileaks dump. The average Internet user no longer has to rely on news organizations to tell him or her how to interpret the information (though they would still like to fill that role). But simply having access to a bunch of secret material doesn’t necessarily lead to anything worthwhile.

Measuring the popularity of tiny houses

I enjoy looking at pictures of tiny houses, those abodes with around 100 square feet. Perhaps it has something to do with my interest in home designs or my liking of cozy places or thinking about how Americans are finding alternatives to buying large homes.

But it is difficult to get a handle on exactly how many people like these houses or actually decide to buy them. One thing is sure: it is a small number of people. But this story suggests the number of people interested is on the rise:

Tumbleweed’s business has grown significantly since the housing crisis began, Shafer said. He now sells about 50 blueprints, which cost $400 to $1,000 each, a year, up from 10 five years ago. The eight workshops he teaches around the country each year attract 40 participants on average, he said…

Since the housing crisis and recession began, interest in tiny homes has grown dramatically among young people and retiring Baby Boomers, said Kent Griswold, who runs the Tiny House Blog, which attracts 5,000 to 7,000 visitors a day…

Gregory Johnson, who co-founded the Small House Society with Shafer, said the online community now has about 1,800 subscribers, up from about 300 five years ago. Most of them live in their small houses full-time and swap tips on living simple and small.

Johnson, 46, who works as a computer consultant at the University of Iowa, said dozens of companies specializing small houses have popped up around the country over the past few years…

He said his small houses, which sell for $20,000 to $50,000, are much cheaper than building a home addition and can be resold when the extra space is no longer needed. His company has sold 16 houses this year and aims to sell 20 next year.

These numbers are small – and anecdotal. Even with this rise in popularity, there are still few people interested in selling or buying tiny houses. Are there enough people here to declare that there is a “tiny house movement”? Why not include figures about how many people have joined Facebook groups having to do with tiny houses?

While the popularity of these homes might be indicative that more Americans are interesting in downsizing, the better figure to look at is the average size of the new American single-family home. Taking into account national data, this figure dropped this year and suggests that houses across the country are becoming slightly smaller (or at least reversing the trend of always getting bigger).