The availability of data online for social science research

Posted on March 6, 2024 by legallysociable

While in graduate school, I learned that data for sociology research could be found online. The Census had a website. The Inter-university Consortium for Political and Social Research had a website. Social media platforms were blooming with Myspace offering lots of activity to examine and Facebook launching. Still today in 2024, I continue to make use of both of these streams of data:

Datasets available online. The ability to look up accurate and detailed statistics is hard to overvalue. For example, I regularly access the Census website and The Association of Religion Data Archives for information. I experienced the flip-side in graduate school as well. On one research assistant project, I looked through World Health Organization statistics published in thick bound volumes by the United Nations. Presumably, some of these books still exist but the expectation is that such information should be available online.
Online activity as data. With the growth of the World Wide Web and social media, sociologists and others use what is online as research data. I have published a few works that draw on websites and online materials to examine patterns. Of course, online activity is not necessarily the same as offline activity but I think the so-called “virtual world” and “real world” overlap more than people sometimes think. Studying online activity can tell us about important online patterns and offline patterns.

On the other hand, I did not have any specific training in graduate school about how to access this data online. Navigating websites and datasets online requires experience and know-how. Developing datasets from online activity takes work. A lot of methodological writing and advice can apply to online data but collecting data online can be its own process. What about having programming skills to speed up data collection and analysis?

I am sure there is a lot of research to come that will use both of these data streams to good effect. I look forward to the findings about society and social relationships to come.

Giving thanks for complex society

Posted on November 26, 2020 by legallysociable

After reading about the work of scholar Peter Turchin, I was reminded of the 1988 book The Collapse of Complex Societies by anthropologist Joseph Tainter. The argument is this: society can get so complex that small changes in the system that have cascading effects can derail everything.

This can sound like a recipe for doomsday. We are in a current era where complexity is all around us. Cities are incredibly complex and suburbs are complex as well. Traffic systems can be thrown off with an accident. The number of people and systems in the United States can make it difficult to get things done. Black swan events arise. Throw a serious wrench into the current system, like COVID-19, and how might it all fall apart?

Yet, I would guess that many of the features of modern life that people enjoy are the result of these complex systems. Streaming Netflix to a screen. The availability of modern health care. Being able to get fast food. Modern transportation systems. Widespread social change. All of these require the working together of numerous systems, organizations, and people. All might have been hard to imagine even just a century ago.

Additionally, this complexity is a boon for social scientists and researchers trying to get a handle on it all. The discipline of sociology arose in the nineteenth century as numerous changes – urbanization, industrialization, migration, modern nation-states, ways of thinking based on rationality and science, and more – came together. The modern university also developed relatively recently. The complexity of society plus the speed of social change begs for analysis, looking at patterns, trying to understand what we have helped create.

For now, we can give thanks for what complex society does bring – and work to address its many ills.

Patterns in political yard signs

Posted on October 16, 2020 by legallysociable

A new book by three political scientists look at how Americans have deployed and reacted to yard signs in recent election cycles:

We just started being puzzled about it. We did things like code the amount of traffic on a given street, and we thought maybe people on a street with high traffic would be more likely to put up signs. But you find out that those people wanted to let other people know where they stand- that it wasn’t just about catching the eye of passing traffic [to try to get out the vote for a candidate]. We found out that there’s a combination of expressive and communicative motives…
One of the things that was really clear from our studies is that signs are really important to people who display them. They’re emotionally invested in these dynamics and are more likely than people who don’t put up signs to say that it’s a good thing, or a reasonable thing, for neighborhoods to be doing.
I also think this is why we hear about these stories of theft and vandalism-people going to extremes around signs. At least seemingly, in news reports, it can accelerate fast, from people putting up signs to some kind of an altercation, a police report, a fight on the street. I think it’s because people view it as a real affront when someone messes with their expression of self…
You really notice, when you’re walking around, those places where signs are battling one another. But when we did spatial analysis to look at the clustering of signs systematically, in a way that would cut through those strong anecdotal impressions, we found that, really, there wasn’t much evidence of the intermingling of signs-the famous Sign Wars, where there’s a Biden sign at one house and a Trump sign next to it. Really, it was more about like-minded clustering: pockets of Biden supporters signaling to one another, pockets of Trump supporters signaling to one another. More solidarity than outright conflict.

I appreciate the systematic approach for a phenomenon that lends itself to anecdotes. This is how social science can be really helpful: many people have experiences with or have seen yard signs but unless researchers approach the issue in a rigorous way, it is hard to know what exactly is going on.

For example, I regularly walk in two different places in my suburb and I have been keeping an eye on yard signs. At least in the areas I walk, the signs are primarily in favor of one party in the national election while local election signs are more varied. Furthermore, the number of people who have signs is still pretty limited even in a heated political climate. But, just based on my walks, I do not know if what I am seeing match my suburb as a whole let alone communities across the United States. And unless I interact in some way with the people with (and without) yard signs, I have little idea of what is motivating them.

I wonder how the behavior of putting out political yard signs relates to other political behavior. If a political yard sign is expressive, how much does this carry over to other parts of life? Are these the people who are most active in local political activity? Are they the most partisan? Are they the ones always bringing up politics at family gatherings or among friends?

I would also be curious to how this relates to social class and particular neighborhoods. Lawns, in some places, are sacred: they should be green, free of weeds and leaves. Property values are important in many places. Political signs might mess up particular aesthetics or introduce the idea of conflict when suburbanites just want to leave each other alone.

A short overview of recent survey questions about Holocaust knowledge in the US

Posted on October 2, 2020 by legallysociable

Although this article leads with recent survey results about what Americans know and think about the Holocaust, I’ll start with the summary of earlier surveys and move forward in time to the recent results:

Whether or not the assumptions in the Claims Conference survey are fair, and how to tell, is at the core of a decades long debate over Holocaust knowledge surveys, which are notoriously difficult to design. In 1994, Roper Starch Worldwide, which conducted a poll for the American Jewish Committee, admitted that its widely publicized Holocaust denial question was “flawed.” Initially, it appeared that 1 in 5, or 22 percent, of Americans thought it was possible the Holocaust never happened. But pollsters later determined that the question—“Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?”—was confusing and biased the sample. In a subsequent Gallup poll, when asked to explain their views on the Holocaust in their own words, “only about 4 percent [of Americans] have real doubts about the Holocaust; the others are just insecure about their historical knowledge or won’t believe anything they have not experienced themselves,” according to an Associated Press report at the time. More recently, the Anti-Defamation League was criticized for a 2014 worldwide study that asked respondents to rate 11 statements—“People hate Jews because of the way they behave, for example”—as “probably true” or “probably false.” If respondents said “probably true” to six or more of the statements, they were considered to harbor anti-Semitic views, a line that many experts said could not adequately represent real beliefs…
Just two years ago, the Claims Conference released another survey of Americans that found “Two-Thirds of Millennials Don’t Know What Auschwitz Is,” as a Washington Post headline summarized it. The New York Times reported on the numbers at the time as proof that the “Holocaust is fading from memory.” Lest it appear the group is singling out Americans, the Claims Conference also released surveys with “stunning” results from Canada, France, and Austria.
But a deeper look at the Claims Conference data, which was collected by the firm Schoen Cooperman Research, reveals methodological choices that conflate specific terms (the ability to ID Auschwitz) and figures (that 6 million Jews were murdered) about the Holocaust with general knowledge of it, and knowledge with attitudes or beliefs toward Jews and Judaism. This is not to discount the real issues of anti-Semitism in the United States. But it is an important reminder that the Claims Conference, which seeks restitution for the victims of Nazi persecution and also to “ensure that future generations learn the lessons of the Holocaust,” is doing its job: generating data and headlines that it hopes will support its worthy cause.
The new Claims Conference survey is actually divided into two, with one set of data from a 1,000-person national survey and another set from 50 state-by-state surveys of 200 people each. In both iterations, the pollsters aimed to assess Holocaust knowledge according to three foundational criteria: the ability to recognize the term the Holocaust, name a concentration camp, and state the number of Jews murdered. The results weren’t great—fully 12 percent of national survey respondents had not or did not think they had heard the term Holocaust—but some of the questions weren’t necessarily written to help respondents succeed. Only 44 percent were “familiar with Auschwitz,” according to the executive summary of the data, but that statistic was determined by an open-ended question: “Can you name any concentration camps, death camps, or ghettos you have heard of?” This type of active, as opposed to passive, recall is not necessarily indicative of real knowledge. The Claims Conference also emphasized that 36 percent of respondents “believe” 2 million or fewer Jews were killed in the Holocaust (the correct answer is 6 million), but respondents were actually given a multiple-choice question with seven options—25,000, 100,000, 1 million, 2 million, 6 million, 20 million, and “not sure”—four of which were lowball figures. (Six million was by far the most common answer, at 37 percent, followed by “not sure.”)

The first example above has made it into research methods textbooks regarding the importance of how survey questions are worded. The ongoing discussion in this article also could illustrate these textbook dialogues: how questions are asked and how the results are interpreted by the researchers are very important.

There are other actors in this process that can help or harm the data interpretation:

Funders/organizations behind the data. What do they do with the results?
How the media reports the information. Do they accurately represent the data? Do they report on how the data was collected and analyzed?
Does the public understand what the data means? Or, do they solely take their cues from the researchers and/or the media reports?
Other researchers who look at the data. Would they measure the topics in the same way and, if not, what might be gained by alternatives?

This all may be boring details to many but going from choosing research topics and developing questions to sharing results with the public and interpretation from others can be a process. The hope is that all of the actors involved can help get as close to what is actually happening – in this case, accurately measuring and reporting attitudes and beliefs.

Using a GRIM method to find unlikely published results

Posted on March 27, 2017 by legallysociable

Discovering which published studies may be incorrect or fraudulent takes some work and here is a newer tool: GRIM.

GRIM is the acronym for Granularity-Related Inconsistency of Means, a mathematical method that determines whether an average reported in a scientific paper is consistent with the reported sample size and number of items. Here’s a less-technical answer: GRIM is a B.S. detector. The method is based on the simple insight that only certain averages are possible given certain sets of numbers. So if a researcher reports an average that isn’t possible, given the relevant data, then that researcher either (a) made a mistake or (b) is making things up.

GRIM is the brainchild of Nick Brown and James Heathers, who published a paper last year in Social Psychological and Personality Science explaining the method. Using GRIM, they examined 260 psychology papers that appeared in well-regarded journals and found that, of the ones that provided enough necessary data to check, half contained at least one mathematical inconsistency. One in five had multiple inconsistencies. The majority of those, Brown points out, are “honest errors or slightly sloppy reporting.”…

After spotting the Wansink post, Anaya took the numbers in the papers and — to coin a verb — GRIMMED them. The program found that the four papers based on the Italian buffet data were shot through with impossible math. If GRIM was an actual machine, rather than a humble piece of code, its alarms would have been blaring. “This lights up like a Christmas tree,” Brown said after highlighting on his computer screen the errors Anaya had identified…

Anaya, along with Brown and Tim van der Zee, a graduate student at Leiden University, also in the Netherlands, wrote a paper pointing out the 150 or so GRIM inconsistencies in those four Italian-restaurant papers that Wansink co-authored. They found discrepancies between the papers, even though they’re obviously drawn from the same dataset, and discrepancies within the individual papers. It didn’t look good. They drafted the paper using Twitter direct messages and titled it, memorably, “Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab.”

I wonder how long it will be before journals employ such methods for submitted manuscripts. Imagine Turnitin for academic studies. Then, what would happen to authors if problems are found?

It also sounds like a program like this could make it easy to do mass analysis of published studies to help answer questions like how many findings are fraudulent.

Perhaps it is too easy to ask whether GRIM has been vetted by outside persons…

More on limits of Census measures of race and ethnicity

Posted on February 2, 2012 by legallysociable

Here is some more information about the limitations of measuring race with the current questions in the United States Census:

When the 2010 census asked people to classify themselves by race, more than 21.7 million — at least 1 in 14 — went beyond the standard labels and wrote in such terms as “Arab,” ”Haitian,” ”Mexican” and “multiracial.”

The unpublished data, the broadest tally to date of such write-in responses, are a sign of a diversifying America that’s wrestling with changing notions of race…

“It’s a continual problem to measure such a personal concept using a check box,” said Carolyn Liebler, a sociology professor at the University of Minnesota who specializes in demography, identity and race. “The world is changing, and more people today feel free to identify themselves however they want — whether it’s black-white, biracial, Scottish-Nigerian or American. It can create challenges whenever a set of people feel the boxes don’t fit them.”

In an interview, Census Bureau officials said they have been looking at ways to improve responses to the race question based on focus group discussions during the 2010 census. The research, some of which is scheduled to be released later this year, examines whether to include new write-in lines for whites and blacks who wish to specify ancestry or nationality; whether to drop use of the word “Negro” from the census form as antiquated; and whether to possibly treat Hispanics as a mutually exclusive group to the four main race categories.

This highlights some of the issues of social science research:

1. Social science categories change as people’s own understanding of the terms changes. Keeping up with these understandings can be difficult and there is always a lag. For example, a sizable group of respondents in the 2010 Census didn’t like the categories but the problem can’t be fixed until a future Census.

2. Adding write-in options or more questions means that the Census becomes longer, requiring more time to take and analyze. With all of the Census forms that are returned, this is no small matter.

3. Comparing results of repeated surveys like the Census can become quite difficult when the definitions change.

4. The Census is going to change things based on focus groups? I assume they will also test permutations of the questions and possible categories in smaller-scale surveys before settling on what they will do.

Ethics and social science: grad student gets 6 months sentence for studying animal rights’ groups

Posted on February 16, 2011 by legallysociable

This is an update of a story I have been tracking for a while: a sociology graduate student who had studied animal rights’ groups was sentenced to six months in jail. Here is a brief summary of where the case now stands:

Scott DeMuth, a sociology graduate student at the University of Minnesota, was sentenced yesterday to 6 months in federal prison for his role in a 2006 raid on a Minnesota ferret farm. A judge in Davenport, Iowa, ordered that DeMuth be taken into custody immediately.

In 2009, DeMuth was charged with felony conspiracy in connection with a separate incident, a 2004 lab break-in at the University of Iowa that caused more than $400,000 in damage. DeMuth argued that anything he might know about the Iowa incident had been collected as part of his research on radical activist groups and was therefore protected by confidentiality agreements with his research subjects. A petition started by DeMuth’s graduate advisor, David Pellow, argued that the charges violated DeMuth’s academic freedom.

Last year, prosecutors offered to drop all charges related to the Iowa break-in if DeMuth would plead guilty to a lesser misdemeanor charge related to the ferret farm incident. DeMuth took the deal. No one has been convicted in the Iowa break-in.

This has been an interesting case to introduce to students when teaching ethics amongst sociology and anthropology majors in a research class. Just how far should participant observation go? Couple this with another story, like Venkatesh knowing about possible crimes in Gang Leader for a Day, and a good conversation typically ensues.

However, this case does bring up some larger questions about how protected researchers and their subjects should be when carrying out their research. Should researchers have shield laws? How exactly do courts define “academic freedom” in cases like this?

The trolley problem, race, and making decisions

Posted on September 17, 2010 by legallysociable

The trolley problem is a classic vignette used in research studies and it asks under what conditions is it permissible to sacrifice one life for the lives of others (see an explanation here). Psychologist David Pizarro tweaked the trolley problem to include racial dimensions by using characters named Chip and Tyrone. Pizarro found that people’s opinions about race influenced which character they were more willing to sacrifice:

What did this say about people’s morals? Not that they don’t have any. It suggests that they had more than one set of morals, one more consequentialist than another, and choose to fit the situation…

Or as Pizarro told me on the phone, “The idea is not that people are or are not utilitarian; it’s that they will cite being utilitarian when it behooves them. People are aren’t using these principles and then applying them. They arrive at a judgment and seek a principle.”

So we’ll tell a child on one day, as Pizarro’s parents told him, that ends should never justify means, then explain the next day that while it was horrible to bomb Hiroshima, it was morally acceptable because it shortened the war. We act — and then cite whichever moral system fits best, the relative or the absolute.

Some interesting findings from a different take on a classic research tool. This is always an interesting question to ask regarding many social issues: when does the end justify the means and when does it not?

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: