The lack of variation in ordinal scales: color-coded terror alerts plus employee surveys and ratings

In the last few days, I ran into a few stories that are related in unusual ways: they both concern a lack of variation in an ordinal scale. First, let’s start with the announcement from the Department of Homeland Security regarding the color-coded terror alert scale:

A government review determined that the five-tiered color-coded system instituted in 2002 had suffered from a lack of credibility and eroded public confidence. The color has not been changed since 2006 and has never gone below yellow, or “elevated,” risk. Setting the risk level to green, or “low,” was never even considered.

In the long run, the problem was that the scale didn’t change. Theoretically, there were five options but the alert was generally in the same place. Since the alert was always “elevated” or above, this was not helpful. (This also seems related to the argument some have made that a multi-decade “war on drugs” or “war on poverty” doesn’t make much sense because wars are supposed to have an end. Always being at war or on alert for terror erodes the sense of urgency.)

I also came across a human resources website that recommended businesses avoid five point scales regarding certain questions asked of employees:

A typical ranking, called a Likert scale, runs from Strongly Agree to Strongly Disagree. And it’s fine for many psychological and sociological surveys. When you’re asking for ratings from 1,000 random people, you’ll get a wide variety of answers.

“But inside an organization, a 5-point scale loses its effectiveness,” Murphy says. “If you ask a group of employees at Acme Inc. to rate the statement, ‘Acme is a good place to work,’ you’re not going to get very many low responses (i.e., 1s and 2s). That’s because if you truly thought Acme was an awful place to work, you probably would have quit already.”…

But as with employee surveys, we don’t think 5-point scales are effective for performance evaluations. Many HR pros tell managers that only a very small percentage of their subordinates, say 10 percent, can be awarded the highest rating. And, managers are understandably reluctant to rate anyone as unsatisfactory—even when that’s the rating he or she deserves.

This is not just a hypothetical situation: I remember reading recently about the extremely high percent of teachers in a large district that were given satisfactory or higher ratings. (One group suggests that 91% of Chicago teachers in 2007-2008 were rated “superior or excellent”.) If the ratings mean anything and are actually measuring performance, it is difficult to believe that such a high figure is true.

The lesson to be learned here from these two cases? Be sure that there will be variation in responses if using an ordinal scale. Otherwise, the scale is quite unhelpful.