The cover story in today’s Chicago Tribune on Chicago’s status as a global city includes a nice graphic showing how Chicago matches up on a variety of dimensions.
This sort of multi-dimensional graphic is becoming more common. Its biggest advantage is that it can display a lot of information across a variety of dimensions. This graphic shows 10 different aspects of being a global city. It is also relatively easy to compare ranks of cities, if you know what you are looking at – the further out the area or the more area a city covers on the graphic means a higher ranking. Of course, the biggest downside is that is takes a little bit of time to figure out how to read it. Is a city better if it is closer to the middle or further away on each dimension? (It is is better to be further out – higher ranking cities cover more area.) It can also be a lot to take in at once.
It is a nice addition to add the seven comparison cities at the bottom with Chicago’s mass overlaid on each diagram. Just having Chicago’s rankings graphed would provide some information but do so without any context.
Some graphs can be more difficult to interpret, particularly if the categories along one of the axes are not a consistent width. Here is an example: misreading a chart of income in the United States:
“When I was growing up in Canada,” says Jon Evans of Techcrunch, “I was taught that income distribution should and did look like a bell curve, with the middle class being the bulge in the middle. Oh, how naïve my teachers were. This is how income distribution looks in America today.”
“That big bulge up above? It’s moving up and to the left. America is well on the way towards having a small, highly skilled and/or highly fortunate elite, with lucrative jobs; a vast underclass with casual, occasional, minimum-wage service work, if they’re lucky; and very little in between.”…
Er, no. Look closely at those last two brackets. Now look at the brackets immediately to the right of them? What do you notice?
Probably, you notice the same thing that immediately struck me: the last two brackets cover a much, much wider income band than the rest of the brackets on the graph.
Each bar on that graph represents a $5,000 income band: Under $5,000, $5000 to $9,999, and so forth. Except for the last two. The penultimate band is $200,000 to $250,000, which is ten times as wide as the previous band. And the last bar represents all incomes over $250,000–a group that runs from some law associate who pulled down $251,000 last year, through A-Rod’s $27 million annual salary, all the way to some Silicon Valley superstar who just cashed out the company for a one time windfall of hundreds of millions of dollars. Unsurprisingly, much wider bands have more people in them than they would if you kept on extrapolating out in $5,000 increments…
To put it another way, the apparent clustering of income along the rich right tail of the distribution is just an artifact of the way that the Census presents the data. If they kept running through $5,000 brackets all the way out to A-Rod, the spreadsheet would be about a mile long, and there would only be a handful of people in each bracket. So at the high end, where there are few households, they summarize.
The Census likely has good reasons for reporting these higher-income categories in such a way. First, because there are relatively fewer people in each $5,000 increment, they are trying to not make the graph too wide. Second, I believe the Census topcodes income, meaning that above a certain dollar point, incomes don’t get any higher. This is done to help protect the identity of these respondents who might be easy to pick out of the data otherwise.
But, this is a classic misinterpretation of a graph. As McArdle notes, this is a long-tail graph with very few people at the top end. The graph tries to alert reader to this by also marking some of the notable percentiles; above the $130,000 to $134,999 category, it reads “The top 10 percent reported incomes above $135,000” and above the top two categories, it reads, “approximately 4 percent of households.” Making the right interpretation depends not just on the relative shape of the graph, bell curve or otherwise, but looking closely at the axes and categories.
This particular graphic provides a look at how the United States stacks up against other developed nations on nine key measures, such as a Gini index, Gallup’s global wellbeing index, and life expectancy at birth.
As a graphic, this is both interesting and confusing. It is interesting in that one can take a quick glance at all of these measures at once and the color shading helps mark the higher and lower values. This is the goal of graphics or charts: condense a lot of information into an engaging format. However, there are a few problems: there is a lot of information to look at, it is unclear why the countries are listed in the order they are, and it takes some work to compare the countries marked with the different colors because they may be at the top or bottom of the list.
(By the way, the United States doesn’t compare well to some of the other countries on this list. Are there other overall measures in which the United States would compare more favorably?)
One of the key purposes of a chart or graph is to distill a lot of complicated information into a simple graphic so readers can quickly draw conclusions. In the midst of a crowded field of people who may (or may not) be vying to be the Republican candidate for president in 2012, one chart attempts to do just that.
This chart has two axes: moderate to conservative and insider to outsider. While these may be fuzzy concepts, creator Nate Silver suggests these axes give us some important information:
With that said, it is exceptionally important to consider how the candidates are positioned relative to one another. Too often, I see analyses of candidates that operate through what I’d call a checkbox paradigm, tallying up individual candidates’ strengths and weaknesses but not thinking deeply about how they will compete with one another for votes.
Silver then goes on to explain two other pieces of information for each candidate that is part of the circle used to place each candidate on the graph: the color indicates the region and the size of the circle represents their relative stock on Intrade.
Based on this chart, it looks like we have a diagonal running from top left to bottom right, from moderate insider (Mitt Romney) to conservative outsider (Sarah Palin) with Tim Pawlenty and Mike Huckabee trying to straddle the middle. We will have to see how this plays out.
But as a statistics professor who is always on the lookout for cool ways of presenting information, this is an interesting graphic.
In recent years, services like Twitter have exploded. Perhaps to bring some sense to the dizzying array of applications available for users, one sociologist has mapped the “Twitterverse.”
I like this graphic. However, I would be curious to hear the greater purpose of this chart. Does it indicate the popularity of particular applications? Does it reflect the date such applications were made available? Or is this simply an informational chart meant to display the broad range of services available on Twitter?
Additionally, this chart is a reminder of the dizzying array of apps Twitter users can download. Sorting through all of the possible apps on Twitter or in other places (iPhone, Droid, etc.) could be a time-consuming process.