Summarizing data visualization errors

Check out this good quick overview of visualization errors – here are a few good moments:

Everything is relative. You can’t say a town is more dangerous than another because the first one had two robberies and the other only had one. What if the first town has 1,000 times the population that of the first? It is often more useful to think in terms of percentages and rates rather than absolutes and totals…

It’s easy to cherrypick dates and timeframes to fit a specific narrative. So consider history, what usually happens, and proper baselines to compare against…

When you see a three-dimensional chart that is three dimensions for no good reason, question the data, the chart, the maker, and everything based on the chart.

In summary: data visualizations can be very useful for highlighting a particular pattern but they can also be altered to advance an incorrect point. I always wonder with these examples of misleading visualizations whether the maker intentionally made the change to advance their point or whether there was a lack of knowledge about how to do good data analysis. Of course, this issue could arise with any data analysis as there are right and wrong ways to interpret and present data.

Visualizing immigration to the United States

Here are three interesting visualization options – an animated map and two infographics – to see immigration to the United States. Three quick thoughts:

  1. The map really does help illustrate the various stages of immigration. It starts from Western Europe, moves significantly to Eastern Europe in the late 1800s, and then opens to Mexico, east Asia, and other parts of the globe in the 1960s.
  2. It is unfortunate that the arrivals from Asia have to go over the “break” in the map since it has the Atlantic in the center. At first, I couldn’t figure out where the dots coming into the United States from the left were coming from.
  3. The second infographic provides some proportional context: even with the jump in migrants from Mexico, they represent a smaller proportion of the total U.S. population than the immigration spikes in the 1800s.

Presenting big data about Chicago

The Chicago Architecture Foundation has a new exhibit highlighting the use of big data in Chicago:

Architects, planners, engineers and citizens, it contends, are increasingly using massive amounts of data to analyze urban issues and shape innovative designs…

But data, the show argues, is useful as well as ubiquitous. We see some classically gritty Chicago stuff to back this up, though it’s not quite powerful or precise enough to be fully persuasive…

More convincing are the show’s examples of “digital visualization,” which is geekspeak for using digital technology to present and analyze urban planning data.

Take a monumental, crowd-pleasing map of Chicago, 15 feet high and 30 feet wide, which presents the footprints of thousands of buildings, even individual houses, and color-codes them by the era in which they were built. We see the impact of the city’s three great building booms, from Chicago’s earliest days to 1899, from 1900 to 1945, and from 1946 to 1979. The recent surges that filled downtown with new skyscrapers look puny by comparison.

Also worth seeing: Video monitors which display data for Divvy, the city’s bike-sharing program. They offer neat tidbits: Divvy’s most popular station, for example, is at Millennium Park.

Sounds interesting. Big cities are complex social entities who could benefit from large-scale and real-time data collection and analysis. Of course, as Kamin notes at the end, there still is a human side to cities that cannot be ignored but getting a handle through data on what is happening could go a long way.

Another dimension to this is how to best present big data. While the online presentation of maps has grown popular, how can this be done best in person? I look forward to seeing this exhibit in person as I already like what the Chicago Architecture Foundation has done with this space. Here is part of the gallery a few years ago:

CAFChicagoAug11This is a great free place to learn more about Chicago and then choose among the cool offerings in the gift shop or sign up for one of the architecture tours that cover all different aspects of Chicago.

The factors behind the rise of viral maps

Here is a short look at how viral maps (“graphic, easy to read, and they make a quick popular point”) are put together by one creator:

When I need to find a particular data set, it’s often as straightforward as a search for the topic with the word “shapefile” or “gis” attached. There’s so much data just sitting on servers that if you can imagine it, it’s probably out there somewhere (often for free). Sometimes though, finding data requires a deeper search. A lot of government-provided data sits inside un-indexed data portals or clearinghouses. Depending on the quality of the portal, these can be tedious to sort through…

Simplicity and ease-of-use: Interactive maps are great, but I want the maps I make to be straightforward to read and understand. I don’t want viewers to have to figure out how to use the map; they should just be able to look at it and figure out what’s going on.

Projections: Typical web maps are limited to the Web Mercator projection. I don’t have any objection to Mercator in principle (in fact it’s brilliant for what it does), but I can’t in good conscience use it for maps at a continental or global scale. Sticking to static maps allows me to choose more appropriate projections for the data and region I’m depicting.

Uniformity: I want everyone who visits my maps to be presented with the same information. I don’t want some algorithm deciding that one visitor is shown a particular view while another visitor gets a different one.

These principles sound similar to what one would expect for any sort of online chart or infographic. There is plenty of data available online but it takes some skill in order to present the data clearly and then market the map to the appropriate audience.

Now that I think about it, it is a little surprising that it took this long for viral maps to catch on. First, the Internet makes a lot of geographic data easily accessible. Two, it is a visual medium and maps are essentially graphics (audio is another story). Third, geographic data seems to feed into a lot of hot-button topics of conversation these days as people of different races (residential segregation), cultural viewpoints (think the American South or the Bible Belt), education (think the Creative Callas looking for exciting urban neighborhoods), and other groupings tend to live in different places.

I wonder if the real story here isn’t the technology that makes mapping on a large-scale relatively easy today. GIS software has been around for a while but it generally pretty expensive and has a learning curve. Now, there are numerous websites that offer access to data and mapping capability (think the Census or Social Explorer). Shapefiles are used by a variety of local governments and researchers and can be downloaded. There are good freeware GIS programs like GeoDa. You need some bandwidth and computing power to get the data and crunch the numbers. All together, the pieces have now come together for more people to access, manipulate, and publish maps in a way that wasn’t possible even just 5 years ago.


Visualizing the migration flows in and out of DuPage County

The US Census recently released data on county-by-county migration flows. The tables that can be downloaded are huge but here is a look at the flows in and out of DuPage County:


Looks like a lot of movement to (and some from) warmer locales – southern California, Arizona, Florida – and lots of movement in the Midwest in an area roughly bounded by St. Louis, Detroit, and Minneapolis. You can also look at the migrations by education or income level.

Very cool all around. There is a lot of data to crunch here and these visualizations help make sense of a lot of data. At the same time, these aren’t necessarily huge movements of people. Take Harris County, home to Houston (4th largest city in the United States): over this five year span, there was a +88 flow from Harris to DuPage County.

Looking at inequality in NYC by translating wealth differences into building heights

It can be difficult to visualize inequality but here is an innovative way of doing so: imagining wealth as buildings in New York City.

In his most recent visualization project, the Pittsburgh-based artist and researcher re-imagines what the city’s skyline would look like if building height were a direct reflection of a neighborhood’s net household wealth. “I was inspired to create this project after standing atop Mt. Washington in my hometown of Pittsburgh and looking at the Pittsburgh skyline,” he explains. “I thought to myself, ‘What if you could actually see inequality?’ This relatively even landscape would look much different.”

Lamm, who is responsible for other viral visualizations like Normal Barbie, translated Esri’s map of median household net worth in New York City (based on 2010 Census data) into the bright green 3-D bars you’re looking at. Every $100,000 of net worth in a section on Esri’s map equals one centimeter in height on Lamm’s visualization. So if one section (which appears to consist of multiple blocks) had a net worth of $500,000, Lamm’s rendering would measure 5 cm high. Similarly, if another section had a net worth of $80,000, the green would appear at a much flatter 0.8 cm.

Of the maps/visualizations available here, the best one is probably the first one that shows much of Manhattan from the northwest looking southeast.

Choosing to visualize wealth rather than income is a strategic choice. Much talk about inequality involves income but this may be the wrong metric. Income is more about short-term access to money but wealth may be more important for longer-term outcomes (purchasing a house, etc.) and the wealth differences between groups are quite a big larger. For example, the differences in wealth between the top 5% and the rest of America are astounding as are the differences between whites and blacks as well as Latinos.

Additionally, singling out New York, particularly Manhattan, is an interesting choice. The differences here are indeed stark. Manhattan is the seat of the financial sector. But, few places in the United States would have this much wealth inequality.

The value of using maps to see the rise and fall of Detroit

Here is a series of maps that show both the growth and decline of Detroit over its history. When looking at these maps, I’m reminded that it is quite difficult to talk about either the rise or decline of a major city just by discussing raw numbers, such as population increases or losses or economic figures, or photographs. For example, we could talk about the rise of Houston in recent decades and contrast this to the sharp population decrease in Detroit. Moving past statistics, we could include photographs of a city. Detroit has been photographed many times in recent years with often bleak scenes illustrating economic and social decline.

In some middle ground between numbers and photos and in-depth analysis (of which there does not seem to be much about Detroit recently – the mainstream media has primarily focused on short snippets of information) are maps. A good map has sufficient information to provide a top-down approach to the city and give some indication of the city’s infrastructure. Additionally, it is much easier today to provide multiple layers of mapped information based on Census data and other sources. Growth is relatively easy to see as new streets and points of interest starting showing up. On the other hand, decline might be harder to show as the streets may be empty and the points of interest might be decaying. Still, a current map shows the scope of the problem facing Detroit: it is population and economic decline plus a large chunk of land and structures that is difficult to maintain.

All together, I’m advocating for more widespread use of maps in reporting on and discussions about cities, whether they are struggling or thriving. Maps can help us move beyond seeing vacant houses or economic developments and take in the big picture all at once.

“A staggering migration” of hundreds of millions to Chinese cities

A New York Times video highlights the large number of Chinese residents the government intends to resettle to cities in the new two decades. Three quick thoughts on the video:

1. Yes, the scale of urbanization in China is astounding. As the video notes, China’s urbanization rate has approached Western levels in a matter of decades while it took centuries in the West.

2. The video argues that the rapid urbanization in recent years was more natural while the planned urbanization in the next 15 years is more forced by the government. I think this is an odd choice of words: “natural” versus “forced.” This seems to borrow from a typical US/Western explanation that people are free to make choices between urban, suburban, and rural areas. It may feel this way for those with money but it obscures that there are plenty of social forces, such as economic opportunities or race/ethnicity, that “push” and “pull” people away from certain areas. “Forced” seems more correct for official government policy that will require people to move but as a sociologist, I would be very hesitant to suggest social process were inevitable or “natural” or that individuals are complete free agents who can live where they like.

3. The visual in the video is unique. I understand the purpose: to give people the sense of just how large this urban resettlement in China will be. And it is visually more interesting than a graph. At the same time, it is odd to put so many major metropolitan areas in a line. The cities are geographically disparate so why line them up?

Five experts weigh in on global flight-path maps

An art critic, environmentalist, aviation consultant, data visualization expert, and philosopher offer some interpretations of global flight-path maps.

From the art critic:

It’s almost like contemporary fractalisation – based on fractals, those beautiful divisions of science and nature. A number of artists have exploited them. Max Ernst based a lot of his surreal landscapes on fractalisation.

From the aviation consultant:

Europe looks so bright because it has so many short-haul flights. It’s also one of the busiest global markets and there are several hubs in relatively close proximity in Europe: Paris, Frankfurt, Amsterdam and London…

What we’re going to see in a few years is more connections between Asia and Africa, and South America and Africa, along with more “south-south” trade.

From the expert in data visualization:

You can see the density of the flights, but it doesn’t show you how many people are travelling on them. You could do that by colouring them differently.

From the philosopher:

We are not seeing the life of individual human beings, but the life of the species as a whole, as if the species was one organism, pulsating like a jellyfish. Maybe it represents our collective existence?

Interesting thoughts all around. The quote above from the philosopher is right on in that maps like these allows us to see larger patterns and how we are all connected. It is not just about the flow of passengers or cargo back and forth but also about how these flight paths connect us. The maps could also serve as a proxy for global power and business activity. I remember seeing work from sociologist Zachary Neal along these lines. Take a look at his publications involving cities, networks, and airplanes here.

Planning for the 7 billion person city

Two architects recently won an award for planning for a city that would include all the residents of the world:

This is the premise behind an ambitious research project, called “The City of 7 Billion,” for which the two recently won the $100,000 Latrobe Prize from the American Institute of Architects College of Fellows. With the geo-spatial model Mendis and Hsiang are creating – think a super-enhanced, zoomable Google Earth, Hsiang says – they’re hoping to study the impact of population growth and resource consumption at the scale of the whole world.

Every corner of the planet, they argue, is “urban” in some sense, touched by farming that feeds cities, pollution that comes out of them, industrialization that has made urban centers what they are today. So why not think of the world as a single urban entity?…

Now she and Mendis will be trying to do something similar – sew together disparate data sets, turn them into spatial models, then make those models accessible to the public – with a vastly more complex scenario. They want to connect not just land use with population density, but also income data, carbon dioxide levels, and geographical terrain. Their model of the whole world as one continuous urban terrain could then be used as a predictive tool for planning development into the future.

Hsiang and Mendis are hoping to communicate data and ideas that the political and scientific communities have had a hard time conveying to the public. This may sound like an odd job for architects – visualizing worldwide data about air quality – but Hsiang and Mendis argue that architects are precisely the professionals to do this…

More often, however, they have not been working at the same scale as policy-makers and scientists. “For too long, the architecture profession has been complicit in focusing on buildings and the scale of buildings,” Mendis says. “And I think that’s been detrimental to us.” The City of 7 Billion is an attempt to change that, to involve architects in big-picture questions more often debated by economists and geographers and social scientists.

This sounds like an interesting project on multiple levels:

1. Trying to imagine what a megacity of this size would look like. We are a long way from a megapolis this size yet there are parts of the world that might benefit from such thinking.

2. Putting together data in new ways. This is stretching some of the boundaries of data visualization by putting it in 3-D form.

3. Helping architects get involved in larger conversations about cities.

It will be worth watching where this goes.