The difficulties in finding out the most popular street name in the United States

FiveThirtyEight tries to find out the most common street name in the US and this leads to comparing Census information from 1993 with a Reddit user’s work:

The chart on Reddit that sparked your question looks very different from the 1993 list of most common street names from the Census Bureau.

Why, for example are there 3,238 extra Main streets in that chart compared with the census records in 1993? To find out, I got in touch with “darinhq,” whose name is Darin Hawley when he’s not producing charts on Reddit. After speaking to him, I think there are three explanations for the difference between his chart and the official data.

First, some new streets may have been built over the past 20 years (Hawley used 2013 census data to make his chart). Second, some streets may have changed their names: If a little town grows, it might change the name of its principal street from Tumbleweed Lane to Main Street.

Third, I don’t know how the Census Bureau produced its 1993 list (I asked, and a spokesperson told me the researcher who made it can’t recall his methodology), so Hawley might have simply used a different methodology to produce his chart. Because I wasn’t able to find any data on the frequency that American streets are renamed or the rate at which new streets are being built, I’m going to stake my money on this third explanation. Hawley told me that he counted “Main St N” and “N Main St” as two separate streets in his data. If the Census Bureau counted them as just one street, that could account for the difference.

That’s not the only executive decision Hawley made when he was summarizing this data. He set a minimum of how far away one Elm Street in Maine had to be from another Elm Street in Maine to qualify as two separate streets. That’s a problem because streets can break and resume in unexpected ways.

In other words, getting an answer requires making some judgment calls with the available data. While this is the sort of question that exemplifies the intriguing things we can all learn from the Internet, it is also a question that likely isn’t important enough to spend a lot of time with it. As an urban sociologist, this is an interesting question but what would I learn from the frequencies of street names? What hypothesis could I test? It might roughly tell us the names that Americans give to roads. What we value may just be reflected in these road names. For example, the Census data suggests that numbered streets and references to nature dominate the top 20. Does this mean we like order (a pragmatic approach) and idyllic yet vague nature terms (park, view, lake, tree names) over other things? Yet, the list has limitations as these communities and roads were built at different times, roads can be renamed, and we do have to make judgment calls about what specifies separate streets.

Two other thoughts:

1. The Census researcher who did this back in the early 1990s can’t remember the methodology. Why wasn’t it part of the report?

2. Is this something that would be best left up to marketers (who might find some advertising value in this) or GIS firms (who have access to comprehensive map data)?

The factors behind the rise of viral maps

Here is a short look at how viral maps (“graphic, easy to read, and they make a quick popular point”) are put together by one creator:

When I need to find a particular data set, it’s often as straightforward as a search for the topic with the word “shapefile” or “gis” attached. There’s so much data just sitting on servers that if you can imagine it, it’s probably out there somewhere (often for free). Sometimes though, finding data requires a deeper search. A lot of government-provided data sits inside un-indexed data portals or clearinghouses. Depending on the quality of the portal, these can be tedious to sort through…

Simplicity and ease-of-use: Interactive maps are great, but I want the maps I make to be straightforward to read and understand. I don’t want viewers to have to figure out how to use the map; they should just be able to look at it and figure out what’s going on.

Projections: Typical web maps are limited to the Web Mercator projection. I don’t have any objection to Mercator in principle (in fact it’s brilliant for what it does), but I can’t in good conscience use it for maps at a continental or global scale. Sticking to static maps allows me to choose more appropriate projections for the data and region I’m depicting.

Uniformity: I want everyone who visits my maps to be presented with the same information. I don’t want some algorithm deciding that one visitor is shown a particular view while another visitor gets a different one.

These principles sound similar to what one would expect for any sort of online chart or infographic. There is plenty of data available online but it takes some skill in order to present the data clearly and then market the map to the appropriate audience.

Now that I think about it, it is a little surprising that it took this long for viral maps to catch on. First, the Internet makes a lot of geographic data easily accessible. Two, it is a visual medium and maps are essentially graphics (audio is another story). Third, geographic data seems to feed into a lot of hot-button topics of conversation these days as people of different races (residential segregation), cultural viewpoints (think the American South or the Bible Belt), education (think the Creative Callas looking for exciting urban neighborhoods), and other groupings tend to live in different places.

I wonder if the real story here isn’t the technology that makes mapping on a large-scale relatively easy today. GIS software has been around for a while but it generally pretty expensive and has a learning curve. Now, there are numerous websites that offer access to data and mapping capability (think the Census or Social Explorer). Shapefiles are used by a variety of local governments and researchers and can be downloaded. There are good freeware GIS programs like GeoDa. You need some bandwidth and computing power to get the data and crunch the numbers. All together, the pieces have now come together for more people to access, manipulate, and publish maps in a way that wasn’t possible even just 5 years ago.


Using GIS to study Gettysburg, the Holocaust, and the American iron industry

Smithsonian takes a look at a historian who uses GIS to get a new perspective on important historical events:

Her principal tool is geographic information systems, or GIS, a name for computer programs that incorporate such data as satellite imagery, paper maps and statistics. Knowles makes GIS sound simple: “It’s a computer software that allows you to map and analyze any information that has a location attached.” But watching her navigate GIS and other applications, it quickly becomes obvious that this isn’t your father’s geography…

What emerges, in the end, is a “map” that’s not just color-coded and crammed with data, but dynamic rather than static—a layered re-creation that Knowles likens to looking at the past through 3-D glasses. The image shifts, changing with a few keystrokes to answer the questions Knowles asks. In this instance, she wants to know what commanders could see of the battlefield on the second day at Gettysburg. A red dot denotes General Lee’s vantage point from the top of the Lutheran Seminary. His field of vision shows as clear ground, with blind spots shaded in deep indigo. Knowles has even factored in the extra inches of sightline afforded by Lee’s boots. “We can’t account for the haze and smoke of battle in GIS, though in theory you could with gaming software,” she says…

Though she’s now been ensconced at Middlebury for a decade, Knowles continues to push boundaries. Her current project is mapping the Holocaust, in collaboration with the U.S. Holocaust Memorial Museum and a team of international scholars. Previously, most maps of the Holocaust simply located sites such as death camps and ghettos. Knowles and her colleagues have used GIS to create a “geography of oppression,” including maps of the growth of concentration camps and the movement of Nazi death squads that accompanied the German Army into the Soviet Union…

Aware of these pitfalls, Knowles is about to publish a book that uses GIS in the service of an overarching historical narrative. Mastering Iron, due out in January, follows the American iron industry from 1800 to 1868. Though the subject matter may not sound as grabby as the Holocaust or Gettysburg, Knowles has blended geographical analysis with more traditional sources to challenge conventional wisdom about the development of American industry.

Sounds pretty interesting. Having detailed geographic data can change one’s perspective. But there are two things that need to happen first before researchers can take advantage of such information:

1. Using GIS well requires a lot of training and then being able to find the right data for the analysis.

2. Using geographic data like this requires a change in mindset from the idea that geography is just a background variable. In sociology, analysis often controls for some geographic variation but doesn’t often consider the location or space as the primary factor.

While GIS is a hot method right now, I think these two issues will hold it back from being widely used for a while.