Countering gerrymandering in Pennsylvania with numerical models

Wired highlights a few academics who argued against gerrymandered political districts in Pennsylvania with models showing the low probability that the map is nonpartisan:

Then, Pegden analyzed the partisan slant of each new map compared to the original, using a well-known metric called the median versus mean test. In this case, Pegden compared the Republican vote share in each of Pennsylvania’s 18 districts. For each map, he calculated the difference between the median vote share across all the districts and the mean vote share across all of the districts. The bigger the difference, the more of an advantage the Republicans had in that map.

After conducting his trillion simulations, Pegden found that the 2011 Pennsylvania map exhibited more partisan bias than 99.999999 percent of maps he tested. In other words, making even the tiniest changes in almost any direction to the existing map chiseled away at the Republican advantage…

Like Pegden, Chen uses computer programs to simulate alternative maps. But instead of starting with the original map and making small changes, Chen’s program develops entirely new maps, based on a series of geographic constraints. The maps should be compact in shape, preserve county and municipal boundaries, and have equal populations. They’re drawn, in other words, in some magical world where partisanship doesn’t exist. The only goal, says Chen, is that these maps be “geographically normal.”

Chen generated 500 such maps for Pennsylvania, and analyzed each of them based on how many Republican seats they would yield. He also looked at how many counties and municipalities were split across districts, a practice the Pennsylvania constitution forbids “unless absolutely necessary.” Keeping counties and municipalities together, the thinking goes, keeps communities together. He compared those figures to the disputed map, and presented the results to the court…

Most of the maps gave Republicans nine seats. Just two percent gave them 10 seats. None even came close to the disputed map, which gives Republicans a whopping 13 seats.

It takes a lot of work to develop these models and they are based on particular assumptions as well as methods for calculations. Still, could a political side present a reasonable statistical counterargument?

Given both the innumeracy of the American population and some resistance to experts, I wonder how the public would view such models. On one hand, gerrymandering can be countered by simple arguments: the shapes drawn on the map are pretty strange and can’t truly represent any meaningful community. On the other hand, the models reinforce how unlikely these particular maps are. It isn’t just that the shapes are unusual; they are highly unlikely given various inputs that go into creating meaningful districts. Perhaps any of these argument are meaningless if your side is winning through the maps.

“Using a Real Life SimCity to Design a Massive Development”

As a massive SimCity fan, I find this use of predictive urban models intriguing:

596 acres, 50,000 residents, $4 billion dollars and even a 1,500-boat marina: Everything about the proposed Chicago Lakeside Development, developer Dan McCaffery’s massive micro-city being built at the former site of the U.S. Steel Southworks Plant, is on a different scale. It follows that the design process for this mixed-use project requires a different set of tools, in this case, LakeSim, an advanced computer modeling program. Developed as part of a collaboration between the University of Chicago, Argonne National Laboratory, Skidmore, Owings & Merrill and McCaffery Interests, this program functions like a customized SimCity, analyzing and simulating weather, traffic patterns and energy usage to help architects and designers plan for a site that may eventually contain more than 500 buildings.

“A lot of the Big Data approaches tend to be statistical in nature, looking at past data,” says Argonne scientist Jonathan Ozik. “We’re modeling a complex system of interactive components, running the data forward, so what we end up having is your SimCity analogy, energy systems interacting, vehicles and people moving. What we’re doing here is using a complex systems approach to tackle the problem.”…

The challenge for planners is predicting how so many different systems and variables will interact. LakeSim gives them a framework to analyze these systems over long timelines and run millions of scenarios much quicker than past models — hours as opposed to days — asking “hundreds of questions at once,” according to Ozik. The program is a step forward from similar modeling software, especially valuable at a site that in most respects is being built from scratch.

This seems quite useful at this point but it will be necessary to look at this down the road once the site is developed. How much time did the model save? How accurate was the model? Did relying on such a model lead to negative outcomes? If this is a predictive model, it may be only as good as the outcome.

Interesting to note that the commenters at the bottom are wondering where all the people to live in this development are going to come from. I assume that demand is appropriately accounted for in the model?

The Chicago School model of urban growth doesn’t quite fit…but neither do other models

Sociologist Andy Beveridge adds to the ongoing debate within urban sociology over the applicability of the Chicago School’s model of growth:

Ultimately, Beveridge’s interesting analysis found that the basic Chicago School pattern held for the early part of the 20th century and even into the heyday of American post-war suburbanization. But more recently, the process and pattern of urban development has diverged in ways that confound this classic model…

The pattern of urban growth and decline has become more complicated in the past couple of decades as urban centers, including Chicago, have come back. “When one looks at the actual spatial patterning of growth,” Beveridge notes, “one can find evidence that supports exponents of the Chicago, Los Angeles and New York schools of urban studies in various ways.” Many cities have vigorously growing downtowns, as the New York model would suggest, but outlying areas that are developing without any obvious pattern, as in the Los Angeles model.

The second set of maps (below) get at this, comparing Chicago in the decades 1910-20 and 1990-2000. In the first part of the twentieth century, decline was correlated with decline in adjacent downtown areas, shown here in grey. Similarly, growth was correlated with growth in more outlying suburbs, shown here in black. In the earlier period growth radiated outwards — a close approximate of the Chicago school concentric zone model. But in the more recent map, growth and decline followed less clear patterns. Some growth concentrated downtown, while other areas outside the city continued to boom, in ways predicted more accurately by the New York and Los Angeles models. The islands of grey and black–which indicate geographic correlations of decline and growth, respectively–are far less systematic. As Beveridge writes, the 1990-2000 map shows very little patterning. There were “areas of clustered high growth (both within the city and in the suburbs), as well as decline near growth, growth near decline, and decline near decline.”

Interesting research. It sounds like the issue is not necessarily the models of growth but how widely they are applied within a metropolitan region. Assuming the same processes are taking place over several hundred square miles is making too much of a leap. We might then need to look at smaller areas or types of areas as well as micro processes.

This reminds me that when teaching urban sociology this past spring and reading as a class about the Chicago School, New York School, and Los Angeles School, students wanted to discuss why sociologists seem to want one theory to explain all cities. This isn’t necessarily the case; we know cities are different, particularly when you get outside of an American or Western context. At the same time, we are interested in trying to better understand the underlying processes surrounding city change. Plus, Chicago, New York, and LA have had organized (sometimes more strongly, sometimes more loosely) groups based in important schools pushing theories (and we don’t have such schools in places like Miami, Atlanta, Dallas, Portland, etc.).

Viewing cities as crosses between stars and social networks

A new paper from a physicist suggests cities are “social reactors,” somewhere between social networks and stars:

Others have suggested that cities look and operate like biological organisms, but that is not the case, says Bettencourt. “A city is a bunch of people, but more importantly, it’s a bunch of people interacting, so hence the social network,” he explains. “What’s important are the properties of this social network: the scaling was giving us clues. But then when you think of this superlinearity, which means the socioeconomic outputs are the result of those interactions, are expressed as growing superlinear functions of populations, the only system that I could think of in nature is a star. A star does have this property – it’s essentially a nuclear reactor sustained by gravity and shines brighter (has greater luminosity) the larger its mass. So there’s a sense that this behavior that is sustained by and created by attractive interactions and whose output is proportional to rate of interactions, is what a city is and a star is, and so in that sense they are analogous.”…

The result is this “special social reactor” that adheres to four main assumptions about city dynamics and scaling:

1) There are “mixing populations”: basically, cities have attractive interactions and social outputs are the results of those, which leads to more social interactions.

2) There is “incremental network growth”: notably, the networks themselves and the supporting infrastructure develop gradually as the city grows. The infrastructure is decentralized as are the networks themselves. This is very different from an organism, says Bettencourt, whose internal “infrastructure” (analogous to a vascular system for example) develops basically all at once and has a centralized node.

3) “Human effort is bounded”: as he writes in his paper, “The increasing mental and physical demand from their inhabitants has been a pervasive concern to social scientists. Thus this assumption is necessary to lift an important objection to any conceptualization of cities as scale-invariant systems.” In other words, “The costs imposed on people by living in the city do not scale up,” he says, because as the number of social interactions increase, one doesn’t have to necessarily travel more to get to these interactions. “The city comes to you as it becomes denser,” he notes.

4) “Socioeconomic outputs are proportional to local social interactions”: this gives us an interesting snapshot of exactly what a city is – not just a conglomeration of individuals, but rather a concentration of social interactions.

Sounds interesting. Cities are both agglomerations of social interactions as well as have unique infrastructures (physical and social) that gives shape to and is shaped by these interactions.

Getting the data to model society like we model the natural world

A recent session at the American Association for the Advancement of Science included a discussion of how to model the social world:

Dirk Helbing was speaking at a session entitled “Predictability: from physical to data sciences”. This was an opportunity for participating scientists to share ways in which they have applied statistical methodologies they usually use in the physical sciences to issues which are more ‘societal’ in nature. Examples stretched from use of Twitter data to accurately predict where a person is at any moment of each day, to use of social network data in identifying the tipping point at which opinions held by a minority of committed individuals influence the majority view (essentially looking at how new social movements develop) through to reducing travel time across an entire road system by analysing mobile phone and GIS (Geographical Information Systems) data…

With their eye on the big picture, Dr Helbing and multidisciplinary colleagues are collaborating on FuturICT, a 10-year, 1 billion EUR programme which, starting in 2013, is set to explore social and economic life on earth to create a huge computer simulation intended to simulate the interactions of all aspects of social and physical processes on the planet. This open resource will be available to us all and particularly targeted at policy and decision makers. The simulation will make clear the conditions and mechanisms underpinning systemic instabilities in areas as diverse as finance, security, health, the environment and crime. It is hoped that knowing why and being able to see how global crises and social breakdown happen, will mean that we will be able to prevent or mitigate them.

Modelling so many complex matters will take time but in the future, we should be able to use tools to predict collective social phenomena as confidently as we predict physical pheno[men]a such as the weather now.

This will require a tremendous amount of data. It may also require asking for a lot more data from individual members of society in a way that has not happened yet. To this point, individuals have been willing to volunteer information in places like Facebook and Twitter but we will need much more consistent information than that to truly develop models like are suggested here. Additionally, once that minute to minute information is collected, it needs to be put in a central dataset or location to see all the possible connections. Who is going to keep and police this information? People might be convinced to participate if they could see the payoff. A social model will be able to do what exactly – limit or stop crime or wars? Help reduce discrimination? Thus, getting the data from people might be as much of a problem as knowing what to do with it once it is obtained.