Trying to count the people on the streets in Cairo

This is a problem that occasionally pops up in American marches or rallies: how exactly should one estimate the number of people in the crowd? This has actually been quite controversial at points as certain organizers of rallies have produced larger figures than official government or media estimates. And with the ongoing protests taking place in Cairo, the same question has arisen: just how many Egyptians have taken to the streets in Cairo? There is a more scientific process to this beyond a journalist simply making a guess:

To fact-check varying claims of Cairo crowd sizes, Clark McPhail, a sociologist at the University of Illinois and a veteran crowd counter, started by figuring out the area of Tahrir Square. McPhail used Google Earth’s satellite imagery, taken before the protest, and came up with a maximum area of 380,000 square feet that could hold protesters. He used a technique of area and density pioneered in the 1960s by Herbert A. Jacobs, a former newspaper reporter who later in his career lectured at the University of California, Berkeley, as chronicled in a Time Magazine article noting that “If the crowd is largely coeducational, he adds, it is conceivable that people might press closer together just for the fun of it.”

Such calculations of capacity say more about the size of potential gathering places than they do about the intensity of the political movements giving rise to the rallies. A government that wants to limit reported crowd sizes could cut off access to its cities’ biggest open areas.

From what I have read in the past on this topic, this is the common approach: calculate how much space is available to protesters or marchers, calculate how much space an individual needs, and then look at photos to see how much of that total space is used. The estimates can then vary quite a bit depending on how much space it is estimated each person wants or needs. These days, the quest to count is aided by better photographs and satellite images:

That is because to ensure an accurate count, some computerized systems require multiple cameras, to get high-resolution images of many parts of the crowd, in case density varies. “I don’t know of real technological solutions for this problem,” said Nuno Vasconcelos, associate professor of electrical and computer engineering at the University of California, San Diego. “You will have to go with the ‘photograph and ruler’ gurus right now. Interestingly, this stuff seems to be mostly of interest to journalists. The funding agencies for example, don’t seem to think that this problem is very important. For example, our project is more or less on stand-by right now, for lack of funding.”

Without any such camera setup, many have turned to some of the companies that collect terrestrial images using satellites, but these companies have collected images mostly before and after the peak of protests this week. “GeoEye and its regional affiliate e-GEOS tasked its GeoEye-1 satellite on Jan. 29, 2011 to collect half-meter resolution imagery showing central Cairo, Egypt,” GeoEye’s senior vice president of marketing, Tony Frazier, said in a written statement. “We provided the imagery to several customers, including Google Earth. GeoEye normally relies on our partners to provide their expert analysis of our imagery, such as counting the number of people in these protests.” This image was taken before the big midweek protests. DigitalGlobe, another satellite-imagery company, also didn’t capture images of the protests, according to a spokeswoman, but did take images later in the week.

Because these images are difficult to come by in Egypt, it is then difficult to make an estimate. As the article notes, this is why you will get vague estimates for crowd sizes in news stories like “thousands” or “tens of thousands.”

Since this is a problem that does come up now and then, can’t someone put together a better method for making crowd estimates? If certain kinds of images could be obtained, it seems like an algorithm could be developed that would scan the image and somehow differentiate between people.