Tree diagrams as important tool in human approach to big data

Big data may seem like a recent phenomenon but for centuries tree diagrams have helped people make sense of new influxes of data:

The Book of Trees: Visualizing Branches of Knowledge catalogs a stunning diversity of illustrations and graphics that rely on arboreal models for representing information. It’s a visual metaphor that’s found across cultures throughout history–a data viz tool that has outlived empires and endured huge upheavals in the arts and sciences…

For the first several hundred years at least, the use of the tree metaphor is largely literal. A graphic from 1552 classifies parts of the Code of Justinian–a hugely important collection of a thousand years of Roman legal thought–as a trunk with a dense tangle of leafless branches. An illustration from Liber Floridus, one of the best-known encyclopedias from the Middle Ages, lays out virtues as fronds of a palm. In the early going, classifying philosophical knowledge and delineating the moral world were frequent use cases. In nearly every case, foliage abounds…

At some point in the 18th or 19th century, the tree model made the leap to abstraction. This led to much more sophisticated visuals, including complex organization charts and dense genealogies. One especially influential example arrived with Darwin’s On the Origin of Species, in 1859…

While the impulse to visualize is more alive today than ever, our increasingly technological society may be outgrowing this enduring representational model. “Trees are facing this paradigm shift,” Lima says. “The tree, as a representational hierarchy, cannot accommodate things like the web and Wikipedia–things with linkage. The network is replacing the tree as the new visual metaphor.” In fact, the idea to do a collection solely on trees was born during Lima’s research on his first book–a collection of visualizations based on the staggering complexity of networks.

A few quick thoughts:

1. We talk a lot now about being in a visual age (why can’t audio clips go viral?) yet humans have a long history of utilizing visuals to help them understand the world.

2. We’ve seen big leaps forward in data dissemination in the past – think the invention of writing, the printing press, the telegraph, etc. The leap forward to the Internet may seem quite monumental but such shifts have been tackled before.

3. Designing infographics took skill in the past just as it does today. The tree is a widely understood symbol that lends itself to certain kinds of data. Throw in some color and flair and it can work well. Yet, it can also be done poorly and detract from its ability to convey information quickly.

The intellectual bloodlines of Talcott Parsons

In response to a review of Robert Bellah’s new book, a sociologist writes to the New York Times to link Robert Bellah and Clifford Geertz to Talcott Parsons:

His contrast of Bellah’s theories of religious evolution with Clifford Geertz’s outlook was also illuminating, but I was surprised he did bnot mention that both Bellah and Geertz were students of Talcott Parsons, a towering figure of mid-20th-century sociology. Indeed, a fuller understanding of Bellah’s and Geertz’s intellectual trajectories demands appreciation of their continuity with Parsonsian theory as well as their breaks with it. Parsons struggled to provide a vision of human agency that makes a place for morality, reason, emotions and biology, and of social order as the product of both human initiative and pre-existing collective forces, which are themselves both cultural and coercive. As Wolfe points out, his two illustrious students continued to struggle with the complexities of how we can be agents as well the product of external forces — and the unique role religion has played in how we struggle to manage these elements.

This seems like prescient analysis to me. While undergraduate sociology majors hear in theory classes that Parsons was the end of functionalism and quickly faded from prominence, isn’t this intellectual bloodline a good measure of Parsons abilities? I never knew both Bellah and Geertz, both well-respected and well-known, were his students and this puts Parsons in a slightly different light.

Has anyone ever put together a sociological genealogy where we could see how generations of scholars have emerged from others? While these would no doubt be socially constructed and emphasize famous scholars, I think it would be fascinating to see.

Geneologies as “heavily curated social constructions”

Tracking genealogies is both a popular hobby and big business. A sociologist argues that these genealogies are actually social constructions of our past:

In Ancestors and Relatives: Genealogy, Identity, and Community, Eviatar Zerubavel, a sociologist at Rutgers, pulls back the curtain on the genealogical obsession. Genealogies, he argues, aren’t the straightforward, objective accounts of our ancestries we often presume them to be. Instead, they’re heavily curated social constructions, and are as much about our values as they are about the facts of who gave birth to whom…

“No other animals have ‘second cousins once removed,'” Zerubavel points out, “or are aware of having had great-great-great-grandparents”; only people have the more abstract sorts of relatives necessary for a real genealogy. In the meantime, as categories for relatives proliferate and family trees expand, we accrue large numbers of ‘optional’ relatives. We construct our genealogies by choosing, out of a nearly endless array of possibly important or interesting ancestors, the ones who matter to us.

Those choices are highly motivated, and often obviously artificial. Because we want to stretch our family lines far into the past, we often “cut and paste” different branches, claiming, for example, a great-great-grandmother’s stepfather as one of our own ancestors, and following his line into the past. We “braid” ancestral identities together, emphasizing, as President Obama has, that we come from two distinct lines of descent (“a mother from Kansas and a father from Kenya”). Sometimes, though, the opposite impulses take hold. We might deliberately “lump” our diverse ancestries together, aiming to consolidate them, using a label like “Eurasian,” to lower the contrast (as Tiger Woods does when he refers to himself as “Cablinasian” — a combination of Caucasian, black, American Indian, and Asian). Or we might “clip” our family trees, obscuring their origins so as to preserve coherence and purity. That, Zerubavel writes, is what the Nazis did with Jewish genealogies: “Going only two generations back when formally defining Jewishness… helped the Nazis avoid realizing how many ‘Aryan’ Germans actually also had Jewish ancestors.”…

The point, Zerubavel writes, is that genealogies don’t all follow the same rules. Depending on what you’re trying to emphasize, you accept, reject, combine, or contrast individuals, families, and even whole ethnic identities. The most objective point of view, as Richard Dawkins has written, would probably hold that “all living creatures are cousins.” But genealogies are partial, selective, subjective, and social. They are as much about the present as they are about the past.

This isn’t too surprising: humans commonly pick and choose what we want to believe and then display to others. Could we argue that genealogies are simply another tool of impression management where we show our best (past) side to others and cover up the people we aren’t as proud of? This doesn’t seem that different than communities that cover up infamous parts of their histories or patriotic narratives that emphasize only the positives.

This reminds me of a high school history project I had to do. For my American History class, we had to make a poster out of our genealogies and there was a prize handed out to the person who could go the farthest back. Several of my family lines didn’t go more than four or five generations back but one of them had been extensively researched back to 46 generations and Alfred the Great, king of the Anglo-Saxons in the late 800s. Several things struck me then as odd:

1. I ended up losing out to a girl who could trace her family back 47 generations. Is this a prize-worthy objective anyway?

2. Who has the time and money to spend on tracing one’s family back 46 generations? Perhaps this doesn’t require to many resources these days with online resources plus what is often available at libraries but it still requires time.

3. Some of the family line was strange as I think one time it went through a cousin and another time for a daughter rather than a son. It seemed clearly set up to get back to people like Sir Francis Bacon and Alfred the Great.

But, for the day or two that my poster was up in the classroom, I could say that I could trace my family back 46 generations when most people could not.