Boom in Data Designer jobs in the future?

One designer argues the proliferation of data means the job of data designer will be needed in the coming years:

When I began my career 25 years ago, the notion of design in the software industry was still nascent. It was an engineer’s world, in which just making software function was the consuming focus. So the qualification for this design role was quite simple: do you know anything about software? Those of us trying to apply humanistic or artistic notions to the process faced fundamental technical challenges. It was actually quite exciting, but a constant uphill battle to effect change…The new design challenge is to use this data for the same humanistic outcomes that we have in mind when we shape products through the user interface or physical form. Even conceding that many interfaces are not changing much—we still use PCs, and the mobile experience still mirrors traditional PC software tropes—we can see the data that moves through these systems is becoming more interesting. Just having this data affords the possibility of exciting new products. And the kind of data we choose to acquire can begin to humanize our experiences with technology…

We might consider the Data Designer a hybrid of two existing disciplines. Right now, Data Analysts and Interaction Designers work at two ends of the spectrum, from technical to humanistic. Data Analysts offer the most expertise in the medium, which is a great place to start; but they are approaching the problem from a largely technical and analytical perspective, without the concentration we need in the humanistic aspects of the design problems they address. Interaction Designers today are expert in designing interfaces for devices with screens. They may encounter and even understand the data behind their interfaces; but for the most part, it’s too often left out of the design equation…

Sociological implications. Presented with new capabilities of new technology, the design problem is to determine not just if a certain capability can be used, but how and why it should be used. When systems take in data quietly, from behind the scenes, from more parts of our lives, and shape this data in radical new ways, then we find an emerging set of implications that design does not often face, with profound sociological and safety issues to consider.

Data doesn’t interpret itself; people need to make sense of it and then use it effectively. Simply having all of this data is a good start but skilled practitioners can do effective, useful, and aesthetically pleasing things with the data.

My question would be about how to make to this happen? Is this best addressed top-down by certain organizations who have the foresight and/or resources to make this happen? Or, is this best done by some new startups and innovators who show others the way?

Watch for how Chicago’s new “Array of Things” signs communicate information

Big data about Chicago is to be communicated to the public in a few different ways, including from public signs:

But the information it gathers is only half of what the Array of Things does. It will communicate that data in a complete, machine-readable form online, for users to search, analyze, and adapt. The sensors, however, will also communicate the data to passers-by.And that presents an interesting design dilemma. Most public signage seems self-evident and intuitive, like stop signs and walk signals, but it tends not to change very much, and when it does, it’s iterative. What do you do when you’re designing a new form of public signage, on the cheap, and one that has the possibility to communicate a wide range of information? To find out, I spoke with the array’s designers, SAIC professor Douglas Pancoast and master’s student Satya Batsu.

The obvious approach would be to use a screen. But screens are fragile and expensive. “We knew we didn’t want to have screens,” says Pancoast. “We wanted it to be visible—it couldn’t be too small, it couldn’t be too big, and you couldn’t mistake it for traffic.”…

That also led the designers to the current design of the Array nodes. (Not final, necessarily—the 3D-printed screens are cheap, quickly produced, and replaceable in a few minutes with off-the-shelf hardware.) The hexagonal shape of the lights in a honeycomb pattern is meant to further distinguish the Array nodes from traffic signals—a simple, familiar shape that’s still different from the language of signage that will surround it on city streets…

From that, Pancoast and Batsu narrowed down the nodes to their current iteration, leaving open the question of what information they’ll communicate and how people will recognize it. And that’s where the community comes in. The Array of Things is “neighborhood asset mapping,” in Pancoast’s words; residents are likely to be interested in different data in different places. In one place, they might be interested in air quality, an “asymmetrical” issue across the city. In another, sound or temperature.

This could present some interesting opportunities for observation to see how residents will interact with these public signs. Will they stand around them? Glance at them quickly as they walk by? Ignore them? I’m curious to know what information these signs could provide on a regular basis that would be better than what residents could gather on their smartphones or that would add value to their daily routine.

Using social media data to predict traits about users

Here is a summary of research that uses algorithms and “concepts from psychology and sociology” to uncover traits of social media users through what they make available:

One study in this space, published in 2013 by researchers at the University of Cambridge and their colleagues, gathered data from 60,000 Facebook users and, with their Facebook “likes” alone, predicted a wide range of personal traits. The researchers could predict attributes like a person’s gender, religion, sexual orientation, and substance use (drugs, alcohol, smoking)…

How could liking curly fries be predictive? The reasoning relies on a few insights from sociology. Imagine one of the first people to like the page happened to be smart. Once she liked it, her friends saw it. A social science concept called homophily tells us that people tend to be friends with people like themselves. Smart people tend to be friends with smart people. Liberals are friends with other liberals. Rich people hang out with other rich people…

On the first site, YouAreWhatYouLike, the algorithms will tell you about your personality. This includes openness to new ideas, extraversion and introversion, your emotional stability, your warmth or competitiveness, and your organizational levels.

The second site, Apply Magic Sauce, predicts your politics, relationship status, sexual orientation, gender, and more. You can try it on yourself, but be forewarned that the data is in a machine-readable format. You’ll be able to figure it out, but it’s not as pretty as YouAreWhatYouLike.

These aren’t the only tools that do this. AnalyzeWords leverages linguistics to discover the personality you portray on Twitter. It does not look at the topics you discuss in your tweets, but rather at things like how often you say “I” vs. “we,” how frequently you curse, and how many anxiety-related words you use. The interesting thing about this tool is that you can analyze anyone, not just yourself.

The author then goes on to say that she purges her social media accounts to not include much old content so third parties can’t use the information against them. That is one response. However, before I go do this, I would want to know a few things:

1. Just how good are these predictions? It is one thing to suggest they are 60% accurate but another to say they are 90% accurate.

2. How much data do these algorithms need to make good predictions?

3. How are social media companies responding to such moves? While I’m sure they are doing some of this themselves, what are they planning to do if someone wants to use this data in a harmful way (say, affecting people’s credit score)? Why not set limits for this now rather than after the fact?

Chicago to collect big data via light pole sensors

Chicago is hoping to collect all sorts of information via a new system of sensors along main streets:

The smooth, perforated sheaths of metal are decorative, but their job is to protect and conceal a system of data-collection sensors that will measure air quality, light intensity, sound volume, heat, precipitation, and wind. The sensors will also count people by observing cell phone traffic…

While data-hungry researchers are unabashedly enthusiastic about the project, some experts said that the system’s flexibility and planned partnerships with industry beg to be closely monitored. Questions include whether the sensors are gathering too much personal information about people who may be passing by without giving a second thought to the amount of data that their movements—and the signals from their smartphones—may be giving off.

The first sensor could be in place by mid-July. Researchers hope to start with sensors at eight Michigan Avenue intersections, followed by dozens more around the Loop by year’s end and hundreds more across the city in years to come as the project expands into neighborhoods, Catlett said…

While the benefits of collecting and analyzing giant sets of data from cities are somewhat speculative, there is a growing desire from academic and industrial researchers to have access to the data, said Gary King, director of the Institute for Quantitative Social Sciences at Harvard University.

The sort of data collected here could be quite fascinating, even with the privacy concerns. I wonder if a way around this is for the city to make clear now and down the road how exactly they will use the data to improve the city. To some degree, this may not be possible because this is a new source of data collection and it is not entirely known what might emerge. Yet, collecting big data can be an opaque process that worries some because they are rarely told how the data improves their lives. If this simply is another source of data that the city doesn’t use or uses behind the scenes, is it worth it?

A quick hypothetical. Let’s say the air sensors along Michigan Avenue, one of Chicago prime tourist spots, shows a heavy amount of car exhaust. In response to the data, the city announces a plan to limit congestion on Michigan Avenue or to have clean mass transit. This could be a clear demonstration that the big data helped improve the pedestrian experience.

But, I could also imagine that in a year or two the city hasn’t said much about this data and people are unclear what is collected and what happens to it. More transparency and clear action steps could go a long way here.

Facebook to hold pre-ASA conference

Last year’s ASA meetings included some special sessions on big data and Facebook is hosting a pre-conference this year at the company’s headquarters.

VentureBeat has learned that Facebook is to hold an academics-only conference in advance of the American Sociological Association 2014 Annual Meeting this August in San Francisco.

Facebook will run shuttles from the ASA conference hotel to Facebook’s headquarters in Menlo Park, Calif. According to the company’s event description, the pre-conference focuses on “techniques related to data collection with the advent of social media and increased interconnectivity across the world.”…

According to the event schedule, Facebook will give a demo of its tools and software stack at the conference…

There seems to be a great demand for sociologists who can code. Corey now spends a lot of time hiring fellow sociologists, according to his article. It is also the case in other big companies. In one interview conducted with the London School of Economics, Google’s Vice President Prabhakar Raghavan claimed that he just couldn’t hire enough social scientists.

This is a growing area of employment for sociologists who would benefit from getting access to proprietary yet amazing data and would also have to negotiate different structures in the private technology world versus academia.

Presenting big data about Chicago

The Chicago Architecture Foundation has a new exhibit highlighting the use of big data in Chicago:

Architects, planners, engineers and citizens, it contends, are increasingly using massive amounts of data to analyze urban issues and shape innovative designs…

But data, the show argues, is useful as well as ubiquitous. We see some classically gritty Chicago stuff to back this up, though it’s not quite powerful or precise enough to be fully persuasive…

More convincing are the show’s examples of “digital visualization,” which is geekspeak for using digital technology to present and analyze urban planning data.

Take a monumental, crowd-pleasing map of Chicago, 15 feet high and 30 feet wide, which presents the footprints of thousands of buildings, even individual houses, and color-codes them by the era in which they were built. We see the impact of the city’s three great building booms, from Chicago’s earliest days to 1899, from 1900 to 1945, and from 1946 to 1979. The recent surges that filled downtown with new skyscrapers look puny by comparison.

Also worth seeing: Video monitors which display data for Divvy, the city’s bike-sharing program. They offer neat tidbits: Divvy’s most popular station, for example, is at Millennium Park.

Sounds interesting. Big cities are complex social entities who could benefit from large-scale and real-time data collection and analysis. Of course, as Kamin notes at the end, there still is a human side to cities that cannot be ignored but getting a handle through data on what is happening could go a long way.

Another dimension to this is how to best present big data. While the online presentation of maps has grown popular, how can this be done best in person? I look forward to seeing this exhibit in person as I already like what the Chicago Architecture Foundation has done with this space. Here is part of the gallery a few years ago:

CAFChicagoAug11This is a great free place to learn more about Chicago and then choose among the cool offerings in the gift shop or sign up for one of the architecture tours that cover all different aspects of Chicago.

Tree diagrams as important tool in human approach to big data

Big data may seem like a recent phenomenon but for centuries tree diagrams have helped people make sense of new influxes of data:

The Book of Trees: Visualizing Branches of Knowledge catalogs a stunning diversity of illustrations and graphics that rely on arboreal models for representing information. It’s a visual metaphor that’s found across cultures throughout history–a data viz tool that has outlived empires and endured huge upheavals in the arts and sciences…

For the first several hundred years at least, the use of the tree metaphor is largely literal. A graphic from 1552 classifies parts of the Code of Justinian–a hugely important collection of a thousand years of Roman legal thought–as a trunk with a dense tangle of leafless branches. An illustration from Liber Floridus, one of the best-known encyclopedias from the Middle Ages, lays out virtues as fronds of a palm. In the early going, classifying philosophical knowledge and delineating the moral world were frequent use cases. In nearly every case, foliage abounds…

At some point in the 18th or 19th century, the tree model made the leap to abstraction. This led to much more sophisticated visuals, including complex organization charts and dense genealogies. One especially influential example arrived with Darwin’s On the Origin of Species, in 1859…

While the impulse to visualize is more alive today than ever, our increasingly technological society may be outgrowing this enduring representational model. “Trees are facing this paradigm shift,” Lima says. “The tree, as a representational hierarchy, cannot accommodate things like the web and Wikipedia–things with linkage. The network is replacing the tree as the new visual metaphor.” In fact, the idea to do a collection solely on trees was born during Lima’s research on his first book–a collection of visualizations based on the staggering complexity of networks.

A few quick thoughts:

1. We talk a lot now about being in a visual age (why can’t audio clips go viral?) yet humans have a long history of utilizing visuals to help them understand the world.

2. We’ve seen big leaps forward in data dissemination in the past – think the invention of writing, the printing press, the telegraph, etc. The leap forward to the Internet may seem quite monumental but such shifts have been tackled before.

3. Designing infographics took skill in the past just as it does today. The tree is a widely understood symbol that lends itself to certain kinds of data. Throw in some color and flair and it can work well. Yet, it can also be done poorly and detract from its ability to convey information quickly.

Analyze big data better when computer scientists and social scientists share knowledge

Part of the “big-data struggle” is to have more computer scientists interacting with social scientists:

The emerging problems highlight another challenge: bridging the “Grand Canyon,” as Mr. Lazer calls it, between “social scientists who aren’t computationally talented and computer scientists who aren’t social-scientifically talented.” As universities are set up now, he says, “it would be very weird” for a computer scientist to teach courses to social-science doctoral students, or for a social scientist to teach research methods to information-science students. Both, he says, should be happening.

Both groups could learn quite a bit from each other. Arguably, programming skills would be very useful in a lot of disciplines in a world gaga over technology, apps, and big data. Arguably, more rigorous methodologies to find and interpret patterns are needed across a wide range of disciplines interested in human behavior and social interaction. Somebody has to be doing this already, perhaps even within individuals who have training to both areas. But, joining the two academic bodies together on a more formal and institutionalized basis could take quite a bit of work.

Call for more comparative study of poor urban neighborhoods using new techniques

Urban sociologist Mario Small recently argued sociologists and others need to adopt some new approaches to studying poor urban neighborhoods:

Small, who is also dean of UChicago’s Division of the Social Sciences, studies urban neighborhoods and has studied the diversity of experiences for people living in poor neighborhoods in cities across the country.

Studying only a few neighborhoods extensively fails to capture important differences, he said in a talk, “Poverty and Organizational Density,” at a session Feb. 15 at the annual meeting of the American Association for the Advancement of Science in Chicago…

“The experience of poverty varies from city to city, influenced by neighborhood factors such as commercial activity, access to transportation and social services, and other facets of organizational density,” Small said.

He explained that new sources of information, ranging from open city data to detailed, high-resolution imagery from commercial mapping services, provide new opportunities to compare the experience of the poor among multiple cities, in turn pointing cities and service providers toward optimal decision-making about policies, investment, or other interventions.

One of these changes is driven by changes in technology, the ability to collect big data. This can help sociologists and others go beyond surveys and neighborhood observations. Robert Sampson does some of this in Great American City with the ability to map the social networks and neighborhood moves of residents from poorer neighborhoods. Big data will be enable us to go even further.

The second suggestion, however, is something that sociologists could have been doing for decades. Poor neighborhoods in certain cities tend to get the lion’s share of attention, places like Chicago, New York City, Boston, and Philadelphia. In contrast, poor neighborhoods in places like Dallas, Miami, Seattle, Denver, and Las Vegas get a lot less attention. Perhaps I should return to a presentation I made years ago at the Society for the Study of Social Problems about this very topic where I suggested some key factors that led to this lack of comparative study…

Argument: businesses should use scientific method in studying big data

Sociologist Duncan Watts explains how businesses should go about analyzing big data:

A scientific mind-set takes as its inspiration the scientific method, which at its core is a recipe for learning about the world in a systematic, replicable way: start with some general question based on your experience; form a hypothesis that would resolve the puzzle and that also generates a testable prediction; gather data to test your prediction; and finally, evaluate your hypothesis relative to competing hypotheses.

The scientific method is largely responsible for the astonishing increase in our understanding of the natural world over the past few centuries. Yet it has been slow to enter the worlds of politics, business, policy, and marketing, where our prodigious intuition for human behavior can always generate explanations for why people do what they do or how to make them do something different. Because these explanations are so plausible, our natural tendency is to want to act on them without further ado. But if we have learned one thing from science, it is that the most plausible explanation is not necessarily correct. Adopting a scientific approach to decision making requires us to test our hypotheses with data.

While data is essential for scientific decision making, theory, intuition, and imagination remain important as well—to generate hypotheses in the first place, to devise creative tests of the hypotheses that we have, and to interpret the data that we collect. Data and theory, in other words, are the yin and yang of the scientific method—theory frames the right questions, while data answers the questions that have been asked. Emphasizing either at the expense of the other can lead to serious mistakes…

Even here, though, the scientific method is instructive, not for eliciting answers but rather for highlighting the limits of what can be known. We can’t help asking why Apple became so successful, or what caused the last financial crisis, or why “Gangnam Style” was the most viral video of all time. Nor can we stop ourselves from coming up with plausible answers. But in cases where we cannot test our hypothesis many times, the scientific method teaches us not to infer too much from any one outcome. Sometimes the only true answer is that we just do not know.

To summarize: the scientific method provides ways to ask questions and receive data regarding answering these questions. It is not perfect – it doesn’t always produce the answer or the answers people are looking for, it may only be as good as the questions asked, it requires a rigorous methodology – but it can help push forward the development of knowledge.

While there are businesses and policymakers using such approaches, it strikes me that such an argument for the scientific method is especially needed in the midst of big data and gobs of information. In today’s world, getting information is not a problem. Individuals and companies can quickly find or measure lots of data. However, it still requires work, interpretation, and proper methodology to interpret that data.