The difficulty of collecting, interpreting, and acting on data quickly in today’s world

I do not think the issue is just limited to the problems with data during COVID-19:

Photo by Artem Saranin on Pexels.com

If, after reading this, your reaction is to say, “Well, duh, predictions are difficult. I’d like to see you try it”—I agree. Predictions are difficult. Even experts are really bad at making them, and doing so in a fast-moving crisis is bound to lead to some monumental errors. But we can learn from past failures. And even if only some of these miscalculations were avoidable, all of them are instructive.

Here are four reasons I see for the failed economic forecasting of the pandemic era. Not all of these causes speak to every failure, but they do overlap…

In a crisis, credibility is extremely important to garnering policy change. And failed predictions may contribute to an unhealthy skepticism that much of the population has developed toward expertise. Panfil, the housing researcher, worries about exactly that: “We have this entire narrative from one side of the country that’s very anti-science and anti-data … These sorts of things play right into that narrative, and that is damaging long-term.”

My sense as a sociologist is that the world is in a weird position: people expect relatively quick solutions to complex problems, there is plenty of data to think about (even as the quality of the data varies widely), and there are a lot of actors interpreting and acting on data or evidence. Put this all together and it is can be difficult to collect good data, make sound interpretations of data, and make good choices regarding acting on those interpretations.

In addition, making predictions about the future is already difficult even with good information, interpretation, and policy options.

So, what should social scientists take from this? I would hope we can continue to improve our abilities to respond quickly and well to changing conditions. Typical research cycles take years but this is not possible in certain situations. There are newer methodological options that allow for quicker data collection and new kinds of data; all of this needs to be evaluated and tested. We need better processes of reaching consensus at quicker rates.

Will we ever be at a point where society is predictable? This might be the ultimate dream of social science if only we had enough data and the correct models. I am skeptical but certainly our methods and interpretation of data can always be improved.

Thinking about probabilistic futures

When looking to predict the future, one historian of science suggests we need to think probabilistically:

Photo by Tara Winstead on Pexels.com

The central message sent from the history of the future is that it’s not helpful to think about “the Future.” A much more productive strategy is to think about futures; rather than “prediction,” it pays to think probabilistically about a range of potential outcomes and evaluate them against a range of different sources. Technology has a significant role to play here, but it’s critical to bear in mind the lessons from World3 and Limits to Growth about the impact that assumptions have on eventual outcomes. The danger is that modern predictions with an AI imprint are considered more scientific, and hence more likely to be accurate, than those produced by older systems of divination. But the assumptions underpinning the algorithms that forecast criminal activity, or identify potential customer disloyalty, often reflect the expectations of their coders in much the same way as earlier methods of prediction did.

Social scientists have long hoped to contribute to accurate predictions. We want to both better understand what is happening now as well as provide insights into what will come after.

The idea of thinking probabilistically is a key part of the Statistics course I teach each fall semester. We can easily fall into using language that suggests we “prove” things or relationships. This implies certainty and we often think science leads to certainty, laws, and cause and effect. However, when using statistics we are usually making estimates about the population from the samples and information we have in front of us. Instead of “proving” things, we can speak to the likelihood of something happening or the degree to which one variable affects another. Our certainty of these relationships or outcomes might be higher or lower, depending on the information we are working with.

All of this relates to predictions. We can work to improve our current models to better understand current or past conditions but the future involves changes that are harder to know. Like inferential statistics, making predictions involves using certain information we have now to come to conclusions.

The idea of thinking both (1) probabilistically and (2) plural futures can help us understand our limitations in considering the future. In regards to probabilities, we can higher or lower likelihoods regarding our predictions of what will happen. In thinking of plural futures, we can work with multiple options or pathways that may occur. All of this should be accompanied by humility and creativity as it is difficult to predict the future, even with great information today.

Zillow sought pricing predictability in the supposedly predictable market of Phoenix

With Zillow stopping its iBuyer initiative, here are more details about how the Phoenix housing market was key to the plan:

Photo by RODNAE Productions on Pexels.com

Tech firms chose the Phoenix area because of its preponderance of cookie-cutter homes. Unlike Boston or New York, the identikit streets make pricing properties easier. iBuyers’ market share in Phoenix grew from around 1 percent in 2015—when tech companies first entered the market—to 6 percent in 2018, says Tomasz Piskorski of Columbia Business School, who is also a member of the National Bureau of Economic Research. Piskorski believes iBuyers—Zillow included—have grown their share since, but are still involved in less than 10 percent of all transactions in the city…

Barton told analysts that the premise of Zillow’s iBuying business was being able to forecast the price of homes accurately three to six months in advance. That reflected the time to fix and sell homes Zillow had bought…

In Phoenix, the problem was particularly acute. Nine in 10 homes Zillow bought were put up for sale at a lower price than the company originally bought them, according to an October 2021 analysis by Insider. If each of those homes sold for Zillow’s asking price, the company would lose $6.3 million. “Put simply, our observed error rate has been far more volatile than we ever expected possible,” Barton admitted. “And makes us look far more like a leveraged housing trader than the market maker we set out to be.”…

To make the iBuying program profitable, however, Zillow believed its estimates had to be more precise, within just a few thousand dollars. Throw in the changes brought in by the pandemic, and the iBuying program was losing money. One such factor: In Phoenix and elsewhere, a shortage of contractors made it hard for Zillow to flip its homes as quickly as it hoped.

It sounds like the rapid sprawling growth of Phoenix in recent decades made it attractive for trying to estimate and predict prices. The story above highlights cookie-cutter subdivisions and homes – they are newer and similar to each other – and I imagine this is helpful for models compared to older cities where there is more variation within and across neighborhoods. Take that critics of suburban ticky-tacky houses and conformity!

But, when conditions change – COVID-19 hits which then changes the behavior of buyers and sellers, contractors and the building trades, and other actors in the housing industry – that uniformity in housing was not enough to easily profit.

As the end of the article suggests, the algorithms could be changed or improved and other institutional buyers are also interested. Is this just a matter of having more data and/or better modeling? Could it all work for these companies outside of really unusual times? Or, perhaps there really are US or housing markets around the globe that are more predictable than others?

If suburban areas and communities are the places where this really takes off, the historical patterns of people making money off what are often regarded as havens for families and the American Dream may continue. Sure, homeowners may profit as their housing values increase over time but the bigger actors including developers, lenders, and real estate tech companies may be the ones who really benefit.

Models are models, not perfect predictions

One academic summarizes how we should read and interpret COVID-19 models:

Every time the White House releases a COVID-19 model, we will be tempted to drown ourselves in endless discussions about the error bars, the clarity around the parameters, the wide range of outcomes, and the applicability of the underlying data. And the media might be tempted to cover those discussions, as this fits their horse-race, he-said-she-said scripts. Let’s not. We should instead look at the calamitous branches of our decision tree and chop them all off, and then chop them off again.

Sometimes, when we succeed in chopping off the end of the pessimistic tail, it looks like we overreacted. A near miss can make a model look false. But that’s not always what happened. It just means we won. And that’s why we model.

Five quick thoughts in response:

  1. I would be tempted to say that the perilous times of COVID-19 lead more people to see models as certainty but I have seen this issue plenty of times in more “normal” periods.
  2. It would help if the media had less innumeracy and more knowledge of how science, natural and social, works. I know the media leans towards answers and sure headlines but science is often messier and takes time to reach consensus.
  3. Making models that include social behavior is difficult. This particular phenomena has both a physical and social component. Viruses act in certain ways. Humans act in somewhat predictable ways. Both can change.
  4. Models involve data and assumptions. Sometimes, the model might fit reality. At other times, models do not fit. Either way, researchers are looking to refine their models so that we better understand how the world works. In this case, perhaps models can become better on the fly as more data comes in and/or certain patterns are established.
  5. Predictions or proof can be difficult to come by with models. The language of “proof” is one we often use in regular conversation but is unrealistic in numerous academic settings. Instead, we might talk about higher or lower likelihoods or provide the best possible estimate and the margins of error.

Using humorists to predict the future because they can push beyond plausibility

Predictions made by experts are often not very good so why not let humorists try their hand at looking at the future?

This is not because “Simpsons” creator Matt Groening and his teams of writers through the decades are sinister geniuses. They are, of course, but the phenomenon of jokes coming uncannily true is not at all unique to “The Simpsons.” So at this time of year, when lots of people are making forecasts or looking back at how last year’s predictions went, I’d like to make the case that humorists may make the best futurists of all.

The writers of “The 80s” would not have won one of Philip Tetlock’s forecasting competitions: The great majority of their “predictions” were wildly wrong. Congress didn’t ban the consumption of meat, Muhammad Ali didn’t become chairman of the Joint Chiefs of Staff, Disney didn’t buy the United Kingdom, a musical version of “1984” starring Leif Garrett, Tracy Austin and Marlon Brando (as “Big Brother”) did not become the movie of the decade, cancer was not cured with “a substance secreted in the cranium of the baby harp seal when its head was struck repeatedly.” But given that the aim of the book was not to make predictions but to entertain, that was OK. It’s like with “The Simpsons”: You’re not watching it to get a rundown on the world to come; the fact that you sometimes do is a happy bonus…

The humorist’s approach to looking into the future bears some resemblance to scenario planning, a practice developed in the 1950s and 1960s at the Rand Corp. and Hudson Institute. Scenario planning involves coming up with alternative story lines of how things might plausibly develop in the future, and thinking about how a business or other organization can adapt to them. It’s not about picking the right scenario, but about opening your mind to different possibilities.

To make stories about the future funny, they usually have to be pushed beyond the bounds of plausibility. If they’re not pushed too far beyond, though, they can sometimes come true — with the advantage that few “serious” forecasters will have predicted them. The Trump presidency is a classic case of this. He had been talking about running since the late 1980s, but those in the media and political circles had learned over the years not to take him seriously. So it was left to the jokers.

Looking into the future is a difficult task since the future is a complex system with many variable at play. Even with all the data we have at our disposal these days, future trends do not necessarily have to follow in line with past results. This reminds me of Nassim Taleb’s writings from The Black Swan and onward: there are certain parts of reality that are fairly predictable, other areas that complex but more knowable, and other areas that we do not even know what we do not know. See this chart adapted from Taleb by Garry Peterson for an overview:

Taleb's quadrants

This also gets at an important aspect of creativity: being able to think beyond existing realities.

Another bonus of looking to humorists to think about the future: you might get some extra laughs along the way.

The new suburban crisis is…

According to Richard Florida, the era of cheap growth is over and suburbs will struggle to address important issues:

Suburban sprawl is extremely costly to the economy broadly. Infrastructure and vital services such as water and energy can be 2.5 times more expensive to deliver in the suburbs than in compact urban centers. In total, sprawl costs the U.S. economy roughly $600 billion a year in direct costs related to inefficient land usage and car dependency, and another $400 billion in indirect costs from traffic congestion, pollution, and the like, according to a 2015 study from the London School of Economics. The total bill: a whopping $1 trillion a year…

When all is said and done, the suburban crisis reflects the end of a long era of cheap growth. Building roads and infrastructure and constructing houses on virgin land was and is an incredibly inexpensive way to provide an American Dream to the masses, certainly when compared to what it costs to build new subway lines, tunnels, and high-rise buildings in mature cities. For much of the 1950s, 1960s, and 1970s, and on into the 1980s and 1990s, suburbanization was the near-perfect complement to America’s industrial economy. More than the great mobilization effort of World War II or any of the Keynesian stimulus policies that were applied during the 1930s, it was suburban development that propelled the golden era of economic growth in the 1950s and 1960s. As working- and middle-class families settled into suburban houses, their purchases of washers, dryers, television sets, living-room sofas, and automobiles stimulated the manufacturing sector that employed so many of them, creating more jobs and still more homebuyers. Sprawl was driver of the now-fading era of cheap economic growth.

But today, clustering, not dispersal, powers innovation and economic growth. Many people still like living in suburbs, of course, but suburban growth has fallen out of sync with the demands of the urbanized knowledge economy. Too much of our precious national productive capacity and wealth is being squandered on building and maintaining suburban homes with three-car garages, and on the infrastructure that supports them, rather than being invested in the knowledge, technology, and density that are required for sustainable growth. The suburbs aren’t going away, but they are no longer the apotheosis of the American Dream and the engine of economic growth.

Florida is right on a number of counts: (1) many suburbs are long past their period of growth and now having aging infrastructure as well as changing populations; (2) sprawl can be very inefficient for providing basic services (from water to roads to social services); and (3) we are in a different economic era.

At the same time, it is not necessarily clear where the suburbs will go after this. At least a few outcomes are possible:

  1. A decline in interest in suburbs (either a plateauing in population or even decreasing) due to inefficiencies, costs to the environment, and a resurgent interest in urban life (particularly among younger adults). Suburban critics have predicted movement in this direction for several decades.
  2. A retooling of suburbia. This could include: older suburbs adapting to the lack of greenfield growth opportunities; an increase in retrofitting older suburban developments and making them new and exciting; and denser suburban development (from row houses to New Urbanism).
  3. The status quo: enough Americans continue to express a desire for the suburban life despite what critics say. Technology may even help as driverless cars could make commutes more bearable.

There are indeed real issues facing suburbs, the suburban life was never as idyllic as it was portrayed, and suburban communities and outcomes today are varied. But, I believe it is hard to bet against an ongoing interest among Americans for the suburbs.

Bad predictions: actively managed equity funds

An article about diversity in ETFs includes this figure about the prediction abilities of those who pick stocks:

A study by S&P Dow Jones Indices found that from 2006 to mid-2016, 87 percent of all actively managed U.S. equity funds underperformed the market.

In other words: not good. This is plenty of other evidence about this; see the work of Phillip Tetlock. Hence, the rise of ETFs.

One thing that this article on ETF does not address: if more business has moved to different financial instruments, what has happened to all of those stock pickers and hedge fund managers?

Trying to predict the 2017 housing market

This summary of predictions for housing in 2017 includes 17 different estimates from various groups. Here is the one I’m most interested in:

Most observers expect home sales and prices to moderate in the coming year. They say suburbs will make a comeback while the days of low mortgage rates are over.

Suburbs will make a comeback you say? Perhaps there will indeed a Donald Trump effect for suburbs. Here is one more specific suggestion that might contribute to this:

The percentage of people who drive to work will rise for the first time in a decade as homeowners move farther into the suburbs seeking affordable housing.

Cheaper gas probably doesn’t hurt either.

Looking through these 17 predictions, few explicitly apply to suburbs. Most are about two things: millennials (with some help from baby boomers) are driving the housing market and there will be a slow rise in housing values.

One bonus summary statement:

One prediction you can always count on: No matter what’s happening with the economy, NAR is always going to say it’s a great time to buy. Its fourth quarter Housing Opportunities and Market Experience survey found that 70 percent of people say now is a good time to buy a home. NAR also predicts the rate on a 30-year fixed mortgage will rise to 4.6 percent by the end of 2017.

Perhaps there is one prediction missing: will the homeownership rate rise after dropping in previous quarters?

And who is going to check to see if these predictions for 2017 were successful?

Whether driverless cars will benefit suburbs or cities

Some are wondering what kinds of places will benefit most from driverless cars:

Two op-eds published Thursday make the case one way and the other for the driverless car and the American settlement. In Bloomberg View, the economist Tyler Cowen argues that new technology—not just cars, but also virtual reality and the Internet of Things—has advantages that favor the suburbs. In the Wall Street Journal, Uber CEO Travis Kalanick posits that new technology will create “a more livable and less congested” city.

Cohen’s argument is in some ways convincing. He’s right that driverless cars and on-demand delivery could bring perks to the suburbs—a commute spent reading a book, say, or the quick purchase of that one-percent pint—that have traditionally belonged to urbanites. It’s also true that new technologies, like a smart home heating system, are more readily installed in the modern, spacious suburban home than the older urban apartment. (Ask a New Yorker if she’s ever had a garbage disposal.)…

But Kalanick makes a great point in his piece: autonomous transportation is actually the less important component in creating “a city that lives and breathes more easily.” The more important concept is… sharing. Not the bullshit low-paid menial labor that has long characterized the sharing economy, but actual sharing, where two people get in the same car together.

The most radical future is one where self-driving cars are shared, both on a single trip and between trips. A slightly less radical future is one in which individuals are willing to use a car someone else has just used, but prefer to ride alone.

All interesting points. But, I have two larger concerns with either argument:

  1. What if driverless cars allow both suburbs and cities to thrive? In other words, it would allow some to live outside major cities and others to further enjoy city life.
  2. Point #1 is connected to another: transportation technology alone does not dictate choices about where people live and work. It can certainly open up new possibilities. But, the American suburbs in general are not solely the result of the automobile; suburbs were growing before this, partly due to newer technologies like trains and streetcars but also due to solidifying cultural ideas about cities, suburbs, and social life. I could see driverless cars both giving justifications to those who want to live a car-sharing life in the big city while others will make the choice to buy a cheaper yet bigger home further away and let the car handle the longer commute.

It is difficult to make predictions in this case. As the article notes in the final paragraph, regulations and policies could help tilt the scales one way or another. We have seen this before: a variety of policies in the early to mid 1900s helped make suburban living more affordable and palatable to many Americans. The results included white flight, disinvestment in major cities, the creation of new infrastructure such as interstate highways, and the development of the suburban American Dream accessible to many (whites).

The first publicly available “pre-crime” map

A think tank in Rio will soon maintain an online map predicting future crime:

With data from 42 police precincts on crimes committed between January 2010 to March 2016, CrimeRadar tracks some 14 million different crime events. But the app goes beyond mapping historical crimes: Through machine learning and predictive analysis, CrimeRadar will also map out future crime trends—like an open-gov pre-crime heat map…

Muggah says that Igarapé struck a deal with the Institute for Public Security, a state government agency, to build a public-facing mobile app that would show the distribution, intensity, and typologies of crimes across metro Rio. The researchers analyzed data centralized with the ISP along with data from Rio’s 190 system (like 911 in the U.S.) and created 812 categories for crimes. Those break down into capital crimes and violent crimes (like armed assault or intentional homicide), less-intense crimes (thefts, burglaries), and “victimless” crimes (loitering, prostitution).

“We built out a model that uses three data points—the time, the location, and the event—by discriminating in geospatial polygons using these three tiers,” Muggah says. “This algorithm creates a score, a risk score, based on those three data points, for every 250-meter-by-250-meter square unit in the state. You group some of the hundreds of thousands of scores for each sector into deciles to create a simplified, color-coded risk rating, on a scale of 1 to 10.”…

“We have over an 85 percent accuracy of mirroring risk against actual events. The beauty of machine learning is that this improves over time,” Muggah says. “The more data, the more information you feed into it, the higher-resolution your risk projections are going to be.”

Two things strike me as interesting:

  1. The claim that this is for the good of individuals who will be able to then make decisions. What about promoting the public good? This reminds me of apps in the United States that identified tougher neighborhoods but then received backlash.
  2. I’m not sure that 85% accuracy is good or bad. Obviously, such models strive to be much better than that. At the same time, making predictions (and with increasing levels of accuracy regarding times, locations, and actors) in a large city with many variable factors (particularly humans) is difficult. It will be interesting to see how accurate these models can be.