The perils of analyzing big real estate data

Two leaders of Zillow recently wrote Zillow Talk: The New Rules of Real Estate which is a sort of Freakanomics look at all the real estate data they have. While it is an interesting book, it also illustrates the difficulties of analyzing big data:

1. The key to the book is all the data Zillow has harnessed to track real estate prices and make predictions on current and future prices. They don’t say much about their models. This could be for two good reasons: this is aimed at a mass market and the models are their trade secrets. Yet, I wanted to hear more about all the fascinating data – at least in an appendix?

2. Problems of aggregation: the data is analyzed usually at a metro area or national level. There are hints at smaller markets – a chapter on NYC for example and another looking at some unusual markets like Las Vegas – but there are not different chapters on cheaper/starter homes or luxury homes. An unanswered questino: is real estate within or across markets more similar? Put another way, are the features of the Chicago market so unique and patterned or are cheaper homes in the Chicago region more like similar homes in Atlanta or Los Angeles compared to more expensive homes across markets?

3. Most provocative argument: in Chapter 24, the authors suggest that pushing homeownership for lower-income Americans is a bad idea as it can often trap them in properties that don’t appreciate. This was a big problem in the 2000s: Presidents Clinton and Bush pushed homeownership but after housing values dropped in the late 2000s, poorer neighborhoods were hit hard, leaving many homeowners to default or seriously underwater. Unfortunately, unless demand picks up in these neighborhoods (and gentrification is pretty rare), these homes are not good investments.

4. The individual chapters often discuss small effects that may be significant but don’t have large substantive effects. For example, there is a section on male vs. female real estate agents. The effects for each gender are small: at most, a few percentage points difference in selling price as well as slight variations in speed of sale. (Women are better in both categories: higher prices, faster sales.)

5. The authors are pretty good at repeatedly pointing out that correlation does not mean causation. Yet, they don’t catch all of these moments and at other times present patterns in such a way that distort the axes. For example, here is a chart from page 202:

ZillowTalkp202

These two things may be correlated (as one goes up so does the other and vice versa) but why fix the axes so you are comparing half percentages to five percentage increments?

6. Continuing #4, I supposed a buyer and seller would want to use all the tricks they can but the tips here mean that those in the real estate market are supposed to string along all of these small effects to maximize what they get. On the final page, they write: “These are small actions that add up to a big difference.” Maybe. With margins of error on the effects, some buyers and sellers aren’t going to get the effects outlined here: some will benefit more but some will benefit less.

7. The moral of the whole story? Use data to your advantage even as it is not a guarantee:

In the new realm of real estate, everyone faces a rather stark choice. The operative question now is: Do you wield the power of data to your advantage? Or do you ignore the data, to your peril?

The same is true of the housing market writ large. Certainly, many macro-level dynamics are out of any one person’s control. And yet, we’re better equipped than ever before to choose wisely in the present – to make the kinds of measured judgments that can prevent another coast-to-coast bubble and calamitous burst. (p.252)

In the end, this book is aimed at the mass market where a buyer or seller could hope to string together a number of these small advantages. Yet, there are no guarantees and the effects are often small. Having more data may be good for markets and may make participants feel more knowledgeable (or perhaps more overwhelmed) but not everyone can take advantage of this information.

Three reasons Millennials are driving less and going fewer places overall

A new study attributes less driving among Millennials to three factors:

The truth might be a little of this, a little of that, and even some of the other. That’s the takeaway from a new analysis of Millennial driving habits from transport scholar Noreen McDonald of the University of North Carolina. Writing in the Journal of the American Planning Association, McDonald attributes 10 to 25 percent of the driving decline to changing demographics, 35 to 50 percent to attitudes, and another 40 percent to the general downward shift in U.S. driving habits…

What makes McDonald’s work especially useful and compelling is that she compared the travel patterns of Millennials (born between 1979 and 1990, by her definition) with those of Generation X (born 1967-1978) at the same age. So she looked at driving data (both trips and miles) from tens of thousands of individuals in 1995, 2001, and 2009 alike.

But, it isn’t just that Millennials are driving less – they are going fewer places overall.

This analysis provides evidence of a long-term decrease in automobility that started in the late 1990s with younger members of Gen X and has continued with the Millennial generation. The decrease in driving has not been accompanied by an increase in other modes of travel or a decline in average trip length, meaning that younger Americans are increasingly going fewer places.

Those smartphones are media gadgets are pretty compelling and make accessing the rest of the world easier. Perhaps there is less need to wander and display independence by leaving the house. Maybe all those fears about crime out there have crept in for a whole generation.

If local mobility is reduced, does this mean this newer generation of Americans will have less geographic mobility within the United States (fewer moves or significant moves throughout their lives)?

New Federal website shows complaints about mortgage lenders

Thanks to the Consumer Financial Protection Bureau, there is a new website for narratives of consumer complaints regarding mortgage lenders:

The bureau logs each complaint by category in a publicly viewable database and gives the company that is the subject of a complaint time to respond via a nonpublic online portal connecting it with the consumer through a bureau intermediary. In the past three years, according to the bureau, it has received and worked on more than 627,000 complaints. They range from alleged harassment by debt-collection attorneys, to foreclosures, student loan defaults and poor treatment of customers by loan servicers. Roughly 28 percent of all complaints filed to date have been about mortgage issues — the largest single category. What’s been missing, though, has been any real detail about the troubling circumstances that triggered the complaint in the first place expressed in the customer’s own words.

Starting in late June, that all changed. The bureau began posting what it calls “narratives” that name the bank or company involved and go into sometimes excruciating detail. Allegations get pretty serious — charges of lending fraud, violations of federal regulations and illegal overcharges. Some are heartfelt, such as one from a Virginia homebuyer whose closing was repeatedly delayed by the bank: “Who compensates us for the loss of income for the days taken off from work (to attend closings)? For the movers that have been scheduled? For the pre-move-in renovations that cannot now be done because the contractors are fully scheduled for the rest of the summer?” (To see the narratives, go to http://tinyurl.com/phnkq99)

The first batch of 7,700-plus narratives was posted June 25, including hundreds of mortgage complaints. The consumer’s name and address — other than state of residence — are redacted, as are all details the bureau or the consumer considers ?private.

Lenders are not permitted to post their own narratives, but instead must use one of several stock responses, such as “company can’t verify or dispute the facts in the complaint” or “company believes it acted appropriately as authorized by contract or law.” Lenders can also decline to participate in the narratives process by saying, “Company chooses not to provide a public response.”

The article suggests two large threads emerge from the complaints: dislike of being placed in customer service hell without getting answers from anyone and problems with escrow accounts.

Not surprisingly, lenders are not happy with this information on the website. The issue is similar to that which plagues many online reviews: how can businesses or readers be sure that the story or review is credible? Yet, this certainly puts more information on the side of consumers and this is needed in an industry that holds so much debt for so many people.

These narratives posted online would make for some good coding opportunities for social scientists…

NYC Council to Google: mark truck routes, no left turns

Two members of the New York City council have two recommendations for the routes provided by Google Maps:

Council members Brad Lander, deputy leader of policy for the council, and Ydanis Rodriguez, who chairs the council’s transportation committee, wrote a letter to Google on July 1 suggesting two enhancements to the company’s maps. One would create a “stay on truck routes” option for truck drivers. The other, which has a much broader application, would allow users to select “reduce left turns,” minimizing the number of such turns required on a given trip.

Why reduce left turns? In their letter, Lander and Rodriguez cited an extensive report from WNYC reporter Kate Hinds about the danger of left turns by motor vehicles in an urban environment where lots of people travel on foot and by bicycle. According to data compiled by Hinds and her colleagues, 17 pedestrians and three bicyclists were killed in New York by left-turning vehicles last year. The fatality rate for pedestrians struck by drivers making lefts in the city is the highest in the nation, according to Hinds’s report…

The city’s department of transportation has been redesigning intersections to make left turns safer by changing signals and incorporating other design measures. But Lander and Rodriguez got the idea to ask Google to help by giving its map users the chance to request a “reduce left turns” routing option. “We haven’t heard back yet,” says Rodriguez. “But we hope, knowing that Google is one of those good private entities, that Google can look at this.”…

Nationally, a quarter of motor-vehicle crashes involving pedestrians occur during left turns. A 2013 study found that when drivers make “permitted” left turns—in which they do not have the protection of a left-turn green arrow—they are not even looking to see if there is a pedestrian in their path as much as 9 percent of the time. Such turns, the study found, pose an “alarming” level of risk to pedestrians.

Generally, I would be in favor of Google Maps and others programs offering more route options for those who have particular routes they might want to choose. Routes with late night gas stations? Routes that are more scenic? Routes that avoid long stretches of strip malls? Scenic routes? Routes that involve driving near fewer semis? Routes with more interesting sights along the way? Just like Google Mail has lab features you can turn on and off, why not do some of this for driving routes?

Even if Google makes the left turn information available as an option, how much of an effect would it have on safety? The average driver probably doesn’t think much about reducing left turns. So, Google could help by suggesting people might want this but I could also imagine a public campaign advising against left turns. Now, if Google started eliminating left turns without telling people, that could get interesting…

The ongoing mystery of counting website visitors

The headline says it all: “It’s 2015 – You’d Think We’d Have Figured Out How to Measure Web Traffic By Now.”

ComScore was one of the first businesses to take the approach Nielsen uses for TV and apply it to the Web. Nielsen comes up with TV ratings by tracking the viewing habits of its panel — those Nielsen families — and taking them as stand-ins for the population at large. Sometimes they track people with boxes that report what people watch; sometimes they mail them TV-watching diaries to fill out. ComScore gets people to install the comScore tracker onto their computers and then does the same thing.

Nielsen gets by with a panel of about 50,000 people as stand-ins for the entire American TV market. ComScore uses a panel of about 225,000 people4 to create their monthly Media Metrix numbers, Chasin said — the numbers have to be much higher because Internet usage is so much more particular to each user. The results are just estimates, but at least comScore knows basic demographic data about the people on its panel, and, crucial in the cookie economy, knows that they are actually people.5

As Chasin noted, though, the game has changed. Mobile users are more difficult to wrangle into statistically significant panels for a basic technical reason: Mobile apps don’t continue running at full capacity in the background when not in use, so comScore can’t collect the constant usage data that it relies on for its PC panel. So when more and more users started going mobile, comScore decided to mix things up…

Each measurement company comes up with different numbers each month, because they all have different proprietary models, and the data gets more tenuous when they start to break it out into age brackets or household income or spending habits, almost all of which is user-reported. (And I can’t be the only person who intentionally lies, extravagantly, on every online survey that I come across.)…

And that’s assuming that real people are even visiting your site in the first place. A study published this year by a Web security company found that bots make up 56 percent of all traffic for larger websites, and up to 80 percent of all traffic for the mom-and-pop blogs out there. More than half of those bots are “good” bots, like the crawlers that Google uses to generate its search rankings, and are discounted from traffic number reports. But the rest are “bad” bots, many of which are designed to register as human users — that same report found that 22 percent of Web traffic was made up of these “impersonator” bots.

This is an interesting data problem to solve with multiple interested parties from measurement firms, website owners, people who create search engines, and perhaps, most important of all, advertisers who want to quantify exactly which advertisements are seen and by whom. And the goalposts keep moving: new technologies like mobile devices change how visits are tracked and measured.

How long until we get an official number from the reputable organization? Could some of these measurement groups and techniques merge – consolidation to cut costs seems to be popular in the business world these days. In the end, it might not be good measurement that wins out but rather which companies can throw their weight around most effectively to eliminate their competition.

Use Airbnb to try a neighborhood before you buy a home

The neighborhood is an important part of purchasing a new home so I’m surprised it has taken so long to get to a solution like this: use Airbnb to try out the neighborhood before you buy.

Realtor.com and Airbnb have teamed up to show visitors to the realty website what Airbnb rentals are near properties listed for sale, so potential buyers can test-drive a neighborhood.

“This collaboration with Airbnb reinforces our commitment to giving consumers unparalleled insight to make informed real estate decisions,” Ryan O’Hara, chief executive officer of Realtor.com, said in a statement Wednesday. “Our relationship with Airbnb … allows us to reduce some of the unknown factors associated with relocating to a new community.”

I wonder how many people will take advantage of this. Even though Airbnb might make it easier to try the neighborhood yourself, it still requires the effort of signing up, actually staying, and then looking around and/or talking to people. And if you are going to go to the trouble to walk around and talk to people, do you actually need to stay the night? Remember, many neighborhood members may just be trying to avoid each other (examples here and here).

Perhaps the next step in all of this is to find a way for people to stay at the prospective home itself before they buy. Perhaps you could get two days and one night to stay there but have to put down a hefty fee. This may not work if the homeowners are still living there but it would offer an unparalleled look at a major purchase.

Mark Zuckerberg encouraging people to read sociological material

Mark Zuckerberg has been recommending an important every two weeks in 2015 and his list thus far includes a number of works that touch on sociological material:

Zuckerberg’s book club, A Year of Books, has focused on big ideas that influence society and business. His selections so far have been mostly contemporary, but for his eleventh pick he’s chosen “The Muqaddimah,” written in 1377 by the Islamic historian Ibn Khaldun…

Ibn Khaldun’s revolutionary scientific approach to history has established him as one of the foundational thinkers of modern sociology and historiography…

The majority of Zuckerberg’s book club selections have been explorations of issues through a sociological lens, so it makes sense that he is now reading the book that helped create the field.

A Year of Books so far:

  • “The End of Power: From Boardrooms to Battlefields and Churches to States, Why Being In Charge Isn’?t What It Used to Be” by Moisés Naím
  • “The Better Angels of Our Nature: Why Violence Has Declined” by Steven Pinker
  • “Gang Leader for a Day: A Rogue Sociologist Takes to the Streets” by Sud hir Venkatesh
  • “On Immunity: An Inoculation” by Eula Biss
  • “Creativity, Inc.: Overcoming the Unseen Forces That Stand in the Way of True Inspiration” by Ed Catmull and Amy Wallace
  • “The Structure of Scientific Revolutions” by Thomas S. Kuhn
  • “Rational Ritual: Culture, Coordination, and Common Knowledge” by Michael Chwe
  • “Dealing with China: An Insider Unmasks the New Economic Superpower” by Henry M. Paulson
  • “Orwell’s Revenge: The 1984 Palimpsest” by Peter Huber
  • “The New Jim Crow: Mass Incarceration in the Age of Colorblindness” by Michelle Alexander
  • “The Muqaddimah” by Ibn Khaldun

An interesting set of selections. At the least, it suggests Zuckerberg is broadly interested in social issues and not just the success of Facebook (whether through gaining users or producing sky-high profits). More optimistically, perhaps Zuckerberg has a sociological perspective and can take a broader view of society. This could be very helpful given that his company is a sociological experiment in the making – not the first social networking site but certainly a very influential one that has helped pioneer new kinds of interactions as well as changed behaviors from news gathering to impression management.

The more cynical take here is that this book list is itself an impression management tool intended to bolster his reputation. Look, I do really want the best for our users and society! However, would this be the set of books that would most impress the public or investors? Listing sociology books as well as books regarding sociological topics may only impress some.

Autonomous vehicles to intensify motion sickness

A new study suggests self-driving cars will make motion sickness worse:

For adults, motion sickness will be more of an issue in self-driving vehicles than in conventional vehicles. Some are expected to experience motion sickness often, while others may actually feel sick every time they’re riding in an autonomous vehicle, a study by researchers at The University of Michigan’s Transportation Research Institute revealed…

Mr. Sivak and his co-researcher Brandon Schoettle looked at the three main factors that cause motion sickness (conflict between vestibular and visual inputs; inability to anticipate the direction of motion; and lack of control over the direction of motion) and determined that they are elevated in self-driving vehicles…

It’s become evident that self-driving cars will replace traditional cars in the future, and when this happens, all adults (who are most prone to motion sickness) will be passengers at all times. Mr. Sivak clarified that being a passenger in an autonomous vehicle will be quite different than riding along in a train or other mode of public transportation, for, unlike trains, self-driving cars will be subject to more lateral acceleration/deceleration as well as longitudinal acceleration/deceleration that is drastically less smooth. The small windows won’t help either.

The other major factor in the increased prevalence of motion sickness is what adults will do whilst in cars instead of driving. In an opinion survey of 3,255 adults from the U.S., China, India, Japan, Australia and the U.K., respondents named reading, talking/texting, sleeping, watching movies/TV, working and playing games as the activities they’ll engage in while riding in self-driving cars. According to the study, almost all of the activities mentioned worsen the frequency and severity of motion sickness.

An interesting side effect of a new technology. But, this means that automakers can/should include virtual reality devices to ease the ride – they won’t just be cars but rather entertainment pods! Everyone can be like the kids of today who ride in the expensive minivans and SUVs watching their split screen entertainment systems while holding their tablets and smartphones…

“Why Is My Smart Home So Stupid?”

A marketing professor gives an answer to this simple question:

One popular answer is that the Internet of Things is still in its infancy and that better technology and standards are within reach and will lead to greater integration, and thus, greater smartness in the not too distant future.

There is some value in this explanation. Everyone who has ever tried to get an IP camera to work on a cell phone will probably agree. But this answer is also entirely steeped in a technological mindset and the naive belief that better technology will automatically improve our lives.

An alternative explanation may be that popular tropes such as the “Internet of Things” not only inspire but also constrain our imagination as innovators and as consumers. Designing greater customer experiences and, thus, extracting greater economic value may be a matter of avoiding this trope altogether…

One managerial implication we can derive from Epp, Schau, and Price is that different smart home definitions are possible. And Nest’s definition seems much more powerful than Plum’s. Plum adds yet another layer to the Internet of Things, and the result is often a home where everything is connected but nothing adds up. In sharp contrast, Nest succeeds by putting its technology in service of a much higher sociological goal: the age-old quest to create and sustain a happy family

The suggestion here is that new technology is only as good as the improvements in social interactions that it brings. Way before the smart home, modern consumers have been promised all sorts of benefits from new technology but the created items don’t always lead to the desired social outcomes. Cars enabled easier transportation but led to more private existences and increasing sprawl. Similarly, more single-family homes gave people space but helped spread them out. The radio and later television delivered mass media, theoretically connecting people, but also led to people sitting around these items. Modern appliances were to save labor. The Internet allows unprecedented customized access to information yet can lead to echo chambers and isolated interactions. Autonomous vehicles will create more free time or more time to work?

Perhaps this should be a challenge for smart home innovators: how can new devices both help in their particular area (say heating or lighting or saving energy) and foster social interaction? This may actually be the harder part.

How to pronounce McMansion (courtesy of YouTube)

The Internet might bring some wonderful things but it can also make you scratch your head. Here is a YouTube video for how to pronounce McMansion. The video doesn’t exactly have a lot of views – 1 after I watched it! – and comes courtesy of DictionaryVoice.com.

Two quick thoughts:

1. I admit that I have looked up pronunciations through online dictionaries and had the site read to me. This can be a very handy Internet tool.

2. The next video YouTube plays after this one is the song “Jesusland” from Ben Folds. This is one of the few pop songs I know that mention McMansions and, fitting the common use of the word, Folds uses it as part of this critique of Middle America. Here is the portion of the song where it comes up:

Down the tracks, beautiful McMansions on a hill
That overlook a highway with riverboat casinos
And you still have yet to see a soul

Not too different from those depictions in Gone Girl