The most common words found in American real estate listings

A new analysis looks at the 100 most common words found in American real estate listings with “beautiful” sitting at #1:

Point2Homes, which provides marketing services to real estate agents, ran the numbers on 300,000 active listings in the United States in the first half of 2012 to see which household features and characteristics were thought by their listing agents to attract buyer interest. Though such chestnuts as “must see” and “spacious” pervade the listing verbiage, it was interesting to note which other specifics appear to merit singling out, according to Roxana Baiceanu, a spokesman for the company…

But she said the analysts were a bit surprised to see the emphasis on such specifics as hardwood floors and stainless steel, which placed second and third, respectively, on the overall frequency list.

In all, the top 100 terms aren’t particularly surprising — nearly every listing in the history of American real estate would have you believe that there’s no such thing as an unappealing home. That list includes such predictables as “stunning,” “sunny,” “finest,” “perfect,” “super” and “spectacular,” along with more concrete features such as “home office,” “soaking tub” and “dishwasher.” But when Point2 started breaking the findings into geographic regions and price segments, it was a little more revealing…

Geographically, homes for sale in the Midwest and along the East Coast seem stuck in that “beautiful” rut, where that word held the No. 1 spot. But on the West Coast and in the South, “stainless steel appliances” went to the top of the heap, she said. Midwesterners also liked “fireplaces,” which showed up with two variations in the top 10; Eastern states placed a premium on “move-in condition;” the South was the only region to put the legendarily coveted “granite countertops” in its top tier of listing terms.

It is interesting to see stainless steel and hardwood floors up there. While these might be desirable features, they are relatively quick fixes to homes while other features, such as “open concept,” are harder to change.

This list suggests several things to me:

1. Selling a home involves a lot of marketing. This is obvious but seeing this list full of vague and positive words is an extra reminder.

2. This list is like a set of code words. If you aren’t familiar with real estate listings, these may strike you in one way but if you commonly see such words, you can read between the lines.

3. I wonder what happens to homes whose listings don’t feature these common words. Is there a penalty? Would this help the home stand out to a particular kind of buyer?

Pew again asks for one-word survey responses regarding budget negotiations

I highlighted this survey technique in April but here it is again: Pew asked Americans to provide a one-word response to Congress’ debt negotiations.

Asked for single-word characterizations of the budget negotiations, the top words in the poll — conducted in the days before an apparent deal was struck — were “ridiculous,” “disgusting” and “stupid.” Overall, nearly three-quarters of Americans offered a negative word; just 2 percent had anything nice to say.

“Ridiculous” was the most frequently mentioned word among Democrats, Republicans and independents alike. It was also No. 1 in an April poll about the just-averted government shutdown. In the new poll, the top 27 words are negative ones, with “frustrating,” “poor,” “terrible,” “disappointing,“ “childish,” “messy” and “joke” rounding out the top 10.

And then we are presented a word cloud.

On the whole, I think this technique can suggest that Americans have generally unfavorable responses. But the reliance on particular terms is better for headlines than it is for collecting data. What would happen if public responses were split more evenly: what words/responses would then be used to summarize the data? The Washington Post headline (and Pew Research as well) can now use forceful and emotional words like “ridiculous” and “disgusting” rather than the more accurate numerical figures than about “three-quarters of Americans offered a negative word.” Why not also include an ordinal question (strongly disapprove to strongly approve) about American’s general opinion of debt negotiations in order to corroborate this open ended question?

This is a possibly interesting technique in order to take advantage of open ended questions without allowing respondents to give possibly lengthy responses. Open ended questions can produce a lot of data: there were over 330 responses in this survey alone. I’ll be interested to see if other organizations adopt this approach.

Pew using word frequencies to describe public’s opinion of budget negotiations

In the wake of the standoff over a federal government shutdown last week, Pew conducted a poll of Americans regarding their opinions on this event. One of the key pieces of data that Pew is reporting is a one-word opinion of the proceedings:

The public has an overwhelmingly negative reaction to the budget negotiations that narrowly avoided a government shutdown. A weekend survey by the Pew Research Center for the People & the Press and the Washington Post finds that “ridiculous” is the word used most frequently to describe the budget negotiations [29 respondents], followed by “disgusting,” [22 respondents] “frustrating,” [14 respondents] “messy,” [14 respondents] “disappointing” [13 respondents] and “stupid.” [13 respondents]

Overall, 69% of respondents use negative terms to describe the budget talks, while just 3% use positive words; 16% use neutral words to characterize their impressions of the negotiations. Large majorities of independents (74%), Democrats (69%) and Republicans (65%) offer negative terms to describe the negotiations.

The full survey was conducted April 7-10 among 1,004 adults; people were asked their impressions of the budget talks in interviews conducted April 9-10, following the April 8 agreement that averted a government shutdown.

I would be hesitant about leading off an article or headline (“Budget Negotiations in a Word – “Ridiculous”) with these word frequencies since they generally were used by few respondents: the most common response, “ridiculous,” was only given by 2.9% of the survey respondents (based on the figures here of 1,004 total respondents). I think the better figures to use would be the broader ones about negative responses where 69% used negative terms and a majority of all political stripes used a negative descriptor.

You also have to dig into the complete report for some more information. Here is the exact wording of the question:

PEW.2A If you had to use one single word to describe your impression of the budget negotiations in Washington, what would that one word be? [IF “DON’T KNOW” PROBE ONCE: It can be anything, just the first word that comes to mind…] [OPEN END: ENTER VERBATIM RESPONSE]

Additionally, the full report says that this descriptor question was only asked of 427 respondents on April 9-10 (so my above percentage should be altered: it should be 29/427 = 6.8%). So this is a smaller sample answering this particular question; how generalizable are the results? And the most common response to this question is the other category with 202 respondents. Presumably, the “others” are mostly negative since we are told 69% use negative terms. (As a side note, why not separate out the “don’t knows” and “refused”? There are 45 people in this category but these seem like different answers.)

One additional thought I have: at least this wasn’t put into a word cloud in order to display the data.

Modeling “wordquakes”

Several researchers suggest that certain words on the Internet are used in patterns similar to those of earthquakes:

News tends to move quickly through the public consciousness, noted physicist Peter Klimek of the Medical University of Vienna and colleagues in a paper posted on Readers usually absorb a story, discuss it with their friends, and then forget it. But some events send lasting reverberations through society, changing opinions and even governments.

“It is tempting to see such media events as a human, social excitable medium,” wrote Klimek’s team. “One may view them as a social analog to earthquakes.”…

Events that came from outside the blogosphere also seemed to exhibit aftershocks that line up with Omori’s law for the frequency of earthquake aftershocks.

“We show that the public reception of news reports follow a similar statistic as earthquakes do,” the researchers conclude. “One might also think of a ‘Richter scale’ for media events.”

“I always think it’s interesting when people exploit the scale of online media to try to understand human behavior,” said Duncan Watts, a researcher at Yahoo! Research who describes himself as a “reformed physicist who has become a sociologist.”

But he notes that drawing mathematical analogies between unrelated phenomena doesn’t mean there’s any deeper connection. A lot of systems, including views on YouTube, activity on Facebook, number of tweets on Twitter, avalanches, forest fires, power outages and hurricanes all show frequency graphs similar to earthquakes.

“But they’re all generated by different processes,” Watts said. “To suggest that the same mechanism is at work here is kind of absurd. It sort of can’t be true.”

A couple of things are of note:

1. One of the advantages of the Internet as a medium is that people can fairly easily track these sorts of social phenomenon. The data is often in front of our eyes and once collected and put into a spreadsheet or data program is like any other dataset.

2. An interesting quote from the story: the “reformed physicist who has become a sociologist.” This pattern that looks similar to an earthquake is interesting. But sociologists would also want to know why this is the case and what factors affect the initial “wordquake” and subsequent aftershocks. (But it is interesting that the paper was developed by physicists: how many sociologists would look at this word frequency data and think of an earthquake pattern?)

2a. Just thinking about these word frequencies, how does this earthquake model differ from other options for looking at this sort of data? For example, researchers have used diffusion models to examine the spread of riots. Is a diffusion model better than an earthquake model for this phenomena?

3. Does this model offer any predictive power? That is, does it give us any insights into what words may set off “wordquakes” in the future?