Finding the mean, median, and modal Walmart shopper

Numerator found that Walmart’s typical shopper in the US is a white woman between 55 and 64 years old, who is married and living in the suburbs of the Southeast. She typically has an undergraduate degree and earns about \$80,000 per year.

She visits Walmart at least once per week — about 63 trips per year — and picks up 13 products for a total cost of about \$54 per trip. 13.5% of her spending takes place at Walmart, while she spends about 11% at Amazon.

Her primary shopping categories in-store are groceries, including chicken, fruit, snacks and sweets, but she also gets a lot of fast food. Her favorite five brands at Walmart are Turkey Knob, Cheetos, Betty Crocker, Dole, and Tyson.

I am always looking for examples to help illustrate the differences between the three primary measures of central tendency: mean, median, and mode. When an article or report says something is “typical,” what exactly do they mean? Here is my guess at which data above is which measure of central tendency:

-mean: age, education level, visits to Walmart, money spent per trip

-median: income

-mode: race/ethnicity, marital status, place of residence, what is purchased

Some of these are harder to guess or do not fit these three options well. For example, is the \$54 per visit a mean or median? Or, the five favorite brands are not a singular mode and they may lead the list of brands but not actually comprise that much of the total percent of purchases.

Additionally, it would be interesting to add measures of variability. How much variation is there in the age and education level of Walmart shoppers? I would guess the company wants to know more about the \$54 spent per trip; how many spend more and what could be done to increase the number of people who spend more? Throw in a standard deviation or some other measure of dispersion and the numbers above become much more interesting.

In the end, the report above does not mean that someone visiting a Walmart will find most shoppers fit that profile. The measures of central tendency here tell us something but using multiple measures plus some measures of variability would provide more in terms of revealing who is at Walmart.

The problem with using averages as illustrated by the average salaries of NBA players

In negotiations between NBA owners and players, the topic of the “average player salary” has come up. This discussion illustrates some of the issues involved with  using averages and medians:

Here is the “average player salary” for each of the major U.S. professional team sports, based on a variety of sources using the most recent data available:

NBA: \$5.15 million (2010-11)

MLB: \$3.34 million (2010)

NHL: \$2.4 million (2010-11)

NFL: \$1.9 million (2010)

From the public’s view, these numbers are high in all four sports. But players and agents argue that these averages obscure important distinctions including the value of certain positions over others (the quarterback in the NFL versus the punter) and the size of the roster (fewer NBA players, more NFL players).

One common solution to problems with averages is to instead use a median. Here is how this might change the discussion in the NBA:

“It’s the median salary that’s more important,” NBA agent Bill Duffy said. “Look at the Miami Heat as an analogy here: You’ve got three guys making \$17 million and probably six guys making \$1.2 [million]. So that’s a little misguided, that average salary.”…

It is not unlike, Duffy said, news stories that cite the “average” U.S. household income as opposed to the median. The latter figure, according to the most recent U.S. census, was \$50,233. If you were to average in the dollar amounts pulled down by Wall Street bankers, Ivy League lawyers, certain public-union employees and yes, professional athletes, that number would jump considerably.

Curiously, neither the NBA nor the NBPA seems to make much use of a median player salary.

“We use [average] because it’s the most commonly used measure and best reflects the amount of compensation that the NBA provides to players across the league,” an NBA spokesman said this week. “In addition, it’s the measure that both we and the union agreed upon in the CBA.”

In the NFL, the median salary is approximately \$770,000 — about 40 percent of the average.

In the NBA, using USA Today salary figures for the 2009-10 season, the estimated median salary was about \$2.33 million. That’s still about 46 times what the median U.S. household earns, but it is less than half what the max-salary-bloated “average” is.

What happens in these sports is this: a small number of star athletes make huge amounts of money, pulling the average for all athletes up. If you use the median instead, where 50% of the players make more and 50% more make less, it suggests that more of the athletes in each sport make less. Particularly in the NFL which has bigger rosters, the difference between the average and the median shows that many players make very little.

It is interesting that the NBA spokesman said the two sides had agreed in their Collective Bargaining Agreement that they would use the average salary figure. Was this really a point of contention negotiations or did no one really think about the consequences? What was the thinking behind this for the players? If the union was focused on helping all of their members, perhaps they would focus on the median, suggesting that they are strongest when all of their members are well taken care of. This lower figure might also look more palatable to the public though it is unclear whether public perceptions have any influence on such negotiations. However, if the union was more interested in making sure that individual athletes could receive the biggest possible payouts because of their athletic exploits, then perhaps the average is better.

Two takeaway points:

1. Averages and medians are both measures of central tendency but they are open to different interpretations. People need to be clear about which they are using and which interpretation their number interprets.

2. It will be interesting to see if the new CBA is based on average or median salaries.