When I teach Statistics and Research Methods, we talk a little about how researchers make decisions about creating and using categories for data they have. As this example of recommendations about fertility notes, creating categories can be a tricky process:
Being 35 or older is labeled by the medical community as “advanced maternal age.” In diagnosis code speak, these patients are “elderly,” or in some parts of the world, “geriatric.” In addition to being offensive to most, these terms—so jarringly at odds with what is otherwise considered a young age—instill a sense that one’s reproductive identity is predominantly negative as soon as one reaches age 35. But the number 35 itself, not to mention the conclusions we draw from it, has spun out of our collective control…
The 35-year-old threshold is not only known by patients, it is embraced by doctors as a tool that guides the care of their patients. It’s used bimodally: If you’re under 35, you’re fine; if you’re 35 or older, you have a new host of problems. This interpretation treats the issue at hand as what is known as a “threshold effect.” Cross the threshold of age 35, it implies, and the intrinsic nature of a woman’s body has changed; she falls off a cliff from one category into another. (Indeed, many of my patients speak of crossing age 35 as exactly this kind of fall, with their fertility “plummeting” suddenly.) As I’ve already stated, though, the age-related concerns are gradual and exist along a continuum. Even if the rate of those risks accelerates at a certain point, it’s still not a quantum leap from one risk category to another.
This issue comes up frequently in science and medicine. In order to categorize things that fall along a continuum, things that nature itself doesn’t necessarily distinguish as being separable into discrete groups, we have to create cutoffs. Those work very well when comparing large groups of patients, because that’s what the studies were designed to do, but to apply those to individual patients is more difficult. To a degree, they can be useful. For example, when we are operating far from those cutoffs—counseling a 25-year-old versus a 45-year-old—the conclusions to draw from that cutoff are more applicable. But operate close to it—counseling a 34-year-old trying to imagine her future 36-year-old self—and the distinction is so subtle as to be almost superfluous.
The trade-offs seem clear. A single point where the data turns from one category to another, an age of 35, simplifies the research findings (though the article suggests they may not actually point to 35) and allows doctors and others to offer clear guidance. The number is easy to remember.
A continuum, on the other hand, might better fit the data where there is not a clear drop-off at an age near 35. The range offers more flexibility for doctors and patients to develop an individualized approach.
Deciding which is better requires thinking about the advantages of each, the purpose of the categories, and who wants what information. The “easy” answer is that both sets of categories can exist; people could keep in mind a rough estimate of 35 while doctors and researchers could have conversations where they discuss why that particular age may or may not matter for a person.
More broadly, learning more about continuums and considering when they are worth deploying could benefit our society. I realize I am comfortable with them; sociologists suggest many social phenomena fall along a continuum with many cases falling in between. But, this tendency toward continuums or spectrums or more nuanced or complex results may not always be helpful. We can decry black and white thinking and yet we all need to regularly make quick decisions based on a limited number of categories (I am thinking of System 1 thinking described by behavioral economists and others). Even as we strive to collect good data, we also need to pay attention to how we organize and communicate that data.