During discussions in grant review panels, I’ve often heard
an investigator described as “passionate” about their proposed work. That “passionate” label is usually spoken as
praise and endorsement. But that word
sometimes worries me.
We mental health researchers should certainly be passionate
about problems we hope to solve. Passion
helps us to stay focused on our real goal: improving the lives of people with
mental health conditions. Without that
motivated focus, we can be distracted by bureaucratic frustrations or academic
politics. We certainly have plenty of unsolved
problems to passionately focus on, including the stigmatization of people who
live with mental health conditions, the disappointing effectiveness of our
current treatments and the sorry state of our mental health care system.
But passion about research questions must be distinguished
from passion about any specific answers.
Loyalty to any particular belief or theory is a problematic motivation
for research. Scientific theories and
beliefs are made to be tested and then refined or discarded. Rather than asking what study could prove I’m
right, I should ask what study could show me where my beliefs or theories don’t
hold up to evidence. We do research to
learn where we are wrong, not to confirm that we already believe.
And we should certainly not be passionate about specific programs
or treatments, especially those we helped to create. While we may hope that some new treatment or
program will be helpful, we must be open to the possibility that it will
fail. Studies of “failed” treatments or
programs are often more informative than so-called successes, but only if we are
willing to see them clearly. We should
be especially skeptical when testing treatments or programs in which we have
vested interest – even if that interest is just personal pride rather than
Nor should we be passionate about defending the findings of our
past research. It’s inevitable that much
of my past research will turn out to be incorrect or incomplete. What’s not inevitable is whether I’ll be the
first to discover something new or the last to acknowledge progress.
That’s the challenging but essential balance: remaining both passionate about why we do mental health research and skeptical while we are doing it. It’s a New Year’s resolution that requires renewal every year.
Over the last two months, I’ve helped at some of KP
Washington’s weekend COVID-19 vaccination clinics. After a year of pandemic disruption and worry,
giving vaccine shots is really a joy.
I’ll definitely remember the day we vaccinated over 1,100
A few weeks ago, I was assigned to the post-vaccination
observation area. Watching for allergic
reactions is not as gratifying as giving shots.
But spending several hours watching over the recently vaccinated did
give me time to think about that type of work.
Fortunately, we didn’t see any severe or dangerous reactions to the vaccine. Several people did feel lightheaded or dizzy after vaccination, but those reactions were more ordinary anxiety than allergy. My job was to respond quickly to any serious reaction and not overreact to reactions that weren’t dangerous. Epi-pens, IVs, and stretchers were just out of sight; we’d rather not bring them out unnecessarily. Those interventions are pretty intrusive – and alarming to everyone else watching.
Looking out over the post-vaccine observation area, I was
reminded of my days as a lifeguard at the city pool during high school
summers. That lifeguard job was actually
pretty similar to post-vaccine observation.
We often watched over several hundred people in an afternoon. My job was to react quickly to any real
danger while not diving in to rescue anyone who didn’t need that kind of help. Grabbing someone to drag them to the side of
the pool is also pretty intrusive – and alarming to everyone watching.
And that got me thinking about my current day job in mental
health care – especially our work regarding outreach and population-based
care. There as well, we hope to respond quickly
to people with urgent needs while avoiding any over-reaction that’s intrusive
or alarming. The help we offer shouldn’t
be overly restrictive or overly medicalized.
But how can we know who needs more urgent – or even more intrusive – help? I decided that my post-vaccine observation work could borrow a practice from my lifeguard days. Back then, whenever I fished someone out of the deep end, I’d ask them to sit next to me on the lifeguard bench for a while. And we’d talk a bit. Sitting and talking together was both diagnosis (Is this swimmer ready to go back in the water?) and treatment (“I think you’ll be fine – now enjoy the rest of your day”). So I tried that out in the vaccination clinic. When one of our vaccinated felt dizzy or faint, I’d ask them to come sit next to me so I could take their hand to feel their pulse. We would make small talk so I could see their face and hear their voice. And after a while I could say “I think you’ll be fine – now enjoy the rest of your day.”
Three events in the last month got me thinking about the role that
clinicians and researchers (like me) play in the popular media.
A group of Stanford faculty recently published an opinion
piece in JAMA criticizing Scott Atlas for spreading misinformation and
recommending misguided policies regarding the COVID-19 pandemic. They argued (and I agree) that Atlas, a
non-practicing neuroradiologist, misused the credibility and authority attached
to his medical training. That criticism
of Atlas cited the American Medical Association’s Code of
Medical Ethics Opinion that physicians “…should
ensure that the medical information they provide is…commensurate with their
medical expertise and based on valid scientific evidence.”
Around the same time, I did an interview for a local
TV station regarding effects of the pandemic on mental health and risk of
suicide. I talked about all the ways
that the pandemic and related social and economic disruption could increase
risk for mental health problems and suicidal behavior. And, because the interviewer asked, I gave my
advice about self-care: keeping a regular sleep schedule, establishing a new
daily routine, finding new ways to create or maintain social connection,
avoiding over-use of alcohol or cannabis.
But I was careful to be honest about the limits of current knowledge and
my expertise. We don’t yet know if or how
the pandemic affected rates of suicidal behavior. Emphasizing the limits of our knowledge
probably reduces the chances that my face will appear on television, and that’s
fine with me.
My advice about self-care during the pandemic was
reasonably commensurate with my medical expertise. I am a practicing mental health clinician
rather than a non-practicing neuroradiologist.
So I know something about self-management of common mental health
problems. But I have no greater expertise
about coping with the pandemic than most mental health clinicians. And my expertise about coping with widespread
fear and loss was mostly common sense rather than new knowledge gained through
It’s not wrong to promote or reinforce common-sense
messages about self-care. Even if I have
no special expertise, my status as a clinician and researcher may lend those useful
messages more credibility. But I should
be careful not to claim special authority or credibility that I don’t actually
own. Media appearances are just a tool to support public
health messages rather than an end unto themselves. Aiming to become an “influencer” can lead
down a slippery slope away from medical expertise and valid scientific
And I must be especially careful about confusing medical
or scientific expertise with my personal values. All researchers and clinicians are certainly entitled
to hold and express personal values. Scott
Atlas is entitled to his values, and I am entitled to mine. And I suspect our two sets of values have
just about zero overlap. In my opinion,
Atlas repeatedly failed to separate his values (personal “liberty” above all
else) from his medical expertise or scientific evidence. Having seen that, I try to be more careful
about the distinction. I practice
starting statements about my values with “I believe that…” and statements about
research (mine or others’) with “We have evidence that…”. My values are central to my motivation, but
they are not evidence.
written that healthcare research needs a Journal of Silly Mistakes We
Almost Made. Until that journal is
established, I’ll have to share my examples with readers of this blog. It’s a small audience, but I like to think
it’s a discerning one. Here’s the latest
example for that hypothetical journal:
The COVID-19 pandemic has been called the “perfect
storm” of increased risk for suicide.
Concerns about increasing suicide mortality have even entered the political
debate about the negative effects of restrictions on businesses, public
gatherings, and in-person schooling. But
that discussion has been a largely data-free zone. Official US national statistics on suicide
deaths in 2020 will not be released until the fall of 2021.
We hoped that interim mortality data from state departments of health might allow an early look at the effects of the pandemic on suicide mortality. Our first look at Washington state data for January through June of 2020 seemed to indicate a decrease in suicide deaths during April and May. But the number of suicide deaths recorded for June was implausibly low, so we thought we should wait for the next quarterly update before trusting data for April and May.
New state data for deaths recorded through October told a
very different story. After including the
additional suicide deaths in the new data, the appearance of decreased suicide
mortality between April and June almost completely disappeared. And the new data for deaths officially recorded
during the fall added significantly to suicide deaths occurring as far back as
February and March. So we suspect that
suicide death statistics for April, May, and June may still not be complete.
Using the most recent data, we can probably say that there
was no dramatic decrease in suicide mortality in Washington during the first
three months of the pandemic. But we
can’t yet rule out the possibility of a significant increase.
Although the most recent state mortality data added only a
few deaths overall during the first quarter of 2020, most of those delayed
reports were suicide deaths. It makes
sense that suicide deaths would be classified and recorded more slowly than
“medically attended” deaths occurring in a hospital. We’re now looking at delays in reporting of
suicide deaths across MHRN health systems.
It’s possible that the delays we’ve seen are unique to Washington state
or unique to the chaos of 2020. We’ll
only know that if we look at trends in multiple states, checking carefully for
signs that data are not complete. I
expect we’ll soon have a clearer picture about the spring and early
summer. But we may need to wait a few
more months to have accurate data about the fall.
Because we’ve learned from previous mistakes we almost made, we didn’t rush to press with the early news that suicide mortality in Washington state appeared to decrease during the COVID-19 pandemic. Our new message – that we just can’t know the answer yet – will not make any headlines. Others have used early 2020 mortality records to report that overall suicide mortality decreased during the first months of the pandemic. I hope they are right. And I certainly hope they were careful.
One of my favorite Seattle bike rides goes south along the
shore of Lake Washington to Seward Park and then back north to Capitol
Hill. That route was especially pleasant
during the past summer when Lake Washington Boulevard was temporarily closed to
On sunny days when biking is most inviting, the prevailing winds usually blow from the north. That means I’ve got the wind behind me traveling south and against me when I’m heading home. Unless the wind is especially strong, though, I rarely notice the tailwind pushing me south down the lake. I only notice the headwind when I turn around to head back north.
During the past summer, failing to notice the wind behind me became my personal metaphor for privilege. When I’m getting help from my usual tailwind, I still feel like I’m pedaling hard. And I usually am pedaling hard. My legs get at least some of the credit (maybe even most of the credit) for speeding south down the lake. But the wind behind me is certainly an advantage, whether I notice it or not. It’s easy not to notice the help I’ve been getting until it disappears.
Coincidentally, the prevailing wind direction mirrors the social geography of Seattle. People who would hope to move from South Seattle up to Capitol Hill will usually face more of a headwind.
I’m trying to frequently remind myself that the prevailing winds have more often blown in my direction. It’s not as if I’ve never been treated unfairly or faced any headwind. But when that happens, I usually take notice – because it’s not very common. Noticing my tailwind takes a more conscious effort since I’m much more accustomed to a gentle breeze at my back.
Controversy regarding potential treatments and vaccines for
COVID-19 has brought arguments about randomized trial methods into the public
square. Disagreements about interim
analyses and early stopping of vaccine trials have moved from the methods
sections of medical journals to the editorial
section of the Wall Street Journal. I never imagined a rowdy public debate about
the more sensitive Pocock
stopping rule versus the more patient O’Brien-Fleming rule. A randomized trial nerd like me should be heartened
that anyone even cares about the difference.
I imagine it’s how Canadians feel during Winter Olympic years when the
rest of the world actually watches curling.
But the debates have grown so heated that choice of a statistical
stopping rule has become a test of political allegiance. And that drove me to some historical reading
about the concept of equipoise.
understood, the ethical requirement for equipoise in randomized trials applied
to the individual clinician and the individual patient. Each clinician was expected to place their
duty to an individual patient over any regard for public health or scientific
knowledge. By that understanding, a
clinician would only recommend or participate in a randomized trial if they had
absolutely no preference. If they
preferred one of the treatments being compared, they were obligated to
recommend that treatment and recommend against joining a randomized trial.
In 1987, Benjamin
Freedman suggested an alternative concept of “clinical equipoise”, applied
at the community level rather than to the individual patient. A clinician might have a general preference for
one treatment over another. Or the
clinician might believe that one treatment would be preferred for a specific
patient. But that clinician might still
suggest that their patient participate in a randomized trial if the overall
clinical or scientific community was uncertain about the best choice.
Freedman’s discussion of equipoise offers some useful words
for navigating current controversies regarding COVID-19 clinical trials. He suggested that a clinician could ethically
particicipate in a randomized trial or recommend participation to their
patients “if their less-favored treatment is preferred by colleagues whom they
consider to be responsible and competent.”
A clinician is not expected to be ignorant or indifferent (“I have no
idea what’s best for you.”). Instead,
they are expected to be honest regarding uncertainty (“I believe A is probably
better, but many experts I respect believe that B is best.”). Freedman
proposed that a randomized trial evaluating some new treatment is ethically
appropriate if success of that treatment “appears to be a reasonable
gamble.” That language accurately
communicates both the hope and uncertainty central to randomized trials of new
treatments. It also communicates that
odds of success can change with time. A
gamble that appeared reasonable when a randomized trial started may become less
reasonable as new evidence – from within the trial or outside of it –
accumulates. Reasonable investigators
must periodically reconsider their assessments of whether the original gamble
remains reasonable. That’s where those
stopping rules come in.
The concept of clinical equipoise also helps to
frame our MHRN randomized trials of new mental health services or programs. We are often studying treatments or programs
that we have developed or adapted, so we are neither ignorant nor indifferent. We certainly care about the trial outcome,
and we usually hope the new treatment we are testing will be found helpful. But we do randomized trials because our
beliefs and hopes are not evidence.
Projections of the course of the COVID-19 pandemic prompted vigorous arguments about which disease model performed the best. As competing models delivered divergent predictions of morbidity and mortality, arguments in the epidemiology and medical twitterverse grew as rowdy as the crowd in that male model walk-off from the movie Zoolander. Given the rowdy disagreement among experts, how can we evaluate the accuracy of competing models yielding divergent predictions? With all respect to the memory of David Bowie (the judge of that movie modeling competition), isn’t there some objective way of judging model performance? I think that question leads to a key distinction between types of mathematical models.
Most models predicting individual-level events (such as our MHRN models predicting suicidal behavior) follow an empirical or inductive approach. An inductive model begins with “big data”, usually including a large number of events and a large number of potential predictors. The data then tell us which predictors are useful and how much weight each should be given. Theory or judgment may be involved in assembling the original data, but the data then make the key decisions. Regardless of our opinions about what predictors might matter, the data dictate what predictors actually matter.
In contrast, models predicting population-level change (including many competing models of COVID-19 morbidity and mortality) often follow a mechanistic or deductive approach. A deductive model assumes a mechanism of the underlying process, such as the susceptible-infected-recovered model of infectious disease epidemics. We then attempt to estimate key rates or probabilities in that mechanistic model, such as the now-famous reproduction number or R0 for COVID-19. “Mass action” models apply those rates or probabilities to groups, while “Agent-based” models apply them to simulated individuals. In either case, though, an underlying structure or mechanism is assumed. And key rate or probabilities are usually estimated from multiple sources – usually involving at least some expert opinion or interpretation.
Judging the performance of empirical or inductive prediction models follows a standard path. At a minimum, we randomly divide our original data into one portion for developing a model and a separate portion for testing or validating it. Before using a prediction model to inform practice or policy, we would often test how well it travels – by testing or validating it in data from a later time or a different place. So far, our MHRN models predicting individual-level suicidal behavior have held up well in all of those tests.
That empirical validation process is usually neither feasible nor reasonable with mechanistic or deductive models, especially in the case of an emerging pandemic. If our observations are countries rather than people, we lack the sample size to divide the world into a model development sample and a validation sample. And it makes no sense to validate COVID-19 predictions based on how well they travel across time or place. We already know that key factors driving the pandemic vary widely over time and place. We could wait until the fall to see which COVID-19 model made the best predictions for the summer, but that answer will arrive too late to be useful. Because we lack the data to judge model performance, the competition can resemble a beauty contest.
I am usually skeptical of mechanistic or deductive models. Assumed mechanisms are often too simple. In April, some reassuring models used Farr’s Law to predict that COVID-19 would disappear as quickly as it erupted. Unfortunately, COVID-19 didn’t follow that law. Even when presumed mechanisms are correct, estimates of key rates or probabilities in mechanistic models often depend on expert opinion rather than data. Small differences in those estimates can lead to marked differences in final results. In predicting the future of the COVID-19 pandemic, small differences in expectations regarding reproduction number or case fatality rate lead to dramatic differences in expected morbidity and mortality. When we have the necessary data, I’d rather remove mechanistic assumptions and expert opinion estimates from the equation.
But we sometimes lack the data necessary to develop empirical or inductive models – especially when predicting the future of an evolving epidemic. So we will have to live with uncertainty – and with vigorous arguments about the performance of competing models. Rather than trying to judge which COVID-19 model performs best, I’ll stick to what I know I need to do: avoid crowds (especially indoors), wash my hands, and wear my mask!
The controversy around COVID-19 research from the
Surgisphere database certainly got my attention. Not because I have any expertise regarding
treatments for COVID-19. But because I
wondered how that controversy would affect trust in research like ours using
“big data” from health records
In case you didn’t follow the story, here’s my brief
summary: Surgisphere is a data analytics company that claims to gather health
records data from 169 hospitals across the world. In early May, Surgisphere researchers and
academic collaborators published a paper in NEJM showing that drugs used to
treat cardiovascular disease did not increase severity of COVID-19. That drew some attention. In late May, a second paper in Lancet
reported that hydroxychloroquine treatment of COVID-19 had no benefit and could
increase risk of abnormal heart rhythms and death. Given the scientific/political controversy
around hydroxychloroquine, that finding drew lots of attention. Intense scrutiny across the academic internet
and twitterverse revealed that many of the numbers in both the Lancet paper and
the earlier NEJM paper didn’t add up. After
several days of controversy and “expressions
of concern”, Surgisphere’s academic collaborators retracted
the article, reporting that they were not able to access the original data
to examine discrepancies and replicate the main results.
Subsequent discussions questioned why journal reviewers and
editors didn’t recognize the signs that Surgisphere data were not
credible. Spotting the discrepancies,
however, would have required detailed knowledge regarding the original health system
records and the processes for translating those records data into
research-ready datasets. I suspect most
reviewers and editors rely more on the reputation of the research team than on
detailed inspection of the technical processes.
Our MHRN research also gathers records data from hundreds –
or even thousands – of facilities across 15 health systems. We double-check the quality of those data
constantly, looking for oddities that might indicate some technical problem in
health system databases or some error in our processes. We can’t expect that journal reviewers and
editors will triple-check all of our double-checking.
I hope MHRN has established a reputation for trustworthiness, but I’m
ambivalent about the scientific community relying too much on individual or
institutional reputation. I do believe
that the quality of MHRN’s prior work is relevant when evaluating both the
integrity of MHRN data and the capability of MHRN researchers. Past performance does give some indication of
future performance. But relying
on reputation to assess new research will lead to the same voices being
amplified. And those loudest voices will
tend to be older white men (insert my picture here).
I’d prefer to rely on transparency rather than reputation,
but there are limits to what mental health researchers can share. Our MHRN projects certainly share detailed
information about the health systems that contribute data to our research. The methods we use to process original health
system records into research-ready databases are well-documented and widely
imitated. Our individual research
projects publish the detailed
computer code that connects those research databases to our published
results. But we often cannot share patient-level
data for others to replicate our findings.
Our research nearly always considers sensitive topics like mental health
diagnoses, substance use, and suicidal behavior. The more we learn about risk of re-identifying
individual patients, the more careful we get about sharing sensitive data. At some point, people who rely on our
research findings will need to trust the veracity of our data and the integrity
of our analyses.
We can, however, be transparent regarding shortcomings of our
data and mistakes in our interpretation.
I can remember some of our own “incredible” research that could have
turned into Surgisphere-style embarrassments.
For example: We found an alarmingly high rate of death by suicide in the
first few weeks after people lost health insurance coverage. But then we discovered we were using a
database that registered months of complete insurance coverage – so people were
counted as not insured during the month they died. I’m grateful we didn’t publish that dramatic
news. Another example: We found a
disturbingly high number of people seen in emergency department visits for a
suicide attempt had another visit with a diagnosis of self-harm during the next
few days. That was alarming and
disturbing. But then we discovered that the
bad news was actually good news. Most of
those visits that looked like repeat suicide attempts turned out to be timely
follow-up visits after a suicide attempt.
Another embarrassing error avoided.
While we try to publicize those lessons, the audience for those stories is narrow. There is no Journal of Silly Mistakes We Almost Made, but there ought to be. If that journal existed, research groups could establish their credibility by publishing detailed accounts of their mistakes and near-mistakes. As I’ve often said in our research team meetings: If we’re trying anything new, we are bound to get some things wrong. Let’s try to find or mistakes before other people point them out to us.
Even in our MHRN health systems, however, actual uptake of telehealth and virtual care lagged far behind the research. Reimbursement policies were partly to blame; telephone visits and online messaging were usually not “billable” services. But economic barriers were only part of the problem. Even after video visits were permitted as “billable” substitutes for in-person visits, they accounted for only 5 to 10 percent of mental health visits for either psychotherapy or medication management. Only a few of our members seemed to prefer video visits, and our clinicians didn’t seem to be promoting them.
Then the COVID-19 pandemic changed everything almost overnight. At Kaiser Permanente Washington, virtual visits accounted for fewer than 10% of all mental health visits during the week of March 9th. That increased to nearly 60% the following week and over 95% the week after that. I’ve certainly never seen anything change that quickly in mental health care. Discussing this with our colleague Rinad Beidas from U Penn, I asked if the implementation science literature recognizes an implementation strategy called “My Hair is on Fire!” Whatever we call it, it certainly worked in this case.
My anecdotal experience was that the benefits of telehealth or virtual care included the expected and the unexpected. As expected, even my patients who had been reluctant to schedule video visits enjoyed the convenience of avoiding a trip to our clinic through Seattle traffic (or the way Seattle traffic used to be). Also as expected, there were no barriers to serving patients in Eastern Washington where psychiatric care is scarce to nonexistent. It was as if the Cascade mountains had disappeared. One unexpected bonus was meeting some of the beloved pets I’d only heard about during in-person visits. And I could sometimes see the signs of those positive activities we mental health clinicians try to encourage, like guitars and artwork hanging on the walls. It’s good to ask, “Have you been making any art?”. But it’s even better to ask, “Could you show me some of that art you’ve been making?”
As is often the case in health care, the possible negative consequences of this transformation showed up more in data than in anecdotes. My clinic hours still seemed busy, but the data showed that our overall number of mental health visits had definitely decreased. Virtual visits had not replaced all of the in-person visits. That’s concerning, since we certainly don’t expect that the COVID-19 pandemic has decreased the need for mental health care. So we’re now digging deeper into those data to understand who might be left behind in the sudden transition to virtual care. Some of our questions: Is the decrease in overall visits greater in some racial or ethnic groups? Are we now seeing fewer people with less severe symptoms or problems – or are the people with more severe problems more likely to be left behind?
As our research group was starting to investigate these questions, I got a message from one of our clinical leaders asking about a new use for our suicide risk prediction models. They were also thinking about people with the greatest need being left behind. And they were thinking about remedies. Before the COVID-19 outbreak, we were testing use of suicide risk prediction scores in our clinics, prompting our clinicians to assess and address risk of self-harm during visits. But some people at high risk might not be making visits, even telephone or video visits, during these pandemic times. So our clinical leaders proposed reaching out to our members at highest risk rather than waiting for them to appear (virtually) in our clinics.
Collaborating with out clinical leaders on this new outreach program prompted me to look again at our series of studies on telehealth and virtual care. Those studies weren’t about simply offering telehealth as an option. Persistent outreach was a central element of each of those programs. Telehealth is about more than avoiding Seattle traffic or Coronavirus infection. Done right, telehealth and virtual care enable a fundamental shift from passive response to active outreach.
Outreach is especially important in these chaotic times. During March and April, video and telephone visits were an urgent work-around for doing the same work we’ve always done. Now in May, we’re designing and implementing the new work that these times call for.
Adrian Hernandez, Rich Platt, and I recently published a Perspective in New England Journal of Medicine about the pressing need for pragmatic clinical trials to answer common clinical questions. We started writing that piece last summer, long before any hint of the COVID-19 pandemic. But the need for high-quality evidence to address common clinical decisions is now more urgent than we could have imagined.
Leaving aside heated debates regarding the effectiveness of hydroxychloroquine or azithromycin, we can point to other practical questions regarding use of common treatments by people at risk for COVID-19. Laboratory studies suggest that ibuprofen could increase virus binding sites. Should we avoid ibuprofen and recommend acetaminophen for anyone with fever and respiratory symptoms? Acetaminophen toxicity is not benign. Laboratory studies suggest that ACE inhibitors, among the most common medications for hypertension, could also increase virus binding sites. Should we recommend against ACE inhibitors for the 20% of older Americans now using them daily? Stopping or changing medication for hypertension certainly has risks. Laboratory studies can raise those questions, but we need clinical trials to answer them with any certainty.
Pragmatic or real-world clinical trials are usually the best method for answering those practical questions. Pragmatic trials are embedded in everyday practice, involve typical patients and typical clinicians, and study the way treatments work under real-world conditions. Compared to traditional clinical trials, done in specialized research centers under highly controlled conditions, pragmatic trials are both more efficient (we can get answers faster and cheaper) and more generalizable (the answers apply to real-world practice).
Pragmatic trials are especially helpful when alternative interventions could have different balances of benefits and risks for different people – like ibuprofen and acetaminophen for reducing fever. Laboratory studies – or even highly controlled traditional clinical trials – can’t sort out how that balance plays out in the real world.
In our perspective piece, we pointed out financial, regulatory, and logistical barriers to faster and more efficient pragmatic clinical trials. But the most important barrier is cultural. It’s unsettling to acknowledge our lack of evidence to guide common and consequential clinical decisions. Clinicians want to inspire hope and confidence. Patients and families making everyday decisions about healthcare might be dismayed to learn that we lack clear answers to important clinical questions. We all must do the best we can with whatever evidence we have, but we should certainly not be satisfied with current knowledge. If we hope to activate our entire health care system to generate better evidence, we’ll probably need to provoke more discomfort with the quality of evidence we have now.
Inadequate evidence can also lead to endless conflict. My colleague Michael Von Korff used to use the term “German Argument” to describe people preferring to argue about a question when the answer is readily available to those willing to look. Michael was fully entitled to use that expression, since his last name starts with “Von.” Germany, however, now stands out for success in mitigating the impact of the COVID-19 pandemic. Even if Michael’s ethnic joke no longer applies, the practice of arguing rather than examining evidence is widespread. The best way to end those arguments is to say “I really don’t know the answer. How could we find out as quickly as possible?”