All That Glitters

With respect for the evidence that randomized clinical trials afford, their value is inflated in many ways. And their authority is always worth questioning.

"Observing Science" title and tagline on dark grey background

Read Time: 4 minutes


Randomized clinical trials (RCT) have achieved a kind of sanctity in the health sciences. The bigger, the better. Trials are performed to provide the evidence for which tests, preventive regimens, prognostic markers, or treatments work and which don’t. Randomized study results are the primary drivers of Food and Drug Administration (FDA) approval for new medications and medical devices and so sit at our commercial crossroads. RCTs provide what is considered the “gold standard” of evidence, the most convincing and trustworthy experimental comparison. With due respect for the evidence they afford, the value of RCTs is inflated in all sorts of ways, and their authority is always worth questioning.

Let’s start with a description of a typical RCT. In the case of a medication trial, a collection of study patients is randomly assigned (like the flip of coin) into one of two treatments. By randomizing, researchers are doing their best to construct two groups that are extremely similar before treatment. This likely similarity between groups (randomization does not guarantee identity, one of its limitations) can then support claims that when, say, medication A has reduced death (or some other less severe outcome of interest) more than a placebo, we can fairly say this difference is caused by A. RCTs offer a study design that “controls” for other possible variables that could influence death (like age) because persons in the two groups are, as noted, on average the same age. So why should we be cautious?

RCTs enlighten us, but tomorrow’s trial may make today’s recommendation incorrect or misleading.

First, while a clinical trial’s effects may provide evidence for FDA approval or for pharmaceutical companies to proceed with their manufacturing plans for medication A, it doesn’t give perfect answers for the treatment of future patients. The findings of a clinical trial apply to an “average” patient who met the criteria for participation in that trial. Yet many RCTs enroll restricted populations, focusing only on persons expected to be highly responsive to treatment. The trial purports to answer questions about the treatment in the world beyond the trial, yet patients want answers about themselves. Maybe medication A has different beneficial and detrimental effects depending on whether you are young or old, but if these age-related analyses are not analyzed or announced by the RCT investigators, a patient can only extrapolate from the trial results. You are young, not an “average”-aged patient, and so your personal risk/benefit appraisal is not apparent. And what if you are older or younger than anyone in the trial, how do you decide whether the results apply to you? A trial may be well done and indisputable, but there are always limitations to its available evidence.

And this is the best case. Many RCTs are given a “gold standard” status without deserving it. Some trials are poorly conducted. Unless there are large numbers of rigorously performed and reviewed trials addressing the same condition, we should resist thinking the evidence of any particular trial is authoritative.

There are other limitations, as well. RCTs often recruit atypical patients where they are treated by atypically good clinicians. RCTs are suitable for testing medications, but for behavioral treatments the processes are more complex and the outcomes are harder to measure reliably. Trial participants walk out halfway through and data is lost, which is mitigated only in part by statistical techniques that are difficult to follow. RCTs require confirmation by other RCTs, but how many is enough? RCTs are expensive to perform. RCTs take up questions of efficacy, but not mechanism—we don’t really learn how medication A works along the way. And of course, not all clinical questions can be addressed by trial; rare disease RCTs are not possible.

We have to remind ourselves that there are other kinds of evidence, including non-randomized methods, that are convincing in health science. When insulin was first used to reduce diabetic ketoacidosis, an RCT was not needed: observation was enough.

We should value RCTs, but they are not made of gold, solid and fixed. Their results offer flecks of gold, guidance about the use of a finite number of tests and treatments. RCTs enlighten us, but tomorrow’s trial may make today’s recommendation incorrect or misleading.

Previous Issue: It’s Time To Talk About Peer Review