Confounding bias (part 2 in a series)
(Click here for part 1)
Panacea was a goddess known for having a cure-all, which is why her name now means “a solution to every problem.” This series comprises a critique of an RCT, recently published in JAMA, in which the authors concluded that the treatment was something of a panacea.
Confounding bias
Confounding bias happens in studies where the researchers ignore factors that are likely causes of the outcome and then mistakenly conclude that the whole of any difference is caused by the treatment. In theory, randomized controlled trials (RCTs) are supposed to be free of this type of bias — that’s because randomization should distribute any confounders (or alternative causes) equally between the two groups. In practice, RCTs usually have at least a little bit of this bias — and sometimes more than a little. One way this happens is that the randomization process fails to produce two similar groups because, you know, a butterfly flapped its wings, which is more likely when the study sample size is small, like in this study of Lexapro.1
Most RCTs will provide a summary of some important “baseline characteristics” (usually “Table 1”) so that the reader can see if the two treatment groups are similar at baseline. The figure shows a picture from this study’s “Table 1.” As you can see, patients in the placebo group had a lot more of some key “demographics,” most of which are important social determinants of health (or disease). In this elderly population with underlying heart disease, the placebo group was 4% more male, 59% more single, 31% more alone, and 60% more likely to be renters (a marker of socioeconomic status [SES]).

All of these are powerful determinants of disease. At the ages sampled in this study, male sex alone is associated with a higher rate of death. When you add on being unmarried, living alone, and having lower SES, it’s possible that this alone could account for the whole difference reported by the study authors.
Statistical and clinical significance with confounders
Some readers who have a background in science might like to know if these differences are statistically significant. It has largely gone out of vogue to use p-values to compare baseline characteristics in RCTs.2,3 This is primarily because such probability estimates require a much more nuanced interpretation compared to traditional inferential hypothesis tests. If you’d like to know more, read about it here.
References
- Kim J-M, Steward R, Lee Y-S, et al. Effect of escitalopram vs placebo treatment for depression on long-term cardiac outcomes in patients with acute coronary syndrome: A randomized clinical trial. JAMA. 2018;230(4):350-7.
- Senn S. Testing for baseline balance in clinical trials. Statistics in Medicine. 1994;13(17):1715-26.
- Austin PC, Manca A, Zwarenstein M, et al. A substantial and confusing variation exists in handling of baseline trials: A review of trials published in leading medical journals. Journal of Clinical Epidemiology. 2010;63(2):142-53.