Clinical Epidemiology (Mar 2017)
Control of confounding in the analysis phase – an overview for clinicians
Abstract
Johnny Kahlert,1 Sigrid Bjerge Gribsholt,1,2 Henrik Gammelager,1,3 Olaf M Dekkers,1,4,5 George Luta1,6 1Department of Clinical Epidemiology, Institute of Clinical Medicine, 2Department of Endocrinology and Internal Medicine, 3Department of Anaesthesiology and Intensive Care Medicine, Aarhus University Hospital, Aarhus, Denmark; 4Department of Clinical Epidemiology, 5Department of Medicine, Section Endocrinology, Leiden University Medical Center, Leiden, the Netherlands; 6Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC, USA Abstract: In observational studies, control of confounding can be done in the design and analysis phases. Using examples from large health care database studies, this article provides the clinicians with an overview of standard methods in the analysis phase, such as stratification, standardization, multivariable regression analysis and propensity score (PS) methods, together with the more advanced high-dimensional propensity score (HD-PS) method. We describe the progression from simple stratification confined to the inclusion of a few potential confounders to complex modeling procedures such as the HD-PS approach by which hundreds of potential confounders are extracted from large health care databases. Stratification and standardization assist in the understanding of the data at a detailed level, while accounting for potential confounders. Incorporating several potential confounders in the analysis typically implies the choice between multivariable analysis and PS methods. Although PS methods have gained remarkable popularity in recent years, there is an ongoing discussion on the advantages and disadvantages of PS methods as compared to those of multivariable analysis. Furthermore, the HD-PS method, despite its generous inclusion of potential confounders, is also associated with potential pitfalls. All methods are dependent on the assumption of no unknown, unmeasured and residual confounding and suffer from the difficulty of identifying true confounders. Even in large health care databases, insufficient or poor data may contribute to these challenges. The trend in data collection is to compile more fine-grained data on lifestyle and severity of diseases, based on self-reporting and modern technologies. This will surely improve our ability to incorporate relevant confounders or their proxies. However, despite a remarkable development of methods that account for confounding and new data opportunities, confounding will remain a serious issue. Considering the advantages and disadvantages of different methods, we emphasize the importance of the clinical input and of the interplay between clinicians and analysts to ensure a proper analysis. Keywords: observational studies, confounding, adjustment, stratification, multivariable analysis, propensity score