Clinical Epidemiology (Jul 2018)

Automated data-adaptive analytics for electronic healthcare data to study causal treatment effects

  • Schneeweiss S

Journal volume & issue
Vol. Volume 10
pp. 771 – 788

Abstract

Read online

Sebastian Schneeweiss1,2 1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, 2Harvard Medical School, Boston, MA, USA Background: Decision makers in health care increasingly rely on nonrandomized database analyses to assess the effectiveness, safety, and value of medical products. Health care data scientists use data-adaptive approaches that automatically optimize confounding control to study causal treatment effects. This article summarizes relevant experiences and extensions. Methods: The literature was reviewed on the uses of high-dimensional propensity score (HDPS) and related approaches for health care database analyses, including methodological articles on their performance and improvement. Articles were grouped into applications, comparative performance studies, and statistical simulation experiments. Results: The HDPS algorithm has been referenced frequently with a variety of clinical applications and data sources from around the world. The appeal of HDPS for database research rests in 1) its superior performance in situations of unobserved confounding through proxy adjustment, 2) its predictable efficiency in extracting confounding information from a given data source, 3) its ability to automate estimation of causal treatment effects to the extent achievable in a given data source, and 4) its independence of data source and coding system. Extensions of the HDPS approach have focused on improving variable selection when exposure is sparse, using free text information and time-varying confounding adjustment. Conclusion: Semiautomated and optimized confounding adjustment in health care database analyses has proven successful across a wide range of settings. Machine-learning extensions further automate its use in estimating causal treatment effects across a range of data scenarios. Keywords: high-dimensional data, confounding (epidemiology), health care databases, real-world data, confounding adjustment, propensity scores, automation, causal conclusions, artificial intelligence, machine learning

Keywords