Can supervised deep learning architecture outperform autoencoders in building propensity score models for matching?

Mohammad Ehsanul Karim

doi:10.1186/s12874-024-02284-5

BMC Medical Research Methodology (Aug 2024)

Can supervised deep learning architecture outperform autoencoders in building propensity score models for matching?

Mohammad Ehsanul Karim

Affiliations

Mohammad Ehsanul Karim: School of Population and Public Health, University of British Columbia

DOI: https://doi.org/10.1186/s12874-024-02284-5
Journal volume & issue: Vol. 24, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Purpose Propensity score matching is vital in epidemiological studies using observational data, yet its estimates relies on correct model-specification. This study assesses supervised deep learning models and unsupervised autoencoders for propensity score estimation, comparing them with traditional methods for bias and variance accuracy in treatment effect estimations. Methods Utilizing a plasmode simulation based on the Right Heart Catheterization dataset, under a variety of settings, we evaluated (1) a supervised deep learning architecture and (2) an unsupervised autoencoder, alongside two traditional methods: logistic regression and a spline-based method in estimating propensity scores for matching. Performance metrics included bias, standard errors, and coverage probability. The analysis was also extended to real-world data, with estimates compared to those obtained via a double robust approach. Results The analysis revealed that supervised deep learning models outperformed unsupervised autoencoders in variance estimation while maintaining comparable levels of bias. These results were supported by analyses of real-world data, where the supervised model’s estimates closely matched those derived from conventional methods. Additionally, deep learning models performed well compared to traditional methods in settings where exposure was rare. Conclusion Supervised deep learning models hold promise in refining propensity score estimations in epidemiological research, offering nuanced confounder adjustment, especially in complex datasets. We endorse integrating supervised deep learning into epidemiological research and share reproducible codes for widespread use and methodological transparency.

Published in BMC Medical Research Methodology

ISSN: 1471-2288 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General)
Website: http://bmcmedresmethodol.biomedcentral.com

About the journal

Abstract

Keywords