Comparison of standard and penalized logistic regression in risk model developmentCentral MessagePerspective

Yan Yan, MD, PhD; Zhizhou Yang, BA; Tara R. Semenkovich, MD, MPHS; Benjamin D. Kozower, MD, MPH; Bryan F. Meyers, MD, MPH; Ruben G. Nava, MD; Daniel Kreisel, MD, PhD; Varun Puri, MD, MSCI

JTCVS Open (Mar 2022)

Comparison of standard and penalized logistic regression in risk model developmentCentral MessagePerspective

Yan Yan, MD, PhD,
Zhizhou Yang, BA,
Tara R. Semenkovich, MD, MPHS,
Benjamin D. Kozower, MD, MPH,
Bryan F. Meyers, MD, MPH,
Ruben G. Nava, MD,
Daniel Kreisel, MD, PhD,
Varun Puri, MD, MSCI

Affiliations

Yan Yan, MD, PhD: Division of Public Health Sciences, Washington University School of Medicine, St Louis, Mo
Zhizhou Yang, BA: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Tara R. Semenkovich, MD, MPHS: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Benjamin D. Kozower, MD, MPH: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Bryan F. Meyers, MD, MPH: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Ruben G. Nava, MD: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Daniel Kreisel, MD, PhD: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo
Varun Puri, MD, MSCI: Division of Cardiothoracic Surgery, Washington University School of Medicine, St Louis, Mo; Address for reprints: Varun Puri, MD, MSCI, 660 S Euclid Ave, Campus Box 8234, St Louis, MO 63110.

Journal volume & issue: Vol. 9
pp. 303 – 316

Abstract

Read online

Objective: Regression models are ubiquitous in thoracic surgical research. We aimed to compare the value of standard logistic regression with the more complex but increasingly used penalized regression models using a recently published risk model as an example. Methods: Using a standardized data set of clinical T1-3N0 esophageal cancer patients, we created models to predict the likelihood of unexpected pathologic nodal disease after surgical resection. Models were fitted using standard logistic regression or penalized regression (ridge, lasso, elastic net, and adaptive lasso). We compared the model performance (Brier score, calibration slope, C statistic, and overfitting) of standard regression with penalized regression models. Results: Among 3206 patients with clinical T1-3N0 esophageal cancer, 668 (22%) had unexpected pathologic nodal disease. Of the 15 candidate variables considered in the models, the key predictors of nodal disease included clinical tumor stage, tumor size, grade, and presence of lymphovascular invasion. The standard regression model and all 4 penalized logistic regression models had virtually identical performance with Brier score ranging from 0.138 to 0.141, concordance index ranging from 0.775 to 0.788, and calibration slope from 0.965 to 1.05. Conclusions: For predictive modeling in surgical outcomes research, when the data set is large and the outcome of interest is relatively frequent, standard regression models and the more complicated penalized models are very likely to have similar predictive performance. The choice of statistical methods for risk model development should be on the basis of the nature of the data at hand and good statistical practice, rather than the novelty or complexity of statistical models.

Published in JTCVS Open

ISSN: 2666-2736 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Internal medicine: Specialties of internal medicine: Diseases of the circulatory (Cardiovascular) system; Medicine: Surgery
Website: https://www.journals.elsevier.com/jtcvs-open

About the journal

Abstract

Keywords