Machine learning models predict the emergence of depression in Argentinean college students during periods of COVID-19 quarantine

Lorena Cecilia López Steinmetz; Lorena Cecilia López Steinmetz; Margarita Sison; Rustam Zhumagambetov; Juan Carlos Godoy; Stefan Haufe; Stefan Haufe; Stefan Haufe; Stefan Haufe

doi:10.3389/fpsyt.2024.1376784

Frontiers in Psychiatry (Apr 2024)

Machine learning models predict the emergence of depression in Argentinean college students during periods of COVID-19 quarantine

Lorena Cecilia López Steinmetz,
Lorena Cecilia López Steinmetz,
Margarita Sison,
Rustam Zhumagambetov,
Juan Carlos Godoy,
Stefan Haufe,
Stefan Haufe,
Stefan Haufe,
Stefan Haufe

Affiliations

Lorena Cecilia López Steinmetz: Inverse Modeling and Machine Learning, Chair of Uncertainty, Institute of Software Engineering and Theoretical Computer Science, Faculty IV Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
Lorena Cecilia López Steinmetz: Instituto de Investigaciones Psicológicas (IIPsi), Facultad de Psicología, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Córdoba (UNC), Córdoba, Argentina
Margarita Sison: Berlin Center for Advanced Neuroimaging (BCAN), Charité – Universitätsmedizin Berlin, Berlin, Germany
Rustam Zhumagambetov: Working Group 8.44 Machine Learning and Uncertainty, Mathematical Modelling and Data Analysis Department, Physikalisch-Technische Bundesanstalt Braunschweig und Berlin, Berlin, Germany
Juan Carlos Godoy: Instituto de Investigaciones Psicológicas (IIPsi), Facultad de Psicología, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Córdoba (UNC), Córdoba, Argentina
Stefan Haufe: Inverse Modeling and Machine Learning, Chair of Uncertainty, Institute of Software Engineering and Theoretical Computer Science, Faculty IV Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
Stefan Haufe: Berlin Center for Advanced Neuroimaging (BCAN), Charité – Universitätsmedizin Berlin, Berlin, Germany
Stefan Haufe: Working Group 8.44 Machine Learning and Uncertainty, Mathematical Modelling and Data Analysis Department, Physikalisch-Technische Bundesanstalt Braunschweig und Berlin, Berlin, Germany
Stefan Haufe: Institute for Medical Informatics, Charité – Universitätsmedizin Berlin, Berlin, Germany

DOI: https://doi.org/10.3389/fpsyt.2024.1376784
Journal volume & issue: Vol. 15

Abstract

Read online

IntroductionThe COVID-19 pandemic has exacerbated mental health challenges, particularly depression among college students. Detecting at-risk students early is crucial but remains challenging, particularly in developing countries. Utilizing data-driven predictive models presents a viable solution to address this pressing need.Aims1) To develop and compare machine learning (ML) models for predicting depression in Argentinean students during the pandemic. 2) To assess the performance of classification and regression models using appropriate metrics. 3) To identify key features driving depression prediction.MethodsA longitudinal dataset (N = 1492 college students) captured T1 and T2 measurements during the Argentinean COVID-19 quarantine. ML models, including linear logistic regression classifiers/ridge regression (LogReg/RR), random forest classifiers/regressors, and support vector machines/regressors (SVM/SVR), are employed. Assessed features encompass depression and anxiety scores (at T1), mental disorder/suicidal behavior history, quarantine sub-period information, sex, and age. For classification, models’ performance on test data is evaluated using Area Under the Precision-Recall Curve (AUPRC), Area Under the Receiver Operating Characteristic curve, Balanced Accuracy, F1 score, and Brier loss. For regression, R-squared (R2), Mean Absolute Error, and Mean Squared Error are assessed. Univariate analyses are conducted to assess the predictive strength of each individual feature with respect to the target variable. The performance of multi- vs univariate models is compared using the mean AUPRC score for classifiers and the R2 score for regressors.ResultsThe highest performance is achieved by SVM and LogReg (e.g., AUPRC: 0.76, 95% CI: 0.69, 0.81) and SVR and RR models (e.g., R2 for SVR and RR: 0.56, 95% CI: 0.45, 0.64 and 0.45, 0.63, respectively). Univariate models, particularly LogReg and SVM using depression (AUPRC: 0.72, 95% CI: 0.64, 0.79) or anxiety scores (AUPRC: 0.71, 95% CI: 0.64, 0.78) and RR using depression scores (R2: 0.48, 95% CI: 0.39, 0.57) exhibit performance levels close to those of the multivariate models, which include all features.DiscussionThese findings highlight the relevance of pre-existing depression and anxiety conditions in predicting depression during quarantine, underscoring their comorbidity. ML models, particularly SVM/SVR and LogReg/RR, demonstrate potential in the timely detection of at-risk students. However, further studies are needed before clinical implementation.

Published in Frontiers in Psychiatry

ISSN: 1664-0640 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry: Neurology. Diseases of the nervous system: Psychiatry
Website: https://www.frontiersin.org/journals/psychiatry

About the journal

Abstract

Keywords