Interactive Journal of Medical Research (Dec 2024)
Predicting Depressive Symptoms Using GPS-Based Regional Data in Germany With the CORONA HEALTH App During the COVID-19 Pandemic: Cross-Sectional Study
Abstract
BackgroundNumerous studies have been conducted to predict depressive symptoms using passive smartphone data, mostly integrating the GPS signal as a measure of mobility. Environmental factors have been identified as correlated with depressive symptoms in specialized studies both before and during the pandemic. ObjectiveThis study combined a data-based approach using passive smartphone data to predict self-reported depressive symptoms with a wide range of GPS-based environmental factors as predictors. MethodsThe CORONA HEALTH app was developed for the purpose of data collection, and this app enabled the collection of both survey and passive data via smartphone. After obtaining informed consent, we gathered GPS signals at the time of study participation and evaluated depressive symptoms in 249 Android users with the Patient Health Questionnaire-9. The only GPS-based data collected were the participants’ location at the time of the questionnaire, which was used to assign participants to the nearest district for linking regional sociodemographic data. Data collection took place from July 2020 to February 2021, coinciding with the COVID-19 pandemic. Using GPS data, each dataset was linked to a wide variety of data on regional sociodemographic, geographic, and economic characteristics describing the respondent’s environment, which were derived from a publicly accessible database from official German statistical offices. Moreover, pandemic-specific predictors such as the current pandemic phase or the number of new regional infections were matched via GPS. For the prediction of individual depressive symptoms, we compared 3 models (ie, ridge, lasso, and elastic net regression) and evaluated the models using 10-fold cross-validation. ResultsThe final elastic net regression model showed the highest explained variance (R2=0.06) and reduced the dataset from 121 to 9 variables, the 3 main predictors being current COVID-19 infections in the respective district, the number of places in nursing homes, and the proportion of fathers receiving parental benefits. The number of places in nursing homes refers to the availability of care facilities for the elderly, which may indicate regional population characteristics that influence mental health. The proportion of fathers receiving parental benefits reflects family structure and work-life balance, which could impact stress and mental well-being during the pandemic. ConclusionsPassive data describing the environment contributed to the prediction of individual depressive symptoms and revealed regional risk and protective factors that may be of interest without their inclusion in routine assessments being costly.