Preventive Medicine Reports (Feb 2022)
Joint risk prediction for hazardous use of alcohol, cannabis, and tobacco among adolescents: A preliminary study using statistical and machine learning
Abstract
For some, substance use during adolescence may be a stepping stone on the way to substance use disorders in adulthood. Risk prediction models may help identify adolescent users at elevated risk for hazardous substance use. This preliminary analysis used cross-sectional data (n = 270, ages 13–18) from the baseline dataset of a randomized controlled trial intervening with adolescent alcohol and/or cannabis use. Models were developed for jointly predicting quantitative scores on three measures of hazardous substance use (Rutgers Alcohol Problems Index, Adolescent Cannabis Problem Questionnaire, and Hooked on Nicotine Checklist) based on personal risk factors using two statistical and machine learning methods: multivariate covariance generalized linear models (MCGLM) and penalized multivariate regression with a lasso penalty. The predictive accuracy of a model was evaluated using root mean squared error computed via leave-one-out cross-validation. The final proposed model was an MCGLM model. It has eleven risk factors: age, early life stress, age of first tobacco use, age of first cannabis use, lifetime use of other substances, age of first use of other substances, maternal education, parental attachment, family cigarette use, family history of hazardous alcohol use, and family history of hazardous cannabis use. Different subsets of these risk factors feature in the three outcome-specific components of this joint model. The quantitative risk estimate provided by the proposed model may help identify adolescent substance users of cannabis, alcohol, and tobacco who may be at an elevated risk of developing hazardous substance use.