Journal of Clinical Medicine (Mar 2021)

Machine Learning-Based Nicotine Addiction Prediction Models for Youth E-Cigarette and Waterpipe (Hookah) Users

  • Jeeyae Choi,
  • Hee-Tae Jung,
  • Anastasiya Ferrell,
  • Seoyoon Woo,
  • Linda Haddad

DOI
https://doi.org/10.3390/jcm10050972
Journal volume & issue
Vol. 10, no. 5
p. 972

Abstract

Read online

Despite the harmful effect on health, e-cigarette and hookah smoking in youth in the U.S. has increased. Developing tailored e-cigarette and hookah cessation programs for youth is imperative. The aim of this study was to identify predictor variables such as social, mental, and environmental determinants that cause nicotine addiction in youth e-cigarette or hookah users and build nicotine addiction prediction models using machine learning algorithms. A total of 6511 participants were identified as ever having used e-cigarettes or hookah from the National Youth Tobacco Survey (2019) datasets. Prediction models were built by Random Forest with ReliefF and Least Absolute Shrinkage and Selection Operator (LASSO). ReliefF identified important predictor variables, and the Davies–Bouldin clustering evaluation index selected the optimal number of predictors for Random Forest. A total of 193 predictor variables were included in the final analysis. Performance of prediction models was measured by Root Mean Square Error (RMSE) and Confusion Matrix. The results suggested high performance of prediction. Identified predictor variables were aligned with previous research. The noble predictors found, such as ‘witnessed e-cigarette use in their household’ and ‘perception of their tobacco use’, could be used in public awareness or targeted e-cigarette and hookah youth education and for policymakers.

Keywords