ITM Web of Conferences (Jan 2024)

Statistical Data Mining Methods in Predicting Happiness and Habits

  • Sulaiman Sazan Kamal,
  • Jghef Yousif Sufyan,
  • Abdullah Abdulqadir Ismail,
  • Ahmed Saadaldeen Rashid

DOI
https://doi.org/10.1051/itmconf/20246401019
Journal volume & issue
Vol. 64
p. 01019

Abstract

Read online

The objective of this study is to employ statistical data mining methods and con-duct a survey among young individuals to construct a model capable of forecasting overall happiness. This model will consider over a hundred characteristics, including lifestyle choices and musical tastes. We utilized boosting trees, subset se-lection, and GAM (Generalized Additive Models) techniques. In addition, we created actual test data to validate the model. All available approaches have found many lifestyle variables, including as energy levels, loneliness, desire to alter the past, eating properly, and spending time with friends, as significant determinants of happiness. We generated authentic test data to verify the model, utilizing rigorous testing protocols to evaluate its predicted precision and applicability across various demographics. Based on our investigation, the use of the gradient boost technique resulted in improved picture projections. The evaluation of the technique using a confusion matrix revealed an accuracy of 97.1% for training and a perfect accuracy of 100% for validation. The training phase achieved an accuracy of 62.5%, as shown by the confusion matrix, while the overall confusion matrix demonstrated a 92.0% accuracy in predicting happiness. The support vector machine, trained incrementally, demonstrated encouraging prospects for future investigation.

Keywords