Statistical Data Mining Methods in Predicting Happiness and Habits

Sulaiman Sazan Kamal; Jghef Yousif Sufyan; Abdullah Abdulqadir Ismail; Ahmed Saadaldeen Rashid

doi:10.1051/itmconf/20246401019

ITM Web of Conferences (Jan 2024)

Statistical Data Mining Methods in Predicting Happiness and Habits

Sulaiman Sazan Kamal,
Jghef Yousif Sufyan,
Abdullah Abdulqadir Ismail,
Ahmed Saadaldeen Rashid

Affiliations

Sulaiman Sazan Kamal: College of Engineering, Department of Computer engineering, Knowledge university
Jghef Yousif Sufyan: College of Engineering, Department of Computer engineering, Knowledge university
Abdullah Abdulqadir Ismail: College of Engineering, Department of Computer engineering, Knowledge university
Ahmed Saadaldeen Rashid: Computer Science Department, Bayan University

DOI: https://doi.org/10.1051/itmconf/20246401019
Journal volume & issue: Vol. 64
p. 01019

Abstract

Read online

The objective of this study is to employ statistical data mining methods and con-duct a survey among young individuals to construct a model capable of forecasting overall happiness. This model will consider over a hundred characteristics, including lifestyle choices and musical tastes. We utilized boosting trees, subset se-lection, and GAM (Generalized Additive Models) techniques. In addition, we created actual test data to validate the model. All available approaches have found many lifestyle variables, including as energy levels, loneliness, desire to alter the past, eating properly, and spending time with friends, as significant determinants of happiness. We generated authentic test data to verify the model, utilizing rigorous testing protocols to evaluate its predicted precision and applicability across various demographics. Based on our investigation, the use of the gradient boost technique resulted in improved picture projections. The evaluation of the technique using a confusion matrix revealed an accuracy of 97.1% for training and a perfect accuracy of 100% for validation. The training phase achieved an accuracy of 62.5%, as shown by the confusion matrix, while the overall confusion matrix demonstrated a 92.0% accuracy in predicting happiness. The support vector machine, trained incrementally, demonstrated encouraging prospects for future investigation.

Published in ITM Web of Conferences

ISSN: 2271-2097 (Online)
Publisher: EDP Sciences
Country of publisher: France
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.itm-conferences.org/

About the journal

Abstract

Keywords