Information (Oct 2022)

Statistical Machine Learning Regression Models for Salary Prediction Featuring Economy Wide Activities and Occupations

  • Yasser T. Matbouli,
  • Suliman M. Alghamdi

DOI
https://doi.org/10.3390/info13100495
Journal volume & issue
Vol. 13, no. 10
p. 495

Abstract

Read online

A holistic occupational and economy-wide framework for salary prediction is developed and tested using statistical machine learning (ML). Predictive models are developed based on occupational features and organizational characteristics. Five different supervised ML algorithms are trained using survey data from the Saudi Arabian labor market to estimate mean annual salary across economic activities and major occupational groups. In predicting the mean salary over economic activities, the Bayesian Gaussian process regression ML showed a marked improvement in R2 over multiple linear regression (from 0.50 to 0.98). Moreover, lower error levels were obtained: root-mean-square error was reduced by 80% and mean absolute error was reduced by almost 90% compared to multiple linear regression. However, the salary prediction over major occupational groups resulted in artificial neural networks performing the best in terms of both R2, with an improvement from 0.62 in multiple linear regression to 0.94 and errors were reduced by approximately 60%. The proposed framework can help estimate annual salary levels across different types of economic activities and organization sizes, as well as different occupations.

Keywords