Multiple imputation methods: a case study of daily gold price

Ala Alrawajfi; Mohd Tahir Ismail; Sadam Al Wadi; Saleh Atiewi; Ahmad Awajan

doi:10.7717/peerj-cs.2337

PeerJ Computer Science (Sep 2024)

Multiple imputation methods: a case study of daily gold price

Ala Alrawajfi,
Mohd Tahir Ismail,
Sadam Al Wadi,
Saleh Atiewi,
Ahmad Awajan

Affiliations

Ala Alrawajfi: School of Mathematical Science, Universiti Sains Malaysia, Penang, Penang, Malaysia
Mohd Tahir Ismail: School of Mathematical Science, Universiti Sains Malaysia, Penang, Penang, Malaysia
Sadam Al Wadi: College of Business, The University of Jordan, Amman, Amman, Jordan
Saleh Atiewi: Department of Computer Science, Al-Hussein Bin Talal University, Maan, Maan, Jordan
Ahmad Awajan: Department of Mathematics, Al-Hussein Bin Talal University, Maan, Maan, Jordan

DOI: https://doi.org/10.7717/peerj-cs.2337
Journal volume & issue: Vol. 10
p. e2337

Abstract

Read online Read online

Data imputation strategies are necessary to address the prevalent difficulty of missing values in data observation and recording operations. This work utilizes diverse imputation methods to forecast and complete absent values inside a financial time-series dataset, specifically the daily prices of gold. The predictive accuracy of imputed data is assessed in comparison to the original entire dataset to ensure its robustness. The imputation methods are validated using actual closing price data obtained from a daily gold price website. The examined approaches include mean imputation, k-nearest neighbor (KNN), hot deck, random forest, support vector machine (SVM), and spline imputation. Their performance is evaluated based on several metrics, including mean error (ME), mean absolute error (MAE), root mean square error (RMSE), mean percentage error (MPE), and mean absolute percentage error (MAPE). The results indicate that the KNN approach consistently performs better than other methods in terms of all accuracy measures. Nevertheless, the precision of all techniques decreases as the proportion of missing data rises. Therefore, the KNN approach is suggested because to its exceptional performance and dependability in imputation tasks.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords