Journal of Agrometeorology (Sep 2024)
Univariate and multivariate imputation methods evaluation for reconstructing climate time series data: A case study of Mosul station-Iraq
Abstract
Comprehensive climate time series data is indispensable for monitoring the impacts of climate change. However, observational datasets often suffer from data gaps within their time series, necessitating imputation to ensure dataset integrity for further analysis. This study evaluated six univariate and multivariate imputation methods to infill missing values. These methods were applied to complete the subsets of time series data for precipitation, temperature, and relative humidity from Mosul station spanning 1980–2013. Artificial gaps of 5%, 10%, 20%, and 30% missing observations were introduced under scenarios of missing completely at random (MCAR) missing at random (MAR), and missing not at random (MNAR). Evaluation metrics including RMSE and Kling-Gupta Efficiency were utilized for performance evaluation. Results revealed that seasonal decomposition was the most effective univariate imputation method across all variables. For the multivariate imputation, kNN demonstrated superior performance in infilling the precipitation missing data under MCAR, while norm.predict exhibited optimal performance in the temperature missing data under all missing scenarios. Moreover, missForest was identified as the most suitable method for infilling missing relative humidity data. This study's methodology offers insights into selecting appropriate imputation methods for other climate stations, thereby enhancing the accuracy of the climate change effects analysis.
Keywords