Atmosphere (Dec 2023)
Research on Missing Value Imputation to Improve the Validity of Air Quality Data Evaluation on the Qinghai-Tibetan Plateau
Abstract
In the Qinghai-Tibet Plateau region, operational deficiencies and limited maintenance capacities often impair automatic air quality monitoring stations. This results in frequent data omissions, compromising the reliability of environmental assessment data. Therefore, an effective data imputation method is required to address the gaps in observational records. Utilizing a Sequence-to-Sequence framework, we introduce a model termed Bidirectional Recurrent Imputation for Time Series-Attention-based Long Short-Term Memory (BRITS-ALSTM). The encoder of BRITS-ALSTM applies BRITS to integrate single-station historical characteristics with multi-station correlation features. Concurrently, the decoder employs LSTM within an attention mechanism to capitalize on previously observed data, thereby generating hourly imputations for missing air quality data values. The model was trained using six types of air quality data from 16 stations across Qinghai Province. Through localized testing and parameter optimization, BRITS-ALSTM achieved a reduction in mean relative error (MRE) by 74.88% compared to the baseline mean-filling approach. Additionally, ablation studies demonstrated an improvement in the coefficient of determination R-squared (R2) from 0.67 to 0.76, outperforming the standalone BRITS. Consequently, BRITS-ALSTM enhances the accuracy of air quality data evaluations in the Tibetan Plateau and offers an efficacious strategy for data imputation in elevated terrains.
Keywords