Scientific Reports (Nov 2024)
A hybrid model for missing traffic flow data imputation based on clustering and attention mechanism optimizing LSTM and AdaBoost
Abstract
Abstract Reliable traffic flow data is not only crucial for traffic management and planning, but also the foundation for many intelligent applications. However, the phenomenon of missing traffic flow data often occurs, so we propose an imputation model for missing traffic flow data to overcome the randomness and instability bands of traffic flow. First, k-means clustering is used to classify road segments with traffic flow belonging to the same pattern into a group to utilize the spatial characteristics of roads fully. Then, the LSTM networks optimized with an attention mechanism are used as the base learner to extract the temporal dependence of the traffic flow. Finally, the AdaBoost algorithm is used to integrate all the LSTM-attention networks into a reinforced learner to impute the missing data. To validate the effectiveness of the proposed model, we use the PeMS dataset for validation, we impute the data with missing data rate from 10 to 60% under three missing modes, and we use multiple baseline models for comparison, which confirms that our proposed model improves the stability and accuracy of imputing the missing data of the traffic flow with different scenarios.
Keywords