Scientific Reports (Dec 2023)
Selection and prediction of metro station sites based on spatial data and random forest: a study of Lanzhou, China
Abstract
Abstract Urban economic development, congestion relief, and traffic efficiency are all greatly impacted by the thoughtful planning of urban metro station layout. with the urban area of Lanzhou as an example, the suitability of the station locations of the built metro stations of the rail transit lines 1 and 2 in the study area have been evaluated using multi-source heterogeneous spatial data through data collection, feature matrix construction, the use of random forest and K-fold cross-validation, among other methods. The average Gini reduction value was used to examine the contribution rate of each feature indicator based on the examination of model truthfulness. According to the study's findings: (1) K-fold cross-validation was applied to test the random forest model that was built using the built metro stations and particular factors. The average accuracy of the tests and out-of-bag data (OOB) of tenfold cross-validation were 89.62% and 91.285%, respectively. Additionally, the AUC area under the ROC curve was 0.9823, indicating that this time, from the perspective of the natural environment, traffic location, and social factors The 19 elements selected from the views of the urban function structure, social economics, and natural environment are closely associated to the locations of the metro station in the research region, and the prediction the findings are more reliable; (2) It becomes apparent that more than half of the built station sites display excellent agreement with the predicted sites in terms of geographical location by superimposing the built metro station sites with the prediction results and tally up their cumulative prediction probability values within the 300 m buffering zone; (3) Based on the contribution rate of each indicator to the model, transport facilities, companies, population density, night lighting, science, education and culture, residential communities, and road network density are identified as the primary influential factors, each accounting for over 6.6%. Subsequently, land use, elevation, and slope are found to have relatively lower contributions. The results of the research provided important information for the local metro's best location selection and planning.