Geo-spatial Information Science (Jun 2024)

Research on multiscale OpenStreetMap in China: data quality assessment with EWM-TOPSIS and GDP modeling

  • Chuqiao Han,
  • Binbin Lu,
  • Jianghua Zheng,
  • Danlin Yu,
  • Shudan Zheng

DOI
https://doi.org/10.1080/10095020.2024.2356238

Abstract

Read online

OpenStreetMap (OSM) is a voluntary platform designed to provide free and up-to-date geographic data. Since OSM is based on multisource geographic data provided by the public, the quality of the data has become a concern of researchers. In this study, a unified measure for evaluating OSM data quality was constructed, and the entropy weight method (EWM) and Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) model were used to evaluate the quality of multiscale OSM data in China from 2014 to 2020. In addition to evaluating the data quality, the use of OSM data quality index factors in economic modeling at different spatial scales in China was explored by using a geographic information system (GIS) analysis method and a geographically and temporally weighted regression (GTWR) model. Four machine learning models, SVM, RF, XGBoost and CatBoost, were used to simulate the grid-scale GDP, and the effectiveness of these simulations was discussed. The results showed that (1) the weights of OSM data quality indicator factors vary across different spatial scales. (2) From 2014 to 2020, the quality of national-scale OSM data first increased, then decreased and then gradually stabilized. In addition, the quality of OSM data at the provincial and municipal scales is significantly different, and the distribution is affected by the population and geographical environment. (3) Over time, the spatial clustering characteristics of OSM data quality at different spatial scales in China has continuously strengthened. In addition, the quality of Chinese OSM data displays obvious local spatial autocorrelation characteristics, which are dominated by H-H clustering and L-L clustering. (4) The GTWR model performs well in predicting and revealing the spatiotemporal correlation characteristics between GDP and OSM data quality indicators at the provincial and municipal scales in China. The correlations increase with decreasing spatial scale (provincial to municipal). Moreover, the GDP modeling ability is better in economically underdeveloped Northwest China and economically developed East China. (5) Four machine learning models coupled with road network length completeness, relative linear density, road name attribute completeness, POI name attribute completeness, road network accuracy, road network update frequency, POI update frequency, topological consistency and directional similarity yielded the best grid-scale GDP simulation values. Notably, the CatBoost model provided the best accuracy, and the results further verified that it is feasible to use the proposed OSM data quality index system to predict the regional economic development level in China.

Keywords