IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2020)
Disaggregating County-Level Census Data for Population Mapping Using Residential Geo-Objects With Multisource Geo-Spatial Data
Abstract
Accurate spatialization of socioeconomic data is conducive to understand the spatial and temporal distribution of human social development status and, thus, effectively support future scientific decision-making. This study focuses on population mapping, which is a classical spatialization of macroeconomic data of the social economy. Traditional population mapping based on rough grids or administrative divisions such as townships often has deficiencies in the accuracy of spatial pattern and prediction. In this article, hence, we employ residential geo-objects as basic mapping units and formalize the problem as a spatial prediction process using machine-learning (ML) methods with high-spatial-resolution (HSR) satellite remote sensing images and multisource geospatial data. The indicators of population spatial density, including residential geo-objects' area, building existence index, terrain slope, night light intensity, density of point of interest (POI) and road network from Internet electronic maps, and locational factors such as the distances from road and river, are jointly applied to establish the relationship between these multivariable factors and quantitative index of population density using ML algorithms such as Random Forests and XGBoost. The predicated values of population density from the mined nonlinear regression relation are further used to calculate the weights of disaggregation of each unit, and then the population quantity distribution at the scale of residential geo-objects is obtained under the control of the total amount of population statistics. Experiments with a county area show that the methodology has the ability to achieve better results than the traditional deterministic methods by reproducing a more accurate and finer geographic population distribution pattern. Meanwhile, it is found that the optimization of mapping results may benefit from the multisources geospatial data, and thus the methodological framework can be recommended to be extended to other spatialization areas of socioeconomic data.
Keywords