The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Jun 2024)
Comparison of machine learning and statistical approaches for Digital Elevation Model (DEM) correction: interim results
Abstract
The correction of digital elevation models (DEMs) can be achieved using a variety of techniques. Machine learning and statistical methods are broadly applicable to a variety of DEM correction case studies in different landscapes. However, a literature survey did not reveal any research that compared the effectiveness or performance of both methods. In this study, we comparatively evaluate three gradient boosted decision trees (XGBoost, LightGBM and CatBoost) and multiple linear regression for the correction of two publicly available global DEMs: Copernicus GLO-30 and ALOS World 3D (AW3D) in Cape Town, South Africa. The training datasets are comprised of eleven predictor variables including elevation, slope, aspect, surface roughness, topographic position index, terrain ruggedness index, terrain surface texture, vector ruggedness measure, percentage bare ground, urban footprints and percentage forest cover as an indicator of the overland forest distribution. The target variable (elevation error) was derived with respect to highly accurate airborne LiDAR. The results presented in this study represent urban/industrial and grassland/shrubland/dense bush landscapes. Although the accuracy of the original DEMs had been degraded by several anomalies, the corrections improved the vertical accuracy across vast areas of the landscape. In the urban/industrial and grassland/shrubland landscapes, the reduction in the root mean square error (RMSE) of the original AW3D DEM was greater than 70%, after correction. The corrections improved the accuracy of Copernicus DEM, e.g., > 44% RMSE reduction in the urban area and >32% RMSE reduction in the grassland/shrubland landscape. Generally, the gradient boosted decision trees outperformed multiple linear regression in most of the tests.