Ecological Indicators (Oct 2023)
AGTML: A novel approach to land cover classification by integrating automatic generation of training samples and machine learning algorithms on Google Earth Engine
Abstract
The timely, accurate, and automatic acquisition of land cover (LC) information is a prerequisite for detecting LC dynamics and performing ecological analyses. Cloud computing platforms, such as the Google Earth Engine, have substantially improved the efficiency and scale of LC classification. However, the lack of sufficient and representative training samples hinders automatic and accurate LC classification. In this study, we propose a new approach that integrates the automatic generation of training samples and machine learning algorithms (AGTML) for LC classification in Heilongjiang Province, China. After optimal focal radii were determined for different LC types using Landsat 8 based on focal statistics and unique phenology. Then target training samples were automatically generated based on the improved distance measure SED (a composite of Spectral angle distance (SAD) and Euclidean distance (ED)). Furthermore, LC classification was performed using four feature combinations and three machine learning algorithms. According to independent validation data, the automatically generated training samples demonstrated good representativeness and stability among all three classifiers, with an overall accuracy (OA) of classification higher than 86%, and showed high consistency in the landscape pattern of classification. RF yielded the highest classification accuracy (92.99% OA). AGTML outperformed GLC-FCS30 in identifying large fragmentation and small patch regions in the landscape types. The AGTML approach was subsequently applied to the Guanzhong Plain using different satellite imagery. Results were consistent and accurate (>96.50% OA), demonstrating that the AGTML approach can be applied to various regions and sensors, and has immense potential for automated LC classification across regional and global scales.