Remote Sensing (Sep 2020)

Comparison of Machine Learning Methods for Mapping the Stand Characteristics of Temperate Forests Using Multi-Spectral Sentinel-2 Data

  • Kourosh Ahmadi,
  • Bahareh Kalantar,
  • Vahideh Saeidi,
  • Elaheh K. G. Harandi,
  • Saeid Janizadeh,
  • Naonori Ueda

DOI
https://doi.org/10.3390/rs12183019
Journal volume & issue
Vol. 12, no. 18
p. 3019

Abstract

Read online

The estimation and mapping of forest stand characteristics are vital because this information is necessary for sustainable forest management. The present study considers the use of a Bayesian additive regression trees (BART) algorithm as a non-parametric classifier using Sentinel-2A data and topographic variables to estimate the forest stand characteristics, namely the basal area (m2/ha), stem volume (m3/ha), and stem density (number/ha). These results were compared with those of three other popular machine learning (ML) algorithms, such as generalised linear model (GLM), K-nearest neighbours (KNN), and support vector machine (SVM). A feature selection was done on 28 variables including the multi-spectral bands on Sentinel-2 satellite, related vegetation indices, and ancillary data (elevation, slope, and topographic solar-radiation index derived from digital elevation model (DEM)) and then the most insignificant variables were removed from the datasets by recursive feature elimination (RFE). The study area was a mountainous forest with high biodiversity and an elevation gradient from 26 to 1636 m. An inventory dataset of 1200 sample plots was provided for training and testing the algorithms, and the predictors were fed into the ML models to compute and predict the forest stand characteristics. The accuracies and certainties of the ML models were assessed by their root mean square error (RMSE), mean absolute error (MAE), and R-squared (R2) values. The results demonstrated that BART generated the best basal area and stem volume predictions, followed by GLM, SVM, and KNN. The best RMSE values for both basal area (8.12 m2/ha) and stem volume (29.28 m3/ha) estimation were obtained by BART. Thus, the ability of the BART model for forestry application was established. On the other hand, KNN exhibited the highest RMSE values for all stand variable predictions, thereby exhibiting the least accuracy for this specific application. Moreover, the effectiveness of the narrow Sentinel-2 bands around the red edge and elevation was highlighted for predicting the forest stand characteristics. Therefore, we concluded that the combination of the Sentinel-2 products and topographic variables derived from the PALSAR data used in this study improved the estimation of the forest attributes in temperate forests.

Keywords