Remote Sensing (May 2024)

Chlorophyll-a Estimation in 149 Tropical Semi-Arid Reservoirs Using Remote Sensing Data and Six Machine Learning Methods

  • Victor Oliveira Santos,
  • Bruna Monallize Duarte Moura Guimarães,
  • Iran Eduardo Lima Neto,
  • Francisco de Assis de Souza Filho,
  • Paulo Alexandre Costa Rocha,
  • Jesse Van Griensven Thé,
  • Bahram Gharabaghi

DOI
https://doi.org/10.3390/rs16111870
Journal volume & issue
Vol. 16, no. 11
p. 1870

Abstract

Read online

It is crucial to monitor algal blooms in freshwater reservoirs through an examination of chlorophyll-a (Chla) concentrations, as they indicate the trophic condition of these waterbodies. Traditional monitoring methods, however, are expensive and time-consuming. Addressing this hindrance, we conducted a comprehensive investigation using several machine learning models for Chla modeling. To this end, we used in situ collected water sample data and remote sensing data from the Sentinel-2 satellite, including spectral bands and indices, for large-scale coverage. This approach allowed us to conduct a comprehensive analysis and characterization of the Chla concentrations across 149 freshwater reservoirs in Ceará, a semi-arid region of Brazil. The implemented machine learning models included k-nearest neighbors, random forest, extreme gradient boosting, the least absolute shrinkage, and the group method of data handling (GMDH); in particular, the GMDH approach has not been previously explored in this context. The forward stepwise approach was used to determine the best subset of input parameters. Using a 70/30 split for the training and testing datasets, the best-performing model was the GMDH model, achieving an R2 of 0.91, an MAPE of 102.34%, and an RMSE of 20.4 μg/L, which were values consistent with the ones found in the literature. Nevertheless, the predicted Chla concentration values were most sensitive to the red, green, and near-infrared bands.

Keywords