Water (Dec 2022)

Estimating Chlorophyll-<i>a</i> Concentration from Hyperspectral Data Using Various Machine Learning Techniques: A Case Study at Paldang Dam, South Korea

  • GwangMuk Im,
  • Dohyun Lee,
  • Sanghun Lee,
  • Jongsu Lee,
  • Sungjong Lee,
  • Jungsu Park,
  • Tae-Young Heo

DOI
https://doi.org/10.3390/w14244080
Journal volume & issue
Vol. 14, no. 24
p. 4080

Abstract

Read online

Algal blooms have been observed worldwide and have had a serious impact on industries that use water resources, which is a problem for people and the environment. For this reason, an algae warning system is used to count the number of cyanobacterial cells and the concentration of chlorophyll-a. Several studies using multispectral or hyperspectral data to estimate chlorophyll concentration have recently been carried out. In the present study, a comparative approach was applied to estimate the concentration of chlorophyll-a at Paldang Dam, South Korea using hyperspectral data. We developed a framework for estimating chlorophyll-a using dimension reduction methods, such as principal component analysis and partial least squares, and various machine learning algorithms. We analyzed hyperspectral data collected during a field survey to locate peaks in the chlorophyll-a spectrum. The framework that used support vector regression achieved the highest R2 of 0.99, a mean square error (MSE) of 1.299 μg/cm3, and showed a small discrepancy between observed and real values relative to other frameworks. These findings suggest that by combining hyperspectral data with dimension reduction and a machine learning algorithm, it is possible to provide an accurate estimation of chlorophyll-a. Using this, chlorophyll-a can be obtained in real time through hyperspectral sensor data input from drones or unmanned aerial vehicles using the learned machine learning algorithm.

Keywords