Applied Sciences (Dec 2021)
A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery
Abstract
Chlorophyll-a (Chla) concentration, which serves as a phytoplankton substitute in inland waters, is one of the leading indicators for water quality. Generally, water samples are analyzed in professional laboratories, and Chla concentrations are measured regularly for the purpose of water quality monitoring. However, limited spatial water sampling and the labor-intensive nature of data collection make global and long-term monitoring difficult. The developments of remote-sensing optical sensors and technologies make the long-term monitoring of Chla concentrations for an entire water body more achievable. Many studies based on machine learning techniques, such as regression and artificial neural network (ANN) methods, have recently been proposed for Chla concentration estimation using optical satellite images. The methods based on machine learning can achieve accurate estimation. However, overfitting problems may arise because the in situ Chla dataset is generally insufficient to train a complicated machine learning model, which makes trained models inapplicable. In this study, an ANN model containing three convolutional and two fully connected layers with 4953 unknown parameters is designed. A transfer learning method, consisting of model pretraining, main-training, and fine-tuning stages, is proposed to ease the problem of insufficient in situ samples. In the model pretraining stage, the ANN model is pretrained and initialized using samples derived from an existing Chla concentration model. The pretrained ANN model is then fine-tuned using the proposed transfer learning technique with in situ samples collected in five different campaigns carried out during early 2019 from Laguna Lake, the Philippines. Before the transfer learning, data augmentation and rebalancing methods are conducted to enrich the variability and to near-uniformly distribute the in situ samples in Chla concentration space, respectively. To estimate the alleviation of model overfitting, the trained ANN model, using an in situ dataset from Laguna Lake, was tested using an in situ dataset from Lake Victoria, Uganda, obtained in 2019, which has a similar trophic state as Laguna Lake. The experimental results from Sentinel-3 imagery indicated that the overfitting problem was significantly alleviated and the trained ANN model outperformed related models in terms of the root-mean-squared error of the estimated Chla concentrations.
Keywords