Ecological Informatics (Mar 2025)
Effect of phosphorus fractions on benthic chlorophyll-a: Insight from the machine learning models
Abstract
The relationship between total phosphorus (TP) and benthic chlorophyll-a (chl-a), a vital indicator of algal biomass in freshwater ecosystems, has been well-established since the 1960s. Different machine learning models have been used to predict the benthic chl-a from the TP and dissolved P (DP), however, to the best of our knowledge, colloidal and particulate P (CP and PP) have never been used in predictive models for benthic chl-a. To address this gap, we applied two machine learning algorithms—random forest (RF), and artificial neural networks (ANN) to predict benthic chl-a concentrations by incorporating these specific P fractions as separate variables. Additionally, support vector regression (SVR) was used to predict chl-a concentrations across upstream, midstream, and downstream sections. A total of 125 freshwater samples were collected from these sections of the Thousand Island Lake (TIL) watershed for analysis. The RF model (R2 = 0.88, RMSE = 2.20) outperformed the ANN (R2 = 0.37, RMSE = 4.78). The SHapley Additive exPlanations (SHAP) were used to interpret the RF model, revealing CP as the most influential predictor of benthic chl-a levels. Lower concentrations of PP and DP significantly contributed to benthic chl-a predictions, suggesting possibly rapid biogeochemical transformations among these P fractions. The SVR analysis demonstrated the dominance of DP and CP upstream and downstream, respectively, while PP was more influential in the middle stream areas. This study highlights the differential impacts of P fractions on benthic chl-a and offers new insights into aquatic health assessments in the TIL watershed.