Ecosphere (Mar 2016)
Modeling lake trophic state: a random forest approach
Abstract
Abstract Productivity of lentic ecosystems is well studied, and it is widely accepted that as nutrient inputs increase, productivity increases and lakes transition from lower trophic state (e.g., oligotrophic) to higher trophic states (e.g., eutrophic). These broad trophic state classifications are good predictors of ecosystem condition, services (e.g., recreation and esthetics), and disservices (e.g., harmful algal blooms). While the relationship between nutrients and trophic state provides reliable predictions, it requires in situ water quality data to parameterize the model. This limits the application of these models to lakes with existing and, more importantly, available water quality data. To address this, we take advantage of the availability of a large national lakes water quality database (i.e., the National Lakes Assessment), land‐use/land‐cover data, lake morphometry data, and other universally available data, and we apply data‐mining approaches to predict trophic state. Using these data and random forests, we first model chlorophyll a and then classify the resultant predictions into trophic states. The full model estimates chlorophyll a with both in situ and universally available data. The mean‐squared error and adjusted R2 of this model was 0.09 and 0.8, respectively. The second model uses universally available GIS data only. The mean‐squared error was 0.22, and the adjusted R2 was 0.48. The Kappa coefficients of the trophic state classifications derived from the chlorophyll a predictions were 0.57 for the full model and 0.29 for the “GIS‐only” model. Random forests extend the usefulness of the class predictions by providing prediction probabilities for each lake. This allows us to make trophic state predictions and also indicate the level of uncertainty around those predictions. For the full model, these predicted class probabilities ranged from 0.42 to 1. For the GIS‐only model, they ranged from 0.33 to 0.96. It is our conclusion that in situ data are required for better predictions, yet GIS and universally available data provide trophic state predictions, with estimated uncertainty, that still have the potential for a broad array of applications. The source code and data for this manuscript are available from https://github.com/USEPA/LakeTrophicModelling.
Keywords