PLoS ONE (Jan 2017)
A Random Forest approach to predict the spatial distribution of sediment pollution in an estuarine system.
Abstract
Modeling the magnitude and distribution of sediment-bound pollutants in estuaries is often limited by incomplete knowledge of the site and inadequate sample density. To address these modeling limitations, a decision-support tool framework was conceived that predicts sediment contamination from the sub-estuary to broader estuary extent. For this study, a Random Forest (RF) model was implemented to predict the distribution of a model contaminant, triclosan (5-chloro-2-(2,4-dichlorophenoxy)phenol) (TCS), in Narragansett Bay, Rhode Island, USA. TCS is an unregulated contaminant used in many personal care products. The RF explanatory variables were associated with TCS transport and fate (proxies) and direct and indirect environmental entry. The continuous RF TCS concentration predictions were discretized into three levels of contamination (low, medium, and high) for three different quantile thresholds. The RF model explained 63% of the variance with a minimum number of variables. Total organic carbon (TOC) (transport and fate proxy) was a strong predictor of TCS contamination causing a mean squared error increase of 59% when compared to permutations of randomized values of TOC. Additionally, combined sewer overflow discharge (environmental entry) and sand (transport and fate proxy) were strong predictors. The discretization models identified a TCS area of greatest concern in the northern reach of Narragansett Bay (Providence River sub-estuary), which was validated with independent test samples. This decision-support tool performed well at the sub-estuary extent and provided the means to identify areas of concern and prioritize bay-wide sampling.