Ecological Informatics (May 2025)
Integrating direct observation and environmental DNA data to enhance species distribution models in riverine environments
Abstract
The recent advances in both theoretical and modeling approaches (species distribution models) and molecular techniques (environmental DNA) offer new opportunities to advance the assessment of biodiversity. This is particularly the case for riverine environments, whose biodiversity is disproportionately under peril, but also whose dendritic connectivity allows a spatial interpretation of eDNA samples, which reflect a biodiversity signal averaged over a certain upstream area. Conversely, traditional, direct observation surveys provide localized information on taxon density. Here, I propose a framework to leverage both data types to improve estimates of a taxon’s spatial distribution. Specifically, I expand the eDITH model (which allows estimating the spatial distribution of taxa based on spatially replicated stream eDNA data) to include direct observations, and upgrade the eDITH R-package to allow a broad implementation of such method. Moreover, I propose optimized sampling strategies for both eDNA and direct sampling, with algorithms (included in the upgraded eDITH package) that mathematically translate rule-of-thumb criteria to maximize the spatial coverage of sites’ arrangement in a riverscape based on the peculiar features of each data type. Finally, I test such framework by means of an in-silico experiment, whereby I show that optimized sampling strategies outperform random-based strategies in the ability to reconstruct a taxon’s spatial distribution. When eDNA and direct sampling sites are spatially arranged in an optimized fashion, the highest prediction skill for a fixed total number of sampling sites deployed is reached when both data types are included in the model fitting. The optimal trade-off between eDNA and direct sampling observations depends on both characteristics of the investigated taxon (e.g., the spatial heterogeneity in its distribution) and the level of uncertainty in the observed data. These results will contribute to designing efficient strategies for integrated biomonitoring in river networks.