Research Ideas and Outcomes (Mar 2024)

A lab-centric, workflow-based data management system for environmental DNA research

  • Alex Borisenko,
  • Robert Young,
  • Robert Hanner

DOI
https://doi.org/10.3897/rio.10.e120483
Journal volume & issue
Vol. 10
pp. 1 – 39

Abstract

Read online Read online Read online

The adoption of environmental DNA approaches as a standard tool for biodiversity monitoring leads to the increase in the number of eDNA-based species occurrence records; however, considerable disparity remains in the nature and quality of associated information, much of it unpublished and/or poorly parametrised. A robust system for tracking biological materials from their point of origin through laboratory analyses is required to connect inferred taxon occurrences with analytical history and provenance data. The bulk of eDNA research is currently driven by small-scale operations where the tasks of digitisation, organisation and cross-referencing field records with laboratory analytical data and biomaterial sample location, are often performed manually and disconnected.We present an integrative, full-stack data management solution that provides a structured ontological concept, a minimalist data schema for eDNA research and a software application prototype designed to facilitate real-time digitisation, parsing, annotation and archival of eDNA data. The system tracks the provenance and analytical history of biological samples through a structured hierarchy of events, linked with associated digital file attachment archives, such as images and raw sequence files, and with inferred taxonomic occurrence records. The data entry process is compartmentalised and incorporated into the corresponding stages of standard operations used in fieldwork, biological collection management and laboratory analysis. Resulting data records can be integrated into various output formats required for large-scale analytics, publication and/or submission to global data aggregators. The prototype is implemented on the Microsoft 365 platform as a relational database (Access) linked to cloud-based data tables (SharePoint) and a set of associated data conversion spreadsheets (Excel). The system is designed primarily around the data management needs of small research labs; however, it is scalable to larger institutions and inter-institutional academic networks.

Keywords