Ecological Informatics (Nov 2024)

Traits.build: A data model, workflow and R package for building harmonised ecological trait databases

  • Elizabeth Wenk,
  • Payal Bal,
  • David Coleman,
  • Rachael Gallagher,
  • Sophie Yang,
  • Daniel Falster

Journal volume & issue
Vol. 83
p. 102773

Abstract

Read online

Trait databases have proliferated over the past decades, facilitating research on the ecology, evolution, and conservation of taxa across the Tree of Life. Typically, teams of independent researchers build these databases, and each must develop their own workflow and output structure. This divests research hours from downstream tasks such as trait-based analysis and interpretation and the resultant datasets are often difficult to integrate due to disparate database structures. Here we introduce the {traits.build} R-package, which offers a generalised workflow for building trait databases. {traits.build} contains bespoke functions for propagating metadata files, extensive tutorials, and sample configuration files, allowing researchers to efficiently build a new trait database using open-source tools. In addition, the {traits.build} output structure is fully documented by a data model, ensuring the meaning of each variable and semantic relationship between variables is transparent and consistent. The data standard links to terms in previously published data standards, drawing strongly on DarwinCore and the Ecological Trait-data Standard, but also includes the ability to fully map location and context properties absent from these vocabularies. It is the first published database-building workflow that adheres to the Extensible Observation Ontology. Simultaneously developing a generalised workflow and publishing a data standard for the workflow provides {traits.build} users a straightforward pathway to build a new trait database that achieves the FAIR principles. The meaning of all variables in a {traits.build} database are already documented, allowing further integration with either other {traits.build} databases or indeed any other database with a documented data model. This follows the vision of the Open Traits Network to build trait databases whose data can be easily integrated for further analysis.

Keywords