Modeling community standards for metadata as templates makes data FAIR

Mark A. Musen; Martin J. O’Connor; Erik Schultes; Marcos Martínez-Romero; Josef Hardi; John Graybeal

doi:10.1038/s41597-022-01815-3

Scientific Data (Nov 2022)

Modeling community standards for metadata as templates makes data FAIR

Mark A. Musen,
Martin J. O’Connor,
Erik Schultes,
Marcos Martínez-Romero,
Josef Hardi,
John Graybeal

Affiliations

Mark A. Musen: Stanford Center for Biomedical Informatics Research, Stanford University
Martin J. O’Connor: Stanford Center for Biomedical Informatics Research, Stanford University
Erik Schultes: GO FAIR Foundation
Marcos Martínez-Romero: Stanford Center for Biomedical Informatics Research, Stanford University
Josef Hardi: Stanford Center for Biomedical Informatics Research, Stanford University
John Graybeal: Stanford Center for Biomedical Informatics Research, Stanford University

DOI: https://doi.org/10.1038/s41597-022-01815-3
Journal volume & issue: Vol. 9, no. 1
pp. 1 – 15

Abstract

Read online

Abstract It is challenging to determine whether datasets are findable, accessible, interoperable, and reusable (FAIR) because the FAIR Guiding Principles refer to highly idiosyncratic criteria regarding the metadata used to annotate datasets. Specifically, the FAIR principles require metadata to be “rich” and to adhere to “domain-relevant” community standards. Scientific communities should be able to define their own machine-actionable templates for metadata that encode these “rich,” discipline-specific elements. We have explored this template-based approach in the context of two software systems. One system is the CEDAR Workbench, which investigators use to author new metadata. The other is the FAIRware Workbench, which evaluates the metadata of archived datasets for their adherence to community standards. Benefits accrue when templates for metadata become central elements in an ecosystem of tools to manage online datasets—both because the templates serve as a community reference for what constitutes FAIR data, and because they embody that perspective in a form that can be distributed among a variety of software applications to assist with data stewardship and data sharing.

Published in Scientific Data

ISSN: 2052-4463 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/sdata/

About the journal