Earth and Space Science (Aug 2021)

A Guide to Using GitHub for Developing and Versioning Data Standards and Reporting Formats

  • Robert Crystal‐Ornelas,
  • Charuleka Varadharajan,
  • Ben Bond‐Lamberty,
  • Kristin Boye,
  • Madison Burrus,
  • Shreyas Cholia,
  • Michael Crow,
  • Joan Damerow,
  • Ranjeet Devarakonda,
  • Kim S. Ely,
  • Amy Goldman,
  • Susan Heinz,
  • Valerie Hendrix,
  • Zarine Kakalia,
  • Stephanie C. Pennington,
  • Emily Robles,
  • Alistair Rogers,
  • Maegen Simmonds,
  • Terri Velliquette,
  • Helen Weierbach,
  • Pamela Weisenhorn,
  • Jessica N. Welch,
  • Deborah A. Agarwal

DOI
https://doi.org/10.1029/2021EA001797
Journal volume & issue
Vol. 8, no. 8
pp. n/a – n/a

Abstract

Read online

Abstract Data standardization combined with descriptive metadata facilitate data reuse, which is the ultimate goal of the Findable, Accessible, Interoperable, and Reusable (FAIR) principles. Community data or metadata standards are increasingly created through an approach that emphasizes collaboration between various stakeholders. Such an approach requires platforms for collaboration on the development process that centers on sharing information and receiving feedback. Our objective in this study was to conduct a systematic review to identify data standards and reporting formats that use version control for developing data standards and to summarize common practices, particularly in earth and environmental sciences. Out of 108 data standards and reporting formats identified in our review, 32 used GitHub as the version control platform, and no other platforms were used. We found no universally accepted methodology for developing and publishing data standards. Many GitHub repositories did not use key features that could help developers to gather user feedback, or to create and revise standards that build on previous work. We provide guidance for community‐driven standard development and associated documentation on GitHub based on a systematic review of existing practices.

Keywords