PLoS Computational Biology (Jan 2023)

Ten simple rules for using public biological data for your research.

  • Vishal H Oza,
  • Jordan H Whitlock,
  • Elizabeth J Wilk,
  • Angelina Uno-Antonison,
  • Brandon Wilk,
  • Manavalan Gajapathy,
  • Timothy C Howton,
  • Austyn Trull,
  • Lara Ianov,
  • Elizabeth A Worthey,
  • Brittany N Lasseigne

DOI
https://doi.org/10.1371/journal.pcbi.1010749
Journal volume & issue
Vol. 19, no. 1
p. e1010749

Abstract

Read online

With an increasing amount of biological data available publicly, there is a need for a guide on how to successfully download and use this data. The 10 simple rules for using public biological data are: (1) use public data purposefully in your research; (2) evaluate data for your use case; (3) check data reuse requirements and embargoes; (4) be aware of ethics for data reuse; (5) plan for data storage and compute requirements; (6) know what you are downloading; (7) download programmatically and verify integrity; (8) properly cite data; (9) make reprocessed data and models Findable, Accessible, Interoperable, and Reusable (FAIR) and share; and (10) make pipelines and code FAIR and share. These rules are intended as a guide for researchers wanting to make use of available data and to increase data reuse and reproducibility.