Frontiers in Bioengineering and Biotechnology (Apr 2019)

Defending Our Public Biological Databases as a Global Critical Infrastructure

  • Jacob Caswell,
  • Jason D. Gans,
  • Nicholas Generous,
  • Corey M. Hudson,
  • Eric Merkley,
  • Curtis Johnson,
  • Christopher Oehmen,
  • Kristin Omberg,
  • Emilie Purvine,
  • Karen Taylor,
  • Christina L. Ting,
  • Murray Wolinsky,
  • Gary Xie

DOI
https://doi.org/10.3389/fbioe.2019.00058
Journal volume & issue
Vol. 7

Abstract

Read online

Progress in modern biology is being driven, in part, by the large amounts of freely available data in public resources such as the International Nucleotide Sequence Database Collaboration (INSDC), the world's primary database of biological sequence (and related) information. INSDC and similar databases have dramatically increased the pace of fundamental biological discovery and enabled a host of innovative therapeutic, diagnostic, and forensic applications. However, as high-value, openly shared resources with a high degree of assumed trust, these repositories share compelling similarities to the early days of the Internet. Consequently, as public biological databases continue to increase in size and importance, we expect that they will face the same threats as undefended cyberspace. There is a unique opportunity, before a significant breach and loss of trust occurs, to ensure they evolve with quality and security as a design philosophy rather than costly “retrofitted” mitigations. This Perspective surveys some potential quality assurance and security weaknesses in existing open genomic and proteomic repositories, describes methods to mitigate the likelihood of both intentional and unintentional errors, and offers recommendations for risk mitigation based on lessons learned from cybersecurity.

Keywords