Scientific Data (Sep 2023)

AmeriFlux BASE data pipeline to support network growth and data sharing

  • Housen Chu,
  • Danielle S. Christianson,
  • You-Wei Cheah,
  • Gilberto Pastorello,
  • Fianna O’Brien,
  • Joshua Geden,
  • Sy-Toan Ngo,
  • Rachel Hollowgrass,
  • Karla Leibowitz,
  • Norman F. Beekwilder,
  • Megha Sandesh,
  • Sigrid Dengel,
  • Stephen W. Chan,
  • André Santos,
  • Kyle Delwiche,
  • Koong Yi,
  • Christin Buechner,
  • Dennis Baldocchi,
  • Dario Papale,
  • Trevor F. Keenan,
  • Sébastien C. Biraud,
  • Deborah A. Agarwal,
  • Margaret S. Torn

DOI
https://doi.org/10.1038/s41597-023-02531-2
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 13

Abstract

Read online

Abstract AmeriFlux is a network of research sites that measure carbon, water, and energy fluxes between ecosystems and the atmosphere using the eddy covariance technique to study a variety of Earth science questions. AmeriFlux’s diversity of ecosystems, instruments, and data-processing routines create challenges for data standardization, quality assurance, and sharing across the network. To address these challenges, the AmeriFlux Management Project (AMP) designed and implemented the BASE data-processing pipeline. The pipeline begins with data uploaded by the site teams, followed by the AMP team’s quality assurance and quality control (QA/QC), ingestion of site metadata, and publication of the BASE data product. The semi-automated pipeline enables us to keep pace with the rapid growth of the network. As of 2022, the AmeriFlux BASE data product contains 3,130 site years of data from 444 sites, with standardized units and variable names of more than 60 common variables, representing the largest long-term data repository for flux-met data in the world. The standardized, quality-ensured data product facilitates multisite comparisons, model evaluations, and data syntheses.