Biogeosciences (Apr 2023)

Using machine learning and Biogeochemical-Argo (BGC-Argo) floats to assess biogeochemical models and optimize observing system design

  • A. Mignot,
  • H. Claustre,
  • H. Claustre,
  • G. Cossarini,
  • F. D'Ortenzio,
  • F. D'Ortenzio,
  • E. Gutknecht,
  • J. Lamouroux,
  • P. Lazzari,
  • C. Perruche,
  • S. Salon,
  • R. Sauzède,
  • V. Taillandier,
  • V. Taillandier,
  • A. Teruzzi

DOI
https://doi.org/10.5194/bg-20-1405-2023
Journal volume & issue
Vol. 20
pp. 1405 – 1422

Abstract

Read online

Numerical models of ocean biogeochemistry are becoming the major tools used to detect and predict the impact of climate change on marine resources and to monitor ocean health. However, with the continuous improvement of model structure and spatial resolution, incorporation of these additional degrees of freedom into fidelity assessment has become increasingly challenging. Here, we propose a new method to provide information on the model predictive skill in a concise way. The method is based on the conjoint use of a k-means clustering technique, assessment metrics, and Biogeochemical-Argo (BGC-Argo) observations. The k-means algorithm and the assessment metrics reduce the number of model data points to be evaluated. The metrics evaluate either the model state accuracy or the skill of the model with respect to capturing emergent properties, such as the deep chlorophyll maximums and oxygen minimum zones. The use of BGC-Argo observations as the sole evaluation data set ensures the accuracy of the data, as it is a homogenous data set with strict sampling methodologies and data quality control procedures. The method is applied to the Global Ocean Biogeochemistry Analysis and Forecast system of the Copernicus Marine Service. The model performance is evaluated using the model efficiency statistical score, which compares the model–observation misfit with the variability in the observations and, thus, objectively quantifies whether the model outperforms the BGC-Argo climatology. We show that, overall, the model surpasses the BGC-Argo climatology in predicting pH, dissolved inorganic carbon, alkalinity, oxygen, nitrate, and phosphate in the mesopelagic and the mixed layers as well as silicate in the mesopelagic layer. However, there are still areas for improvement with respect to reducing the model–data misfit for certain variables such as silicate, pH, and the partial pressure of CO2 in the mixed layer as well as chlorophyll-a-related, oxygen-minimum-zone-related, and particulate-organic-carbon-related metrics. The method proposed here can also aid in refining the design of the BGC-Argo network, in particular regarding the regions in which BGC-Argo observations should be enhanced to improve the model accuracy via the assimilation of BGC-Argo data or process-oriented assessment studies. We strongly recommend increasing the number of observations in the Arctic region while maintaining the existing high-density of observations in the Southern Oceans. The model error in these regions is only slightly less than the variability observed in BGC-Argo measurements. Our study illustrates how the synergic use of modeling and BGC-Argo data can both provide information about the performance of models and improve the design of observing systems.