Earth System Science Data (Feb 2023)

OceanSODA-MDB: a standardised surface ocean carbonate system dataset for model–data intercomparisons

  • P. E. Land,
  • H. S. Findlay,
  • J. D. Shutler,
  • J.-F. Piolle,
  • R. Sims,
  • H. Green,
  • H. Green,
  • V. Kitidis,
  • A. Polukhin,
  • I. I. Pipko

DOI
https://doi.org/10.5194/essd-15-921-2023
Journal volume & issue
Vol. 15
pp. 921 – 947

Abstract

Read online

In recent years, large datasets of in situ marine carbonate system parameters (partial pressure of CO2 (pCO2), total alkalinity, dissolved inorganic carbon and pH) have been collated, quality-controlled and made publicly available. These carbonate system datasets have highly variable data density in both space and time, especially in the case of pCO2, which is routinely measured at high frequency using underway measuring systems. This variation in data density can create biases when the data are used, for example, for algorithm assessment, favouring datasets or regions with high data density. A common way to overcome data density issues is to bin the data into cells of equal latitude and longitude extent. This leads to bins with spatial areas that are latitude- and projection-dependent (e.g. become smaller and more elongated as the poles are approached). Additionally, as bin boundaries are defined without reference to the spatial distribution of the data or to geographical features, data clusters may be divided sub-optimally (e.g. a bin covering a region with a strong gradient). To overcome these problems and to provide a tool for matching surface in situ data with satellite, model and climatological data, which often have very different spatiotemporal scales both from the in situ data and from each other, a methodology has been created to group in situ data into “regions of interest”: spatiotemporal cylinders consisting of circles on the Earth's surface extending over a period of time. These regions of interest are optimally adjusted to contain as many in situ measurements as possible. All surface in situ measurements of the same parameter contained in a region of interest are collated, including estimated uncertainties and regional summary statistics. The same grouping is applied to each of the non-in situ datasets in turn, producing a dataset of coincident matchups that are consistent in space and time. About 35 million in situ data points were matched with data from five satellite sources and five model and reanalysis datasets to produce a global matchup dataset of carbonate system data, consisting of ∼286 000 regions of interest spanning 54 years from 1957 to 2020. Each region of interest is 100 km in diameter and 10 d in duration. An example application, the reparameterisation of a global total alkalinity algorithm, is presented. This matchup dataset can be updated as and when in situ and other datasets are updated, and similar datasets at finer spatiotemporal scale can be constructed, for example, to enable regional studies. The matchup dataset provides users with a large multi-parameter carbonate system dataset containing data from different sources, in one consistent, collated and standardised format suitable for model–data intercomparisons and model evaluations. The OceanSODA-MDB data can be downloaded from https://doi.org/10.12770/0dc16d62-05f6-4bbe-9dc4-6d47825a5931 (Land and Piollé, 2022).