Ocean Science (Jul 2020)
An approach to the verification of high-resolution ocean models using spatial methods
Abstract
The Met Office currently runs two operational ocean forecasting configurations for the North West European Shelf: an eddy-permitting model with a resolution of 7 km (AMM7) and an eddy-resolving model at 1.5 km (AMM15). Whilst qualitative assessments have demonstrated the benefits brought by the increased resolution of AMM15, particularly in the ability to resolve finer-scale features, it has been difficult to show this quantitatively, especially in forecast mode. Applications of typical assessment metrics such as the root mean square error have been inconclusive, as the high-resolution model tends to be penalised more severely, referred to as the double-penalty effect. This effect occurs in point-to-point comparisons whereby features correctly forecast but misplaced with respect to the observations are penalised twice: once for not occurring at the observed location, and secondly for occurring at the forecast location, where they have not been observed. An exploratory assessment of sea surface temperature (SST) has been made at in situ observation locations using a single-observation neighbourhood-forecast (SO-NF) spatial verification method known as the High-Resolution Assessment (HiRA) framework. The primary focus of the assessment was to capture important aspects of methodology to consider when applying the HiRA framework. Forecast grid points within neighbourhoods centred on the observing location are considered as pseudo ensemble members, so that typical ensemble and probabilistic forecast verification metrics such as the continuous ranked probability score (CRPS) can be utilised. It is found that through the application of HiRA it is possible to identify improvements in the higher-resolution model which were not apparent using typical grid-scale assessments. This work suggests that future comparative assessments of ocean models with different resolutions would benefit from using HiRA as part of the evaluation process, as it gives a more equitable and appropriate reflection of model performance at higher resolutions.