Atmospheric Chemistry and Physics (Oct 2023)
Quantifying stratospheric ozone trends over 1984–2020: a comparison of ordinary and regularized multivariate regression models
Abstract
Accurate quantification of long-term trends in stratospheric ozone can be challenging due to their sensitivity to natural variability, the quality of the observational datasets, and non-linear changes in forcing processes as well as the statistical methodologies. Multivariate linear regression (MLR) is the most commonly used tool for ozone trend analysis; however, the complex coupling in many atmospheric processes can make it prone to the issue of over-fitting when using the conventional ordinary-least-squares (OLS) approach. To overcome this issue, here we adopt a regularized (ridge) regression method to estimate ozone trends and quantify the influence of individual processes. We use the Stratospheric Water and OzOne Satellite Homogenized (SWOOSH) merged dataset (v2.7) to derive stratospheric ozone profile trends for the period 1984–2020. Besides SWOOSH, we also analyse a machine-learning-based satellite-corrected gap-free global stratospheric ozone profile dataset from a chemical transport model (ML-TOMCAT) and output from a chemical transport model (TOMCAT) simulation forced with European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 reanalysis. For 1984–1997, we observe smaller negative trends in the SWOOSH stratospheric ozone profile using ridge regression compared to OLS. Except for the tropical lower stratosphere, the largest differences arise in the mid-latitude lowermost stratosphere (>4 % per decade difference at 100 hPa). From 1998 and the onset of ozone recovery in the upper stratosphere, the positive trends estimated using the ridge regression model (∼1 % per decade near 2 hPa) are smaller than those using OLS (∼2 % per decade). In the lower stratosphere, post-1998 negative trends with large uncertainties are observed and ridge-based trend estimates are somewhat smaller and less variable in magnitude compared to the OLS regression. Aside from the tropical lower stratosphere, the largest difference is around 2 % per decade at 100 hPa (with ∼3 % per decade uncertainties for individual trends) in northern mid-latitudes. For both time periods the SWOOSH data produce large negative trends in the tropical lower stratosphere with a correspondingly large difference between the two trend methods. In both cases the ridge method produces a smaller trend. The regression coefficients from both OLS and ridge models, which represent ozone variations associated with natural processes (e.g. the quasi-biennial oscillation, solar variability, El Niño–Southern Oscillation, Arctic Oscillation, Antarctic Oscillation, and Eliassen–Palm flux), highlight the dominance of dynamical processes in controlling lower-stratospheric ozone concentrations. Ridge regression generally yields smaller regression coefficients due to correlated explanatory variables, and care must be exercised when comparing fit coefficients and their statistical significance across different regression methods. Comparing the ML-TOMCAT-based trend estimates with the ERA5-forced model simulation, we find ML-TOMCAT shows significant improvements with much better consistency with the SWOOSH dataset, despite the ML-TOMCAT training period overlapping with SWOOSH only for the Microwave Limb Sounder (MLS) measurement period. The largest inconsistencies with respect to SWOOSH-based trends post-1998 appear in the lower stratosphere where the ERA5-forced model simulation shows positive trends for both the tropics and the mid-latitudes. The large differences between satellite-based data and the ERA5-forced model simulation confirm significant uncertainties in ozone trend estimates, especially in the lower stratosphere, underscoring the need for caution when interpreting results obtained with different regression methods and datasets.