Atmosphere (Feb 2025)

Random Forest-Based Retrieval of XCO<sub>2</sub> Concentration from Satellite-Borne Shortwave Infrared Hyperspectral

  • Wenhao Zhang,
  • Zhengyong Wang,
  • Tong Li,
  • Bo Li,
  • Yao Li,
  • Zhihua Han

DOI
https://doi.org/10.3390/atmos16030238
Journal volume & issue
Vol. 16, no. 3
p. 238

Abstract

Read online

As carbon dioxide (CO2) concentrations continue to rise, climate change, characterized by global warming, presents a significant challenge to global sustainable development. Currently, most global shortwave infrared CO2 retrievals rely on fully physical retrieval algorithms, for which complex calculations are necessary. This paper proposes a method to predict the concentration of column-averaged CO2 (XCO2) from shortwave infrared hyperspectral satellite data, using machine learning to avoid the iterative computations of the physical method. The training dataset is constructed using the Orbiting Carbon Observatory-2 (OCO-2) spectral data, XCO2 retrievals from OCO-2, surface albedo data, and aerosol optical depth (AOD) measurements for 2019. This study employed a variety of machine learning algorithms, including Random Forest, XGBoost, and LightGBM, for the analysis. The results showed that Random Forest outperforms the other models, achieving a correlation of 0.933 with satellite products, a mean absolute error (MAE) of 0.713 ppm, and a root mean square error (RMSE) of 1.147 ppm. This model was then applied to retrieve CO2 column concentrations for 2020. The results showed a correlation of 0.760 with Total Carbon Column Observing Network (TCCON) measurements, which is higher than the correlation of 0.739 with satellite product data, verifying the effectiveness of the retrieval method.

Keywords