Remote Sensing (May 2024)
Hybrid Machine Learning and Geostatistical Methods for Gap Filling and Predicting Solar-Induced Fluorescence Values
Abstract
Sun-induced chlorophyll fluorescence (SIF) has proven to be advantageous in estimating gross primary production, despite the lack of a stable relationship. Satellite-based SIF measurements at Level 2 offer comprehensive global coverage and are available in near real time. However, these measurements are often limited by spatial and temporal sparsity, as well as discontinuities. These limitations primarily arise from incomplete satellite trajectories. Additionally, variability in cloud cover and periodic issues specific to the instruments can compromise data quality. Two families of methods have been developed to address data discontinuity: (1) machine learning-based gap-filling techniques and (2) geostatistical techniques (various forms of kriging). The former techniques utilize the relationships between ancillary data and SIF, while the latter usually rely on the available SIF data recordings and their covariance structure to provide estimates at unsampled locations. In this study, we create a synthetic approach for SIF gap filling by hybridizing the two approaches under the umbrella of kriging with external drift. We performed leave-one-out cross-validation of the OCO-2 SIF retrieval aggregates for the entire year of 2019, comparing three methods: ordinary kriging, ML-based estimation using ancillary data, and kriging with external drift. The Mean Absolute Error (MAE) for ML, ordinary kriging, and the hybrid approach was found to be 0.1399, 0.1318, and 0.1183 mW m2 sr−1 nm−1, respectively. We demonstrate that the performance of the hybrid approach exceeds both parent techniques due to the incorporation of information from multiple resources. This use of multiple datasets enriches the hybrid model, making it more robust and accurate in handling the spatio-temporal variability and discontinuity of SIF data. The developed framework is portable and can be applied to SIF retrievals at various resolutions and from various sources (satellites), as well as extended to other satellite-measured variables.
Keywords