Science of Remote Sensing (Jun 2024)
Testing temporal transferability of remote sensing models for large area monitoring
Abstract
Applying remote sensing models outside the temporal range of their training data, referred to as temporal model transfer, has become common practice for large area monitoring projects that extrapolate models for hindcasting or forecasting to time periods lacking reference data. However, the development of appropriate validation methods for temporal transfer has lagged behind its rapid adoption. Breaking temporal transfer's assumption of temporal consistency in both remote sensing and reference data and their relationship to each other could lead to biased pixel-level predictions and small area estimators, compromising the operational validity of large area monitoring products. Few studies using temporal transfer have evaluated its effects on model accuracy at the pixel/plot level, and the propensity for biased small area estimators and trends resulting from temporal transfer remains unexplored. We present a framework for evaluating temporal transferability by combining temporal cross-validation with a multiscale map assessment to aid in identifying where and when biased model predictions could scale to small area estimates and affect predicted trends.This validation framework is demonstrated in a case study of annual percent tree canopy cover mapping in Michigan. We tested and compared temporal transferability of random forest models of canopy cover derived from 2010 to 2016 systematic dot-grid photo-interpretations at Forest Inventory and Analysis plots with Landsat spectral indices fit with the LandTrendr temporal segmentation algorithm serving as the primary predictor variables. The temporal cross-validation error (mean RMSE = 13.9% cover) was higher than the common validation approach of considering all time periods of testing data together (RMSE = 13.6% cover) and lower than models trained and tested within the same year (mean RMSE = 14.2% cover). However, the bias of model predictions and small area estimators for individual years was higher with temporal transfer models than when applying models within the same year as their training data. We also evaluated how training models using different temporal subsets and with and without LandTrendr fitting affected predictions of Michigan's 1984–2020 predicted annual mean cover. The mean cover from LandTrendr-based models followed expected and consistent trends and had less difference between models trained with different temporal subsets (max difference = 5.8% cover). While those from Landsat had high interannual variations and greater difference between temporal models (max difference = 11.2% cover). The results of this case study demonstrate that evaluation of temporal transferability is necessary for establishing the operational validity of large area monitoring products, even when using time series algorithms that improve temporal consistency.