Methods in Ecology and Evolution (Jul 2023)

Effects of outliers on remote sensing‐assisted forest biomass estimation: A case study from the United States national forest inventory

  • Jonathan A. Knott,
  • Greg C. Liknes,
  • Courtney L. Giebink,
  • Sungchan Oh,
  • Grant M. Domke,
  • Ronald E. McRoberts,
  • Valquiria F. Quirino,
  • Brian F. Walters

DOI
https://doi.org/10.1111/2041-210X.14084
Journal volume & issue
Vol. 14, no. 7
pp. 1587 – 1602

Abstract

Read online

Abstract Large‐scale ecological sampling networks, such as national forest inventories (NFIs), collect in situ data to support biodiversity monitoring, forest management and planning, and greenhouse gas reporting. Data harmonization aims to link auxiliary remotely sensed data to field‐collected data to expand beyond field sampling plots, but outliers that arise in data harmonization—questionable observations because their values differ substantially from the rest—are rarely addressed. In this paper, we review the sources of commonly occurring outliers, including random chance (statistical outliers), definitions and protocols set by sampling networks, and temporal and spatial mismatch between field‐collected and remotely sensed data. We illustrate different types of outliers and the effects they have on estimates of above‐ground biomass population parameters using a case study of 292 NFI plots paired with airborne laser scanning (ALS) and Sentinel‐2 data from Sawyer County, Wisconsin, United States. Depending on the criteria used to identify outliers (sampling year, plot location error, nonresponse, presence of zeros and model residuals), as many as 53 of the 292 Forest Inventory and Analysis plot observations (18%) were identified as potential outliers using a single criterion and 111 plot observations (38%) if all criteria were used. Inclusion or removal of potential outliers led to substantial differences in estimates of mean and standard error of the estimate of biomass per unit area. The simple expansion estimator, which does not rely on ALS or other auxiliary data, was more sensitive to outliers than model‐assisted approaches that incorporated ALS and Sentinel‐2 data. Including Sentinel‐2 predictors showed minimal increases to the precision of our estimates relative to models with ALS predictors alone. Outliers arise from many causes and can be pervasive in data harmonization workflows. Our review and case study serve as a note of caution to researchers and practitioners that the inclusion or removal of potential outliers can have unintended consequences on population parameter estimates. When used to inform large‐scale biomass mapping, carbon markets, greenhouse gas reporting and environmental policy, it is necessary to ensure the proper use of NFI and remotely sensed data in geospatial data harmonization.

Keywords