Remote Sensing (Jul 2019)

A Decision Tree Approach for Spatially Interpolating Missing Land Cover Data and Classifying Satellite Images

  • Jacinta Holloway,
  • Kate J. Helmstedt,
  • Kerrie Mengersen,
  • Michael Schmidt

DOI
https://doi.org/10.3390/rs11151796
Journal volume & issue
Vol. 11, no. 15
p. 1796

Abstract

Read online

Sustainable Development Goals (SDGs) are a set of priorities the United Nations and World Bank have set for countries to reach in order to improve quality of life and environment globally by 2030. Free satellite images have been identified as a key resource that can be used to produce official statistics and analysis to measure progress towards SDGs, especially those that are concerned with the physical environment, such as forest, water, and crops. Satellite images can often be unusable due to missing data from cloud cover, particularly in tropical areas where the deforestation rates are high. There are existing methods for filling in image gaps; however, these are often computationally expensive in image classification or not effective at pixel scale. To address this, we use two machine learning methods—gradient boosted machine and random forest algorithms—to classify the observed and simulated ‘missing’ pixels in satellite images as either grassland or woodland. We also predict a continuous biophysical variable, Foliage Projective Cover (FPC), which was derived from satellite images, and perform accurate binary classification and prediction using only the latitude and longitude of the pixels. We compare the performance of these methods against each other and inverse distance weighted interpolation, which is a well-established spatial interpolation method. We find both of the machine learning methods, particularly random forest, perform fast and accurate classifications of both observed and missing pixels, with up to 0.90 accuracy for the binary classification of pixels as grassland or woodland. The results show that the random forest method is more accurate than inverse distance weighted interpolation and gradient boosted machine for prediction of FPC for observed and missing data. Based on the case study results from a sub-tropical site in Australia, we show that our approach provides an efficient alternative for interpolating images and performing land cover classifications.

Keywords