Remote Sensing (Oct 2020)
Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
Abstract
Forest species classifications are becoming increasingly automated as advances are made in machine learning. Complex algorithms can reach high accuracies, but are not always suitable for small-scale classifications, which may benefit from simpler conventional methods. The goal of this classification was to identify contiguous stands of ponderosa pine (Pinus ponderosa Douglas ex Lawson) against a mix of forest and non-forest background in the southern Willamette Valley, Oregon. The study area is approximately 816,600 ha, considerably larger than most study areas used for presenting techniques for tree species classification. To achieve the objective, we used two classification procedures, one parametric and one non-parametric. For the parametric method, we selected the maximum likelihood (ML) algorithm, whereas for the non-parametric method we chose the random forest (RF) algorithm. To identify ponderosa pine, we used 1 m spatial resolution red-green-blue-infrared (RGBI) aerial images supplied by the U.S. National Agriculture Imagery Program (NAIP) and 1 m spatial resolution canopy height models (CHMs) provided by the Oregon Department of Geology and Mineral Industries (DOGAMI). We tested four data variations for each method: Aerial imagery, CHM-masked aerial imagery, aerial imagery with an additional CHM band, and CHM-masked aerial imagery with a CHM band. The parametric classifications of aerial imagery alone reached an average kappa coefficient of 0.29, which increased to 0.51 when masked with CHM data. The incorporation of CHM data as a fifth band resulted in a similar improvement in kappa (0.47), but the most effective parametric method was the incorporation of CHM data as both a fifth band and a post-classification mask, resulting in a kappa coefficient of 0.89. The non-parametric classification of aerial imagery achieved a mean validation kappa coefficient of 0.85 collectively and 0.90 individually, which only increased by approximately 0.01 or less when the CHM masks were applied. The addition of the CHM band increased the kappa value to 0.91 for both individual and collective tile classifications. The highest kappa of all methods was achieved through five-band non-parametric classification with the addition of the CHM band (0.94) for both collective and individual classifications. Our results suggest that parametric methods, when enhanced with a CHM mask, could be suitable for large-area, small-scale classifications based on RGBI imagery, but a non-parametric classification of fused spectral and height data will generally achieve the highest accuracy for large, unbalanced datasets.
Keywords