Data in Brief (Feb 2024)

The first generation of a regional-scale 1-m forest canopy cover dataset using machine learning and google earth engine cloud computing platform: A case study of Arkansas, USA

  • Hamdi A. Zurqani

Journal volume & issue
Vol. 52
p. 109986

Abstract

Read online

Forest canopy cover (FCC) is essential in forest assessment and management, affecting ecosystem services such as carbon sequestration, wildlife habitat, and water regulation. Ongoing advancements in techniques for accurately and efficiently mapping and extracting FCC information require a thorough evaluation of their validity and reliability. The primary objectives of this study are to: (1) create a large-scale forest FCC dataset with a 1-meter spatial resolution, (2) assess the regional spatial distribution of FCC at a regional scale, and (3) investigate differences in FCC areas among the Global Forest Change (Hansen et al., 2013) and U.S. Forest Service Tree Canopy Cover products at various spatial scales in Arkansas (i.e., county and city levels). This study utilized high-resolution aerial imagery and a machine learning algorithm processed and analyzed using the Google Earth Engine cloud computing platform to produce the FCC dataset. The accuracy of this dataset was validated using one-third of the reference locations obtained from the Global Forest Change (Hansen et al., 2013) dataset and the National Agriculture Imagery Program (NAIP) aerial imagery with a 0.6-m spatial resolution. The results showed that the dataset successfully identified FCC at a 1-m resolution in the study area, with overall accuracy ranging between 83.31% and 94.35% per county. Spatial comparison results between the produced FCC dataset and the Hansen et al., 2013 and USFS products indicated a strong positive correlation, with R2 values ranging between 0.94 and 0.98 for county and city levels. This dataset provides valuable information for monitoring, forecasting, and managing forest resources in Arkansas and beyond. The methodology followed in this study enhances efficiency, cost-effectiveness, and scalability, as it enables the processing of large-scale datasets with high computational demands in a cloud-based environment. It also demonstrates that machine learning and cloud computing technologies can generate high-resolution forest cover datasets, which might be helpful in other regions of the world.

Keywords