Earth System Science Data (Sep 2024)

A globally sampled high-resolution hand-labeled validation dataset for evaluating surface water extent maps

  • R. Mukherjee,
  • R. Mukherjee,
  • F. Policelli,
  • R. Wang,
  • E. Arellano-Thompson,
  • B. Tellman,
  • P. Sharma,
  • Z. Zhang,
  • J. Giezendanner,
  • J. Giezendanner

DOI
https://doi.org/10.5194/essd-16-4311-2024
Journal volume & issue
Vol. 16
pp. 4311 – 4323

Abstract

Read online

Effective monitoring of global water resources is increasingly critical due to climate change and population growth. Advancements in remote sensing technology, specifically in spatial, spectral, and temporal resolutions, are revolutionizing water resource monitoring, leading to more frequent and high-quality surface water extent maps using various techniques such as traditional image processing and machine learning algorithms. However, satellite imagery datasets contain trade-offs that result in inconsistencies in performance, such as disparities in measurement principles between optical (e.g., Sentinel-2) and radar (e.g., Sentinel-1) sensors and differences in spatial and spectral resolutions among optical sensors. Therefore, developing accurate and robust surface water mapping solutions requires independent validations from multiple datasets to identify potential biases within the imagery and algorithms. However, high-quality validation datasets are expensive to build, and few contain information on water resources. For this purpose, we introduce a globally sampled, high-spatial-resolution dataset labeled using 3 m PlanetScope imagery (Planet Team, 2017). Our surface water extent dataset comprises 100 images, each with a size of 1024×1024 pixels, which were sampled using a stratified random sampling strategy covering all 14 biomes. We highlighted urban and rural regions, lakes, and rivers, including braided rivers and coastal regions. We evaluated two surface water extent mapping methods using our dataset – Dynamic World (Brown et al., 2022), based on Sentinel-2, and the NASA IMPACT model (Paul and Ganju, 2021), based on Sentinel-1. Dynamic World achieved a mean intersection over union (IoU) of 72.16 % and F1 score of 79.70 %, while the NASA IMPACT model had a mean IoU of 57.61 % and F1 score of 65.79 %. Performance varied substantially across biomes, highlighting the importance of evaluating models on diverse landscapes to assess their generalizability and robustness. Our dataset can be used to analyze satellite products and methods, providing insights into their advantages and drawbacks. Our dataset offers a unique tool for analyzing satellite products, aiding the development of more accurate and robust surface water monitoring solutions. The dataset can be accessed via https://doi.org/10.25739/03nt-4f29 (Mukherjee et al., 2024).