Scientific Data (Oct 2024)
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations
Abstract
Abstract In the past decade, substantial global efforts have been devoted to the development of reliable and efficient solutions for early flood warning and monitoring. One of the most common strategies for tackling this challenge involves the application of computer vision techniques to images obtained from the numerous surveillance cameras present in urban settings today. While there are various datasets available for training and testing these techniques, none of them specifically addresses the issue of out-of-distribution (OoD) behavior. This issue becomes particularly critical when evaluating the reliability of these methods under challenging environmental conditions. Our work stands as the first attempt to bridge this gap by introducing a new, highly controlled synthetic dataset that encompasses the essential attributes required for analyzing OoD behavior. The very high correlation between the accuracy of artificial intelligence (AI) models trained on our synthetic dataset and models trained on real-world data proves our dataset’s ability to predict real-world OoD behavior reliably.