Data in Brief (Jun 2023)
FloodIMG: Flood image DataBase system
Abstract
A breakthrough in building models for image processing came with the discovery that a convolutional neural network (CNN) can progressively extract higher-level representations of the image content. Having high-resolution images to train CNN models is a key for optimizing the performance of image segmentation models. This paper presents a new dataset—called Flood Image (FloodIMG) database system—that was developed for flood related image processing and segmentation. We developed various Internet of Things Application Programming Interfaces (IoT API) to gather flood-related images from Twitter, and US federal agencies’ web servers, such as the US Geological Survey (USGS) and the Department of Transportation (DOT). Overall, >9200 images of flooding events were collected, preprocessed, and formatted to make the dataset applicable for CNN training. Bounding boxes and polygon primitives were also labeled on each image to localize and classify an object in the image. Two use cases of FloodIMG are presented in this paper, where the Fast Region-based CNN (R-CNN) algorithm was used to estimate flood severity and depth during recent flooding events in the US. As of >9200 images, 7,400 were categorized as training sets, whereas >1,800 images were used for the R-CNN testing. Users can access the FloodIMG database freely through Kaggle platform to create more accessible, accurate, and optimized image segmentation models. The FloodIMG workflow concludes with a visualization of colors and labels per image that can serve as a benchmark for flood image processing and segmentation.