IEEE Access (Jan 2024)
Identifying Disaster Regions in Images Through Attention Shifting With a Retarget Network
Abstract
Disasters disrupt lives and necessitate quick location of affected areas for rescue efforts. The application of computer vision has enhanced disaster detection, such as landslides and floods; however traditional computer vision methods often overlook smaller, critical details in favor of prominent objects. This research introduces the Retarget Network (RetNet), a novel framework aimed at improving image captioning techniques to identify and prioritize these less evident, yet crucial, objects in disaster scenarios, enhancing scene recognition and aiding in more effective disaster response. By masking images, we direct the model’s focus towards additional significant areas within the image. RetNet employs anchor boxes to refine the targeting of specific areas and, optimize their center positions, heights, and widths. Additionally, RetNet determines which anchors to mask prior to captioning, facilitating the identification of challenging objects such as boulders, soil, and water, which are indicative of natural disasters. We validated RetNet across multiple disaster scenarios—landslides, floods, and wildfires—using images taken from various perspectives, including side-view, aerial and shipborne views. Our findings reveal an accuracy of 91.60% in landslide detection from side-view image captions and 87.50% for detections from shipborne views. These results underscore RetNet’s effectiveness in enhancing the identification of disaster-affefted regions.
Keywords