IEEE Access (Jan 2024)
Advancements in Polyp Detection: A Developed Single Shot Multibox Detector Approach
Abstract
The significant challenge in detecting polyps in wireless capsule endoscopy and colonoscopy images lies in identifying the small ones. This detection task exhibits variability related to multiple characteristics such as color, shape, texture, morphology and size. In recent times, a lot of research has been done on the detection of polyps using the one- or two-stage method. Nevertheless, their real-world application demands significant computational power and memory resources, resulting in a trade-off between speed and improved precision. This study proposes Small Polyp Detection (SPDNet) for identifying small polyp regions in frames from wireless capsule endoscopy (WCE) and colonoscopy. Considering the foundational architecture is centered on balancing precision and speed. Extensive exploration was conducted to understand and address this trade-off. Applying deep transfer learning involves transferring knowledge to polyp images, allowing the extraction of exceptionally representative features and contextual information using the single-shot detector. Medical imaging applications are areas where the SSD has showcased its efficiency, but its limited detection capability for small polyp areas persists. Initially, the latest layers of the original backbone network VGG-16 of the SSD were modified. Subsequently, the feature maps from different layers and scales were adjusted to match their sizes. A branch block and concatenation module were introduced to integrate the feature maps from different layers, which were delivered to the next layer. Following this, a transition block generated new pyramidal layers. Then, in the middle of SPDNet, attention was applied. Finally, the multibox received the feature maps to generate the ultimate detection results, culminating in the refined feature map. The experimental results demonstrated a notable enhancement in the proposed method for detecting small polyp regions, achieving an encouraging mean average precision (mAP) of 94.32% and an F1-score of 91.37%. Moreover, the model showcased efficiency by demanding less computational time.
Keywords