IEEE Access (Jan 2020)

GeminiNet: Combine Fully Convolution Network With Structure of Receptive Fields for Object Detection

  • Shangjie Yao,
  • Yaowu Chen,
  • Xiang Tian,
  • Rongxin Jiang

DOI
https://doi.org/10.1109/ACCESS.2020.2982939
Journal volume & issue
Vol. 8
pp. 60305 – 60313

Abstract

Read online

Pneumonia is a relatively common disease that will endanger the lives of patients if left untreated. End-to-end detection of pneumonia using neural networks will be helpful for reducing related workforce. CNN's processing of images shows remarkable performance, naturally, the use of CNN based methods for assisted reading will be a trend in modern medicine. The property of current detection algorithms is not yet satisfactory, so further research is extremely needed. In this article, we design GeminiNet to identify and localize the pneumonia in Chest X-ray (CXR) images. It uses a popular fully convolution architecture with computation shared on the entire image, combining RoI Align and PSRoI Pooling to capture global and local information and output. Our approach introduces DetNet59, a network designed specifically for detection to capture deep features. In the sixth stage of DetNet59, the structure of the retina-like convolutional layers is added to replace the fully connected layer. This structure uses the dilated convolution to extend the reconstructive field, and the convolution kernels of three different scales are used for parallel calculation to collect rich feature information. GeminiNet is validated on the RSNA dataset. We augment dataset by flipping on horizontal and vertical for the small amount of data. At IoU (Intersection over Union) =0.5, AP reached 0.4575, 0.078 higher than ResNet50, and reached 0.7758 on the AUC indicator. GeminiNet achieves 8fps in detection speed, which is better than the 7fps of the popular Faster R-CNN architecture.

Keywords