陆军军医大学学报 (Nov 2023)

Impact of input image resolution in medical X-ray images on effectiveness of YOLO network for recognition of intertrochanteric fractures

  • LIU Xuesi,
  • DU Zhenwei,
  • NIE Rui

DOI
https://doi.org/10.16016/j.2097-0927.202307027
Journal volume & issue
Vol. 45, no. 22
pp. 2327 – 2333

Abstract

Read online

Objective To explore the effect of various input resolution of X-ray images on the performance of the You Only Look Once (YOLO) network in recognition of intertrochanteric fractures. Methods X-ray anteroposterior data of the patients with intertrochanteric fractures admitted in Army Medical Center of PLA from 2017 to 2022 were collected, and finally, 426 patients and 847 images were retained after exclusion criteria. Based on the 2018 guideline of Arbeitsgemeinschaft für Osteosynthesefragen/Orthopaedic Trauma Association (AO/OTA) and actual clinical incidence, these intertrochanteric fractures were reclassified into grades A1.2/A1.3/A2.2/A2.3/A3, and the X-ray images were assigned into training set (678 images), validation set (84 images), and test set (85 images) in a ratio of 8 ∶1 ∶1 in order to maintain strict consistency across each experiment. Eight common resolutions were set as input size for YOLOX-Swin-Transformer, YOLOX, YOLOv5, and YOLOv4 object detection networks. The training set was trained using both training from scratch and transfer learning. The training time was recorded, the test set was used to test the model, and evaluation metrics was recorded. SPSS20.0 statistical software was employed for statistical analysis. Regression analysis was applied to test curve fitting of training time and mean average precision (mAP) values. Frequency statistics function was performed to count the frequencies of evaluation indicators rated as excellent at each input resolution in order to determine the optimal range. Results The image input resolution was positively correlated with the training time of various networks, with all P-values 0.5) and P=0.011 (P < 0.05), indicating a good fit of the curve and statistical significance in the regression analysis. When the input image resolution was in a range of 480×480, 576×576, 640×640, the frequency of optimal evaluation index showed the highest, accounting for 42.86%. Conclusion The training time is extended with the increase of resolution. To achieve optimal recognition performance when using YOLO series networks for downstream tasks in medical image recognition, the image input resolution should be controlled within the range of 480×480, 576×576, 640×640, without altering the network architecture.

Keywords