IEEE Access (Jan 2024)

Deep Learning-Based YOLO Models for the Detection of People With Disabilities

  • Madallah Alruwaili,
  • Muhammad Nouman Atta,
  • Muhammad Hameed Siddiqi,
  • Abdullah Khan,
  • Asfandyar Khan,
  • Yousef Alhwaiti,
  • Saad Alanazi

DOI
https://doi.org/10.1109/ACCESS.2023.3347169
Journal volume & issue
Vol. 12
pp. 2543 – 2566

Abstract

Read online

The current methods, while in use, continue to grapple with accuracy and effectiveness concerns. It is imperative to establish dependable solutions capable of distinguishing and categorizing people according to their assistive devices to tackle these issues. People with disabilities, such as those experiencing paralysis, limb deficiencies, or amputations, may encounter issues related to discrimination and inadequate support. Hence, this research was undertaken to detect and track people with conditions like paralysis, limb deficiency (Amelia), or amputation among the differently-abled population. Earlier investigations have predominantly focused on recognizing people and their mobility aids, utilizing a variety of methods such as Fast R-CNN, Faster R-CNN, RGB or RGB-D cameras, Kalman filters, and hidden Markov models. Modern deep learning models, including YOLO (You Only Look Once) and its variations, have gained substantial acceptance in current applications owing to their distinctive architectural designs and performance attributes. In this study, a substantial dataset comprising 4,300 images and 8,447 labels spanning five distinct categories is employed to assess the efficacy of YOLOv8, YOLOv5, and YOLOv7 models in the identification of people with disabilities. The evaluation findings show that YOLOv8, which achieved an overall precision of 0.907, performs better than both YOLOv5 (precision: 0.885) and YOLOv7 (precision: 0.906). Notably, YOLOv8 has the best wheelchair detection precision (0.998). Furthermore, YOLOv8 outperforms YOLOv5 (recall: 0.887) and YOLOv7 (recall: 0.925) in terms of recall performance (recall: 0.943). YOLOv8 achieves the greatest mean average accuracy ([email protected]) value of 0.951, followed by YOLOv5 ([email protected]: 0.942), and YOLOv7 ([email protected]: 0.954). In a similar vein, of the three models, YOLOv8 has the best performance ([email protected]:.95: 0.713). The analysis of detection time also shows that YOLOv8 performs best, processing 5,597 frames in just 5.9 milliseconds and achieving a remarkable frame rate of 169.49 frames per second.

Keywords