Scientific Reports (Oct 2024)
Improved yolov5 algorithm combined with depth camera and embedded system for blind indoor visual assistance
Abstract
Abstract To assist the visually impaired in their daily lives and solve the problems associated with poor portability, high hardware costs, and environmental susceptibility of indoor object-finding aids for the visually impaired, an improved YOLOv5 algorithm was proposed. It was combined with a RealSense D435i depth camera and a voice system to realise an indoor object-finding device for the visually impaired using a Raspberry Pi 4 B device as its core. The algorithm uses GhostNet instead of the YOLOv5s backbone network to reduce the number of parameters and computation of the model, incorporates an attention mechanism (coordinate attention), and replaces the YOLOv5 neck network with a bidirectional feature pyramid network to enhance feature extraction. Compared to the YOLOv5 model, the model size was reduced by 42.4%, number of parameters was reduced by 47.9%, and recall rate increased by 1.2% with the same precision. This study applied the improved YOLOv5 algorithm to an indoor object-finding device for the visually impaired, where the searched object was input by voice, and the RealSense D435i was used to acquire RGB and depth images to realize the detection and ranging of the object, broadcast the specific distance of the target object by voice, and assist the visually impaired in finding the object.
Keywords