IEEE Access (Jan 2021)
A Google Glass Based Real-Time Scene Analysis for the Visually Impaired
Abstract
Blind and Visually Impaired People (BVIP) are likely to experience difficulties with tasks that involve scene recognition. Wearable technology has played a significant role in researching and evaluating systems developed for and with the BVIP community. This paper presents a system based on Google Glass designed to assist BVIP with scene recognition tasks, thereby using it as a visual assistant. The camera embedded in the smart glasses is used to capture the image of the surroundings, which is analyzed using the Custom Vision Application Programming Interface (Vision API) from Azure Cognitive Services by Microsoft. The output of the Vision API is converted to speech, which is heard by the BVIP user wearing the Google Glass. A dataset of 5000 newly annotated images is created to improve the performance of the scene description task in Indian scenarios. The Vision API is trained and tested on this dataset, increasing the mean Average Precision (mAP) score from 63% to 84%, with an IoU > 0.5. The overall response time of the proposed application was measured to be less than 1 second, thereby providing accurate results in real-time. A Likert scale analysis was performed with the help of the BVIP teachers and students at the “Roman & Catherine Lobo School for the Visually Impaired” at Mangalore, Karnataka, India. From their response, it can be concluded that the application helps the BVIP better recognize their surrounding environment in real-time, proving the device effective as a potential assistant for the BVIP.
Keywords