A Google Glass Based Real-Time Scene Analysis for the Visually Impaired

Hafeez Ali A.; Sanjeev U. Rao; Swaroop Ranganath; T. S. Ashwin; Guddeti Ram Mohana Reddy

doi:10.1109/ACCESS.2021.3135024

IEEE Access (Jan 2021)

A Google Glass Based Real-Time Scene Analysis for the Visually Impaired

Hafeez Ali A.,
Sanjeev U. Rao,
Swaroop Ranganath,
T. S. Ashwin,
Guddeti Ram Mohana Reddy

Affiliations

Hafeez Ali A.: Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India
Sanjeev U. Rao: Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India
Swaroop Ranganath: ORCiD; Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India
T. S. Ashwin: ORCiD; Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India
Guddeti Ram Mohana Reddy: ORCiD; Department of Information Technology, National Institute of Technology Karnataka Surathkal, Mangalore, India

DOI: https://doi.org/10.1109/ACCESS.2021.3135024
Journal volume & issue: Vol. 9
pp. 166351 – 166369

Abstract

Read online

Blind and Visually Impaired People (BVIP) are likely to experience difficulties with tasks that involve scene recognition. Wearable technology has played a significant role in researching and evaluating systems developed for and with the BVIP community. This paper presents a system based on Google Glass designed to assist BVIP with scene recognition tasks, thereby using it as a visual assistant. The camera embedded in the smart glasses is used to capture the image of the surroundings, which is analyzed using the Custom Vision Application Programming Interface (Vision API) from Azure Cognitive Services by Microsoft. The output of the Vision API is converted to speech, which is heard by the BVIP user wearing the Google Glass. A dataset of 5000 newly annotated images is created to improve the performance of the scene description task in Indian scenarios. The Vision API is trained and tested on this dataset, increasing the mean Average Precision (mAP) score from 63% to 84%, with an IoU > 0.5. The overall response time of the proposed application was measured to be less than 1 second, thereby providing accurate results in real-time. A Likert scale analysis was performed with the help of the BVIP teachers and students at the “Roman & Catherine Lobo School for the Visually Impaired” at Mangalore, Karnataka, India. From their response, it can be concluded that the application helps the BVIP better recognize their surrounding environment in real-time, proving the device effective as a potential assistant for the BVIP.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords