Automated Image Captioning Using Sparrow Search Algorithm With Improved Deep Learning Model

Munya A. Arasi; Haya Mesfer Alshahrani; Nuha Alruwais; Abdelwahed Motwakel; Noura Abdelaziz Ahmed; Abdullah Mohamed

doi:10.1109/ACCESS.2023.3317276

IEEE Access (Jan 2023)

Automated Image Captioning Using Sparrow Search Algorithm With Improved Deep Learning Model

Munya A. Arasi,
Haya Mesfer Alshahrani,
Nuha Alruwais,
Abdelwahed Motwakel,
Noura Abdelaziz Ahmed,
Abdullah Mohamed

Affiliations

Munya A. Arasi: ORCiD; Department of Computer Science, College of Science and Arts in Rijal Almaa, King Khalid University, Abha, Saudi Arabia
Haya Mesfer Alshahrani: Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Nuha Alruwais: ORCiD; Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, Riyadh, Saudi Arabia
Abdelwahed Motwakel: ORCiD; Department of Management Information Systems, College of Business Administration Hawtat Bani Tamim, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
Noura Abdelaziz Ahmed: Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
Abdullah Mohamed: Research Centre, Future University in Egypt, New Cairo, Egypt

DOI: https://doi.org/10.1109/ACCESS.2023.3317276
Journal volume & issue: Vol. 11
pp. 104633 – 104642

Abstract

Read online

Image captioning is a deep learning technique that intends to create and generate textual descriptions or captions for images. It integrates computer vision and natural language processing (NLP) to comprehend the visual content of an image and generate human-like descriptions. Deep learning (DL) based image captioning models can be trained on large-scale datasets, allowing them to generalize various types of images and generate captions that apply to a wide range of visual scenarios. By combining computer vision and natural language processing, DL-enabled image captioning models can understand both visual and textual information, which enables them to generate captions that not only describe the visual content but also incorporate contextual and semantic information. This study develops an Automated Image Captioning using Sparrow Search Algorithm with Improved Deep Learning (AIC-SSAIDL) technique. The major intention of the AIC-SSAIDL technique lies in the automated generation of textual captions for the input images. To accomplish this, the AIC-SSAIDL technique utilizes the MobileNetv2 model to generate feature descriptors of the input images and its hyperparameter tuning process takes place using SSA. For the image captioning process, the AIC-SSAIDL technique utilizes an attention mechanism with long short-term memory (AM-LSTM) network. Finally, the hyperparameter selection of the AM-LSTM model is performed by the fruit fly optimization (FFO) algorithm. A wide range of experiments has been conducted on benchmark data to depict the better performance of the AIC-SSAIDL method. The comprehensive result analysis highlighted the enhanced captioning results of the AIC-SSAIDL method with maximum CIDEr of 46.12, 61.89, and 137.45 on Flickr8k, Flickr30k, and MSCOCO datasets, respectively.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords