Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System

Mesfer Al Duhayyim; Sana Alazwari; Hanan Abdullah Mengash; Radwa Marzouk; Jaber S. Alzahrani; Hany Mahgoub; Fahd Althukair; Ahmed S. Salama

doi:10.3390/app12157724

Applied Sciences (Jul 2022)

Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System

Mesfer Al Duhayyim,
Sana Alazwari,
Hanan Abdullah Mengash,
Radwa Marzouk,
Jaber S. Alzahrani,
Hany Mahgoub,
Fahd Althukair,
Ahmed S. Salama

Affiliations

Mesfer Al Duhayyim: Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia
Sana Alazwari: Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Hanan Abdullah Mengash: Department of Informaion Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
Radwa Marzouk: Department of Informaion Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
Jaber S. Alzahrani: Department of Industrial Engineering, College of Engineering at Alqunfudah, Umm Al-Qura University, Mecca 24382, Saudi Arabia
Hany Mahgoub: Department of Computer Science, College of Science & Art at Mahayil, King Khalid University, Abha 62529, Saudi Arabia
Fahd Althukair: Department of Electrical Engineering and Computer Sciences, College of Engineering, University of California, Berkeley, CA 94720, USA
Ahmed S. Salama: Department of Electrical Engineering, Faculty of Engineering & Technology, Future University in Egypt, New Cairo 11845, Egypt

DOI: https://doi.org/10.3390/app12157724
Journal volume & issue: Vol. 12, no. 15
p. 7724

Abstract

Read online

Image captioning is a popular topic in the domains of computer vision and natural language processing (NLP). Recent advancements in deep learning (DL) models have enabled the improvement of the overall performance of the image captioning approach. This study develops a metaheuristic optimization with a deep learning-enabled automated image captioning technique (MODLE-AICT). The proposed MODLE-AICT model focuses on the generation of effective captions to the input images by using two processes involving encoding unit and decoding unit. Initially, at the encoding part, the salp swarm algorithm (SSA), with a HybridNet model, is utilized to generate effectual input image representation using fixed-length vectors, showing the novelty of the work. Moreover, the decoding part includes a bidirectional gated recurrent unit (BiGRU) model used to generate descriptive sentences. The inclusion of an SSA-based hyperparameter optimizer helps in attaining effectual performance. For inspecting the enhanced performance of the MODLE-AICT model, a series of simulations were carried out, and the results are examined under several aspects. The experimental values suggested the betterment of the MODLE-AICT model over recent approaches.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords