Deep CNN and Deep GAN in Computational Visual Perception-Driven Image Analysis

R. Nandhini Abirami; P. M. Durai Raj Vincent; Kathiravan Srinivasan; Usman Tariq; Chuan-Yu Chang

doi:10.1155/2021/5541134

Complexity (Jan 2021)

Deep CNN and Deep GAN in Computational Visual Perception-Driven Image Analysis

R. Nandhini Abirami,
P. M. Durai Raj Vincent,
Kathiravan Srinivasan,
Usman Tariq,
Chuan-Yu Chang

Affiliations

R. Nandhini Abirami: School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632014, India
P. M. Durai Raj Vincent: School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632014, India
Kathiravan Srinivasan: School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632014, India
Usman Tariq: College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
Chuan-Yu Chang: Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin 64002, Taiwan

DOI: https://doi.org/10.1155/2021/5541134
Journal volume & issue: Vol. 2021

Abstract

Read online

Computational visual perception, also known as computer vision, is a field of artificial intelligence that enables computers to process digital images and videos in a similar way as biological vision does. It involves methods to be developed to replicate the capabilities of biological vision. The computer vision’s goal is to surpass the capabilities of biological vision in extracting useful information from visual data. The massive data generated today is one of the driving factors for the tremendous growth of computer vision. This survey incorporates an overview of existing applications of deep learning in computational visual perception. The survey explores various deep learning techniques adapted to solve computer vision problems using deep convolutional neural networks and deep generative adversarial networks. The pitfalls of deep learning and their solutions are briefly discussed. The solutions discussed were dropout and augmentation. The results show that there is a significant improvement in the accuracy using dropout and data augmentation. Deep convolutional neural networks’ applications, namely, image classification, localization and detection, document analysis, and speech recognition, are discussed in detail. In-depth analysis of deep generative adversarial network applications, namely, image-to-image translation, image denoising, face aging, and facial attribute editing, is done. The deep generative adversarial network is unsupervised learning, but adding a certain number of labels in practical applications can improve its generating ability. However, it is challenging to acquire many data labels, but a small number of data labels can be acquired. Therefore, combining semisupervised learning and generative adversarial networks is one of the future directions. This article surveys the recent developments in this direction and provides a critical review of the related significant aspects, investigates the current opportunities and future challenges in all the emerging domains, and discusses the current opportunities in many emerging fields such as handwriting recognition, semantic mapping, webcam-based eye trackers, lumen center detection, query-by-string word, intermittently closed and open lakes and lagoons, and landslides.

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/8503

About the journal