IEEE Access (Jan 2024)
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers
Abstract
The launch of ChatGPT in 2022 garnered global attention, marking a significant milestone in the Generative Artificial Intelligence (GAI) field. While GAI has been in effect for the past decade, the introduction of ChatGPT sparked a new wave of research and innovation in the Artificial Intelligence (AI) domain. This surge has led to the development and release of numerous cutting-edge tools, such as Bard, Stable Diffusion, DALL-E, Make-A-Video, Runway ML, and Jukebox, among others. These tools exhibit remarkable capabilities, encompassing tasks ranging from text generation and music composition, image creation, video production, code generation, and even scientific work. They are built upon various state-of-the-art models, including Stable Diffusion, transformer models like GPT-3 (recent GPT-4), variational autoencoders, and generative adversarial networks. This advancement in GAI presents a wealth of exciting opportunities across various sectors, such as business, healthcare, education, entertainment, and media. However, concurrently, it poses unprecedented challenges such as impersonation, job displacement, privacy breaches, security vulnerabilities, and misinformation. To addressing these challenges requires a new direction for research to develop solutions and refine existing products. In our endeavor to contribute profound insights to society and advance research on GAI, we present a comprehensive journal which explores the theoretical and mathematical foundations of GAI state-of-the-art models, exploring the diverse spectrum of tasks they can perform, examining the challenges they entail, and discussing the promising prospects for the future of GAI.
Keywords