The Potential of Visual ChatGPT for Remote Sensing

Lucas Prado Osco; Eduardo Lopes de Lemos; Wesley Nunes Gonçalves; Ana Paula Marques Ramos; José Marcato Junior

doi:10.3390/rs15133232

Remote Sensing (Jun 2023)

The Potential of Visual ChatGPT for Remote Sensing

Lucas Prado Osco,
Eduardo Lopes de Lemos,
Wesley Nunes Gonçalves,
Ana Paula Marques Ramos,
José Marcato Junior

Affiliations

Lucas Prado Osco: Faculty of Engineering and Architecture and Urbanism, University of Western São Paulo (UNOESTE), Rod. Raposo Tavares, km 572, Limoeiro, Presidente Prudente 19067-175, Brazil
Eduardo Lopes de Lemos: Faculty of Computing, Federal University of Mato Grosso do Sul (UFMS), Av. Costa e Silva-Pioneiros, Cidade Universitária, Campo Grande 79070-900, Brazil
Wesley Nunes Gonçalves: Faculty of Computing, Federal University of Mato Grosso do Sul (UFMS), Av. Costa e Silva-Pioneiros, Cidade Universitária, Campo Grande 79070-900, Brazil
Ana Paula Marques Ramos: Departament of Cartography, São Paulo State University (UNESP), Centro Educacional, R. Roberto Simonsen, 305, Presidente Prudente 19060-900, Brazil
José Marcato Junior: Faculty of Engineering, Architecture and Urbanism and Geography, Federal University of Mato Grosso do Sul (UFMS), Av. Costa e Silva-Pioneiros, Cidade Universitária, Campo Grande 79070-900, Brazil

DOI: https://doi.org/10.3390/rs15133232
Journal volume & issue: Vol. 15, no. 13
p. 3232

Abstract

Read online

Recent advancements in Natural Language Processing (NLP), particularly in Large Language Models (LLMs), associated with deep learning-based computer vision techniques, have shown substantial potential for automating a variety of tasks. These are known as Visual LLMs and one notable model is Visual ChatGPT, which combines ChatGPT’s LLM capabilities with visual computation to enable effective image analysis. These models’ abilities to process images based on textual inputs can revolutionize diverse fields, and while their application in the remote sensing domain remains unexplored, it is important to acknowledge that novel implementations are to be expected. Thus, this is the first paper to examine the potential of Visual ChatGPT, a cutting-edge LLM founded on the GPT architecture, to tackle the aspects of image processing related to the remote sensing domain. Among its current capabilities, Visual ChatGPT can generate textual descriptions of images, perform canny edge and straight line detection, and conduct image segmentation. These offer valuable insights into image content and facilitate the interpretation and extraction of information. By exploring the applicability of these techniques within publicly available datasets of satellite images, we demonstrate the current model’s limitations in dealing with remote sensing images, highlighting its challenges and future prospects. Although still in early development, we believe that the combination of LLMs and visual models holds a significant potential to transform remote sensing image processing, creating accessible and practical application opportunities in the field.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords