LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education

Unggi Lee; Minji Jeon; Yunseo Lee; Gyuri Byun; Yoorim Son; Jaeyoon Shin; Hongkyu Ko; Hyeoncheol Kim

Computers and Education: Artificial Intelligence (Dec 2024)

LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education

Unggi Lee,
Minji Jeon,
Yunseo Lee,
Gyuri Byun,
Yoorim Son,
Jaeyoon Shin,
Hongkyu Ko,
Hyeoncheol Kim

Affiliations

Unggi Lee: Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author. Department of Computer Science & Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, South Korea.
Minji Jeon: Teaching, Learning and Teacher Education, University of Nebraska-Lincoln, United States
Yunseo Lee: Poongnap Elementary School, Seoul Metropolitan Office of Education, South Korea
Gyuri Byun: Department of Education, Seoul National University, South Korea
Yoorim Son: Interdisciplinary Program in Art Education (Art Education Major), Seoul National University, South Korea
Jaeyoon Shin: Department of Elementary Art Education, Seoul National University of Education, South Korea
Hongkyu Ko: Department of Elementary Art Education, Seoul National University of Education, South Korea; Corresponding author.
Hyeoncheol Kim: Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author.

Journal volume & issue: Vol. 7
p. 100297

Abstract

Read online

Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.

Published in Computers and Education: Artificial Intelligence

ISSN: 2666-920X (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/computers-and-education-artificial-intelligence

About the journal

Abstract

Keywords