Alexandria Engineering Journal (Dec 2024)
Integrating IoT and visual question answering in smart cities: Enhancing educational outcomes
Abstract
Emerging as a paradigmatic shift in urban development, smart cities harness the potential of advanced information and communication technologies to seamlessly integrate urban functions, optimize resource allocation, and improve the effectiveness of city management. Within the domain of smart education, the imperative application of Visual Question Answering (VQA) technology encounters significant limitations at the prevailing stage, particularly the absence of a robust Internet of Things (IoT) framework and the inadequate incorporation of large pre-trained language models (LLMs) within contemporary smart education paradigms, especially in addressing zero-shot VQA scenarios, which pose considerable challenges. In response to these constraints, this paper introduces an IoT-based smart city framework that is designed to refine the functionality and efficacy of educational systems. This framework is delineated into four cardinal layers: the data collection layer, data transmission layer, data management layer, and application layer. Furthermore, we introduce the innovative TeachVQA methodology at the application layer, synergizing VQA technology with extensive pre-trained language models, thereby considerably enhancing the dissemination and assimilation of educational content. Evaluative metrics in the VQAv2 and OKVQA datasets substantiate that the TeachVQA methodology not only outperforms existing VQA approaches, but also underscores its profound potential and practical relevance in the educational sector.