Dataset of computer science course queries from students: Categorized and scored according to Bloom's taxonomy

Khandoker Ashik Uz Zaman; Ashraful Islam; Yusuf Mahbubul Islam; Md Abu Sayed

Data in Brief (Apr 2024)

Dataset of computer science course queries from students: Categorized and scored according to Bloom's taxonomy

Khandoker Ashik Uz Zaman,
Ashraful Islam,
Yusuf Mahbubul Islam,
Md Abu Sayed

Affiliations

Khandoker Ashik Uz Zaman: Department of Computer Science and Engineering, Independent University Bangladesh, Dhaka 1229, Bangladesh
Ashraful Islam: Department of Computer Science and Engineering, Independent University Bangladesh, Dhaka 1229, Bangladesh; Center for Computational & Data Sciences, Independent University Bangladesh, Dhaka 1229, Bangladesh; Corresponding author.
Yusuf Mahbubul Islam: Department of Computer Science and Engineering, Independent University Bangladesh, Dhaka 1229, Bangladesh
Md Abu Sayed: Department of Computer Science and Engineering, Independent University Bangladesh, Dhaka 1229, Bangladesh

Journal volume & issue: Vol. 53
p. 110109

Abstract

Read online

“Why don't students learn?” is a common question that educators try to address. To encourage students to become more engaged in the learning process, we believe in fostering their natural curiosity by encouraging them to ask high-level questions. To support this approach, we have compiled a dataset of questions that we hope will aid in the training of artificial intelligence (AI) models and ultimately improve the learning experience for students. To develop our dataset, we collected anonymous student questioning data in the Summer 2023 semester, utilizing our online application named “Palta Question”, resulting in a dataset of 8,811 unique questions. The dataset consists of students’ inquiries which underwent basic question validation using a sophisticated keyword-based approach, manual categorization by topic and course content, as well as complexity assessment using Bloom's taxonomy keywords which have also been included in the dataset. To ensure question uniqueness, we implemented the Levenshtein distance algorithm to exclude questions with a high similarity rate. This dataset provides targeted insights into student inquiry patterns and knowledge gaps within the domain of 'Introduction to Computers and Research' and 'Data Structure' courses, originating from the students at Independent University, Bangladesh (IUB). While its scope is confined to a specific student group and academic context, limiting broader applicability, it remains valuable for detailed studies in these subjects and serves as a useful foundation for AI-based educational research tools. To demonstrate the effectiveness of the dataset, we also tested it to train the AI to perform basic tasks like sorting questions according to their courses and topics. However, we envision researchers utilizing it to enhance education and aid in students' learning.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal

Abstract

Keywords