An FAQ dataset for E-learning system used on a Japanese University

Yasunobu Sumikawa; Masaaki Fujiyoshi; Hisashi Hatakeyama; Masahiro Nagai

Data in Brief (Aug 2019)

An FAQ dataset for E-learning system used on a Japanese University

Yasunobu Sumikawa,
Masaaki Fujiyoshi,
Hisashi Hatakeyama,
Masahiro Nagai

Affiliations

Yasunobu Sumikawa: Corresponding author.; Tokyo Metropolitan University, Japan
Masaaki Fujiyoshi: Tokyo Metropolitan University, Japan
Hisashi Hatakeyama: Tokyo Metropolitan University, Japan
Masahiro Nagai: Tokyo Metropolitan University, Japan

Journal volume & issue: Vol. 25

Abstract

Read online

In this data article, we present an FAQ dataset written in Japanese and its translation to English in order to train chatbot models for e-learning systems. We first collected raw Q&A data reported as the difficulties from April 2015 to July 2018 by users of the e-learning system introduced at Tokyo Metropolitan University. We then divided them into 11 categories according to features provided by the e-learning system. Finally, we integrated questions with the same answers in order to create the FAQ form. The dataset contains 427 questions and 79 answers that were examined by experts with experience in using the e-learning system for more than three years. Using this dataset, we performed statistical analyses to evaluate the qualities of the FAQ dataset. The proposed applications of the dataset include not only academic research but also activities; for example, translating from Japanese to another one like Chinese, adapting/modifying our dataset for another e-learning system, and developing language models to obtain highly accurate responses from chatbots.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal