Mathematical Biosciences and Engineering (May 2021)
Mathematical modeling and mining real-world Big education datasets with application to curriculum mapping
Abstract
This paper proposes an approach for modeling and mining curriculum Big data from real-world education datasets crawled online from university websites in Australia. It addresses the scenario to give a student a study plan to complete a course by accumulating credits on top of subjects he or she has completed. One challenge to be addressed is that subjects with similar bs from different universities may put barriers for setting up a reasonable, time-saving learning path because the student may be unable to distinguish them before an intensive research on all subjects related to the degree from the universities. We used concept graph-based learning techniques and discuss data representations and techniques which are more suited for large datasets. We created ground truth of subjects relations and subject's description with Bag of Words representations based on natural language processing. The generated ground truth was used to train a model, which summarizes a subject network and a concepts graph, where the concepts are automatically extracted from the subject descriptions across all the universities. The practical challenges to collect and extract the data from the university websites are also discussed in the paper. The work was validated on nineteen real-world education datasets crawled online from university websites in Australia and showed good performance.
Keywords