Knowledge Engineering and Data Science (May 2023)
Optimizing Random Forest Algorithm to Classify Player's Memorisation via In-game Data
Abstract
Assessment of a player's knowledge in game education has been around for some time. Traditional evaluation in and around a gaming session may disrupt the players' immersion. This research uses an optimized Random Forest to construct a non-invasive prediction of a game education player's Memorization via in-game data. Firstly, we obtained the dataset from a 3-month survey to record in-game data of 50 players who play 4-15 game stages of the Chem Fight (a test case game). Next, we generated three variants of datasets via the preprocessing stages: resampling method (SMOTE), normalization (min-max), and a combination of resampling and normalization. Then, we trained and optimized three Random Forest (RF) classifiers to predict the player's Memorization. We chose RF because it can generalize well given the high-dimensional dataset. We used RF as the classifier, subject to optimization using its hyperparameter: n_estimators. We implemented a Grid Search Cross Validation (GSCV) method to identify the best value of n_estimators. We utilized the statistics of GSCV results to reduce the weight of n_estimators by observing the region of interest shown by the graphs of performances of the classifiers. Overall, the classifiers fitted using the BEST n_estimators (i.e., 89, 31, 89, and 196 trees) from GSCV performed well with around 80% accuracy. Moreover, we successfully identified the smaller number of n_estimators (OPTIMAL), at least halved the BEST n_estimators. All classifiers were retrained using the OPTIMAL n_estimators (37, 12, 37, and 41 trees). We found out that the performances of the classifiers were relatively steady at ~80%. This means that we successfully optimized the Random Forest in predicting a player's Memorization when playing the Chem Fight game. An automated technique presented in this paper can monitor student interactions and evaluate their abilities based on in-game data. As such, it can offer objective data about the skills used.