SoftwareX (Jul 2021)
ImbTreeAUC: An R package for building classification trees using the area under the ROC curve (AUC) on imbalanced datasets
Abstract
In this paper, we propose a novel R package, named ImbTreeAUC, for building binary and multiclass decision tree using the area under the receiver operating characteristic (ROC) curve. The package provides nonstandard measures to select an optimal split point for an attribute as well as the optimal attribute for splitting through the application of local, semiglobal and global AUC measures. Additionally, ImbTreeAUC can handle imbalanced data, which is a challenging issue in many practical applications. The package supports cost-sensitive learning by defining a misclassification cost matrix and weight-sensitive learning. It accepts all types of attributes, including continuous, ordered and nominal attributes. The package and its code are made freely available.