A Chi-MIC Based Adaptive Multi-Branch Decision Tree
Jiahao Ye,
Jingjing Yang,
Jiang Yu,
Siqiao Tan,
Feng Luo,
Zheming Yuan,
Yuan Chen
Affiliations
Jiahao Ye
Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis and Decision-Making, Hunan Agricultural University, Changsha, China
Jingjing Yang
Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis and Decision-Making, Hunan Agricultural University, Changsha, China
Jiang Yu
Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis and Decision-Making, Hunan Agricultural University, Changsha, China
Siqiao Tan
Department of Information Intelligence, Hunan Agricultural University, Changsha, China
Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis and Decision-Making, Hunan Agricultural University, Changsha, China
Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis and Decision-Making, Hunan Agricultural University, Changsha, China
Since the decision trees (DTs) have an advantage over “black-box” models, such as neural nets or support vector machines, in terms of comprehensibility, such that it might merit improvement for further optimization. The node splitting measures and pruning methods are primary among the techniques that can improve the generalization abilities of DTs. Here, we introduced the unequal interval optimization for node splitting, as well as the local chi-square test for tree pruning. This new method was named an adaptive multi-branch decision tree (CMDT). 11 benchmark data sets with different scales were chosen from UCI Machine Learning Repository and coupled with 12 classifiers to evaluate the CMDT algorithm. The results showed that CMDT can be more reliable than the twelve comparative approaches, especially for imbalanced datasets. We also discussed the performance metrics and the weighted decision-making table in unbalanced data sets. The CMDT algorithm can be found here: https://github.com/chenyuan0510/CMDT.