Jisuanji kexue (Jan 2022)

Multivariate Regression Forest for Categorical Attribute Data

  • LIU Zhen-yu, SONG Xiao-ying

DOI
https://doi.org/10.11896/jsjkx.201200189
Journal volume & issue
Vol. 49, no. 1
pp. 108 – 114

Abstract

Read online

As categorical attributes cannot be utilized directly in some regression models like the linear regression,SVR and most multivariate regression trees,a multivariate split method dealing with multiple types of data is prompted in this paper.We define the centers of the sample sets on the categorical attributes and the distances from the samples to the centers in order that thecate-gorical attributes can also participate in the clustering process like the numerical attributes.Then a reasonable ensemble scheme is selected for the decision trees generated by the method to get the ensemble called cluster regression forest(CRF).Finally,we use CRF and other 9 regression models to compare regression mean absolute error (MAE) and root mean square error (RMSE) on 12 UCI public data sets.The experimental results show that CRF has the best performance among the 10 regression models.

Keywords