Jisuanji kexue (Jun 2022)

Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment

  • WANG Yu-fei, CHEN Wen

DOI
https://doi.org/10.11896/jsjkx.211100043
Journal volume & issue
Vol. 49, no. 6
pp. 127 – 133

Abstract

Read online

Tri-training is a disagreement-based semi-supervised learning algorithm,in which both semi-supervised learning and ensemble learning mechanisms are simultaneously applied.It can improve the model performance by effectively leveraging some labeled samples along with a large amount of unlabeled ones through collaborations and iterations among basic classifiers.How-ever,when the labeled sample size is insufficient,the initial classifiers generated by Tri-training are not sufficiently trained.Furthermore,mislabeled noisy data might be generated during the collaborative labeling process among the classifiers.Aiming at these problems,a collaborative learning algorithm is proposed,which combines DECORATE ensemble learning,diversity mea-sure and credibility assessment.In our method,to improve the generalization performance,multiple preference classifiers are generated based on DECORATE with differentiated artificial data and labels,and the diversities of classifiers are measured and selected by Jensen-Shannon divergence to maxmize the diversity of the classifiers.At the same time,the credibility of the pseudo labeled samples is assessed during the iterations by a label propagation algorithm to reduce the noisy data.The results of classification experiment on UCI data sets demonstrate that the proposed algorithm achieves higher accuracy and F1-score than Tri-trai-ning algorithm and its improved versions.

Keywords