IEEE Access (Jan 2019)
Ensemble Learners of Multiple Deep CNNs for Pulmonary Nodules Classification Using CT Images
Abstract
Various deep convolutional neural networks (CNNs) have been used to distinguish between benign and malignant pulmonary nodules using CT images. However, single learner usually presents unsatisfied performance due to limited hypothesis space, or falling into local minima, or wrong selection of hypothesis space. To tackle these issues, we propose to build ensemble learners through fusing multiple deep CNN learners for pulmonary nodules classification. CT image patches of 743 nodules are extracted from LIDC-IDRI database and utilized. First, eight deep CNN learners with different architectures are trained and evaluated by 10-fold cross-validation. Each nodule has eight predictions from the eight primary learners. Second, we fuse these eight predictions by the strategies of majority voting (VOT), averaging (AVE), or machine learning. Specifically, different machine learning algorithms including K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Naive Bayes (NB), Decision Trees (DT), Multi-layer Perceptron (MLP), Random Forests (RF), Gradient Boosting Regression Trees (GBRT) and Adaptive Boosting (AdaBoost) are implemented. Moreover, the correlation coefficients between the predictions of 10 ensemble learners are calculated, and the hierarchical clustering dendrogram is drawn. It is found that the ensemble learners achieve higher prediction accuracy (84.0% vs 81.7%) than single CNN learner. The overlap ratio among the 10 ensemble learners is much higher than that of the 8 primary learners (62.9% vs 33.2%). In addition, it is shown that ensemble learners are roughly divided into three categories: the first (SVM, MLP, GBRT and RF) achieves the best performance; the second (VOT and AVE) is better than the third (AdaBoost, DT, NB and KNN). VOT and AVE yield higher recall than the machine learning algorithms. These results indicate that ensemble learners based on multiple CNN learners can achieve better performances for pulmonary nodules classification using CT images and that preferred fusion strategies include SVM, MLP, GBRT and RF.
Keywords