Using Segmentation to Boost Classification Performance and Explainability in CapsNets

Dominik Vranay; Maroš Hliboký; László Kovács; Peter Sinčák

doi:10.3390/make6030068

Machine Learning and Knowledge Extraction (Jun 2024)

Using Segmentation to Boost Classification Performance and Explainability in CapsNets

Dominik Vranay,
Maroš Hliboký,
László Kovács,
Peter Sinčák

Affiliations

Dominik Vranay: Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 042 52 Košice, Slovakia
Maroš Hliboký: Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 042 52 Košice, Slovakia
László Kovács: Faculty of Mechanical Engineering and Informatics, University of Miskolc, 3515 Miskolc, Hungary
Peter Sinčák: Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 042 52 Košice, Slovakia

DOI: https://doi.org/10.3390/make6030068
Journal volume & issue: Vol. 6, no. 3
pp. 1439 – 1465

Abstract

Read online

In this paper, we present Combined-CapsNet (C-CapsNet), a novel approach aimed at enhancing the performance and explainability of Capsule Neural Networks (CapsNets) in image classification tasks. Our method involves the integration of segmentation masks as reconstruction targets within the CapsNet architecture. This integration helps in better feature extraction by focusing on significant image parts while reducing the number of parameters required for accurate classification. C-CapsNet combines principles from Efficient-CapsNet and the original CapsNet, introducing several novel improvements such as the use of segmentation masks to reconstruct images and a number of tweaks to the routing algorithm, which enhance both classification accuracy and interoperability. We evaluated C-CapsNet using the Oxford-IIIT Pet and SIIM-ACR Pneumothorax datasets, achieving mean F1 scores of 93% and 67%, respectively. These results demonstrate a significant performance improvement over traditional CapsNet and CNN models. The method’s effectiveness is further highlighted by its ability to produce clear and interpretable segmentation masks, which can be used to validate the network’s focus during classification tasks. Our findings suggest that C-CapsNet not only improves the accuracy of CapsNets but also enhances their explainability, making them more suitable for real-world applications, particularly in medical imaging.

Published in Machine Learning and Knowledge Extraction

ISSN: 2504-4990 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware
Website: https://www.mdpi.com/journal/make

About the journal

Abstract

Keywords