An Empirical Study on Ensemble of Segmentation Approaches

Loris Nanni; Alessandra Lumini; Andrea Loreggia; Alberto Formaggio; Daniela Cuza

doi:10.3390/signals3020022

Signals (Jun 2022)

An Empirical Study on Ensemble of Segmentation Approaches

Loris Nanni,
Alessandra Lumini,
Andrea Loreggia,
Alberto Formaggio,
Daniela Cuza

Affiliations

Loris Nanni: Department of Information Engineering (DEI), University of Padua, Viale Gradenigo 6, 35131 Padova, Italy
Alessandra Lumini: Department of Computer Science and Engineering (DISI), Università di Bologna, Via dell’Università 50, 47521 Cesena, Italy
Andrea Loreggia: Department of Information Engineering (DII), Università di Brescia, Via Branze 38, 25123 Brescia, Italy
Alberto Formaggio: Department of Information Engineering (DEI), University of Padua, Viale Gradenigo 6, 35131 Padova, Italy
Daniela Cuza: Department of Information Engineering (DEI), University of Padua, Viale Gradenigo 6, 35131 Padova, Italy

DOI: https://doi.org/10.3390/signals3020022
Journal volume & issue: Vol. 3, no. 2
pp. 341 – 358

Abstract

Read online

Recognizing objects in images requires complex skills that involve knowledge about the context and the ability to identify the borders of the objects. In computer vision, this task is called semantic segmentation and it pertains to the classification of each pixel in an image. The task is of main importance in many real-life scenarios: in autonomous vehicles, it allows the identification of objects surrounding the vehicle; in medical diagnosis, it improves the ability of early detecting of dangerous pathologies and thus mitigates the risk of serious consequences. In this work, we propose a new ensemble method able to solve the semantic segmentation task. The model is based on convolutional neural networks (CNNs) and transformers. An ensemble uses many different models whose predictions are aggregated to form the output of the ensemble system. The performance and quality of the ensemble prediction are strongly connected with some factors; one of the most important is the diversity among individual models. In our approach, this is enforced by adopting different loss functions and testing different data augmentations. We developed the proposed method by combining DeepLabV3+, HarDNet-MSEG, and Pyramid Vision Transformers. The developed solution was then assessed through an extensive empirical evaluation in five different scenarios: polyp detection, skin detection, leukocytes recognition, environmental microorganism detection, and butterfly recognition. The model provides state-of-the-art results.

Published in Signals

ISSN: 2624-6120 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.mdpi.com/journal/signals

About the journal

Abstract

Keywords