Erroneous pixel prediction for semantic image segmentation

Lixue Gong; Yiqun Zhang; Yunke Zhang; Yin Yang; Weiwei Xu

doi:10.1007/s41095-021-0235-7

Computational Visual Media (Oct 2021)

Erroneous pixel prediction for semantic image segmentation

Lixue Gong,
Yiqun Zhang,
Yunke Zhang,
Yin Yang,
Weiwei Xu

Affiliations

Lixue Gong: State Key Lab of CAD&CG, Zhejiang University
Yiqun Zhang: State Key Lab of CAD&CG, Zhejiang University
Yunke Zhang: State Key Lab of CAD&CG, Zhejiang University
Yin Yang: School of Computing Clemson University
Weiwei Xu: State Key Lab of CAD&CG, Zhejiang University

DOI: https://doi.org/10.1007/s41095-021-0235-7
Journal volume & issue: Vol. 8, no. 1
pp. 165 – 175

Abstract

Read online

Abstract We consider semantic image segmentation. Our method is inspired by Bayesian deep learning which improves image segmentation accuracy by modeling the uncertainty of the network output. In contrast to uncertainty, our method directly learns to predict the erroneous pixels of a segmentation network, which is modeled as a binary classification problem. It can speed up training comparing to the Monte Carlo integration often used in Bayesian deep learning. It also allows us to train a branch to correct the labels of erroneous pixels. Our method consists of three stages: (i) predict pixel-wise error probability of the initial result, (ii) redetermine new labels for pixels with high error probability, and (iii) fuse the initial result and the redetermined result with respect to the error probability. We formulate the error-pixel prediction problem as a classification task and employ an error-prediction branch in the network to predict pixel-wise error probabilities. We also introduce a detail branch to focus the training process on the erroneous pixels. We have experimentally validated our method on the Cityscapes and ADE20K datasets. Our model can be easily added to various advanced segmentation networks to improve their performance. Taking DeepLabv3+ as an example, our network can achieve 82.88% of mIoU on Cityscapes testing dataset and 45.73% on ADE20K validation dataset, improving corresponding DeepLabv3+ results by 0.74% and 0.13% respectively.

Published in Computational Visual Media

ISSN: 2096-0433 (Print); 2096-0662 (Online)
Publisher: SpringerOpen
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.springer.com/41095

About the journal

Abstract

Keywords