Using deep learning in pathology image analysis: A novel active learning strategy based on latent representation

Yixin Sun; Lei Wu; Peng Chen; Feng Zhang; Lifeng Xu

doi:10.3934/era.2023271

Electronic Research Archive (Jul 2023)

Using deep learning in pathology image analysis: A novel active learning strategy based on latent representation

Yixin Sun,
Lei Wu ,
Peng Chen,
Feng Zhang,
Lifeng Xu

Affiliations

Yixin Sun: 1. School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China 2. Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 314099, China
Lei Wu: 1. School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China 2. Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 314099, China
Peng Chen: 3. School of Computer and Software Engineering, Xihua University, Chengdu 611731, China
Feng Zhang: 4. The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou 324000, China
Lifeng Xu: 4. The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou 324000, China

DOI: https://doi.org/10.3934/era.2023271
Journal volume & issue: Vol. 31, no. 9
pp. 5340 – 5361

Abstract

Read online

Most countries worldwide continue to encounter a pathologist shortage, significantly impeding the timely diagnosis and effective treatment of cancer patients. Deep learning techniques have performed remarkably well in pathology image analysis; however, they require expert pathologists to annotate substantial pathology image data. This study aims to minimize the need for data annotation to analyze pathology images. Active learning (AL) is an iterative approach to search for a few high-quality samples to train a model. We propose our active learning framework, which first learns latent representations of all pathology images by an auto-encoder to train a binary classification model, and then selects samples through a novel ALHS (Active Learning Hybrid Sampling) strategy. This strategy can effectively alleviate the sample redundancy problem and allows for more informative and diverse examples to be selected. We validate the effectiveness of our method by undertaking classification tasks on two cancer pathology image datasets. We achieve the target performance of 90% accuracy using 25% labeled samples in Kather's dataset and reach 88% accuracy using 65% labeled data in BreakHis dataset, which means our method can save 75% and 35% of the annotation budget in the two datasets, respectively.

Published in Electronic Research Archive

ISSN: 2688-1594 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Science: Mathematics; Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.aimspress.com/journal/era

About the journal

Abstract

Keywords