Neural Network Assisted Pathology Case Identification

Jerome Cheng

doi:10.1016/j.jpi.2022.100008

Journal of Pathology Informatics (Jan 2022)

Neural Network Assisted Pathology Case Identification

Jerome Cheng

Affiliations

Jerome Cheng: Corresponding author at: University of Michigan, NCRC Bldg.35, Rm 30-1597, 2800 Plymouth Road, Ann Arbor, MI 48109, USA.; Department of Pathology, University of Michigan, Ann Arbor, MI, USA

DOI: https://doi.org/10.1016/j.jpi.2022.100008
Journal volume & issue: Vol. 13
p. 100008

Abstract

Read online

Background: Traditionally, cases for cohort selection and quality assurance purposes are identified through structured query language (SQL) searches matching specific keywords. Recently, several neural network-based natural language processing (NLP) pipelines have emerged as an accurate alternative/complementary method for case retrieval. Methods: The diagnosis section of 1000 pathology reports with the terms “colon” and “carcinoma” were retrieved from our laboratory information system through a SQL query. Each of the reports were labeled as either positive or negative, where cases are considered positive if the case was a primary adenocarcinoma of the colon. Negative cases comprised adenocarcinoma from other sites, metastatic adenocarcinomas, benign conditions, rectal cancers, and other cases that do not fit in the primary colonic adenocarcinoma category. The 1000 cases were randomly separated into training, validation, and holdout sets. A convolutional neural network (CNN) model built using Keras (a neural network library) was trained to identify positive cases, and the model was applied to the holdout set to predict the category for each case. Results: The CNN model classified 141 out of 149 primary colonic adenocarcinoma cases, and 43 out of 51 negative cases correctly, achieving an accuracy of 92% and area under the ROC curve (AUC) of 0.957. Conclusion: Trained convolutional neural network models by itself, or as an adjunct to keyword and pattern-based text extraction methods may be used to search for pathology cases of interest with high accuracy.

Published in Journal of Pathology Informatics

ISSN: 2229-5089 (Print); 2153-3539 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Medicine: Pathology
Website: https://www.journals.elsevier.com/journal-of-pathology-informatics

About the journal

Abstract

Keywords