Scientific Reports (Aug 2024)
Indirect reference interval estimation using a convolutional neural network with application to cancer antigen 125
Abstract
Abstract Indirect methods for reference interval (RI) estimation, which use data acquired from routine pathology testing, have the potential to accelerate the establishment of RIs to account for variables such as gender and age to improve clinical assessments. However, they require more sophisticated methods of analysis due to the potential influence of pathological patients in raw clinical datasets. In this paper we develop a novel convolutional neural network (CNN) model trained on synthetic data to identify underlying healthy distributions within pathological admixtures. We present both the methodology to generate synthetic data and the CNN model. We evaluate the CNN using two synthetic test datasets, including samples from a proposed benchmark for indirect methods (RIBench) and show significant improvements compared to the reported state-of-the-art method based on the benchmark (refineR). We also demonstrate a real-world application of the model, estimating age-specific RIs for cancer antigen 125 (CA-125), a crucial biomarker for ovarian cancer diagnostics. Our results show that CA-125 RIs are strongly age-dependent, which could have important diagnostic consequences.