Scientific Reports (Oct 2023)
Unsupervised machine learning method for indirect estimation of reference intervals for chronic kidney disease in the Puerto Rican population
Abstract
Abstract Reference intervals (RIs) for clinical laboratory values are extremely important for diagnostics and treatment of patients. However, the determination of these ranges is costly and time-consuming. As a result, often different unverified RIs are used in practice for the same analyte and the same range is used for all patients despite evidence that the values are gender, age, and ethnicity dependent. Moreover, the abnormal flags are rudimentary, merely indicating if a value is within the RI. At the same time, clinical lab data generated in the everyday medical practice contains a wealth of information, that given the correct methodology, can help determine the RIs for each specific segment of the population, including populations that suffer from health disparities. In this work, we develop unsupervised machine learning methods, based on Gaussian mixtures, to determine RIs of analytes related to chronic kidney disease, using millions of routine lab results for the Puerto Rican population. We show that the measures are both gender and age dependent and we find evidence for normal age-related organ function deterioration and failure. We also show that the joint distribution of measures improves the diagnostic value of the lab results.