Scientific Reports (Jan 2024)
New marker for chronic kidney disease progression and mortality in medical-word virtual space
Abstract
Abstract A new marker reflecting the pathophysiology of chronic kidney disease (CKD) has been desired for its therapy. In this study, we developed a virtual space where data in medical words and those of actual CKD patients were unified by natural language processing and category theory. A virtual space of medical words was constructed from the CKD-related literature (n = 165,271) using Word2Vec, in which 106,612 words composed a network. The network satisfied vector calculations, and retained the meanings of medical words. The data of CKD patients of a cohort study for 3 years (n = 26,433) were transformed into the network as medical-word vectors. We let the relationship between vectors of patient data and the outcome (dialysis or death) be a marker (inner product). Then, the inner product accurately predicted the outcomes: C-statistics of 0.911 (95% CI 0.897, 0.924). Cox proportional hazards models showed that the risk of the outcomes in the high-inner-product group was 21.92 (95% CI 14.77, 32.51) times higher than that in the low-inner-product group. This study showed that CKD patients can be treated as a network of medical words that reflect the pathophysiological condition of CKD and the risks of CKD progression and mortality.