Frontiers in Oncology (Oct 2021)

Electronic Medical Records as Input to Predict Postoperative Immediate Remission of Cushing’s Disease: Application of Word Embedding

  • Wentai Zhang,
  • Dongfang Li,
  • Ming Feng,
  • Baotian Hu,
  • Yanghua Fan,
  • Qingcai Chen,
  • Qingcai Chen,
  • Renzhi Wang

DOI
https://doi.org/10.3389/fonc.2021.754882
Journal volume & issue
Vol. 11

Abstract

Read online

BackgroundNo existing machine learning (ML)-based models use free text from electronic medical records (EMR) as input to predict immediate remission (IR) of Cushing’s disease (CD) after transsphenoidal surgery.PurposeThe aim of the present study is to develop an ML-based model that uses EMR that include both structured features and free text as input to preoperatively predict IR after transsphenoidal surgery.MethodsA total of 419 patients with CD from Peking Union Medical College Hospital were enrolled between January 2014 and August 2020. The EMR of the patients were embedded and transformed into low-dimensional dense vectors that can be included in four ML-based models together with structured features. The area under the curve (AUC) of receiver operating characteristic curves was used to evaluate the performance of the models.ResultsThe overall remission rate of the 419 patients was 75.7%. From the results of logistic multivariate analysis, operation (p < 0.001), invasion of cavernous sinus from MRI (p = 0.046), and ACTH (p = 0.024) were strongly correlated with IR. The AUC values for the four ML-based models ranged from 0.686 to 0.793. The highest AUC value (0.793) was for logistic regression when 11 structured features and “individual conclusions of the case by doctor” were included.ConclusionAn ML-based model was developed using both structured and unstructured features (after being processed using a word embedding method) as input to preoperatively predict postoperative IR.

Keywords