Cancer Control (Oct 2023)

RoBERTa-Assisted Outcome Prediction in Ovarian Cancer Cytoreductive Surgery Using Operative Notes

  • Alexandros Laios MD, PhD, PostDoc (Ox),
  • Evangelos Kalampokis PhD,
  • Marios Evangelos Mamalis BA, MSc,
  • Constantine Tarabanis MD,
  • David Nugent MBChB, MRCOG, PhD,
  • Amudha Thangavelu MBChB, MRCOG, MD,
  • Georgios Theophilou MBBS, MRCOG, MD,
  • Diederick De Jong MBBCh, PhD, MSc

DOI
https://doi.org/10.1177/10732748231209892
Journal volume & issue
Vol. 30

Abstract

Read online

Introduction Contemporary efforts to predict surgical outcomes focus on the associations between traditional discrete surgical risk factors. We aimed to determine whether natural language processing (NLP) of unstructured operative notes improves the prediction of residual disease in women with advanced epithelial ovarian cancer (EOC) following cytoreductive surgery. Methods Electronic Health Records were queried to identify women with advanced EOC including their operative notes. The Term Frequency – Inverse Document Frequency (TF-IDF) score was used to quantify the discrimination capacity of sequences of words (n-grams) regarding the existence of residual disease. We employed the state-of-the-art RoBERTa-based classifier to process unstructured surgical notes. Discrimination was measured using standard performance metrics. An XGBoost model was then trained on the same dataset using both discrete and engineered clinical features along with the probabilities outputted by the RoBERTa classifier. Results The cohort consisted of 555 cases of EOC cytoreduction performed by eight surgeons between January 2014 and December 2019. Discrete word clouds weighted by n-gram TF-IDF score difference between R0 and non-R0 resection were identified. The words ‘adherent’ and ‘miliary disease’ best discriminated between the two groups. The RoBERTa model reached high evaluation metrics (AUROC .86; AUPRC .87, precision, recall, and F1 score of .77 and accuracy of .81). Equally, it outperformed models that used discrete clinical and engineered features and outplayed the performance of other state-of-the-art NLP tools. When the probabilities from the RoBERTa classifier were combined with commonly used predictors in the XGBoost model, a marginal improvement in the overall model’s performance was observed (AUROC and AUPRC of .91, with all other metrics the same). Conclusion/Implications We applied a sui generis approach to extract information from the abundant textual surgical data and demonstrated how it can be effectively used for classification prediction, outperforming models relying on conventional structured data. State-of-art NLP applications in biomedical texts can improve modern EOC care.