Applied Sciences (Jan 2021)

Integrating Speculation Detection and Deep Learning to Extract Lung Cancer Diagnosis from Clinical Notes

  • Oswaldo Solarte Pabón,
  • Maria Torrente,
  • Mariano Provencio,
  • Alejandro Rodríguez-Gonzalez,
  • Ernestina Menasalvas

DOI
https://doi.org/10.3390/app11020865
Journal volume & issue
Vol. 11, no. 2
p. 865

Abstract

Read online

Despite efforts to develop models for extracting medical concepts from clinical notes, there are still some challenges in particular to be able to relate concepts to dates. The high number of clinical notes written for each single patient, the use of negation, speculation, and different date formats cause ambiguity that has to be solved to reconstruct the patient’s natural history. In this paper, we concentrate on extracting from clinical narratives the cancer diagnosis and relating it to the diagnosis date. To address this challenge, a hybrid approach that combines deep learning-based and rule-based methods is proposed. The approach integrates three steps: (i) lung cancer named entity recognition, (ii) negation and speculation detection, and (iii) relating the cancer diagnosis to a valid date. In particular, we apply the proposed approach to extract the lung cancer diagnosis and its diagnosis date from clinical narratives written in Spanish. Results obtained show an F-score of 90% in the named entity recognition task, and a 89% F-score in the task of relating the cancer diagnosis to the diagnosis date. Our findings suggest that speculation detection is together with negation detection a key component to properly extract cancer diagnosis from clinical notes.

Keywords