Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approach

Oswaldo Solarte Pabón; Orlando Montenegro; Maria Torrente; Alejandro Rodríguez González; Mariano Provencio; Ernestina Menasalvas

doi:10.7717/peerj-cs.913

PeerJ Computer Science (Mar 2022)

Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approach

Oswaldo Solarte Pabón,
Orlando Montenegro,
Maria Torrente,
Alejandro Rodríguez González,
Mariano Provencio,
Ernestina Menasalvas

Affiliations

Oswaldo Solarte Pabón: Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
Orlando Montenegro: Escuela de Ingeniería de Sistemas y Computación, Universidad del Valle, Cali, Colombia
Maria Torrente: Hospital Universitario Puerta de Hierro, Madrid, Spain
Alejandro Rodríguez González: Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
Mariano Provencio: Hospital Universitario Puerta de Hierro, Madrid, Spain
Ernestina Menasalvas: Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain

DOI: https://doi.org/10.7717/peerj-cs.913
Journal volume & issue: Vol. 8
p. e913

Abstract

Read online Read online

Detecting negation and uncertainty is crucial for medical text mining applications; otherwise, extracted information can be incorrectly identified as real or factual events. Although several approaches have been proposed to detect negation and uncertainty in clinical texts, most efforts have focused on the English language. Most proposals developed for Spanish have focused mainly on negation detection and do not deal with uncertainty. In this paper, we propose a deep learning-based approach for both negation and uncertainty detection in clinical texts written in Spanish. The proposed approach explores two deep learning methods to achieve this goal: (i) Bidirectional Long-Short Term Memory with a Conditional Random Field layer (BiLSTM-CRF) and (ii) Bidirectional Encoder Representation for Transformers (BERT). The approach was evaluated using NUBES and IULA, two public corpora for the Spanish language. The results obtained showed an F-score of 92% and 80% in the scope recognition task for negation and uncertainty, respectively. We also present the results of a validation process conducted using a real-life annotated dataset from clinical notes belonging to cancer patients. The proposed approach shows the feasibility of deep learning-based methods to detect negation and uncertainty in Spanish clinical texts. Experiments also highlighted that this approach improves performance in the scope recognition task compared to other proposals in the biomedical domain.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords