Proceedings of the XXth Conference of Open Innovations Association FRUCT (Apr 2020)
Named Entity Recognition in Spanish Biomedical Literature: Short Review and Bert Model
Abstract
Named Entity Recognition (NER) is the rst step for knowledge acquisition when we deal with an unknown corpus of texts. Having received these entities, we have an opportunity to form parameters space and to solve problems of text mining as concept normalization, speech recognition, etc. The recent advances in NER are related to the technology of word embeddings, which transforms text to the form being effective for Deep Learning. In the paper, we show how NER detects pharmacological substances, compounds, and proteins in the dataset obtained from the Spanish Clinical Case Corpus (SPACCC). To achieve this goal, we use contextualized word embeddings based on BERT language representation, which shows better results than the standard word embeddings.
Keywords