CLEI Electronic Journal (Dec 2019)

A New Statistical and Verbal-Semantic Approach to Pattern Extraction in Text Mining Applications

  • Dildre Georgiana Vasques,
  • Paulo Sérgio Martins,
  • Solange Oliveira Rezende

DOI
https://doi.org/10.19153/cleiej.22.3.5
Journal volume & issue
Vol. 22, no. 3

Abstract

Read online

The discovery of knowledge in textual databases is an approach that basically seeks for implicit relationships between different concepts in different documents written in natural language, in order to identify new useful knowledge. To assist in this process, this approach can count on the help of Text Mining techniques. Despite all the progress made, researchers in this area must still deal with the large number of false relationships generated by most of the available processes. A statistical and verbal semantic approach that supports the understanding of the logic between relationships may bridge this gap. Thus, the objective of this work is to support the user with the identification of implicit relationships between concepts present in different texts, considering the causal relationships between concepts in the texts. To this end, this work proposes a hybrid approach for the discovery of implicit knowledge present in a text corpus, using analysis based on association rules together with metrics from complex networks and verbal semantics. Through a case study, a set of texts from alternative medicine was selected and the different extractions showed that the proposed approach facilitates the identification of implicit knowledge by the user