Linguística (Dec 2018)

Análise de sentimento em artigos de opinião

  • Fátima Silva,
  • Purificação Silvano,
  • António Leal,
  • Fátima Oliveira,
  • Pavel Brazdil,
  • João Cordeiro,
  • Débora Oliveira

Journal volume & issue
Vol. 13
pp. 74 – 114

Abstract

Read online

The present study, which is developed in the interface between linguistics and computer science within the framework of sentiment analysis, aims at making a computational analysis of opinion articles in the area of economics and finance. The main objectives of the study are: i) to determine the semantic orientation of text segments that express opinion by annotating the polarity (positive or negative) and the strength (scale from -3 to 3) of nouns and adjectives, and ii) to verify if a specific lexicon for the area of economics and finance has advantages in automatic annotation of sentiment over a general lexicon. To achieve these objectives, a corpus of 45 texts was selected and analyzed in 2 phases, by annotators with different training. First, a sample of 10 texts was annotated by linguists, co-authors of this paper, with the objective of developing a linguistic annotation model to ascertain the polarity and strength of words in opinion articles and extract the relevant words for this area of study. Then, a set of 35 texts was annotated by university students, replicating the annotation model developed during the first phase. Based on the linguistic annotation, the computer science team tried to establish to what extent a general sentiment lexicon for Portuguese - SentiLex - was sufficient to extract the sentiment of a sentence in a satisfactory manner or whether EconoLex, a specific sentiment lexicon, would be more efficient. The specific lexicon includes terms and multiword expressions that are relevant to the area of economics and finance and to Portuguese language, and it was developed by the authors of this study. The data was analyzed according to a blending methodology, qualitative and quantitative. The results of the analysis allow us to consider the following items as contributes of this study: i) the development of a linguistic annotation model for the analysis of the polarity and strength of the lexicon, especially of nouns and adjectives; ii) the key role, though not exclusive, of the adjectives to determine the polarity of opinion segments of the corpus articles; iii) the creation of a new specific sentiment lexicon for Portuguese in the area of economics and finance; iv) the improvement of the computational performance of EconoLexSentiLex in relation to SentiLex regarding the performance in automatic annotation of sentiment. In spite of these positive results, there are some limitations, which we intend to overcome in the continuity of this interdisciplinary work, namely a more detailed linguistic analysis of the word classes that we studied, the consideration of other elements/ linguistic structures that are essential to ascertain the sentiment in NP/sentence, the extension of the corpus, the expansion of the specific lexicon of the area of economics and finance and the improvement of automatic methods for identifying evaluative words in texts of opinion and for assigning them polarity and strength.

Keywords