Corporum (Jun 2019)

A Corpus-based Study of Reporting Verbs in Citation Texts Using Natural Language Processing

  • Imran Ihsan,
  • Sarah Imran,
  • Osama Ahmed,
  • Muhammad Abdul Qadir

Journal volume & issue
Vol. 2, no. 1
pp. 25 – 36

Abstract

Read online

In scientific literary writings, authors often cite other researches to formulate their opinions and findings. The selection of the reporting verb for such purpose plays an important role in their citations. Reporting verbs may exhibit variety of strengths when used in different contexts and scenarios. Therefore, a compilation of reporting verbs used by authors in various contexts and its formulation in the form of a dataset can provide a basis for corpus-based analysis of citations and its reasons. Sentiment analysis techniques can categorize a citation into Positive, Negative or Neutral sentiments. Natural Language Processing techniques can automatically tag verbs used in a citation with high accuracy. This paper is a sentiment-based study, conducted to formulate a citations‘ reporting verb corpus, by categorizing the citation texts from a selected dataset into three sentiments. Using NLP techniques, reporting verbs are extracted from these citation texts and their frequencies are calculated. The study also describes the analysis of extracted verbs in each sentiment.

Keywords