کتابداری و اطلاع‌رسانی (Oct 2019)

Analyzing the Application of Hyland Metadiscourse Model for Citation-based Automatic Text Summarization: A proposed Annotation Scheme for Citation Contexts

  • Pegah Tajer,
  • Abdolrasoul Jowkar,
  • Seyed Mostafa Fakhrahmad,
  • Hajar Sotoudeh,
  • Alireza Khormaee

DOI
https://doi.org/10.30481/lis.2019.81993
Journal volume & issue
Vol. 22, no. 3
pp. 91 – 111

Abstract

Read online

Objective: Author's abstract contains those contributions that the author himself considers important. Meanwhile, they may be less important among scientific community. This supplementary information can be obtained by analyzing citing articles. Citation contexts citing a cited article are actually summaries of that article produced by the scientific community. This type of summary is called citation summary which can provide a deeper insight into the impact of that article on scientific community. Selecting useful citation sentences to be inserted in a system summary is one of the major challenges of citation-based automatic text summarization. Hence, the semantic approach of analyzing citation contexts reveals citation functions; it can be used to refine citation contexts and to insert important content in the final summary. So, approaches like metadiscourse analysis that provide more information would result in producing useful summaries. Therefore, this paper aims at analyzing the application of Hyland metadiscourse model for citation-based automatic summarization of scientific texts. Moreover, based on Hyland Metadiscourse Model, an annotation scheme was proposed for citation contexts which could be used in corpus-based citation summarization systems. Methodology: This is a library research that answers research questions through studying and analyzing resources related to Hyland Metadiscourse Model, Scientific Text Summarization, Citation Context Analysis and Citation Function Classification. The scheme was evolved during two stages of analysis. First, an initial scheme was created based on studying existing schemes. Then, its metadiscourse version was suggested through analyzing Hyland Metadiscourse Model. Expert evaluation was performed for validating the proposed annotation scheme. Three experts in Information Science and two in Linguistics confirmed the scheme. Findings: Hyland interactional metadiscourse is suitable for analyzing citation contexts because it is used to represent the author's perspective on propositional information and also the reader. Moreover, interactional metadiscourse analysis applies appropriate language tools for the critique genre. Therefore, a scheme was proposed based on boosters, attitude markers, hedges, engagement markers and self-mentions which are the main components of Hyland interactional metadiscourse. The proposed scheme includes 70 classes. Conclusion: Hyland interactive metadiscourse can be used to construct proper corpora for automatic citation-based text summarization. Also, some other phases of automatic summarization such as classifier development, citation context refinement, and sentence selection could be performed based on this type of metadiscourse. Annotating corpora is usually performed using an annotation scheme. Thus, the proposed annotation scheme would be beneficial. However, it is a conceptual scheme proposed on existing theories. So, it is necessary to ask annotators to write down any new labels while annotating. Moreover, they should make some notes about the reasons of creating new ones. In the next stage, if desirable agreement is reached those labels could be added to the scheme.

Keywords