Cogent Arts & Humanities (Dec 2023)

Exploring the efficacy and reliability of automatic text summarisation systems: Arabic texts in focus

  • Abdulfattah Omar,
  • Waheed M. A. Altohami,
  • Wafya Hamouda

DOI
https://doi.org/10.1080/23311983.2023.2185968
Journal volume & issue
Vol. 10, no. 1

Abstract

Read online

This study compared the salient features of the three basic types of automatic text summarisation methods (ATSMs)—extractive, abstractive, and real-time—along with the available approaches used for each type. The data set comprised 12 reports on the current issues on automatic text summarisation methods and techniques across languages, with a special focus on Arabic whose structure has been largely claimed to be problematic in most ATSMs. Three main summarizers were compared: TAAM, OTExtSum, and OntoRealSumm. Further to this, a humanoid version of the summary of the data set was prepared, and then compared to the automatically generated summary. A 10-item questionnaire was built to help with the assessment of the target ATSMs. Also, Rouge analysis was performed to assess the efficacy of all techniques in minimising the redundancy of the data set. Findings showed that the precision of the target summarizers differed considerably, as 80% of the data set has been proven to be aware of the problems underlying ATSMS. The remaining parameters were in the normal range (65–75%). In light of the equations-based assessment of ATSMS, the highest range was noted with the removal of stop word, the least range was noted with POS tagging, stem weight, and stem collection. Regarding Arabic, the statistical analysis has been proven to be the most effective summarisation method (accuracy = 57.59%; reminiscence = 58.79%; F-Value = 57.99%). Further research is required to explore how the lexicogrammatical nature of languages and generic text structure would affect the text summarisation process.

Keywords