Al-Iraqia Journal for Scientific Engineering Research (Mar 2024)

Prioritise Five Tafseer Translators Using Clustering Technique for Surah Al-Baqarah

  • Mohammed A. Ahmed,
  • Shahad Mahgoob Nafl,
  • Hanif Baharin,
  • Puteri Nor Ellyza Nohuddin

DOI
https://doi.org/10.58564/IJSER.3.1.2024.147
Journal volume & issue
Vol. 3, no. 1

Abstract

Read online

The English Tafseer Translation of the Holy Quran is essential for comprehending and interpreting Allah’s words for non-Arabic Muslims. This research adopted five different English translators (TR1-TR5) of chapter (Surah) Al-Baqarah and invested the advantages of the text clustering process to rank (prioritise) between these input five datasets. The absence of dataset ground truth (not standard datasets) requires the use of unsupervised learning (clustering technique) instead of other techniques (e.g. classification (supervised learning)). This study expanded the assessment to include both partitioning-based and hierarchical-based clustering algorithms. In a cluster based on partitioning, k-means is utilized. While for the hierarchical-based, the Agglomerative has been implemented. This research’s aim was achieved through a three-step procedure (stages). The first stage uses text cleansing to remove unnecessary words (Tokenisation, POS tagging, normalisation, stemming, and Stop-word removal). In addition, feature selection used VSM (Vector Space Model) and TF-IDF (Term Frequency-Inverse Document Frequency) to make the five corpora. The second stage implemented the clustering process. In the third stage, clustering validation was obtained using SC (Silhouette Coefficient) and DBI (Davies-Bouldin Index) metrics plus the execution time (ET). Principle Component Analysis (PCA) is used to visualise the clustering outputs. The results show, based on (ET, SC, and DBI) of the k-means algorithm, only ranks (1) and (3) demonstrate the same ranking for these five translators. In contrast, the Agglomerative algorithm shows the same five translators’ positions; each (ET, SC, and DBI) has a distinct rank. However, to obtain the optimal union rank, it is crucial to use a modern approach technique such as MCDM (Multi-Criteria Decision-making Analysis) in future work.

Keywords