Informatics (Mar 2023)

Generating Paraphrase Using Simulated Annealing for Citation Sentences

  • Ridwan Ilyas,
  • Masayu Leylia Khodra,
  • Rinaldi Munir,
  • Rila Mandala,
  • Dwi Hendratmo Widyantoro

DOI
https://doi.org/10.3390/informatics10020034
Journal volume & issue
Vol. 10, no. 2
p. 34

Abstract

Read online

The paraphrase generator for citation sentences is used to produce several sentence alternatives to avoid plagiarism. Furthermore, the generation results need to pay attention to semantic similarity and lexical divergence standards. This study proposed the StoPGEN model as an algorithm for generating citation paraphrase sentences with stochastic output. The generation process is guided by an objective function using a simulated annealing algorithm to maintain the properties of semantic similarity and lexical divergence. The objective function is created by combining the two factors that maintain these properties. This study combined METEOR and PINC Scores in a linear weighting function that can be adjusted for its value tendency in one of the matrix functions. The dataset of citation sentences that had been labeled with paraphrases was used to test StoPGEN and other models for comparison. The StoPGEN model, with the citation sentences dataset, produced a BLEU score of 55.37, outperforming the bidirectional LSTM method with a value of 28.93. StoPGEN was also tested using Quora data by changing the language source in the architecture section resulting in a BLEU score of 22.37, outperforming UPSA 18.21. In addition, the qualitative evaluation results of the citation sentence generation based on respondents obtained an acceptance value of 50.80.

Keywords