Tehnički Vjesnik (Jan 2017)

Document similarity in repeatedly translated corpora

  • Vladimir Mateljan,
  • Vedran Juričić,
  • Dario Ogrizović

DOI
https://doi.org/10.17559/TV-20150831012553
Journal volume & issue
Vol. 24, no. 2
pp. 599 – 602

Abstract

Read online

The paper analyses the changes in relationship between documents in textual corpus that occur due to the translation into another language. Authors analyzed the similarities between documents in original corpus, in Croatian, and compared them with the corresponding documents in translated corpus, in English. The changes were analyzed using two measures, chi-square test’s P-value and new proposed measure, correction coefficient.

Keywords