IJCCS (Indonesian Journal of Computing and Cybernetics Systems) (Apr 2021)
Author Obfuscation on Indonesian News Articles Using Genetic Algorithms
Abstract
Authorship attribution is a method for identifying the author of a text from a group of potential authors and can solve the anonymity of unknown authors. Such method threatens anyone’s privacy, especially those who wish to write anonymously. To address this issue, author obfuscation is proposed to modify a text to disguise its author. In this research, a genetic algorithm-based author obfuscation model was created to modify Indonesian news articles to avoid identification from authorship attribution while keeping its semantics. The model iteratively changed some words in the article using crossover and mutation techniques guided by a fitness function which involve identification probability and similarity to the original article. The model is evaluated based on safety, soundness, and sensibleness parameter. The model has good safety since it can reduce the given authorship attribution model's accuracy by 0.3018 but drops to 0.1179 when tested on different models. Its soundness is pretty good since the similarity of the modified to the original articles reaches 0.7817. The model obtained a score of 2.571 on a scale of 0 to 4 in terms of sensibleness which indicates that some articles are acceptable in terms of grammar, but not a few are messy.
Keywords