网络与信息安全学报 (Jun 2022)
Tampered text detection via RGB and frequency relationship modeling
Abstract
In recent years, the widespread dissemination of tampered text images on the Internet constitutes an important threat to the security of text images.However, the corresponding tampered text detection (TTD) methods have not been sufficiently explored.The TTD task aims to locate all text regions in an image while judging whether the text regions have been tampered with according to the authenticity of the texture.Thus, different from the general text detection task, TTD task further needs to perceive the fine-grained information for real-world and tampered text classification.TTD task has two main challenges.One the one hand, due to the high similarity in texture between real-world texts and tampered texts, TTD methods that only learn from RGB domain features have limited capability to distinguish these two-category texts well.On the other hand, as the different detecting difficulty exists in real-world texts and tampered texts, the network cannot well balance the learning process of the two-category texts, resulting in the imbalance detection performance between real-world and tampered texts.Compared with RGB domain features, the discontinuity of text texture in frequency domain can help the network to identify the authenticity of text instances.Accordingly, a new TTD method based on RGB and frequency information relationship modeling was proposed.The features in the RGB and frequency domains were extracted by independent feature extractors respectively.Thus, the identification ability of tampered texture can be enhanced by introducing frequency information during the texture perception.Then, a global RGB-frequency relationship module (GRM) was introduced to model the texture authenticity relationship between different text instances.GRM referred to the RGB-frequency features of other text instances in the same image to assist in judging the authenticity of the current text instance, which solved the problem of imbalanced detection performance.Furthermore, a new TTD dataset (Tampered-SROIE) was proposed to evaluate the effectiveness of proposed method, which contains 986 images (626 training images and 360 test images).By evaluating on the Tampered-SROIE, the proposed method obtains 95.97% and 96.80% in F-measure for real-world and tampered texts respectively and reduces the imbalanced detection accuracy by 1.13%.The proposed method will give new insights to the TTD community from the perspective of network structure and detection strategy.Tampered-SROIE also provides an evaluation benchmark for future TTD methods.
Keywords