IEEE Access (Jan 2020)
Deep Learning Local Descriptor for Image Splicing Detection and Localization
Abstract
In this paper, a novel image splicing detection and localization scheme is proposed based on the local feature descriptor which is learned by deep convolutional neural network (CNN). A two-branch CNN, which serves as an expressive local descriptor is presented and applied to automatically learn hierarchical representations from the input RGB color or grayscale test images. The first layer of the proposed CNN model is used to suppress the effects of image contents and extract the diverse and expressive residual features, which is deliberately designed for image splicing detection applications. In specific, the kernels of the first convolutional layer are initialized with an optimized combination of the 30 linear high-pass filters used in calculation of residual maps in spatial rich model (SRM), and is fine-tuned through a constrained learning strategy to retain the high-pass filtering properties for the learned kernels. Both the contrastive loss and cross entropy loss are utilized to jointly improve the generalization ability of the proposed CNN model. With the block-wise dense features for a test image extracted by the pre-trained CNN-based local descriptor, an effective feature fusion strategy, known as block pooling, is adopted to obtain the final discriminative features for image splicing detection with SVM. Based on the pre-trained CNN model, an image splicing localization scheme is further developed by incorporating the fully connected conditional random field (CRF). Extensive experimental results on several public datasets show that the proposed CNN based scheme outperforms some state-of-the-art methods not only in image splicing detection and localization performance, but also in robustness against JPEG compression.
Keywords