Tongxin xuebao (Jul 2019)
IMM4HT:an identification method of malicious mirror website for high-speed network traffic
Abstract
Aiming at the problem that some information causing harm to the network environment was transmitted through the mirror website so as to bypass the detection,an identification method of malicious mirror website for high-speed network traffic was proposed.At first,fragmented data from the traffic was extracted,and the source code of the webpage was restored.Next,a standardized processing module was utilized to improve the accuracy.Additionally,the source code of the webpage was divided into blocks,and the hash value of each block was calculated by the simhash algorithm.Therefore,the simhash value of the webpage source codes was obtained,and the similarity between the webpage source codes was calculated by the Hamming distance.The page snapshot was then taken and SIFT feature points were extracted.The perceptual hash value was obtained by clustering analysis and mapping processing.Finally,the similarity of webpages was calculated by the perceptual hash values.Experiments under real traffic show that the accuracy of the method is 93.42%,the recall rate is 90.20%,the F value is 0.92,and the processing delay is 20 μs.Through the proposed method,malicious mirror website can be effectively detected in the high-speed network traffic environment.