Tongxin xuebao (Jan 2009)
TPCAD:a text-oriented multi-protocol inference approach
Abstract
Protocol inference, which discovers protocol characteristics automatically, is an important problem in traffic classification.An accurate and efficient inference method based on semantic analysis of text traffic was proposed.According to the semantic analysis, the text content of traffic to token sequences was resolved.Then the token sequences based on a similarity comparison criterion on semantic space was clustered.At last the recognition characteristics was extracted from the token sequences that belong to the same protocol.In the experiment, the precision ratio is above 95% for common protocols and the speed can meet the needs of online training and classification.