Tongxin xuebao (Dec 2016)

FilterFA: a multiple string matching algorithm based on specification of character set

  • Ping ZHANG,
  • Hui-min HE,
  • Chun-yan ZHANG,
  • Cong CAO,
  • Yan-bing LIU,
  • Jian-long TAN

Journal volume & issue
Vol. 37
pp. 103 – 114

Abstract

Read online

Multiple string matching is one of the core techniques of intrusion detection system, where Aho-Corasick al-gorithm is widely used. To solve the problem that huge storage overhead of AC would influence performance deeply, an improved algorithm ——FilterFA, based on specification of character set was proposed. This algorithm compressed large character by the character set mapping function, and constructed a new automata based on the mapped character set,then space complexity decreased to O(|P||Σ′|). Experiments on synthetic datasets and real-world datasets (such as ClamAV) show that the storage overhead of FilterFA is only about 3% of that of AC, while the size of the character set is 8, and the false recognition rate is less than 2%.

Keywords