作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (05): 38-40. doi: 10.3969/j.issn.1000-3428.2007.05.013

• 软件技术与数据库 • 上一篇    下一篇

基于网络处理器的高效中英文多模式匹配算法

廖明涛,张德运,李金库   

  1. (西安交通大学电信学院网络研究所,西安 710049)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-03-05 发布日期:2007-03-05

High Efficiency Chinese-English Multi-pattern Match Algorithm Based on Network Processor

LIAO Mingtao, ZHANG Deyun, LI Jinku   

  1. (Institute of Network, School of Information, Xi’an Jiaotong University, Xi’an 710049)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-03-05 Published:2007-03-05

摘要: 由于中英文字符在编码方面的差异,传统面向英文字符环境的多模式匹配算法无法直接应用于中英文字符混合环境。提出了一种适用于网络处理器和中英文混合环境的高效多模式匹配算法。该算法采用从左向右的正向匹配,以字节为最小匹配单位,以字符为最小移位单位,在Trie树结构基础上,利用块字符匹配降低逐字匹配的概率,结合Quick Search(QS)算法进行跳跃加速。实验表明,算法能够在中英文混合环境下避免字节错位和误匹配,匹配速度优于已有算法,且不存在空间膨胀问题,能够满足高速网络信息审计的要求。

关键词: 网络处理器, 多模式匹配, 字符串匹配

Abstract: Due to different encoding of Chinese and English characters, traditional multi-pattern match algorithms do not work for Chinese- English characters mixed text. This paper proposes a high efficiency multi-pattern match algorithm for Chinese-English mixed environment based on network processor. The algorithm uses left to right positive direction match, considers one byte as least match unit and one character as least shift unit. Based on hashed Trie structure, it utilizes block characters match to reduce the probability of ordered character at a time match, uses Quick Search algorithm to shift and speed up the match. The experiment shows that the algorithm is faster than existing algorithms, avoids byte misplace and mismatch under Chinese-English mixed environment, satisfies the requirement of high performance network information audit system.

Key words: Network processor, Multi-pattern match, String match