作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (20): 48-50. doi: 10.3969/j.issn.1000-3428.2006.20.018

• 软件技术与数据库 • 上一篇    下一篇

基于过滤的中文多模式近似字符串匹配算法

范立新1,2;谢晓能1,3;吴 飞1

  

  1. (1. 浙江大学计算机学院,杭州 310027;2. 绍兴文理学院计算机系,绍兴 312000;3. 杭州广播电视大学信息工程学院,杭州 310012)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-10-20 发布日期:2006-10-20

Algorithm of Multiple Approximate String for Chinese Characters Based on Filtering

FAN Lixin1,2;XIE Xiaoneng1,3;WU Fei1

  

  1. (1. College of Computer, Zhejiang University, Hangzhou 310027; 2. Department of Computer, Shaoxing Arts and Science University, Shaoxing 312000; 3. College of Information Engineering, Hangzhou Radio & TV University, Hangzhou 310012)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-10-20 Published:2006-10-20

摘要: 当前近似字符串匹配算法主要针对英文等中小字符集,该文针对汉字等大字符集的有效算法很少,尤其缺少适合汉字等大字符集的多模式近似匹配算法的情况,提出了一种适合汉字等大字符集的多模式近似匹配算法——MBPM-BM,通过实验证明了该算法的有效性。 近似字符串匹配;中文字符串匹配;多模式匹配;位并行运算;过滤

关键词: 近似字符串匹配, 中文字符串匹配, 多模式匹配, 位并行运算, 过滤

Abstract: Most of the algorithms of approximate string match are designed for small or middle size of character set. Until now, people can’t find any efficient algorithms for searching of multiple patterns of large size of character set. This paper presents an algorithm——MBPM-BM, which can be used for searching of multiple patterns. Experimental results show that MBPM-BM works well in practice especially in chinese characters match.

Key words: Approximate string match, Chinese string match, Multiple patterns match, Bit-parallel calculation, Filtering