计算机工程 ›› 2015, Vol. 41 ›› Issue (1): 143-149.doi: 10.3969/j.issn.1000-3428.2015.01.027

• 人工智能及识别技术 • 上一篇    下一篇

面向维吾尔文的多模式匹配算法研究

伊力亚尔·达吾提,哈力旦·阿布都热依木,杨娜娜   

  1. 新疆大学电气工程学院,乌鲁木齐 830047
  • 收稿日期:2014-02-24 修回日期:2014-04-02 出版日期:2015-01-15 发布日期:2015-01-16
  • 作者简介:伊力亚尔·达吾提(1988-),男,硕士研究生,主研方向:智能信息处理,图像处理;哈力旦·阿布都热依木,教授;杨娜娜,硕士研究生。
  • 基金项目:
    国家自然科学基金资助项目(61163026,60865001)

Research on Multiple Pattern Matching Algorithm for Uyghur

Yiliyaer Dawut,Halidan Abudureyimu,YANG Nana   

  1. College of Electrical Engineering,Xinjiang University,Urumqi 830047,China
  • Received:2014-02-24 Revised:2014-04-02 Online:2015-01-15 Published:2015-01-16

摘要: 维吾尔文多模式匹配算法是影响维吾尔文关键词过滤和检测性能的关键步骤之一。为此,考虑维吾尔文语法特点、书写方式、字母变换形式、特殊字母等因素,提出一种基于维吾尔文音节划分的多模式匹配算法。通过Bohum-sani函数的维吾尔语音节分解方法计算字符串音节数,利用Bohum-xekli函数得到字符串音节结构,按语法特点从右至左方式进行模式比较,实现维吾尔文多模式匹配。实验结果表明,与现有模式匹配算法相比,该算法具有更高的匹配效率。

关键词: 维吾尔文, 特殊字母, 词边界, 音节划分, 音节结构, 模式匹配

Abstract: Uyghur multiple pattern matching algorithm is the one of the key steps of affecting the keywords filtering and detecting system performance.This paper proposes a multiple pattern string matching algorithm for Uyghur,based on Uyghur syllable partition with considering the Uyghur syntactic characteristics,alphabet writing form,the form of alphabets change and especial alphabets.It uses Bohum-sani function to calculate character syllable count,Bohum xekli function to get string syllable combination form.It implemerts patlern matching from right to left according to the language features.Experimental result shows that this algorithm has higher matching efficiency,and the new multiple pattern string matching algorithm for Uyghur language performance is better than improved pattern matching efficiency comprised with existing pattern matching algorithms.

Key words: Uyghur, special alphabet, word boundary, syllable partition, syllable structure, pattern matching

中图分类号: