摘要: 维吾尔文多模式匹配算法是影响维吾尔文关键词过滤和检测性能的关键步骤之一。为此,考虑维吾尔文语法特点、书写方式、字母变换形式、特殊字母等因素,提出一种基于维吾尔文音节划分的多模式匹配算法。通过Bohum-sani函数的维吾尔语音节分解方法计算字符串音节数,利用Bohum-xekli函数得到字符串音节结构,按语法特点从右至左方式进行模式比较,实现维吾尔文多模式匹配。实验结果表明,与现有模式匹配算法相比,该算法具有更高的匹配效率。
关键词:
维吾尔文,
特殊字母,
词边界,
音节划分,
音节结构,
模式匹配
Abstract: Uyghur multiple pattern matching algorithm is the one of the key steps of affecting the keywords filtering and detecting system performance.This paper proposes a multiple pattern string matching algorithm for Uyghur,based on Uyghur syllable partition with considering the Uyghur syntactic characteristics,alphabet writing form,the form of alphabets change and especial alphabets.It uses Bohum-sani function to calculate character syllable count,Bohum xekli function to get string syllable combination form.It implemerts patlern matching from right to left according to the language features.Experimental result shows that this algorithm has higher matching efficiency,and the new multiple pattern string matching algorithm for Uyghur language performance is better than improved pattern matching efficiency comprised with existing pattern matching algorithms.
Key words:
Uyghur,
special alphabet,
word boundary,
syllable partition,
syllable structure,
pattern matching
中图分类号:
伊力亚尔·达吾提,哈力旦·阿布都热依木,杨娜娜. 面向维吾尔文的多模式匹配算法研究[J]. 计算机工程, 2015, 41(1): 143-149.
Yiliyaer Dawut,Halidan Abudureyimu,YANG Nana. Research on Multiple Pattern Matching Algorithm for Uyghur[J]. Computer Engineering, 2015, 41(1): 143-149.