作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (23): 12-14. doi: 10.3969/j.issn.1000-3428.2009.23.005

• 博士论文 • 上一篇    下一篇

String核负实例语法特征提取算法

吕 威1,2,林文昶2,李 磊2   

  1. (1. 北京师范大学珠海分校信息技术学院,珠海 519085;2. 中山大学软件研究所,广州 510275)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-12-05 发布日期:2009-12-05

Grammatical Feature Extraction Algorithm for String Kernel False Instance

LV Wei1,2, LIN Wen-chang2, LI Lei2   

  1. (1. School of Information Technology, Zhuhai Campus, Beijing Normal University, Zhuhai 519085; 2. Software Research Institute, Zhongshan University, Guangzhou 510275)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-05 Published:2009-12-05

摘要: 通过String核方法把语法数据库中的负实例转化成核矩阵,采用Kmeans聚类算法对核矩阵进行聚类,将原始负实例数据库分成多个容量较小的特征数据表,使大规模O(n3)核矩阵转换为 ( )矩阵,以减少运算量。分析语法检查精度随Kmeans聚类参数的变化规律。实验结果表明,该算法在不降低语法检查精度的前提下提高了语法检查速度。

关键词: Kmeans方法, 聚类, String核, 负实例, 特征提取

Abstract: This paper translates false instance in grammatical database to kernel matrix through String kernel method, uses Kmeans clustering method to cluster the kernel matrix and separate the original false instance database into many characteristic tables with small capacitance. It transforms large scale O(n3) kernel matrix into ( ) matrix to decrease calculation amount, and analyzes the rule of the grammatical check accuracy with the change of Kmeans clustering parameters. Experimental results show that this algorithm can enhance the running speed without decreasing the accuracy of grammatical check.

Key words: Kmeans method, clustering, String kernel, false instance, feature extraction

中图分类号: