作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (5): 79-81. doi: 10.3969/j.issn.1000-3428.2008.05.028

• 软件技术与数据库 • 上一篇    下一篇

基于多分类器决策的VN组合自动标注

陈丽江   

  1. (南京师范大学图书馆,南京 210036)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-03-05 发布日期:2008-03-05

Autolabeling of VN Combination Based on Multi-classifier

CHEN Li-jiang   

  1. (Library of Nanjing Normal University, Nanjing 210036)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-03-05 Published:2008-03-05

摘要: 汉语里动名词组合常使句法分析产生歧义。该文使用Adaboost算法组合多个贝叶斯分类器,对汉语中常见的动名词组合进行自动标注,分别识别出其中的定中结构和动宾结构。在进行特征选择时,参考词义消歧的方法,利用上下文词语、动词名词本身及其音节数等构造了特征向量。实验结果表明,在不参照其他资源的情况下,该方法识别效果较好,平均精确率和召回率分别达到90.5%和88.2%。

关键词: VN结构, VO结构, 语境, Adaboost算法, 贝叶斯分类

Abstract: The verb-noun combinations in Chinese often create ambiguities in parsing. This paper proposes a method to label VN combinations automatically by using Adaboost algorithm to combine several Bayesian classifications to identify the VN and VO constructions. It constructs the feature vector depending on verbs, nouns and their context including the number of syllables of words. Experimental results show that this method can achieve reasonably good result. It achieves precision of 90.5% and recall of 88.2% without using any other resources.

Key words: VN construction, VO construction, context, Adaboost algorithm, Bayesian classification

中图分类号: