摘要: 汉语里动名词组合常使句法分析产生歧义。该文使用Adaboost算法组合多个贝叶斯分类器,对汉语中常见的动名词组合进行自动标注,分别识别出其中的定中结构和动宾结构。在进行特征选择时,参考词义消歧的方法,利用上下文词语、动词名词本身及其音节数等构造了特征向量。实验结果表明,在不参照其他资源的情况下,该方法识别效果较好,平均精确率和召回率分别达到90.5%和88.2%。
关键词:
VN结构,
VO结构,
语境,
Adaboost算法,
贝叶斯分类
Abstract: The verb-noun combinations in Chinese often create ambiguities in parsing. This paper proposes a method to label VN combinations automatically by using Adaboost algorithm to combine several Bayesian classifications to identify the VN and VO constructions. It constructs the feature vector depending on verbs, nouns and their context including the number of syllables of words. Experimental results show that this method can achieve reasonably good result. It achieves precision of 90.5% and recall of 88.2% without using any other resources.
Key words:
VN construction,
VO construction,
context,
Adaboost algorithm,
Bayesian classification
中图分类号:
陈丽江. 基于多分类器决策的VN组合自动标注[J]. 计算机工程, 2008, 34(5): 79-81.
CHEN Li-jiang. Autolabeling of VN Combination Based on Multi-classifier[J]. Computer Engineering, 2008, 34(5): 79-81.