摘要: 针对基于机器学习的中文共指消解中不同类别名词短语特征向量的使用差异,提出一种基于特征分选策略的方法。该方法在选择特征向量时对人称代词和普通名词短语分别处理,充分利用不同名词短语的已有特征进行共指消解,并减少部分无效特征在共指消解过程中产生的“噪声”。实验结果表明,该中文共指消解方法能提高共指消解的性能, 值达到80.72%。
关键词:
共指消解,
特征选择,
自然语言处理,
支撑向量机,
数据词典
Abstract: This paper studies different features based up on the type of noun phrase in Chinese coreference resolution based on machine learning, and proposes features selection strategy to be applied to coreference resolution, the approach selects pronouns and other noun phrases features respectively, so this method can reduce some “noise” and utilize features effectively. Experimental results show that the method can improve the performance of coreference resolution system, and F-measure reaches 80.72%.
Key words:
coreference resolution,
feature selection,
nature language processing,
Support Vector Machine(SVM),
data dictionary
中图分类号:
李渝勤, 甘润生, 杨永红, 施水才. 基于特征分选策略的中文共指消解方法[J]. 计算机工程, 2011, 37(18): 180-182.
LI Yu-Qi, GAN Run-Sheng, YANG Yong-Gong, SHI Shui-Cai. Chinese Coreference Resolution Method Based on Feature Respective Selection Strategy[J]. Computer Engineering, 2011, 37(18): 180-182.