Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2012, Vol. 38 ›› Issue (01): 48-50,54. doi: 10.3969/j.issn.1000-3428.2012.01.012

• Networks and Communications • Previous Articles     Next Articles

Chinese Entity Relation Extraction Based on Subtree Feature

YAO Quan-zhu, WANG Mei-jun, LI Ru-qiong   

  1. (School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China)
  • Received:2011-07-01 Online:2012-01-05 Published:2012-01-05

基于子树特征的中文实体关系抽取

姚全珠,王美君,李如琼   

  1. (西安理工大学计算机科学与工程学院,西安 710048)
  • 作者简介:姚全珠(1960-),男,教授、博士,主研方向:数据库技术,自然语言处理,数据挖掘;王美君、李如琼,硕士研究生

Abstract: Kernel methods for relation have the implicit representation of feature spaces which can’t distinguish useful feature from useless. As a result, it introduces noise and affect performance. Aiming at this problem, this paper presents entity relation extraction based on the feature of subtrees. The proposed method uses subtree mining and feature selection to get the more useful subtrees, and the feature vector is constructed on them for categorization. Experimental result in Chinese language database shows that the proposed method for entity relation extraction is effective.

Key words: entity relation extraction, phrase structure grammar, dependency grammar, feature selection, Chi-squared statistic

摘要: 基于核函数的实体关系抽取方法将信息隐含在核函数中,无法辨别有用和无用信息,会引入噪声。为此,提出一种基于子树特征的实体关系抽取方法。利用子树挖掘和特征选择得到有效子树,并将其作为特征模板构造特征向量。在中文语料库上进行的实验结果表明,该方法具有较好的分类效果。

关键词: 实体关系抽取, 短语结构语法, 依存语法, 特征选择, 卡方统计量

CLC Number: