作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (24): 289-封三. doi: 10.3969/j.issn.1000-3428.2010.24.104

• 开发研究与设计技术 • 上一篇    下一篇

命名实体关系抽取算法的改进

李妩可1,郭赛球2,尹 艳1   

  1. (1. 湖南文理学院计算机科学与技术学院,湖南 常德 415000;2. 湖南城市学院,湖南 益阳 413000)
  • 出版日期:2010-12-20 发布日期:2010-12-14
  • 作者简介:李妩可(1982-),女,硕士,主研方向:模式匹配; 郭赛球、尹 艳,硕士

Improvement of Named Entity Relation Extraction Algorithm

LI Wu-ke1, GUO Sai-qiu2, YIN Yan1   

  1. (1. College of Computer Science and Technology, Hunan University of Arts and Science, Changde 415000, China; 2. Hunan City University, Yiyang 413000, China)
  • Online:2010-12-20 Published:2010-12-14

摘要:

现有命名实体关系抽取算法没有考虑关系特征序列的模式差异。针对该不足,提出一种改进的命名实体关系抽取算法。在语料库中识别出所有命名实体,利用最短依存路径以及与实体本身关系密切的词对实体关系特征进行提取,基于核函数计算关系特征序列的相似度,输出候选命名实体关系对及其关系。实验结果表明,改进算法具有较好的查全率与查准率,其调和平均值可达78%。

关键词: 命名实体关系抽取, 最短依存路径, 核函数, 调和平均值

Abstract:

Existing named entity relation extraction algorithm does not consider the pattern difference of relation characteristic sequence. Aiming at this shortage, this paper proposes an improved entity relation extraction algorithm. It identifies all of the named entity in the corpus, extracts entity relation characteristic based on the shortest path dependence and the words closely related to the entities, and computes the similarity of the relation feature sequences based on kernel function. Experimental result shows that the improved algorithm has good recall and precision, and its harmonic mean is up to 78%.

Key words: named entity relation extraction, shortest dependence path, kernel function, harmonic mean

中图分类号: