作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于标签传播和主动学习的人物社会关系抽取

刘锦文,许静,张利萍,芮伟康   

  1. (中国科学技术大学 计算机科学与技术学院,合肥 230022)
  • 收稿日期:2016-01-29 出版日期:2017-02-15 发布日期:2017-02-15
  • 作者简介:刘锦文(1992—),女,硕士研究生,主研方向为文本数据挖掘、自然语言处理;许静,博士研究生;张利萍、芮伟康,硕士研究生。
  • 基金资助:
    国家自然科学基金(61332004)。

Personal Social Relation Extraction Based on Label Propagation and Active Learning

LIU Jinwen,XU Jing,ZHANG Liping,RUI Weikang   

  1. (School of Computer Science and Technology,University of Science and Technology of China,Hefei 230022,China)
  • Received:2016-01-29 Online:2017-02-15 Published:2017-02-15

摘要: 基于标签传播的半监督学习算法能够提升少量标注数据下的关系抽取效果,但是随机选择训练样本会使关系抽取性能降低。为了从海量的网络信息中提取出可靠性较高的人物关系,将标签传播算法与主动学习相结合用于人物关系抽取。在训练数据获取中,主动选择不确定性最大的样本进行标注。在人物关系上的实验结果显示,主动学习方法的引入可使平均F1值比标签传播算法提升2.3%。

关键词: 人物社会关系, 特征提取, 标签传播, 主动学习, 关系抽取, 半监督学习

Abstract: In order to extract personal relations of high reliability from the mass network information,the semi-supervised learning algorithm based on label propagation can improve the performance of relation extraction under small amount of labeled data,but randomly selecting training sample may cause the reduction of the relation extraction performance.This paper combines label propagation algorithm and active learning so as to extract the relationship between the characters.In the training data acquisition,the maximum uncertainty of the sample is actively selected for label.Experimental results on personal relation show that the active learning method improves the average F1 by 2.3% than label propagation algorithm.

Key words: personal social relation, feature extraction, label propagation, active learning, relation extraction, semi-supervised learning

中图分类号: