作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (1): 92-99,112. doi: 10.19678/j.issn.1000-3428.0063788

• 人工智能与模式识别 • 上一篇    下一篇

基于双向语义的中文实体关系联合抽取方法

禹克强, 黄芳, 吴琪, 欧阳洋   

  1. 中南大学 计算机学院, 长沙 410083
  • 收稿日期:2022-01-19 修回日期:2022-02-28 发布日期:2022-03-23
  • 作者简介:禹克强(1997-),男,硕士研究生,主研方向为自然语言处理;黄芳(通信作者),教授、博士;吴琪、欧阳洋,硕士研究生。
  • 基金资助:
    湖南省科技计划项目(2016JC2011)。

Joint Extraction Method for Chinese Entity Relationship Based on Bidirectional Semantics

YU Keqiang, HUANG Fang, WU Qi, OUYANG Yang   

  1. School of Computer Science and Engineering, Central South University, Changsha 410083, China
  • Received:2022-01-19 Revised:2022-02-28 Published:2022-03-23

摘要: 现有中文实体关系抽取方法通常利用实体间的单向关系语义特征进行关系抽取,然而仅靠单向语义特征并不能完全利用实体间的语义关系,从而使得实体关系抽取的有效性受到影响。提出一种基于双向语义的中文实体关系联合抽取方法。利用RoBERTa预训练模型获取具有上下文信息的文本字向量表征,通过首尾指针标注识别句子中可能存在关系的实体。为了同时利用文本中的双向关系语义信息,将实体分别作为关系中的主体与客体来建立正负关系,并利用两组全连接神经网络构建正负关系映射器,从而对每一个输入实体同时从正关系与负关系的角度构建候选关系三元组。将候选关系三元组分别在正负关系下的概率分布序列与实体位置嵌入特征相结合,以对候选三元组进行判别,从而确定最终的关系三元组。在DuIE数据集上进行对比实验,结果表明,该方法的精确率与召回率优于MultiR、CoType等基线模型,其F1值达到0.805,相较基线模型平均提高了12.8%。

关键词: 实体关系联合抽取, 双向关系语义, 正负关系映射, 全连接神经网络, 预训练语言模型

Abstract: Existing Chinese entity relationship extraction methods typically use the semantic features of one-way relationships between entities for relationship extraction.However, using only one-way semantic features cannot fully utilize the semantic relationships between entities, which affects the effectiveness of entity relation extraction.A joint extraction method for Chinese entity relationships based on bidirectional semantics is proposed.The RoBERTa pretraining model is used to obtain the text word vector representation with context information, and the entities that may have relationships in sentences are identified based on the first and last pointer labels.Entities are respectively regarded as the subject and object in the relationship to establish positive and negative relationships to simultaneously use the semantic information of two-way relationship in the text. In addition, two sets of fully connected neural networks are used to build a positive and negative relationship mapper to simultaneously construct candidate relationship triplets for each input entity from the perspective of positive and negative relationships.The probability distribution sequence of candidate relation triples under positive and negative relationships is combined with the entity position embedding feature to identify the candidate triples and determine the final relation triplet.The comparison experiment on the DuIE dataset shows that the precision and recall rates of this method are better than those of baseline models, such as MultiR and CoType, and its F1 value reaches 0.805, which is 12.8% higher than that of the baseline models.

Key words: entity relationship joint extraction, bidirectional relational semantics, positive-negative relationship mapping, fully connected neural network, pretrained language model

中图分类号: