Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (10): 116-122. doi: 10.19678/j.issn.1000-3428.0062902

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Knowledge Graph Automatic Construction Model in Open Domain Based on Knowledge-Informed Graph Convolutional Neural Network

SUN Yaru, YANG Ying, WANG Yongjian   

  1. The Third Research Institute of Ministry of Public Security, Shanghai 201204, China
  • Received:2021-10-11 Revised:2021-11-28 Published:2022-10-09

基于知信图卷积神经网络的开放域知识图谱自动构建模型

孙亚茹, 杨莹, 王永剑   

  1. 公安部第三研究所, 上海 201204
  • 作者简介:孙亚茹(1993—),女,硕士研究生,主研方向为自然语言处理、数据挖掘;杨莹(通信作者)、王永剑,副研究员、博士。
  • 基金资助:
    公安部研究计划项目(C21361)。

Abstract: Solving the problem of multi-source knowledge alignment and knowledge redundancy is the key to automatically build a knowledge graph in the open data domain.To solve this problem, an automatic knowledge graph construction model that combines knowledge-informed learning with deep learning is proposed.The model is used to analyze the theoretical relationship between a Graph Convolutional Neural Network(GCN) model and knowledge-informed learning, construct an entity semantic joint space by combining prior knowledge with deep learning, formalize the intervention of prior knowledge on the model, and use an automatic encoder to achieve a fine-grained entity alignment and relationship extraction model.Furthermore, the GCN is combined with multi-head attention to mitigate the impact of entity dependency information loss caused by multi-hop reasoning in the structural data.The results of experimental conducted using the open-source datasets SemEval and FB15k as well as the collected and sorted MD datasets show that the F1 values of the model for relation extraction, entity alignment, and triplet extraction tasks reach 89.5%, 86.6%, and 84.2%, which are 0.3, 2.4, and 0.3 percentage points higher than those of the BERT-Softmax model, respectively.Thus, the proposed model has better information learning ability.

Key words: open data domain, knowledge graph, knowledge-informed learning, Graph Convolutional Neural Network(GCN), attention mechanism

摘要: 解决多源知识对齐和知识冗余问题是在开放数据域自动构建知识图谱的关键。建立一种融合知信学习与深度学习的知识图谱自动构建模型。分析图卷积神经网络(GCN)模型与知信学习之间的理论联系,以先验知识与深度学习相结合的方式构建实体语义联合空间,将先验知识对模型的干预形式化,并利用自动编码器实现一个细粒度的实体对齐和关系抽取模型。同时,采用GCN与多头注意力相结合的方式,缓解因结构数据中多跳推理造成实体依赖信息丢失的影响。在开源数据集SemEval、FB15k和收集整理的MD数据集上的实验结果表明,该模型针对关系抽取、实体对齐和三元组抽取任务的F1值分别达到89.5%、86.6%和84.2%,较BERT-Softmax模型分别提升了0.3、2.4和0.3个百分点,具有更好的信息学习能力。

关键词: 开放数据域, 知识图谱, 知信学习, 图卷积神经网络, 注意力机制

CLC Number: