作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (6): 104-114. doi: 10.19678/j.issn.1000-3428.0057563

• 人工智能与模式识别 • 上一篇    下一篇

基于贝叶斯网的开放世界知识图谱补全

李鑫柏, 吴鑫然, 岳昆   

  1. 云南大学 信息学院, 昆明 650500
  • 收稿日期:2020-03-02 修回日期:2020-04-13 发布日期:2020-05-29
  • 作者简介:李鑫柏(1995-),女,硕士研究生,主研方向为数据与知识工程;吴鑫然,博士研究生;岳昆(通信作者),教授、博士、博士生导师。

Open-World Knowledge Graph Completion Based on Bayesian Network

LI Xinbai, WU Xinran, YUE Kun   

  1. School of Information Science and Engineering, Yunnan University, Kunming 650500, China
  • Received:2020-03-02 Revised:2020-04-13 Published:2020-05-29
  • Contact: 云南省教育厅科学研究基金研究生类项目(2020Y0010)。 E-mail:kyue@ynu.edu.cn

摘要: 知识图谱中实体所涉及的关系之间通常具有相互依赖的性质,基于这种依赖性可利用数据中的新实体来构造更多的三元组从而补全知识图谱。贝叶斯网(BN)是一种表示和推理变量之间相互依赖关系和不确定性知识的有效模型,将BN作为模型框架,研究基于BN的开放世界知识图谱补全方法。提出知识图谱中关系之间依赖性的表示模型构建方法,构建过程包括模型的基础结构构建和参数表计算,基于关系对实体的描述作用,根据描述作用强的关系决定描述作用弱的关系这一规则构建模型的基础结构。给出基于知识图谱中的三元组来抽取数据集的方法,采用最大似然估计法并利用模型的基础结构和数据集来计算模型的参数表。提出基于BN概率推理的三元组构造方法,将开放世界数据中包含新实体三元组的关系和尾实体作为证据,利用概率推理计算新实体与其他实体之间存在关系的条件概率,以此为依据构造与新实体相关的更多三元组,从而完善知识图谱。在FB15k和DBpedia数据集中分别进行三元组类型预测和链路预测实验,结果表明,该方法具有有效性,其预测召回率和MR值相比现有知识图谱补全方法均有明显提升。

关键词: 开放世界知识图谱补全, 依赖关系, 贝叶斯网, 概率推理, 三元组构造

Abstract: The relations among entities in Knowledge Graph(KG) are usually interdependent, and this interdependency can be leveraged to construct more triples based on new entities in open-world data to complete KG.Bayesian Network(BN) is an effective model for representing and inferring the interdependent relations and uncertain knowledge between variables, so the study described in this paper employs BN as the framework of the model to realize BN-based open-world KG completion.First of all, a method is proposed for constructing the model that represents interdependencies between relations in KG.The construction process includes constructing the basic structure of the model and calculating the parameter table.As relations can describe entities, the basic structure of the model is constructed based on the rule that the more descriptive relations decides the less descriptive relations.Next, the paper gives the method for extracting the data sets based on the triples in the KG.The parameter table of the model is calculated based on the basic structure and data sets by using the maximum likelihood estimation method.Then this paper proposes a method for constructing triples based on BN probability reasoning.The method takes the relations and tail entities of triples that contain new entities in open-world data as evidence, and uses probability reasoning to calculate the conditional probability of relations existing between new entities and other entities.On this basis, more triples related to new entities are constructed to complete KG.The method is tested with triple type prediction and link prediction tasks on FB15k and DBpedia data sets.The experimental results show that compared with the existing KG completion method, the proposed method has significantly improved the prediction recall rate and the MR value, which verifies the effectiveness of the method.

Key words: open-world Knowledge Graph Completion(KGC), interdependent relation, Bayesian Network(BN), probabilistic reasoning, triple construction

中图分类号: