计算机工程 ›› 2021, Vol. 47 ›› Issue (1): 87-93,100.doi: 10.19678/j.issn.1000-3428.0056688

• 人工智能与模式识别 • 上一篇    下一篇

一种融合主题特征的自适应知识表示方法

陈文杰   

  1. 中国科学院 成都文献情报中心, 成都 610041
  • 收稿日期:2019-11-25 修回日期:2020-01-14 发布日期:2020-01-20
  • 作者简介:陈文杰(1990-),男,助理馆员、硕士,主研方向为表示学习、知识图谱。
  • 基金项目:
    中国科学院“十三五”信息化专项(XXH13506)。

An Adaptive Approach for Knowledge Representation Fused with Topic Feature

CHEN Wenjie   

  1. Chengdu Library and Information Center, Chinese Academy of Science, Chengdu 610041, China
  • Received:2019-11-25 Revised:2020-01-14 Published:2020-01-20

摘要: 基于翻译的表示学习模型TransE被提出后,研究者提出一系列模型对其进行改进和补充,如TransH、TransG、TransR等。然而,这类模型往往孤立学习三元组信息,忽略了实体和关系相关的描述文本和类别信息。基于主题特征构建TransATopic模型,在学习三元组的同时融合关系中的描述文本信息,以增强知识图谱的表示效果。采用基于主题模型和变分自编器的关系向量构建方法,根据关系上的主题分布信息将同一关系表示为不同的实值向量,同时将损失函数中的距离度量由欧式距离改进为马氏距离,从而实现向量不同维权重的自适应赋值。实验结果表明,在应用于链路预测和三元组分类等任务时,TransATopic模型的MeanRank、HITS@5和HITS@10指标较TransE模型均有显著改进。

关键词: 知识图谱, 表示学习, 主题模型, 变分自编码器, 马氏距离

Abstract: Since the emergence of the translation-based representation learning model,TransE,a series of models such as TransH,TransG and TransR have been proposed to improve and add functions to TransE.However,such models tend to learn triplet information in isolation,and ignore the descriptive text and category information related to entities and relations.Therefore,this paper fuses descriptive text information of relations while learning triples,and constructs the TransATopic model based on topic features to enhance the representation effect of the knowledge graph.The relation vector construction method based on the topic model and Variational Autoencoder(VAE) is used to map one relation to different real-valued vectors according to topic distribution information of relations.At the same time,the distance metric in the loss function is improved from Euclidean distance to a more flexible Mahalanobis distance,which realizes the adaptive assignment of vector weights in different dimensions.Experimental results show that when applied to link prediction and triple classification tasks,TransATopic's indicators including MeanRank,HITS@5 and HITS@10 are significantly improved compared with the TransE model.

Key words: knowledge graph, representation learning, topic model, Variational Autoencoder(VAE), Mahalanobis distance

中图分类号: