基于图表示学习的领域知识图谱推理技术研究

doi:10.19678/j.issn.1000-3428.0065447

摘要/Abstract

摘要：

现有领域知识图谱推理模型多数是由基于百科类通用知识图谱的推理模型迁移而来，但是领域知识图谱的异构性并未得到妥善处理。同时，现有研究将关系预测与三元组分类视作2个独立的任务而忽视了两者之间的关联，且领域知识在领域模型的建立过程中也未得到充分的利用。针对上述问题，建立基于翻译距离的改进推理模型TransSep，为异构的实体类型分配不同的特征空间。提出一种联合训练的策略，使得关系预测与三元组分类2个任务互相指导对方的负采样过程，并交替地学习实体的嵌入特征，从而提升2个任务的训练效果。以医疗领域知识图谱为例，将领域知识通过元路径的思想引入TransSep模型中，增强模型的表达能力。在由复旦大学构建的精准医学知识图谱上进行实验，结果表明，相比TransE、DistMult、TriModel等模型，TransSep模型在关系预测任务中MR分数至少提高17.4%，三元组分类任务中的F1值提高至0.928 6。

关键词: 领域知识图谱, 知识推理, 图表示学习, 图神经网络, 元路径

Abstract:

Most existing inference models of domain knowledge graph are migrated from inference models of general encyclopedic knowledge graphs, without properly handling the heterogeneity of domain knowledge graph. Existing research regards the relationship prediction and triad classification as two independent tasks and ignores the relationship between them, whereby domain knowledge is not fully utilized in the process of model building. To address the above issues, an improved inference model TransSep which is based on translation distance, is established to allocate different feature spaces to heterogeneous entity types. A joint training strategy is proposed to enable relationship prediction and triplet classification tasks, such that prediction and classification are guided by the negative sampling process of each other, and the embedding features of entities are alternately learned, thereby improving the training effectiveness of both tasks. Taking the knowledge graph of the medical field as an example, the idea of domain knowledge is introduced into the TransSep model through meta-path, to enhance the expression ability of the model. Experiments are carried out on the knowledge graph of precision medicine constructed by Fudan University. The results show that compared with TransE, DistMult, TriModel and other models, TransSep model improves MR score by at least 17.4% in relationship prediction tasks, and the F1 score in triple group classification tasks increased to 0.928 6.

Key words: domain knowledge graph, knowledge inference, graph representation learning, graph neural network, meta-path

隋国华, 李陶然, 刘昊, 陈林, 汪卫. 基于图表示学习的领域知识图谱推理技术研究[J]. 计算机工程, 2023, 49(9): 89-98.

Guohua SUI, Taoran LI, Hao LIU, Lin CHEN, Wei WANG. Research on Domain Knowledge Graph Inference Technology Based on Graph Representation Learning[J]. Computer Engineering, 2023, 49(9): 89-98.

http://www.ecice06.com/CN/Y2023/V49/I9/89

图/表 9

参考文献 35

1	王鑫, 傅强, 王林, 等. 知识图谱可视化查询技术综述. 计算机工程, 2020, 46(6): 1- 11. URL
	WANG X, FU Q, WANG L, et al. Survey on visualization query technology of knowledge graph. Computer Engineering, 2020, 46(6): 1- 11. URL
2	肖仰华, 徐波, 林欣. 知识图谱: 概念与技术[M]. 北京: 电子工业出版社, 2020.
	XIAO Y H, XU B, LIN X. Knowledge graph[M]. Beijing: Publishing House of Electronics Industry, 2020. (in Chinese)
3	FÄRBER M, ELL B, MENNE C, et al. A comparative survey of DBpedia, Freebase, OpenCyc, wikidata, and YAGO. Semantic Web Journal, 2015, 1(1): 1- 5.
4	WISHART D S, KNOX C, GUO A C, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research, 2006, 34(S): 668- 672.
5	YAMANISHI Y, ARAKI M, GUTTERIDGE A, et al. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 2008, 24(13): 232- 240. doi: 10.1093/bioinformatics/btn162
6	CHEN B, DONG X, JIAO D, et al. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics, 2010, 11, 255. doi: 10.1186/1471-2105-11-255
7	王昊奋, 丁军, 胡芳槐, 等. 大规模企业级知识图谱实践综述. 计算机工程, 2020, 46(7): 1- 13. URL
	WANG H F, DING J, HU F H, et al. Survey on large scale enterprise-level knowledge graph practices. Computer Engineering, 2020, 46(7): 1- 13. URL
8	BORDES A, USUNIER N, GARCIA-DURÁN A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2013: 2787-2795.
9	LIN Y K, LIU Z Y, SUN M S, et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence. New York, USA: ACM Press, 2015: 2181-2187.
10	WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. New York, USA: ACM Press, 2014: 1112-1119.
11	FENG J, HUANG M L, WANG M D, et al. Knowledge graph embedding by flexible translation[C]//Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning. New York, USA: ACM Press, 2016: 557-560.
12	YANG B, YIH W T, HE X, et al. Embedding entities and relations for learning and inference in knowledge bases[EB/OL]. [2022-07-05]. https://arxiv.org/abs/1412.6575.
13	SOCHER R, CHEN D Q, MANNING C D, et al. Reasoning with neural tensor networks for knowledge base completion[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2013: 926-934.
14	DETTMERS T, MINERVINI P, STENETORP P, et al. Convolutional 2D knowledge graph embeddings. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 16- 25.
15	PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2014: 701-710.
16	GROVER A, LESKOVEC J. Node2vec: scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2016: 855-864.
17	SUN Y Z, HAN J W. Mining heterogeneous information networks: principles and methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery, 2012, 3(2): 1- 159. doi: 10.1007/978-3-031-01902-9
18	DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2017: 135-144.
19	ZHOU J, CUI G, HU S, et al. Graph neural networks: a review of methods and applications. AI Open, 2020, 1, 57- 81.
20	WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4- 24.
21	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2022-07-05]. https://arxiv.org/abs/1609.02907.
22	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2022-07-05]. https://arxiv.org/abs/1710.10903.
23	JI G L, LIU K, HE S Z, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. New York, USA: ACM Press, 2016: 985-991.
24	ZHENG A, CASARI A. Feature engineering for machine learning: principles and techniques for data scientists[M]. [S. l. ]: O'Reilly Media, Inc., 2018.
25	LESHNO M, LIN V Y, PINKUS A, et al. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 1993, 6(6): 861- 867.
26	CUI Z J, KAPANIPATHI P, TALAMADUPULA K, et al. Type-augmented relation prediction in knowledge graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(8): 7151- 7159.
27	NARAYANAN A, CHANDRAMOHAN M, VENKATESAN R, et al. graph2vec: learning distributed representations of graphs[EB/OL]. [2022-07-05]. https://arxiv.org/abs/1707.05005.
28	BANSAL T, JUAN D C, RAVI S, et al. A2N: attending to neighbors for knowledge graph inference[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2019: 4387-4392.
29	WANG X, JI H Y, SHI C, et al. Heterogeneous graph attention network[C]//Proceedings of WWW'19. New York, USA: ACM Press, 2019: 2022-2032.
30	SHANG C, TANG Y, HUANG J, et al. End-to-end structure-aware convolutional networks for knowledge base completion. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 3060- 3067.
31	MALAVIYA C, BHAGAVATULA C, BOSSELUT A, et al. Commonsense knowledge base completion with structural and semantic context. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(3): 2925- 2933.
32	MOHAMED S K, NOVÁČEK V. Link prediction using multi part embeddings[EB/OL]. [2022-07-05]. https://link.springer.com/chapter/10.1007/978-3-030-21348-0_16.
33	MOHAMED S K, MUÑOZ E, NOVACEK V. On training knowledge graph embedding models. Information, 2021, 12(4): 147.
34	AKRAMI F, GUO L B, HU W, et al. Re-evaluating embedding-based knowledge graph completion methods[C]//Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York, USA: ACM Press, 2018: 1779-1782.
35	ZHU S, BING J, MIN X, et al. Prediction of drug-gene interaction by using Metapath2vec[EB/OL]. [2022-07-05]. https://www. researchgate. net/publication/326717975_Prediction_of_Drug-Gene_Interaction_by_Using_Metapat h2vec.

[1]	刘晓黎, 王轶彤. 基于自监督学习的多密度图会话推荐[J]. 计算机工程, 2023, 49(9): 60-68, 78.
[2]	赵世豪, 毛国君, 熊保平, 黄山, 林江宏. 基于图小波卷积神经网络的时空图挖掘模型[J]. 计算机工程, 2023, 49(7): 85-93.
[3]	马月坤, 张可心, 高唱. 体现辨证论治差异的不孕症知识图谱构建方法研究[J]. 计算机工程, 2023, 49(3): 280-287,295.
[4]	袁立宁, 胡皓, 刘钊. 基于多通道图卷积自编码器的图表示学习[J]. 计算机工程, 2023, 49(2): 150-160,174.
[5]	李盼, 解庆, 李琳, 刘永坚. 知识增强的图神经网络序列推荐模型[J]. 计算机工程, 2023, 49(2): 70-80.
[6]	雷李想, 武志昊, 刘钰, 周子站. 基于域内特征间相似性的点击率预估优化[J]. 计算机工程, 2023, 49(2): 238-245.
[7]	潘嘉诚, 董一鸿, 陈华辉. 基于图神经网络的自闭症辅助诊断研究综述[J]. 计算机工程, 2022, 48(9): 1-11.
[8]	丁庆丰, 李晋国. 一种物联网环境下的分布式异常流量检测方案[J]. 计算机工程, 2022, 48(8): 152-159.
[9]	胡承佐, 王庆梅, 李迪超, 王铮. 基于复杂结构信息的图神经网络序列推荐算法[J]. 计算机工程, 2022, 48(5): 82-90,97.
[10]	金雨澄, 王清钦, 高剑, 苗仲辰, 林越峰, 项雅丽, 熊贇. 基于图深度学习的金融文本多标签分类算法[J]. 计算机工程, 2022, 48(4): 16-21.
[11]	赵越, 武志昊, 赵苡积. 基于特征与域感知的点击率预估方法[J]. 计算机工程, 2022, 48(3): 60-68.
[12]	崔丽平, 古丽拉·阿东别克, 王智悦. 基于有向图模型的旅游领域命名实体识别[J]. 计算机工程, 2022, 48(2): 306-313.
[13]	苏珂, 黄瑞阳, 张建朋, 余诗媛, 胡楠. 多跳机器阅读理解研究进展[J]. 计算机工程, 2021, 47(9): 1-17.
[14]	王健宗, 孔令炜, 黄章成, 肖京. 图神经网络综述[J]. 计算机工程, 2021, 47(4): 1-12.
[15]	张鹏, 陈博. 基于图神经网络的智能路由机制[J]. 计算机工程, 2021, 47(12): 171-176,184.

选择文件类型/文献管理软件名称

选择包含的内容