Knowledge Graph Automatic Construction Model in Open Domain Based on Knowledge-Informed Graph Convolutional Neural Network

doi:10.19678/j.issn.1000-3428.0062902

Abstract

Abstract: Solving the problem of multi-source knowledge alignment and knowledge redundancy is the key to automatically build a knowledge graph in the open data domain.To solve this problem, an automatic knowledge graph construction model that combines knowledge-informed learning with deep learning is proposed.The model is used to analyze the theoretical relationship between a Graph Convolutional Neural Network(GCN) model and knowledge-informed learning, construct an entity semantic joint space by combining prior knowledge with deep learning, formalize the intervention of prior knowledge on the model, and use an automatic encoder to achieve a fine-grained entity alignment and relationship extraction model.Furthermore, the GCN is combined with multi-head attention to mitigate the impact of entity dependency information loss caused by multi-hop reasoning in the structural data.The results of experimental conducted using the open-source datasets SemEval and FB15k as well as the collected and sorted MD datasets show that the F1 values of the model for relation extraction, entity alignment, and triplet extraction tasks reach 89.5%, 86.6%, and 84.2%, which are 0.3, 2.4, and 0.3 percentage points higher than those of the BERT-Softmax model, respectively.Thus, the proposed model has better information learning ability.

Key words: open data domain, knowledge graph, knowledge-informed learning, Graph Convolutional Neural Network(GCN), attention mechanism

摘要： 解决多源知识对齐和知识冗余问题是在开放数据域自动构建知识图谱的关键。建立一种融合知信学习与深度学习的知识图谱自动构建模型。分析图卷积神经网络（GCN）模型与知信学习之间的理论联系，以先验知识与深度学习相结合的方式构建实体语义联合空间，将先验知识对模型的干预形式化，并利用自动编码器实现一个细粒度的实体对齐和关系抽取模型。同时，采用GCN与多头注意力相结合的方式，缓解因结构数据中多跳推理造成实体依赖信息丢失的影响。在开源数据集SemEval、FB15k和收集整理的MD数据集上的实验结果表明，该模型针对关系抽取、实体对齐和三元组抽取任务的F1值分别达到89.5%、86.6%和84.2%，较BERT-Softmax模型分别提升了0.3、2.4和0.3个百分点，具有更好的信息学习能力。

关键词: 开放数据域, 知识图谱, 知信学习, 图卷积神经网络, 注意力机制

CLC Number:

TP18

SUN Yaru, YANG Ying, WANG Yongjian. Knowledge Graph Automatic Construction Model in Open Domain Based on Knowledge-Informed Graph Convolutional Neural Network[J]. Computer Engineering, 2022, 48(10): 116-122.

孙亚茹, 杨莹, 王永剑. 基于知信图卷积神经网络的开放域知识图谱自动构建模型[J]. 计算机工程, 2022, 48(10): 116-122.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0062902

http://www.ecice06.com/EN/Y2022/V48/I10/116

Figures/Tables 10

References

[1] SHI W, ZHENG W G, YU J X, et al.Keyphrase extraction using knowledge graphs[J].Data Science and Engineering, 2017, 2(4):275-288.
[2] XIAO G H, CORMAN J.Ontology-mediated SPARQL query answering over knowledge graphs[J].Big Data Research, 2021, 23:1-10.
[3] 金婧, 万怀宇, 林友芳.融合实体类别信息的知识图谱表示学习[J].计算机工程, 2021, 47(4):77-83. JIN J, WAN H Y, LIN Y F.Knowledge graph representation learning fused with entity category information[J].Computer Engineering, 2021, 47(4):77-83.(in Chinese)
[4] 丁辰晖, 夏鸿斌, 刘渊.融合知识图谱与注意力机制的短文本分类模型[J].计算机工程, 2021, 47(1):94-100. DING C H, XIA H B, LIU Y.Short text classification model combining knowledge graph and attention mechanism[J].Computer Engineering, 2021, 47(1):94-100.(in Chinese)
[5] LIU W H, YIN L, WANG C, et al.Medical knowledge graph in Chinese using deep semantic mobile computation based on IoT and WoT[J].Wireless Communications and Mobile Computing, 2021, 2021:1-13.
[6] 刘勘, 张雅荃.基于医疗知识图谱的并发症辅助诊断[J].中文信息学报, 2020, 34(10):85-93, 104. LIU K, ZHANG Y Q.Medical knowledge graph based auxiliary diagnosis of complications[J].Journal of Chinese Information Processing, 2020, 34(10):85-93, 104.(in Chinese)
[7] VASHISHTH S, JAIN P, TALUKDAR P.CESI:canonicalizing open knowledge bases using embeddings and side information[C]//Proceedings of the 2018 World Wide Web Conference.Washington D.C., USA:IEEE Press, 2018:1317-1327.
[8] GALÁRRAGA L, HEITZ G, MURPHY K, et al.Canonicalizing open knowledge bases[C]//Proceedings of the 23rd ACM International Conference on Information and Knowledge Management.New York, USA:ACM Press, 2014:1679-1688.
[9] VON RUEDEN L, MAYER S, BECKH K, et al.Informed machine learning-a taxonomy and survey of integrating prior knowledge into learning systems[J/OL].IEEE Transactions on Knowledge and Data Engineering:1-20[2021-10-25].https://ieeexplore.ieee.org/document/9429985/citations#citations.
[10] WEIN S, MALLONI W M, TOMÉ A M, et al.A graph neural network framework for causal inference in brain networks[J].Scientific Reports, 2021, 11(1):8061.
[11] 俞思伟, 范昊, 王菲, 等.基于知识图谱的智能医疗研究[J].医疗卫生装备, 2017, 38(3):109-111, 126. YU S W, FAN H, WANG F, et al.Research on intelligent medicine based on knowledge graph[J].Chinese Medical Equipment Journal, 2017, 38(3):109-111, 126.(in Chinese)
[12] 韩普, 马健, 张嘉明, 等.基于多数据源融合的医疗知识图谱框架构建研究[J].现代情报, 2019, 39(6):81-90. HAN P, MA J, ZHANG J M, et al.The framework construction of medical knowledge graph based on multi-data source fusion[J].Journal of Modern Information, 2019, 39(6):81-90.(in Chinese)
[13] BUNESCU R C, MOONEY R J, MANNING C.A shortest path dependency kernel for relation extraction[C]//Proceedings of the 2005 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2017:724-731.
[14] PENG N Y, POON H, QUIRK C, et al.Cross-sentence N-ary relation extraction with graph LSTMs[J].Transactions of the Association for Computational Linguistics, 2017, 5:101-115.
[15] XU K, FENG Y S, HUANG S F, et al.Semantic relation classification via convolutional neural networks with simple negative sampling[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2015:536-540.
[16] XU Y, MOU L L, LI G, et al.Classifying relations via long short term memory networks along shortest dependency paths[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2015:536-540.
[17] MIWA M, BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Stroudsburg, USA:Association for Computational Linguistics, 2016:90-94.
[18] ZHANG Y H, QI P, MANNING C D.Graph convolution over pruned dependency trees improves relation extraction[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2018:2205-2215.
[19] HENDRICKX I, KIM S N, KOZAREVA Z, et al.SemEval-2010 task 8:multi-way classification of semantic relations between pairs of nominals[C]//Proceedings of the Workshop on Semantic Evaluations:Recent Achievements and Future Directions.Stroudsburg, USA:Association for Computational Linguistics, 2009:33-38.
[20] ZHANG Y H, ZHONG V, CHEN D Q, et al.Position-aware attention and supervised data improve slot filling[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2017:35-45.
[21] XU Y, MOU L L, LI G, et al.Classifying relations via long short term memory networks along shortest dependency paths[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2015:1785-1794.
[22] GUO Z J, ZHANG Y, LU W.Attention guided graph convolutional networks for relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:241-251.
[23] WU S C, HE Y F.Enriching pre-trained language model with entity information for relation classification[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.New York, USA:ACM Press, 2019:2361-2364.
[24] LIN Y K, LIU Z Y, SUN M S, et al.Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence.Palo Alto, USA:AAAI Press, 2015:2181-2187.

Please choose a citation manager

Content to export