基于动态图注意力与标签传播的实体对齐

doi:10.19678/j.issn.1000-3428.0067814

摘要/Abstract

摘要：

实体对齐是多源数据库融合的有效方法, 旨在找出多源知识图谱中的共指实体。近年来, 图卷积网络(GCN)已成为实体对齐表示学习的新范式, 然而, 不同组织构建知识图谱的目标及规则存在巨大差异, 要求实体对齐模型能够准确发掘知识图谱之间的长尾实体特征, 并且现有的GCN实体对齐模型过于注重关系三元组的结构表示学习, 忽略了属性三元组丰富的语义信息。为此, 提出一种实体对齐模型, 引入动态图注意力网络聚合属性结构三元组表示, 降低无关属性结构对实体表示的影响。同时, 为缓解知识图谱的关系异构问题, 引入多维标签传播对实体邻接矩阵的不同维度进行压缩, 将实体特征根据压缩后的知识图谱邻接关系进行传播以获得关系结构表示, 最后通过线性规划算法对实体表示相似度矩阵进行迭代以得到最终的对齐结果。在公开数据集EN-FR-15K、EN-ZH-15K以及中文医学数据集MED-BBK-9K上进行实验, 结果表明, 该模型的Hits@1分别为0.942、0.926、0.427, Hits@10分别为0.963、0.952、0.604, MRR分别为0.949、0.939、0.551, 消融实验结果也验证了模型中各模块的有效性。

关键词: 数据库融合, 图卷积网络, 实体对齐, 标签传播, 线性规划

Abstract:

Entity alignment is an effective approach for multi-source database fusion with the aim of identifying co-referring entities in multi-source knowledge graphs. Recently, Graph Convolutional Network (GCN) have emerged as a new paradigm for entity alignment representation learning. However, there are significant differences in the objectives and rules for constructing knowledge graphs in different organizations, which require entity alignment models to accurately explore the long-tail entity features among knowledge graphs. Moreover, existing GCN entity alignment models focus overly on the structural representation of relationship triplets and neglect the rich semantic information of the attribute triplets. Accordingly, an entity alignment model is proposed that introduces a dynamic graph attention network to aggregate the attribute structure triplet representations and reduce the impact of irrelevant attribute structures on the entity representations. Simultaneously, to alleviate the problem of heterogeneous relationships in knowledge graphs, multi-dimensional label propagation is introduced to compress the different dimensions of the entity adjacency matrix. The entity features are propagated along the compressed knowledge graph adjacency relationship to obtain a relationship structure representation. Finally, a linear programming algorithm is used to iterate the entity representation similarity matrix to obtain the final alignment result. Experiments are conducted on publicly available datasets EN-FR-15K, EN-ZH-15K, and the Chinese medical dataset MED-BBK-9K, and the results demonstrate that the Hits@1 of the model are 0.942, 0.926, and 0.427, the Hits@10 are 0.963, 0.952, and 0.604, and the Mean Reciprocal Rank (MRR) values are 0.949, 0.939, and 0.551, respectively. The ablation experimental results verify the effectiveness of each module in the model.

Key words: database fusion, Graph Convolutional Network(GCN), entity alignment, label propagation, linear programming

莫少聪, 陈庆锋, 谢泽, 刘春雨, 邱俊铼. 基于动态图注意力与标签传播的实体对齐[J]. 计算机工程, 2024, 50(4): 150-159.

Shaocong MO, Qingfeng CHEN, Ze XIE, Chunyu LIU, Junlai QIU. Entity Alignment Based on Dynamic Graph Attention and Label Propagation[J]. Computer Engineering, 2024, 50(4): 150-159.

http://www.ecice06.com/CN/Y2024/V50/I4/150

图/表 9

图1 实体对齐示例

Fig.1 Example of entity alignment

图2 本文模型框架

Fig.2 Framework of the model in this paper

图3 不同向量维度下的实验结果

Fig.3 Experimental results under different vector dimensions

图4 不同传播迭代轮数下的实验结果

Fig.4 Experimental results under different propagation iteration rounds

图5 不同最近邻数下的实验结果

Fig.5 Experimental results under different nearest neighbor numbers

图6 不同种子集比例下的实验结果

Fig.6 Experimental results under different seed set ratios

参考文献 40

1	CHEN Y H, LI H, LI H, et al. An overview of knowledge graph reasoning: key technologies and applications. Journal of Sensor and Actuator Networks, 2022, 11 (4): 78. doi: 10.3390/jsan11040078
2	吴玺煜, 陈启买, 刘海, 等. 基于知识图谱表示学习的协同过滤推荐算法. 计算机工程, 2018, 44 (2): 226-232, 263. URL
	WU X Y, CHEN Q M, LIU H, et al. Collaborative filtering recommendation algorithm based on representation learning of knowledge graph. Computer Engineering, 2018, 44 (2): 226-232, 263. URL
3	王智悦, 于清, 王楠, 等. 基于知识图谱的智能问答研究综述. 计算机工程与应用, 2020, 56 (23): 1- 11. URL
	WANG Z Y, YU Q, WANG N, et al. Survey of intelligent question answering research based on knowledge graph. Computer Engineering and Applications, 2020, 56 (23): 1- 11. URL
4	蒋川宇, 韩翔宇, 杨文蕊, 等. 医学知识图谱研究与应用综述. 计算机科学, 2023, 50 (3): 83- 93. URL
	JIANG C Y, HAN X Y, YANG W R, et al. Survey of medical knowledge graph research and application. Computer Science, 2023, 50 (3): 83- 93. URL
5	JI S X, PAN S R, CAMBRIA E, et al. A survey on knowledge graphs: representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33 (2): 494- 514. doi: 10.1109/TNNLS.2021.3070843
6	BORSBOOM D, DESERNO M K, RHEMTULLA M, et al. Network analysis of multivariate data in psychological science. Nature Reviews Methods Primers, 2021, 1, 58. doi: 10.1038/s43586-021-00055-w
7	SUCHANEK F M, ABITEBOUL S, SENELLART P. PARIS: probabilistic alignment of relations, instances, and schema. Proceedings of the VLDB Endowment, 2011, 5 (3): 157- 168. doi: 10.14778/2078331.2078332
8	BORDES A, USUNIER N, GARCIA A, et al. Translating embeddings for modeling multi-relational data[EB/OL]. [2023-05-05]. https://hal.science/hal-00920777/document.
9	WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph embedding by translating on hyperplanes. Proceedings of the AAAI Conference on Artificial Intelligence, 2014, 28 (1): 1112- 1119.
10	LIN Y K, LIU Z Y, SUN M S, et al. Learning entity and relation embeddings for knowledge graph completion. Proceedings of the AAAI Conference on Artificial Intelligence, 2015, 29 (1): 2181- 2187.
11	SHI B X, WENINGER T. ProjE: embedding projection for knowledge graph completion. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31 (1): 1236- 1242.
12	DETTMERS T, MINERVINI P, STENETORP P, et al. Convolutional 2D knowledge graph embeddings. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32 (1): 1811- 1818.
13	SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling relational data with graph convolutional networks[C]//Proceedings of European Semantic Web Conference. Berlin, Germany: Springer, 2018: 593-607.
14	KIPF T, WELLING M, Semi-supervised classification with graph convolutional networks[EB/OL]. [2023-05-05]. https://doi.org/10.48550/arXiv.1609.02907.
15	NGUYEN D Q, NGUYEN T D, NGUYEN D Q, et al. A novel embedding model for knowledge base completion based on convolutional neural network[C]//Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: Association for Computational Linguistics, 2018: 327-333.
16	JIANG X, WANG Q, WANG B. Adaptive convolution for multi relational learning[C]//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: Association for Computational Linguistics, 2019: 978-987.
17	SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 3859-3869.
18	GUO L, SUN Z, HU W. Learning to exploit long term relational dependencies in knowledge graphs[C]//Proceedings of International Conference on Machine Learning. Long Beach, USA: Association for Computing Machinery, 2019: 2505-2514.
19	邓凯旋, 陈鸿昶, 黄瑞阳. 基于标签传播能力的改进LPA算法. 计算机工程, 2018, 44 (3): 60- 64. URL
	DENG K X, CHEN H C, HUANG R Y. Improved LPA algorithm based on label propagation ability. Computer Engineering, 2018, 44 (3): 60- 64. URL
20	CHEN M H, TIAN Y T, YANG M H, et al. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York, USA: ACM Press, 2017: 1511-1517.
21	SUN Z Q, HU W, LI C K. Cross-lingual entity alignment via joint attribute-preserving embedding[C]//Proceedings of International Semantic Web Conference. Berlin, Germany: Springer, 2017: 628-644.
22	SUN Z Q, HU W, ZHANG Q H, et al. Bootstrapping entity alignment with knowledge graph embedding[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Berlin, Germany: Springer, 2018: 4396-4402.
23	ZHANG Q H, SUN Z Q, HU W, et al. Multi-view knowledge graph embedding for entity alignment[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Berlin, Germany: Springer, 2019: 5429-5435.
24	WANG Z C, LÜ Q S, LAN X H, et al. Cross-lingual knowledge graph alignment via graph convolutional networks[C]// Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2018: 349-357.
25	TEONG K S, SOON L K, SU T T. Schema-agnostic entity matching using pre-trained language models[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York, USA: ACM Press, 2020: 2241-2244.
26	SCARSELLI F, GORI M, TSOI A C, et al. The graph neural network model. IEEE Transactions on Neural Networks, 2009, 20 (1): 61- 80. doi: 10.1109/TNN.2008.2005605
27	WU Y T, LIU X, FENG Y S, et al. Relation-aware entity alignment for heterogeneous knowledge graphs[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. New York, USA: ACM Press, 2019: 5278-5284.
28	GUO L B, SUN Z Q, ERMEI C, et al. Recurrent skipping networks for entity alignment[EB/OL]. [2023-05-05]. https://arxiv.org/abs/1811.02318.
29	ZHU Y, LIU H Z, WU Z H, et al. Relation-aware neighborhood matching model for entity alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35 (5): 4749- 4756. doi: 10.1609/aaai.v35i5.16606
30	WU Y T, LIU X, FENG Y S, et al. Neighborhood matching network for entity alignment[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2020: 6477-6487.
31	FEY M, LENSSEN J E, MORRIS C, et al. Deep graph matching consensus[C]//Proceeding of International Conference on Learning Representations. New York, USA: ACM Press, 2020: 1-23.
32	张富, 杨琳艳, 李健伟, 等. 实体对齐研究综述. 计算机学报, 2022, 45 (6): 1195- 1225. URL
	ZHANG F, YANG L Y, LI J W, et al. An overview of entity alignment methods. Chinese Journal of Computers, 2022, 45 (6): 1195- 1225. URL
33	VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2023-05-05]. https://arxiv.org/abs/1710.10903.
34	BRODY S, URI A, ERAN Y. How attentive are graph attention networks?[EB/OL]. [2023-05-05]. https://arxiv.org/abs/2105.14491.
35	MAO X, WANG W T, WU Y B, et al. LightEA: a scalable, robust, and interpretable entity alignment framework via three-view label propagation[C]//Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2022: 825-838.
36	BALL K. An elementary introduction to modern convex geometry[EB/OL]. [2023-05-05]. http://www.cse.yorku.ca/~andy/courses/6114/lecture-notes/Ball.pdf.
37	MAO X, WANG W T, WU Y B, et al. From alignment to assignment: frustratingly simple unsupervised entity alignment[EB/OL]. [2023-05-05]. https://arxiv.org/abs/2109.02363v1.
38	GE C C, LIU X Z, CHEN L, et al. Make it easy: an effective end-to-end entity alignment framework[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM Press, 2021: 777-786.
39	SUN Z Q, HU W, LI C K. Cross-lingual entity alignment via joint attribute-preserving embedding[EB/OL]. [2023-05-05]. https://www.xueshufan.com/publication/2746582923.
40	XIANG Y J, ZHANG Z H, CHEN J Y, et al. OntoEA: ontology-guided entity alignment via joint knowledge graph embedding[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2021: 1117-1128.

[1]	夏卫欢, 廖列法, 张守信, 张燕琴. 基于方面-词性感知的方面级情感分析[J]. 计算机工程, 2024, 50(3): 68-77.
[2]	叶晓东, 赵迎迎, 孙永奇, 赵思聪, 刘真. 基于非定长编码和滑动窗口的隐私保护记录链接方法[J]. 计算机工程, 2024, 50(2): 154-164.
[3]	王志宝, 江树涛, 李菲, 高俊涛, 马强, 杨彬. 基于多邻域感知的石油数据资产图谱实体对齐[J]. 计算机工程, 2024, 50(1): 339-347.
[4]	马建红, 龚天, 姚爽. 基于证据句与图卷积网络的文档级关系抽取[J]. 计算机工程, 2023, 49(8): 104-110.
[5]	高小方, 原玉梁, 温静, 白雪飞. 面向相交多流形聚类的标签传播算法[J]. 计算机工程, 2023, 49(6): 90-98.
[6]	陈昱瑾, 王晶, 武志昊, 赵耀帅, 林友芳. 基于图卷积网络融合群组关系的社会化推荐方法[J]. 计算机工程, 2023, 49(5): 112-121.
[7]	陈文轩, 曾碧, 郭植星. 融合多特征与语义图卷积网络的摔倒检测方法[J]. 计算机工程, 2023, 49(5): 277-285,294.
[8]	袁立宁, 胡皓, 刘钊. 基于多通道图卷积自编码器的图表示学习[J]. 计算机工程, 2023, 49(2): 150-160,174.
[9]	王欢, 宋丽娟, 杜方. 基于多模态知识图谱的中文跨模态实体对齐方法[J]. 计算机工程, 2023, 49(12): 88-95.
[10]	杨海洋, 张兴鹏. 融合多通道图卷积网络的方面级情感分析模型[J]. 计算机工程, 2023, 49(11): 61-69.
[11]	刘宽, 奚小冰, 周明东. 基于自适应多尺度图卷积网络的骨架动作识别[J]. 计算机工程, 2023, 49(10): 264-271.
[12]	王效灵, 胡志杰, 徐帅帅, 黄浩如. 改进暗通道先验和策略性融合的图像去雾算法[J]. 计算机工程, 2023, 49(10): 212-221.
[13]	张文豪, 廖列法, 王茹霞. 融合句法树多信息学习方面级情感分析[J]. 计算机工程, 2023, 49(10): 72-79.
[14]	孔博, 韩虎, 陈景景, 白雪, 邓飞. 基于虚拟依存关系与知识增强的方面级情感分析[J]. 计算机工程, 2023, 49(10): 53-63.
[15]	王曙燕, 郭睿涵, 孙家泽. 基于图对比学习的MOOC推荐方法[J]. 计算机工程, 2023, 49(1): 57-64,72.

选择文件类型/文献管理软件名称

选择包含的内容