Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Multi-modal Entity Alignment Method Based on Attribute Filtering Enhancement and Multi-round Instruction Reasoning

  

  • Published:2025-12-10

基于属性筛选增强与多轮指令推理的多模态实体对齐方法

Abstract: Multi-modal Entity Alignment (MMEA) aims to integrate structural, textual, and visual information to identify nodes in different multi-modal knowledge graphs that represent the same real-world entity. Existing methods often ignore the inconsistency in attribute-type descriptions between knowledge graphs when fusing multi-modal features, which leads to deviation in entity representation and affects alignment performance. To address this issue, this paper propose an MMEA method based on attribute filtering enhancement and multi-round instruction reasoning. The method consists of three main modules: First, multi-modal information is integrated and entity similarity is calculated to obtain candidate entity sequences. Secondly, in the entity information processing section, an attribute screening enhancement mechanism is employed to select semantically similar entity attribute types between knowledge graphs, thereby mitigating the interference caused by differences in attribute descriptions and redundant information. This helps reduce interference caused by descriptive differences and redundant information. Finally, the alignment task is modeled as a multiple-choice problem, where the filtered attributes and natural language information of entities are combined to fine-tune large language models. During reasoning, a multi-round reasoning strategy is introduced, dividing the large number of candidate entities into subsequences to enhance the model's ability to distinguish semantic differences among the entities in the subsequences, thereby improving the accuracy of the final alignment reasoning. Experiments are conducted on multiple public datasets FB-DB15K, FB-YAGO15K, EN-FR-15K V2, EN-DE-15K V2, and the results demonstrate consistent improvements in entity alignment performance of our method over the baseline methods. Specifically, on the FB-DB15K, EN-FR-15K V2, and EN-DE-15K V2 datasets, our method achieves absolute gains in MRR of 2%, 1%, and 0.2%, respectively, compared to the second-best model. Notably, a significant substantial margins of 9.1% in MRR and 7.8% in Hits@1.

摘要: 多模态实体对齐(MMEA)旨在综合结构、文本与图像等多种模态信息,从不同的多模态知识图谱中识别出表示同一现实世界实体的节点。现有方法在融合多模态特征时,往往忽略了不同图谱中实体的属性类型不一致,导致实体表征偏离,影响对齐效果。为此,本文提出基于属性筛选增强与多轮指令推理的多模态实体对齐方法。该方法包括三个核心模块:首先,通过融合多模态信息并计算实体间相似度,从而获取候选实体序列;其次,在实体信息处理部分,通过属性筛选增强机制,选取图谱间语义相似的实体属性类型,从而缓解属性描述差异与冗余信息带来的干扰;最后,将对齐任务建模为多项选择问题,结合筛选后的实体属性与实体的自然语言描述来构建指令,对大语言模型进行微调;并在推理时引入多轮推理策略,将大规模候选实体划分为子序列,增强模型辨别子序列实体间语义差异的能力,从而提升最终对齐的推理准确性。在多个公开数据集FB-DB15K、FB-YAGO15K、EN-FR-15K V2、EN-DE-15K V2上的实验结果表明,本文方法相比基线方法的实体对齐性能均有提升。具体而言,在FB-DB15K、EN-FR-15K V2和EN-DE-15K V2上,本文方法的MRR指标相比次优模型分别取得了2%、1%和0.2%的绝对提升。特别是在FB-YAGO15K数据集中,本文方法的MRR和Hits@1相比次优模型MCCEA分别提升了9.1%和7.8%,取得了明显的优势。