Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

   

A Graph Pooling Method by Fusing Multiple Structures and Features of Graph Data

  

  • Published:2024-04-16

一种融合图数据多元结构和特征的图池化方法

Abstract: In graph neural networks, graph pooling is a critical operation used for downsampling graph data to extract graph representation. Due to the complex topology structure and high-dimensional feature of graph data, existing graph pooling methods have the following problems in their design: 1. Existing methods fail to fully utilize and simultaneously fuse the topological structure information and long-range dependence information of graph data; 2. In the graph pooling process, the features of discarded nodes are not considered, inevitably resulting in the loss of important information in graph data. Based on these issues, this paper proposes a graph pooling method based on multi-feature fusion to simultaneously capture the local topology structure, global topology structure and long-range dependencies of graph data. And then we use an aggregation module to combine these features to obtain a new pooled graph. Additionally, to avoid the loss of node feature during the graph pooling process, a new feature fusion method is proposed, which aggregates the features of dropping nodes onto the retained nodes in a certain proportion. Based on this pooling method, we construct a graph classification model based on hierarchical pooling and conducts experiments on multiple public datasets. The experimental results show that the model proposed in this paper achieves better performance on the graph classification task compared to the best baseline model, with classification accuracy increases of 2.97%, 3.59%, 0.48%, and 0.24% on the D&D, PROTEINS, NCI1, and NCI109 datasets, respectively. This suggests that it can effectively utilize the feature information, topological information, and long-range dependencies information of graph data to improve the performance of graph classification.

摘要: 在图神经网络中,图池化是一类用于对图数据进行下采样以提取图表征的重要操作。由于图数据存在较为复杂的网络拓扑结构和高维度的特征信息,现有的图池化方法在设计中还存在以下问题:1、未能充分利用并同时融合图数据的拓扑结构信息和节点的长距离依赖信息;2、在图池化过程中没有考虑丢弃节点的特征,不可避免造成图数据的重要信息损失。针对以上问题,本文提出了一种基于多元特征融合的图池化方法来同时捕获图数据的局部拓扑信息、全局拓扑信息以及长距离节点依赖关系,并使用一个聚合模块聚合这些特征信息得到一个新的池化图。为了缓解图池化过程中节点的特征信息丢失,提出一种新的特征融合方法将丢弃节点的信息以一定比例汇聚到保留节点上。基于该池化方法,构建了一个基于分层池化的图分类模型,并在多个公共数据集上进行实验。结果表明,与最佳基线模型相比,本文所提出的模型在图分类任务上取得了更好的效果,在D&D、PROTEINS、NCI1和NCI109四个数据集上的分类准确率分别提升了2.97%、3.59%、0.48%和0.24%。这表明它能够更有效利用图数据的特征信息、拓扑信息和长距离节点依赖信息,提升图分类效果。