作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (1): 128-137. doi: 10.19678/j.issn.1000-3428.0068357

• 人工智能与模式识别 • 上一篇    下一篇

一种融合图数据多元结构和特征的图池化方法

王翔1,2,*(), 魏玉锌1, 毛国君1,2   

  1. 1. 福建理工大学计算机科学与数学学院, 福建 福州 350118
    2. 福建理工大学福建省大数据挖掘与应用技术重点实验室, 福建 福州 350118
  • 收稿日期:2023-09-07 出版日期:2025-01-15 发布日期:2024-04-16
  • 通讯作者: 王翔
  • 基金资助:
    国家自然科学基金(61773415); 国家重点研发计划; 福建理工大学科技项目(GY-Z21183)

A Graph Pooling Method Fusing Multiple Structures and Features of Graph Data

WANG Xiang1,2,*(), WEI Yuxin1, MAO Guojun1,2   

  1. 1. School of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118, Fujian, China
    2. Fujian Provincial Key Laboratory of Big Data Mining and Application, Fujian University of Technology, Fuzhou 350118, Fujian, China
  • Received:2023-09-07 Online:2025-01-15 Published:2024-04-16
  • Contact: WANG Xiang

摘要:

在图神经网络中, 图池化是一类用于对图数据进行下采样以提取图表征的重要操作。由于图数据存在较为复杂的网络拓扑结构和高维度的特征信息, 因此现有图池化方法在设计过程中未能同时融合图数据的拓扑结构信息和节点的长距离依赖信息, 在图池化过程中没有考虑丢弃节点的特征, 造成图数据的重要信息损失。为此, 提出一种基于多元特征融合的图池化方法来同时捕获图数据的局部拓扑信息、全局拓扑信息以及长距离节点依赖关系, 并使用1个聚合模块聚合这些特征信息得到1个新的池化图。为了解决图池化过程中节点特征信息丢失的问题, 提出一种新的特征融合方法将丢弃节点的信息以一定比例汇聚到保留节点上。基于该池化方法, 构建基于分层池化的图分类模型。在D&D、PROTEINS、NCI1和NCI109 4个数据集上的实验结果表明, 与最佳基线模型相比, 所提模型的分类准确率分别提升了2.97、3.59、0.48和0.24个百分点, 能够更有效利用图数据的特征信息、拓扑信息和长距离节点依赖信息, 在图分类任务上取得了更好的效果。

关键词: 图池化, 图分类, 拓扑信息, 长距离节点依赖, 特征融合

Abstract:

In graph neural networks, graph pooling is a critical operation used to downsample graph data and extract graph representations. Owing to the complex network topology and high-dimensional feature information of graph data, existing graph pooling methods fail to simultaneously integrate both the topological information of graph data and the long-distance dependency information of nodes during the design process. In the graph pooling process, node features are not discarded because discarding them would result in the loss of important information from the graph data. To address these issues, this study proposes a graph pooling method based on multi-feature fusion to simultaneously capture the local and global topology structures and long-distance dependencies of graph data. An aggregation module is then used to combine these features to obtain a new pooled graph. To solve the problem of node feature information loss during graph pooling, a new feature fusion method is proposed to aggregate the information of discarded nodes in a certain proportion onto the reserved nodes. Using this pooling method, a graph classification model is constructed based on hierarchical pooling. The experimental results on four datasets-D&D, PROTEINS, NCI1, and NCI109-indicate that compared with the best baseline model, the proposed model improves the classification accuracy by 2.97, 3.59, 0.48, and 0.24 percentage points, respectively. It can more effectively utilize the features, topological, and long-distance node-dependency information of graph data, and achieve better results in graph classification tasks.

Key words: graph pooling, graph classification, topological information, long-distance node dependencies, feature fusion