Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (6): 38-48. doi: 10.19678/j.issn.1000-3428.0068976

• Research Hotspots and Reviews • Previous Articles     Next Articles

Graph Neural Network Enhancement Based on Personalized PageRank Higher Order Neighborhood Aggregation

SHANG Yaming1, WU Anbiao1, YUAN Ye2,*(), WANG Yishu1   

  1. 1. School of Computer Science and Engineering, Northeastern University, Shenyang 110167, Liaoning, China
    2. School of Computer, Beijing Institute of Technology, Beijing 100081, China
  • Received:2023-12-06 Online:2025-06-15 Published:2024-06-03
  • Contact: YUAN Ye

基于个性化PageRank高阶邻域聚合的图神经网络增强

商雅名1, 吴安彪1, 袁野2,*(), 王一舒1   

  1. 1. 东北大学计算机科学与工程学院, 辽宁 沈阳 110167
    2. 北京理工大学计算机学院, 北京 100081
  • 通讯作者: 袁野
  • 基金资助:
    国家自然科学基金(62302084); 中国博士后科学基金(2023M730518); 中央高校基本科研业务费专项资金(N232405-16)

Abstract:

The key idea behind Graph Neural Network (GNN) is to learn the information representation of a target node by aggregating neighborhood information through the topology of a graph; however, edges that are not relevant to a downstream task or nodes with limited neighbors may limit the representation of the neural network. Existing enhancement methods seldom focus on both structure and features simultaneously when enhancing graph data. Among them, existing local area enhancement methods use generative models to generate features through first-order neighborhoods and cannot obtain more relevant higher-order neighborhood information for nodes. To address this phenomenon, this study presents an effective data enhancement strategy. First, an edge prediction model is used to adjust the topology of a graph to improve the Signal-to-Noise Ratio (SNR) and facilitate the message transfer between nodes. Second, a Personalized PageRank (PPR) algorithm is used to aggregate the effective information in multiorder neighborhoods from a global perspective for global feature enhancement. Finally, the generative model is used to generate more features for local enhancement, which enriches node expression, especially for low-degree nodes. Experiments show that the accuracies of Graph Convolutional Network (GCN) and Graph Attention Network (GAT) models are improved by 3.1 and 1.3 percentage points on average, respectively, on the Cora, CiteSeer, and PubMed datasets with this data enhancement strategy. This result shows that performance improves to an extent when this strategy is applied to neural network architectures with different benchmark sets.

Key words: data augmentation, Personalized PageRank(PPR), generative model, neural network, global aggregation, multi-order neighborhood

摘要:

图神经网络(GNN)的关键思想是通过图的拓扑结构来聚合邻域信息学习目标节点的信息表征,当图中存在与下游任务无关的边,或者节点的邻居有限时,都会限制神经网络的表达。现有的增强方法很少从结构和特征两方面出发来同时增强图数据,其中现有的局域增强方法运用生成模型通过一阶邻域来生成特征,无法为节点获得更多相关高阶邻域信息。针对这种现象,提出一种有效的数据增强策略。首先运用边预测模型来调整图的拓扑结构,提高信噪比(SNR),促进节点之间的消息传递;然后运用个性化PageRank(PPR)算法从全局角度聚合多阶邻域中的有效信息进行全局特征增强;最后运用生成模型来生成更多特征进行局域增强,丰富节点表达,尤其是低度节点。实验结果表明,在Cora、CiteSeer和PubMed数据集上,在图卷积网络(GCN)和图注意力网络(GAT)模型上运用该数据增强策略,在测试精度方面模型准确率平均提高3.1和1.3百分点,证明当应用于不同的基准集的各种神经网络架构时,该数据增强策略都能产生一定程度上的性能提升。

关键词: 数据增强, 个性化PageRank, 生成模型, 神经网络, 全局聚合, 多阶邻域