作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (10): 13-21. doi: 10.19678/j.issn.1000-3428.0065807

• 热点与综述 • 上一篇    下一篇

基于超邻接图的异质信息网络表征学习

杨彬, 王轶彤*   

  1. 复旦大学 软件学院, 上海 200433
  • 收稿日期:2022-09-20 出版日期:2023-10-15 发布日期:2023-10-10
  • 通讯作者: 王轶彤
  • 作者简介:

    杨彬(1997-), 男, 硕士研究生, 主研方向为异质信息网络表征学习

  • 基金资助:
    国家重点研发计划重点专项(2020YFC2008400)

Representation Learning in Heterogeneous Information Network Based on Hyper Adjacency Graph

Bin YANG, Yitong WANG*   

  1. School of Software, Fudan University, Shanghai 200433, China
  • Received:2022-09-20 Online:2023-10-15 Published:2023-10-10
  • Contact: Yitong WANG

摘要:

异质信息网络往往包含不同类型的节点和关系,丰富的语义信息和复杂的关系对目前异质信息网络的表征学习提出了巨大的挑战。现有多数方法通常使用预定义的元路径来捕获异质的语义信息和结构信息,但成本高、覆盖率低,且不能准确有效地捕获和学习有影响力的高阶邻居节点。提出HIN-HG模型来解决以上问题。HIN-HG通过生成异质信息网络的超邻接图来准确有效地捕获对目标节点有影响力的邻居节点,并使用带有多通道机制的卷积神经网络聚合在不同关系下的不同类型的邻居节点。HIN-HG可以自动地学习不同邻居节点和元路径的权重而无须进行手动指定,同时可以捕获全图范围内和目标节点相似的节点作为高阶邻居,并通过信息传播有效地更新目标节点的表征。在DBLP、ACM和IMDB真实数据集上的实验结果表明,在节点分类任务中,HIN-HG较HAN、GTN、HGSL等前沿的异质信息网络表征学习方法性能更优,Macro-F1和Micro-F1多分类评估指标平均提高5.6和5.7个百分点,提高了节点分类的准确性和有效性。

关键词: 异质信息网络, 元路径, 邻域聚合, 表征学习, 图卷积

Abstract:

Heterogeneous Information Network(HIN) typically contains different types of nodes and interactions. Richer semantic information and complex relationships have posed significant challenges to current representation learning in HINs. Although most existing approaches typically use predefined meta-paths to capture heterogeneous semantic and structural information, they suffer from high cost and low coverage. In addition, most existing methods cannot precisely and effectively capture and learn influential high-order neighbor nodes. Accordingly, this study attempts to address the problems of meta-paths and influential high-order neighbor nodes with a proposed original HIN-HG model. HIN-HG generates a hyperadjacency graph of the HIN, precisely and effectively capturing the influential neighbor nodes of the target nodes. Then, convolutional neural networks are adopted with a multichannel mechanism to aggregate different types of neighbor nodes under different relationships. HIN-HG can automatically learn the weights of different neighbor nodes and meta-paths without manually specifying them. Meanwhile, nodes similar to the target node can be captured in the entire graph as higher-order neighbor nodes and the representation of the target node can be effectively updated through information propagation. The experimental results of HIN-HG on three real datasets-DBLP, ACM, and IMDB demonstrate the improved performance of HIN-HG compared with state-of-the-art methods in HIN representation learning, including HAN, GTN, and HGSL. HIN-HG exhibits improved accuracy of node classification by 5.6 and 5.7 percentage points on average in the multiple classification evaluation indices Macro-F1 and Micro-F1, respectively, thus improving the accuracy and effectiveness of node classification.

Key words: Heterogeneous Information Network(HIN), meta-path, neighborhood aggregation, representation learning, graph convolution