作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (11): 133-143. doi: 10.19678/j.issn.1000-3428.0069679

• 人工智能与模式识别 • 上一篇    下一篇

基于重要性采样的异质超网络表示学习

夏青青1,2, 朱宇1,2,*(), 王晓英1,2, 黄建强1,2, 曹腾飞1,2   

  1. 1. 青海大学计算机技术与应用学院, 青海 西宁 810016
    2. 青海大学青海省智能计算与应用实验室, 青海 西宁 810016
  • 收稿日期:2024-04-01 修回日期:2024-05-15 出版日期:2025-11-15 发布日期:2025-11-26
  • 通讯作者: 朱宇
  • 基金资助:
    国家自然科学基金(62166032); 国家自然科学基金(62162053); 青海省自然科学基金(2022-ZJ-961Q)

Heterogeneous Hypernetwork Representation Learning Based on Importance Sampling

XIA Qingqing1,2, ZHU Yu1,2,*(), WANG Xiaoying1,2, HUANG Jianqiang1,2, CAO Tengfei1,2   

  1. 1. School of Computer Technology and Application, Qinghai University, Xining 810016, Qinghai, China
    2. Qinghai Provincial Laboratory for Intelligent Computing and Application, Qinghai University, Xining 810016, Qinghai, China
  • Received:2024-04-01 Revised:2024-05-15 Online:2025-11-15 Published:2025-11-26
  • Contact: ZHU Yu

摘要:

异质超网络能够建模现实世界中的各种高阶元组关系, 表征超网络的异质高阶信息, 同时异质超网络具有不同程度的不可分解性, 而现有研究方法没有充分考虑高阶元组关系(超边)的不可分解性。针对上述问题, 提出一种基于重要性采样的异质超网络表示学习方法HRIS, 将紧密高阶元组关系融入超网络表示学习中。首先, 该方法提出判断节点的概念, 融合不可分解因子与元组相似度改进随机游走对重要节点的采样来捕获超网络中紧密的高阶元组关系。其次, 为了使序列更具全局性与多样性, 引入数据增强中的随机交换方法来解决过拟合问题, 同时提出基于节点度的随机删除方法提升鲁棒性。最后, 构建一个负采样增强的skip-gram模型NSE-skip-gram来获得高质量的节点表示向量。在4个真实数据集上的实验结果表明: 对于链接预测任务, HRIS显著优于基线方法; 对于超网络重建任务, 在所有重建比例下, HRIS在全球定位系统(GPS)和drug数据集上较最优基线方法平均提升3.75和9.79百分点。

关键词: 表示学习, 高阶元组关系, 重要性采样, 数据增强, 负采样增强, 链接预测, 超网络重建

Abstract:

Heterogeneous hypernetworks can model various high-order tuple relations found in the real world, which represent heterogeneous high-order information within the hypernetwork. However, heterogeneous hypernetworks have different degrees of indecomposability, and existing research methods do not fully consider the indecomposability of high-order tuple relations regarded as hyperedges. To address this issue, a heterogeneous hypernetwork representation learning method based on importance sampling, called HRIS, is proposed, which incorporates close high-order tuple relations into hypernetwork representation learning. First, it proposes judgment nodes, and incorporates indecomposable factors and tuple similarity to improve the sampling of important nodes through random walks to capture tight high-order tuple relations within the hypernetwork. Second, to make the sequences more global and diverse, the random swap method in data augmentation is introduced for solving overfitting problems, and a random deletion method based on node degree is proposed to improve robustness. Finally, a skip-gram model with negative sampling enhancement, called NSE-skip-gram, is proposed to obtain high-quality node representation vectors. Experiments conducted on four real hypernetwork datasets reveal that for the link prediction task, the HRIS demonstrates a significant improvement over other baseline methods; for the hypernetwork reconstruction task, the HRIS exhibits an average improvement of 3.75 and 9.79 percentage points compared to the optimal baseline method on the Global Positioning System (GPS) and drug datasets at all reconstruction ratios, respectively.

Key words: representation learning, high-order tuple relation, importance sampling, data augmentation, negative sampling enhancement, link prediction, hypernetwork reconstruction