作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (9): 60-68, 78. doi: 10.19678/j.issn.1000-3428.0066245

• 人工智能与模式识别 • 上一篇    下一篇

基于自监督学习的多密度图会话推荐

刘晓黎, 王轶彤   

  1. 复旦大学 软件学院, 上海 200433
  • 收稿日期:2022-11-11 出版日期:2023-09-15 发布日期:2023-09-14
  • 作者简介:

    刘晓黎(1996—),男,硕士研究生,主研方向为图神经网络、推荐系统

    王轶彤,副教授、博士

  • 基金资助:
    国家重点研发计划重点专项(2020YFC2008400)

Multi-density Graph-based Session Recommendation Using Self-supervised Learning

Xiaoli LIU, Yitong WANG   

  1. School of Software, Fudan University, Shanghai 200433, China
  • Received:2022-11-11 Online:2023-09-15 Published:2023-09-14

摘要:

基于会话的推荐系统旨在根据匿名用户短时间内的历史行为序列预测下一个可能的项目,会话中的项目来自用户的反馈数据。然而,在海量的候选项目集中,用户倾向于反馈其中的小部分项目,除了少部分热门项目外,大量长尾项目的反馈数据非常稀疏。现有的会话推荐方法大多集中在对会话的序列模式以及项目之间的复杂关联关系进行建模,忽略了会话推荐中的长尾分布现象。针对这一问题,提出一种基于多任务自监督学习的会话推荐模型。在原始推荐任务基础上,使用基于项目频率的逆采样器建立自监督学习任务,以增强对长尾项目嵌入的学习,同时缓解数据稀疏性。此外,构建一个多密度会话图,并通过统一的图神经网络以可解释的方式学习会话的嵌入,以更准确地捕捉用户意图。为了避免过拟合,采用带有标签平滑正则化的交叉熵作为目标函数。实验结果表明,与GCE-GNN、COTREC、MSGIFSR等先进的基线方法相比,该方法在Diginetica、Tmall、Gowalla和Last.FM这4个真实数据集上的命中率与宏命中率显著提升,其中,在top-20推荐结果上,命中率分别提升了1.37%、5.88%、0.30%和2.16%,宏命中率分别提升了2.49%、12.86%、1.97%和10.19%。

关键词: 会话推荐, 图神经网络, 自监督学习, 多任务学习, 多密度图, 长尾分布

Abstract:

Session-based recommendation forecasts the next item in a user's sequence of behaviors over a short period. These sessions consist of items from user feedback data. However, users typically provide feedback on only a few items, leaving many long-tail items with sparse data. Existing methods prioritize modeling sequential patterns and item relationships but neglect the long-tail aspect. This paper introduces a session-based recommendation model based on multi-task self-supervised learning. In addition to the main recommendation task, a self-supervised learning task is employed using an item frequency-based inverse sampler. This enhances learning for long-tail item embeddings, mitigating data sparsity. Additionally, to capture user intentions with greater accuracy, a multi-density session graph is constructed. This allows for the learning of session embeddings more effectively through a cohesive graph neural network. To prevent overfitting, the objective function incorporates cross-entropy with label-smoothing regularization. Experimental findings underscore the proposed algorithm's superior performance on four extensively used real-world datasets: Diginetica, Tmall, Gowalla, and Last.FM. When compared against cutting-edge baseline methods, such as the Global Context Enhanced Graph Neural Network(GCE-GNN), CO-Training framework for session-based RECommendation(COTREC), and Multi-granularity intent heterogeneous Session Graph and Intent Fusion ranking for Session-based Recommendation(MSGIFSR), the results are promising. Specifically, in the top-20 recommendation outcomes, the hit rate increases by 1.37%, 5.88%, 0.30%, and 2.16% across the datasets. Concurrently, the macro hit rate improves by 2.49%, 12.86%, 1.97%, and 10.19%, respectively.

Key words: session recommendation, Graph Neural Network(GNN), Self-Supervised Learning(SSL), multi-task learning, multi-density graph, long-tail distribution