作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于数据增强和融合自注意力的序列推荐算法

  • 发布日期:2025-03-25

A Sequential Recommendation Algorithm with Data Augmentation and Integrated Self-Attention

  • Published:2025-03-25

摘要: 在推荐系统中,序列推荐旨在通过用户的历史交互序列来预测其未来兴趣。然而,现有深度学习模型通常侧重于捕捉用户的长期行为模式,而对时间信息的细粒度建模有所忽视,从而限制推荐效果的提升。为了解决这一问题,提出了一种结合全局与局部自注意力机制的方法,通过分块处理用户交互序列并引入权重衰减机制,根据时序距离为不同序列块分配差异化权重,从而更精确地捕捉用户长短期兴趣的变化。然而,这种方法会引入更多的参数,增加模型的复杂度。虽然随机共享嵌入(SSE)技术能够减少参数量并缓解过拟合,但其随机嵌入方式可能会引入噪声数据,影响推荐结果的准确性。针对此问题,提出了一种结合生成对抗网络(GAN)与SSE技术的策略,通过GAN生成符合用户兴趣分布的高质量交互数据,并结合SSE中的随机替换机制进行数据增强,在随机选择生成的数据时,增加由GAN生成的高质量数据,从而降低噪声引入的风险,同时保留了SSE技术降低过拟合方面的优势。在Movielens-1M、Amazon Beauty和Yahoo Music三个公开数据集进行实验,结果表明,所提出方法在归一化折损累计增益(NDCG)、命中率(HR)以及平均倒数排名(MRR)指标上表现优异。

Abstract: In recommender systems, sequential recommendation aims to predict a user’s future interests based on their historical interaction sequences. However, existing deep learning models. However, existing deep learning models typically focus on capturing users' long-term behavior patterns, while neglecting fine-grained modeling of temporal information, which limits the improvement of recommendation performance. To address this issue, we propose a method that combines global and local self-attention mechanisms. This method processes user interaction sequences in chunks and introduces a weight decay mechanism, assigning differentiated weights to different sequence blocks based on temporal distance, thereby more accurately capturing changes in users' short-term and long-term interests. However, this approach introduces more parameters, increasing the model's complexity. While Stochastic Shared Embedding (SSE) technology can reduce the parameter count and mitigate overfitting, its random embedding approach may introduce noisy data, affecting the accuracy of recommendations. To solve this issue, we propose a strategy that combines Generative Adversarial Network (GAN) with SSE. By using GAN to generate high-quality interaction data that aligns with user interest distributions, and combining this with the random replacement mechanism in SSE for data augmentation, we enhance the data by randomly selecting generated data, thus reducing the risk of noise introduction, while retaining SSE's advantage in reducing overfitting. Experiments are conducted on three public datasets: Movielens-1M, Amazon Beauty, and Yahoo Music. The results show that the proposed method performs excellently in terms of Normalized Discounted Cumulative Gain (NDCG), Hit Rate (HR), and Mean Reciprocal Rank (MRR).