针对新用户冷启动问题的改进Epsilon-greedy算法

doi:10.19678/j.issn.1000-3428.0048631

计算机工程 ›› 2018, Vol. 44 ›› Issue (11): 172-177. doi: 10.19678/j.issn.1000-3428.0048631

针对新用户冷启动问题的改进Epsilon-greedy算法

王素琴¹,张洋¹,蒋浩²,朱登明²

1.华北电力大学控制与计算机工程学院,北京 102206; 2.中国科学院计算技术研究所,北京 100080

收稿日期:2017-09-11 出版日期:2018-11-15 发布日期:2018-11-15
作者简介:王素琴(1970—),女,副教授、硕士,主研方向为数据挖掘、计算机视觉;张洋,硕士;蒋浩,助理研究员、博士;朱登明,副研究员、博士。
基金资助:
国家自然科学基金“逼真稳定的服装动画方法研究”(61300131);北京市共建项目(2014JG48)

Improved Epsilon-greedy Algorithm for Cold-start Problem of New Users

WANG Suqin ¹,ZHANG Yang ¹,JIANG Hao ²,ZHU Dengming ²

1.School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China; 2.Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100080,China

Received:2017-09-11 Online:2018-11-15 Published:2018-11-15

摘要/Abstract

摘要：

在解决新用户冷启动问题时,固定不变的Epsilon参数会使传统Epsilongreedy算法收敛缓慢。为此,提出一种改进的Epsilon-greedy算法。利用免疫反馈模型动态调整Epsilon参数,从而使算法快速收敛。使用蒙特卡罗模拟方法对算法进行实验验证,结果表明,该算法能够在用户与推荐系统交互较少的情况下为用户进行有效推荐,且推荐效果优于传统的Epsilon-greedy、Softmax和UCB算法。

Abstract: When solving the cold-start problem of new users,fixed and invariant Epsilon parameters will slow the convergence of traditional Epsilon-greedy algorithm.Therefore,an improved Epsilon-greedy algorithm is proposed.Immune feedback model is used to dynamically adjust the Epsilon parameters so that the algorithm converges quickly.Monte Carlo simulation is used to validate the proposed algorithm.Results show that this algorithm can effectively recommend to users when they have little interaction with the recommendation system,and the recommendation effect is better than the traditional Epsilon-greedy algorithm,Softmax algorithm and UCB algorithm.

Key words: recommendation system, cold-start, Epsilon-greedy algorithm, immune feedback model, bandit algorithm

中图分类号:

TP181

王素琴,张洋,蒋浩,朱登明. 针对新用户冷启动问题的改进Epsilon-greedy算法[J]. 计算机工程, 2018, 44(11): 172-177.

WANG Suqin,ZHANG Yang,JIANG Hao,ZHU Dengming. Improved Epsilon-greedy Algorithm for Cold-start Problem of New Users[J]. Computer Engineering, 2018, 44(11): 172-177.

https://www.ecice06.com/CN/Y2018/V44/I11/172

参考文献

［1］LIU C,WANG Y.Analysis on the cold-start problem in recommendation system［J］.Telecommunications Network Technology,2017(1):56-76.
［2］WEI J,HE J,CHEN K,et al.Collaborative filtering and deep learning based recommendation system for cold start items［J］.Expert Systems with Applications,2017,69:29-39.
［3］冷亚军,陆青,梁昌勇.协同过滤推荐技术综述［J］.模式识别与人工智能,2014,27(8):720-734.
［4］王洁,汤小春.基于社区网络内容的个性化推荐算法研究［J］.计算机应用研究,2011,28(4):1248-1250.
［5］王国霞,刘贺平.个性化推荐系统综述［J］.计算机工程与应用,2012,48(7):66-76.
［6］ZHU R,WANG H M,FENG D W.Trustworthy services selection based on preference recommendation［J］.Journal of Software,2011,22(5):852-864.
［7］李改,李磊.一种解决协同过滤系统冷启动问题的新算法［J］.山东大学学报(工学版),2012,42(2):11-17.
［8］MASSA P,AVESANI P.Trust-aware recommender systems［C］//Proceedings of ACM Conference on Recommender Systems.New York,USA:ACM Press,2007:17-24.
［9］MIDDLETON S E,SHADBOLT N R,DE ROURE D C.Ontological user profiling in recommender systems［J］.ACM Transactions on Information Systems,2004,22(1):54-88.
［10］ZHANG X,NAKHAI M R,WAN N S F W A.A multi-armed bandit approach to distributed robust beamforming in multicell networks［C］//Proceedings of 2016 IEEE Global Communications Conference.Washington D.C.,USA:IEEE Press,2016:1-6.
［11］TAKAHASHI K,YAMADA T.Application of an immune feedback mechanism to control systems［J］.JSME International Journal,1998,41(2):184-191.
［12］BERRY D A,FRISTEDT B.Bandit problems:sequential allocation of experiments［M］.Berlin,Germany:Springer,1985.
［13］HILLS T T.Trade-off between exploration and exploitation［M］//TODD K,SHACKELFOR D,VIVIANA A.Encyclopedia of evolutionary psycholo-gical science.Berlin,Germany:Springer,2017.
［14］LANGFORD J,ZHANG T.The epoch-greedy algorithm for contextual multi-armed bandits［EB/OL］.［2017-09-01］.http://courses.cms.caltech.edu/cs101.2/slides/cs101.2-05-contextual-bandits.pdf.
［15］AUER P,ORTNER R.UCB revisited:improved regret bounds for the stochastic multi-armed bandit problem［J］.Periodica Mathematica Hungarica,2010,61(1):55-65.
［16］AUER P,CESA-BIANCHI N,FREUND Y,et al.The nonstochastic multiarmed bandit problem［J］.SIAM Journal on Computing,2002,32(1):48-77.
［17］CHAPELLE O,LI L.An empirical evaluation of thompson sampling［C］//Proceedings of International Conference on Neural Information Processing Systems.［S.l.］:Curran Associates Inc.,2011:2249-2257.
［18］BERRY D A,FRISTEDT B.Bandit problems［J］.Monographs on Statistics and Applied Probability,1985,25(10):1585-1594.
［19］ANANTHARAM V,VARAIYA P,WALRAND J.Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part II:Markovian rewards［J］.IEEE Transactions on Automatic Control,1987,32(11):977-982.
［20］AUER P.Using confidence bounds for exploitation-exploration trade-offs［J］.Journal of Machine Learning Research,2002,3(3):397-422.
［21］LI L,CHU W,LANGFORD J,et al.A contextual-bandit approach to personalized news article recommenda-tion［C］//Proceedings of International Conference on World Wide Web.New York,USA:ACM Press,2010:661-670.
［22］MCNEE S M,RIEDL J,KONSTAN J A.Being accurate is not enough:how accuracy metrics have hurt recommender systems［C］//Proceedings of 2006 Conference on Human Factors in Computing Systems.New York,USA:ACM Press,2006:1097-1101.
［23］KAWAFUKU M,SASAKI M,TAKAHASHI K.Adaptive learning method of neural network controller using an immune feedback law［C］//Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics.Washington D.C.,USA:IEEE Press,1999:641-646.
［24］SASAKI M,KAWAFUKU M,TAKAHASHI K.An immune feedback mechanism based adaptive learning of neural network controller［C］//Proceedings of Interna-tional Conference on Neural Information Processing.Washington D.C.,USA:IEEE Press,1999:502-507.
［25］MACKAY D J C.Introduction to Monte Carlo methods［C］//Proceedings of NATO Advanced Study Institute on Learning in Graphical Models.Norwell,USA:Kluwer Academic Publishers,1998:175-204.

[1]	张斯力, 李梓健, 蔡瑞初, 郝志峰, 闫玉光. 基于因果机制约束的强化推荐系统[J]. 计算机工程, 2024, 50(5): 279-290.
[2]	杨兴耀, 马帅, 张祖莲, 于炯, 陈嘉颖, 王东晓. 基于偏好感知的去噪图卷积网络社交推荐[J]. 计算机工程, 2024, 50(10): 154-163.
[3]	吴永庆, 王钰涵, 朱月. 基于用户多类型反馈行为序列的点击率预估模型[J]. 计算机工程, 2024, 50(10): 405-417.
[4]	吴志强, 解庆, 李琳, 刘永坚. 基于多模态融合的图神经网络推荐算法[J]. 计算机工程, 2024, 50(1): 91-100.
[5]	唐彦, 卢镘旭. 基于知识图谱与深度涟漪网络的推荐系统[J]. 计算机工程, 2023, 49(5): 63-72,80.
[6]	李盼, 解庆, 李琳, 刘永坚. 知识增强的图神经网络序列推荐模型[J]. 计算机工程, 2023, 49(2): 70-80.
[7]	李婉桦, 孙英娟, 刘艺璇, 刘乾. 基于全局图和多粒度意图单元的会话推荐[J]. 计算机工程, 2023, 49(10): 136-144, 153.
[8]	沈学利, 马玉营, 梁振兴. 融合复杂先验与注意力机制的变分自动编码器[J]. 计算机工程, 2022, 48(11): 55-61.
[9]	刘华玲, 马俊, 张国祥. 基于深度学习的内容推荐算法研究综述[J]. 计算机工程, 2021, 47(7): 1-12.
[10]	贾俊杰, 张玉超. 基于用户模糊聚类的综合信任推荐算法[J]. 计算机工程, 2021, 47(6): 60-67.
[11]	朱映波, 赵阳洋, 王佩, 尹凯, 王振宇. 融合马尔科夫决策过程与信息熵的对话策略[J]. 计算机工程, 2021, 47(3): 284-290.
[12]	唐浩, 刘柏嵩, 刘晓玲, 黄伟明. 基于协同知识图谱特征学习的论文推荐方法[J]. 计算机工程, 2020, 46(9): 306-312.
[13]	吴清春, 贾彩燕. 一种融合社交关系的矩阵分解推荐模型[J]. 计算机工程, 2020, 46(8): 72-77,84.
[14]	李南星, 盛益强, 倪宏. 用于个性化推荐的条件卷积隐因子模型[J]. 计算机工程, 2020, 46(4): 85-90,96.
[15]	黄文明, 卫万成, 张健, 邓珍荣. 基于注意力机制与评论文本深度模型的推荐方法[J]. 计算机工程, 2019, 45(9): 176-182.

选择文件类型/文献管理软件名称

选择包含的内容

针对新用户冷启动问题的改进Epsilon-greedy算法

Improved Epsilon-greedy Algorithm for Cold-start Problem of New Users

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

针对新用户冷启动问题的改进Epsilon-greedy算法

Improved Epsilon-greedy Algorithm for Cold-start Problem of New Users

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价