Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2020, Vol. 46 ›› Issue (2): 103-109. doi: 10.19678/j.issn.1000-3428.0054147

Previous Articles     Next Articles

Research on ALS Acceleration Algorithm Based on Spark Platform

JIA Xiaofang, SANG Guoming, QI Wenkai   

  1. School of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Received:2019-03-08 Revised:2019-05-06 Published:2019-06-06

基于Spark平台的ALS加速算法研究

贾晓芳, 桑国明, 祁文凯   

  1. 大连海事大学 信息科学技术学院, 辽宁 大连 116026
  • 作者简介:贾晓芳(1994-),女,硕士研究生,主研方向为大数据应用、数据挖掘算法;桑国明(通信作者),副教授;祁文凯,硕士研究生。
  • 基金资助:
    国家自然科学基金(61672122);中央高校基本科研业务费项目"大规模协作式多智能体强化学习技术研究"(3132019207)。

Abstract: Collaborative filtering algorithm plays an important role in recommendation system,but its execution efficiency and ranking accuracy are both low.Alternating Least Squares(ALS) algorithm can implement parallel computing,thus improving the execution efficiency,but the time between data loading and iterative convergence of the algorithm is a bit long.Therefore,by combing the Nonlinear Conjugate Gradient(NCG) algorithm and the ALS algorithm,this paper proposes an ALS-NCG algorithm to accelerate the ALS algorithm.The performance of the ALS-NCG algorithm is evaluated in the Spark distributed data processing environment.Experimental results show that compared with the ALS algorithm,the ALS-NCG algorithm needs less iterations and time to obtain high-precision recommended ranking.

Key words: collaborative filtering, recommendation algorithm, Alternating Least Squares(ALS) algorithm, Nonlinear Conjugate Gradient(NCG), Spark platform

摘要: 协同过滤推荐算法在推荐系统中发挥着重要作用,但其存在执行效率与排名精度较低的问题,交替最小二乘(ALS)算法可实现并行计算,从而提高执行效率,但是该算法数据加载与迭代收敛的时间较长。为此,将非线性共轭梯度(NCG)算法与ALS算法相结合,提出一种ALS-NCG算法,以达到加速ALS算法的目的。在Spark分布式数据处理环境中对ALS-NCG算法进行性能评估,实验结果表明,相比ALS算法,ALS-NCG算法获取高精度推荐排名时需要的迭代次数与时间更少。

关键词: 协同过滤, 推荐算法, 交替最小二乘算法, 非线性共轭梯度, Spark平台

CLC Number: