计算机工程 ›› 2018, Vol. 44 ›› Issue (12): 316-320.doi: 10.19678/j.issn.1000-3428.0049038

• 开发研究与工程应用 • 上一篇    

一种两阶段联合哈希的协同过滤算法

张辉宜,侯耀祖,陶陶   

  1. 安徽工业大学 计算机学院,安徽 马鞍山 243032
  • 收稿日期:2017-10-23 出版日期:2018-12-15 发布日期:2018-12-15
  • 作者简介:张辉宜(1963—),男,教授,主研方向为机器学习;侯耀祖,硕士研究生;陶陶,副教授。
  • 基金项目:

    安徽省高校自然科学研究重点项目(KJ2017A063)。

A Collaborative Filtering Algorithm Based on Two-stage Joint Hashing

ZHANG Huiyi,HOU Yaozu,TAO Tao   

  1. College of Computer Science,Anhui University of Technology,Maanshan,Anhui 243032,China
  • Received:2017-10-23 Online:2018-12-15 Published:2018-12-15

摘要:

传统依赖于相似性度量和近邻检索的推荐算法,在面对海量高维数据时存在计算量大和推荐效率低的问题。为此,提出一种基于用户和项目视角的两阶段联合哈希协同过滤算法。针对评分数据,分别从用户或项目视角应用主成分分析和迭代量化技术生成对应的二值码,用评分约束用户与项目的海明距离生成另一视角的二值码,通过二值码完成基于top-K推荐的推荐任务。在MovieLens-1M数据集上的实验结果表明,与ITQ和BinMF算法相比,该算法能够有效减少推荐过程中的计算消耗,提高推荐质量。

关键词: 两阶段联合哈希, 协同过滤, 主成分分析, 迭代量化, 海明距离

Abstract:

The traditional recommendation algorithms which rely on similarity measurement and nearest neighbor retrieval have the problems of large computation and low recommendation efficiency in the face of massive high-dimensional data.To deal with the problem,a Two-stage Joint Hashing(TSH) collaborative filtering algorithm based on user and item perspective is proposed.Principal Component Analysis (PCA) and iterative quantization techniques are applied to generate binary codes for rating data from user or item perspective.Then the binary code of another perspective is generated by constraining the Haming distance of user and item,and the recommendation task based on top-K recommendation is accomplished by the binary code.Experimental results on the MovieLens-1M dataset show that,compared with ITQ and BinMF algorithm,this algorithm can effectively reduce the computational consumption in the recommendation process and improve the recommendation quality.

Key words: Two-stage Joint Hashing(TSH), collaborative filtering, Principal Component Analysis(PCA), iterative quantization, Haiming distance

中图分类号: