计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于协同相似计算的查询推荐

石雁,李朝锋   

  1. (江南大学 物联网工程学院,江苏 无锡 214122)
  • 收稿日期:2015-07-27 出版日期:2016-08-15 发布日期:2016-08-15
  • 作者简介:石雁(1986-),男,硕士,主研方向为搜索引擎、推荐系统;李朝锋,教授、博士。
  • 基金项目:
    国家自然科学基金资助项目(61170120)。

Query Recommendation Based on Collaborative Similarity Calculation

SHI Yan,LI Chaofeng   

  1. (School of Internet of Things Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China)
  • Received:2015-07-27 Online:2016-08-15 Published:2016-08-15

摘要: 单个用户历史搜索点击数据具有稀疏性特点,容易导致查询推荐不准确和无法提供多样性查询的问题。为此,提出将每个用户的查询日志作为文档,利用空间向量模型计算文档间的相似度,并将用户在历史数据中对链接的点击频率作为对链接的偏好评分,采用改进的欧氏距离计算用户的最近邻居,计算出当前用户的相似用户集,将相似用户历史行为数据扩充到单个用户数据中。基于朴素贝叶斯模型训练数据并预测查询-链接的点击率,将其作为权重用于点击图中,应用点击传播产生查询推荐。实验结果证明该方法可获得较高的准确度和平均精度均值。

关键词: 查询推荐, 最近邻, 向量空间模型, 欧氏距离, 朴素贝叶斯, 点击预测

Abstract: The historical search click data of a single user is sparse, which leads inaccurate query recommendation and cannot provide diverse query.Therefore, this paper takes the log of each user as a document, and uses the vector space model to calculate the similarity between the users’ documents.The frequency of user clicking the link in the historical data is considered as the preference score of each link, and the improved Euclidean distance is used to calculate the user’s nearest neighbors. The method is used to calculate the similar user set of the current user, and the historical behavior data of similar users is added to the data of a single user. Based on the naive Bayes model, the data is trained and the click-through rate is predicted between query and links. These rates are used as weight in the click graph and spreaded for recommendation generation. Experimental results show that this method obtains higher precision and mean average precision.

Key words: query recommendation, nearest neighbor, vector space model, Euclidean distance, naive Bayes, click prediction

中图分类号: