作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (8): 47-49. doi: 10.3969/j.issn.1000-3428.2009.08.016

• 软件技术与数据库 • 上一篇    下一篇

一种改进的用户浏览偏爱路径挖掘方法

任永功,付 玉,张 亮   

  1. (辽宁师范大学计算机与信息技术学院,大连 116029)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-04-20 发布日期:2009-04-20

Improved Mining Approach of User’s Preferred Browsing Paths

REN Yong-gong, FU Yu, ZHANG Liang   

  1. (School of Computer and Information Technology, Liaoning Normal University, Dalian 116029)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-04-20 Published:2009-04-20

摘要: 提出一种基于“三矩阵”模型的偏爱浏览路径的挖掘方法。在单元数组存储结构(存储矩阵)基础上建立以浏览兴趣度为基本元素的会话矩阵和路径矩阵。在会话矩阵上采用2个页面向量夹角余弦作为相似用户的页面距离公式进行页面聚类,求得相似用户的相关页面集。并利用路径选择偏爱度在相似用户的路径矩阵上挖掘出相似用户的浏览偏爱路径。实验证明,该方法是合理有效的,能够得到更精准的用户偏爱浏览路径。

关键词: Web日志, 浏览兴趣度, 页面聚类算法

Abstract: This paper proposes a new mining approach of user’s preferred browsing paths through Web logs based on “three matrices” models. This approach establishes session matrix and trace matrix by taking browsing interest as the fundamental element based on cell storage structure (storage matrix), and carries on page clustering in the session matrix through using angle cosine in vector space between two pages, which is called the similar user’s page distance formula. The similar user’s relative pages set can be got. The similar user’s browsing preferred paths by using path choice-preference in similar user’s trace matrix are mined. Experiments prove that this method is reasonable effective and can obtain a more accurate user’s preferred browsing paths.

Key words: Web logs, browsing interest level, page clustering algorithm

中图分类号: