摘要: 提出一种基于“三矩阵”模型的偏爱浏览路径的挖掘方法。在单元数组存储结构(存储矩阵)基础上建立以浏览兴趣度为基本元素的会话矩阵和路径矩阵。在会话矩阵上采用2个页面向量夹角余弦作为相似用户的页面距离公式进行页面聚类,求得相似用户的相关页面集。并利用路径选择偏爱度在相似用户的路径矩阵上挖掘出相似用户的浏览偏爱路径。实验证明,该方法是合理有效的,能够得到更精准的用户偏爱浏览路径。
关键词:
Web日志,
浏览兴趣度,
页面聚类算法
Abstract: This paper proposes a new mining approach of user’s preferred browsing paths through Web logs based on “three matrices” models. This approach establishes session matrix and trace matrix by taking browsing interest as the fundamental element based on cell storage structure (storage matrix), and carries on page clustering in the session matrix through using angle cosine in vector space between two pages, which is called the similar user’s page distance formula. The similar user’s relative pages set can be got. The similar user’s browsing preferred paths by using path choice-preference in similar user’s trace matrix are mined. Experiments prove that this method is reasonable effective and can obtain a more accurate user’s preferred browsing paths.
Key words:
Web logs,
browsing interest level,
page clustering algorithm
中图分类号:
任永功;付 玉;张 亮. 一种改进的用户浏览偏爱路径挖掘方法[J]. 计算机工程, 2009, 35(8): 47-49.
REN Yong-gong; FU Yu; ZHANG Liang. Improved Mining Approach of User’s Preferred Browsing Paths[J]. Computer Engineering, 2009, 35(8): 47-49.