Web日志挖掘中GITC算法的改进

doi:10.3969/j.issn.1000-3428.2008.04.021

计算机工程 ›› 2008, Vol. 34 ›› Issue (4): 60-62.

Web日志挖掘中GITC算法的改进

郭维

（安徽理工大学计算机科学与技术系，淮南 232001）

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-02-20 发布日期:2008-02-20

Improvement of GITC Algorithm on Web Log Mining

GUO Wei

(Dept. of Computer Science and Technology, Anhui University of Science and Technology, Huainan 232001)

Received:1900-01-01 Revised:1900-01-01 Online:2008-02-20 Published:2008-02-20

摘要/Abstract

摘要： GITC算法和Tree-DM算法都是基于交集关系的挖掘算法。文章分析这2个算法的性能特点，提出一种GITC算法的改进算法：GI算法。该算法利用适当的数据结构来保存支持数信息，省去了扫描原数据库来统计支持数耗费的大量时间，并解决了Tree-DM算法在二次求交、冗余求交等方面存在的问题。经过实验验证，较GITC算法而言，GI算法可以更高效地挖掘用户频繁访问模式。

关键词: Web日志挖掘, 频繁访问模式, 交集关系

Abstract: The GITC algorithm and the Tree-DM algorithm are both based on the intersection relation. The paper analyzes the performance of both algorithms deeply, and puts forward an improved algorithm named GI. It stores the information of support number in appropriate data structure so as to spare a mass of time of getting the support number of each candidate by scanning the original database. It also solves the problem of getting the intersections repeatedly and redundantly in the Tree-DM algorithm. Experimental results show that the GI algorithm can discover user frequent access patterns more effectively than GITC.

Key words: Web log mining, frequent access pattern, intersection relation

中图分类号:

TP311.12

郭维. Web日志挖掘中GITC算法的改进[J]. 计算机工程, 2008, 34(4): 60-62.

GUO Wei. Improvement of GITC Algorithm on Web Log Mining[J]. Computer Engineering, 2008, 34(4): 60-62.

https://www.ecice06.com/CN/Y2008/V34/I4/60

[1]	程苗, 陈华平. 基于Hadoop的Web日志挖掘[J]. 计算机工程, 2011, 37(11): 37-39.
[2]	李燕;冯博琴;鲁晓锋. Web日志挖掘中的数据预处理技术[J]. 计算机工程, 2009, 35(22): 44-46.
[3]	方元康;胡学钢;夏启寿;朱勇. 改进的Web日志数据预处理技术[J]. 计算机工程, 2009, 35(10): 73-74.
[4]	陈　敏;苗夺谦. 一种基于Close模式发现用户频繁访问路径的方法[J]. 计算机工程, 2007, 33(08): 14-16.
[5]	陈子军;王鑫昱;李伟. 一种Web日志会话识别的优化方法[J]. 计算机工程, 2007, 33(01): 95-97.
[6]	金　玮;张克君;曲文龙;杨炳儒. 分布式Web用户兴趣迁移模式挖掘研究[J]. 计算机工程, 2006, 32(24): 44-47.

选择文件类型/文献管理软件名称

选择包含的内容

Web日志挖掘中GITC算法的改进

Improvement of GITC Algorithm on Web Log Mining

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 6

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

Web日志挖掘中GITC算法的改进

Improvement of GITC Algorithm on Web Log Mining

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 6

编辑推荐

Metrics

本文评价