作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (3): 1-3. doi: 10.3969/j.issn.1000-3428.2010.03.001

• 博士论文 •    下一篇

基于Web页面结构和主色调的聚类算法

赵涓涓,陈俊杰,李元俊   

  1. (太原理工大学计算机与软件学院,太原 030024)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-02-05 发布日期:2010-02-05

Clustering Algorithm Based on Web Pages Structure and Dominant Color

ZHAO Juan-juan, CHEN Jun-jie, LI Yuan-jun   

  1. (College of Computer and Software, Taiyuan University of Technology, Taiyuan 030024)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-02-05 Published:2010-02-05

摘要: 针对目前Web聚类准确率不高的问题,提出一种基于Web页面链接结构和页面中图片主色调特征的聚类算法。通过分析Web页面中的链接结构和Web页面中所显示图片的主色调来比较页面之间的相似度,对Web站点中的Web页面进行聚类。聚类过程兼顾Web页面结构和页面的主要色彩特征。系统实验结果表明,该算法能有效提高聚类的准确性。

关键词: 聚类, Web挖掘, 链接结构, 主色调

Abstract: Aiming at the problem that the efficiency is low in Web clustering, this paper proposes a clustering algorithm based on linkage structure and the character of the dominant color on Web pages. It compares the similarity between Web pages by analyzing the linkage and the dominant color on them. It can cluster the Web pages on Web sites. In this procedure, the clustering has both the structure and the main character of tone. Experimental results of the system prove that it has made the clustering become more efficient and it has improved a lot than before.

Key words: clustering, Web mining, linkage structure, dominant color

中图分类号: