Abstract:
Aiming at the problem that the efficiency is low in Web clustering, this paper proposes a clustering algorithm based on linkage structure and the character of the dominant color on Web pages. It compares the similarity between Web pages by analyzing the linkage and the dominant color on them. It can cluster the Web pages on Web sites. In this procedure, the clustering has both the structure and the main character of tone. Experimental results of the system prove that it has made the clustering become more efficient and it has improved a lot than before.
Key words:
clustering,
Web mining,
linkage structure,
dominant color
摘要: 针对目前Web聚类准确率不高的问题,提出一种基于Web页面链接结构和页面中图片主色调特征的聚类算法。通过分析Web页面中的链接结构和Web页面中所显示图片的主色调来比较页面之间的相似度,对Web站点中的Web页面进行聚类。聚类过程兼顾Web页面结构和页面的主要色彩特征。系统实验结果表明,该算法能有效提高聚类的准确性。
关键词:
聚类,
Web挖掘,
链接结构,
主色调
CLC Number:
ZHAO Juan-juan; CHEN Jun-jie; LI Yuan-jun. Clustering Algorithm Based on Web Pages Structure and Dominant Color[J]. Computer Engineering, 2010, 36(3): 1-3.
赵涓涓;陈俊杰;李元俊. 基于Web页面结构和主色调的聚类算法[J]. 计算机工程, 2010, 36(3): 1-3.