作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (11): 56-58. doi: 10.3969/j.issn.1000-3428.2007.11.021

• 软件技术与数据库 • 上一篇    下一篇

基于层次分类的页面排序算法

李绍华1,2,高文宇2   

  1. (1. 广东省电子商务市场应用技术重点实验室,广州 510320;2. 广东商学院计算机科学与技术系,广州 510320)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-05 发布日期:2007-06-05

Hierarchical Classification-based PageRank Algorithm

LI Shaohua1,2, GAO Wenyu2   

  1. (1. Guangdong Province Key Lab of Electronic Commerce Market Application Technology, Guangzhou 510320;
    2. Department of Computer Science and Technology, Guangdong University of Business Studies, Guangzhou 510320)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-05 Published:2007-06-05

摘要: 提出了一个基于层次分类的搜索引擎页面排序算法。该算法通过对页面进行层次化分类进而计算页面之间相关性,根据相关性的不同,对来自不同页面的外部链接赋予不同的权重,从而更公正、有效地计算页面的PageRank值。层次分类体系更合理地反映了页面的自然属性,也为设计更为高效的页面分类算法提供了方便。该算法与PageRank在在线计算复杂度方面完全一样,是非查询关键词相关的算法,能够高效地完成在线搜索,具有良好的可伸缩性。

关键词: 搜索引擎, 层次分类, PageRank

Abstract: This paper proposes a hierarchical classification-based PageRank algorithm. HC-PageRank uses hierarchical classification to consider page relativity, and gives different weight factors to different input links. The method can compute PageRank value of Web pages more fair and more effective. Hierarchical classification can reflect page property more precise, and more effective classification algorithm can be designed based on hierarchical classification architecture. The online complexity of HC-PageRank is the same as that of PageRank. HC-PageRank is also a non-keywords-dependent query algorithm, which has good scalable performance.

Key words: Search engine, Hierarchic classification, PageRank

中图分类号: