作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (9): 101-102,. doi: 10.3969/j.issn.1000-3428.2008.09.036

• 软件技术与数据库 • 上一篇    下一篇

大型Web站点逻辑域挖掘算法

郑皎凌   

  1. (成都信息工程学院软件工程系,成都 610225)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-05-05 发布日期:2008-05-05

Large Scale Website Logical Domain Mining Algorithm

ZHENG Jiao-ling   

  1. (Department of Software Engineering, Chengdu University of Information Technology, Chengdu 610225)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-05-05 Published:2008-05-05

摘要: 通过进一步发展Wen-Syan Li等人提出的Web站点逻辑域理论,该文提出Web站点逻辑域核模型及建立在其上的逻辑域挖掘算法。该算法通过对Web站点超链接的图结构进行运算,得到Web站点逻辑域。与Wen-Syan Li算法对比测试,结果表明在获得相同逻辑域个数的情况下,克服了其采用启发式方法所带来的效率问题。在对4个大型Web站点的单独测试中,平均能够达到85%的逻辑域挖掘精度。

关键词: Web站点结构挖掘, 逻辑域, 逻辑域核

Abstract: By developing Wen-Syan Li’s website logical domain theory, the paper proposes a website logical domain core model and logical domain mining algorithm based upon it. The algorithm computes website’s hyperlink graph structure to obtain its logical domain. In comparative test with Wen-Syan Li’s algorithm, it overcomes the efficiency defect of Wen-Syan Li’s huristic method while obtaining the same quantity of logical domain. In separate test of 4 large scale websites, the logical domain core mining precision can averagely reach 85%.

Key words: website structure mining, logical domain, logical domain core

中图分类号: