Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2008, Vol. 34 ›› Issue (9): 101-102,. doi: 10.3969/j.issn.1000-3428.2008.09.036

• Software Technology and Database • Previous Articles     Next Articles

Large Scale Website Logical Domain Mining Algorithm

ZHENG Jiao-ling   

  1. (Department of Software Engineering, Chengdu University of Information Technology, Chengdu 610225)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-05-05 Published:2008-05-05

大型Web站点逻辑域挖掘算法

郑皎凌   

  1. (成都信息工程学院软件工程系,成都 610225)

Abstract: By developing Wen-Syan Li’s website logical domain theory, the paper proposes a website logical domain core model and logical domain mining algorithm based upon it. The algorithm computes website’s hyperlink graph structure to obtain its logical domain. In comparative test with Wen-Syan Li’s algorithm, it overcomes the efficiency defect of Wen-Syan Li’s huristic method while obtaining the same quantity of logical domain. In separate test of 4 large scale websites, the logical domain core mining precision can averagely reach 85%.

Key words: website structure mining, logical domain, logical domain core

摘要: 通过进一步发展Wen-Syan Li等人提出的Web站点逻辑域理论,该文提出Web站点逻辑域核模型及建立在其上的逻辑域挖掘算法。该算法通过对Web站点超链接的图结构进行运算,得到Web站点逻辑域。与Wen-Syan Li算法对比测试,结果表明在获得相同逻辑域个数的情况下,克服了其采用启发式方法所带来的效率问题。在对4个大型Web站点的单独测试中,平均能够达到85%的逻辑域挖掘精度。

关键词: Web站点结构挖掘, 逻辑域, 逻辑域核

CLC Number: