Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2010, Vol. 36 ›› Issue (8): 286-288.

• Developmental Research • Previous Articles     Next Articles

Research of Co-occurrence Words Search-based Topic Crawler

GE Ling, JIANG Zong-li   

  1. (College of Computer, Beijing University of Technology, Beijing 100124)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-04-20 Published:2010-04-20

基于共现词查询的主题爬虫研究

葛 玲,蒋宗礼   

  1. (北京工业大学计算机学院,北京 100124)

Abstract: This paper improves the topic mode through a co-occurrence words database. The topic mode can advance the rate of relationship and quality. Besides, it can describe the environment of key words, conjecture the purpose of users and adjust the rank of search result. A topic crawler system which employs topic sensitive FDC-PageRank to predict the priority of Web page is designed and implemented. Experiments show the system performs well.

Key words: topic crawler, co-occurrence words, FDC topic model, FDC_Topic Sensitive PageRank algorithm

摘要: 通过建立一个共现词库改进主题模型,以提高下载网页的主题相关度及质量,并且能描述其语境的上下文,揣测用户意图,调节检索结果排序。在此基础上设计并实现一个FDC主题爬虫系统,该系统采用改进的主题敏感FDC-PageRank算法来计算网页优先级。实验表明其效果良好。

关键词: 主题爬虫, 共现词, FDC主题模型, FDC_Topic Sensitive PageRank算法

CLC Number: