作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (17): 288-290. doi: 10.3969/j.issn.1000-3428.2010.17.098

• 开发研究与设计技术 • 上一篇    下一篇

基于ProActive的P-Spider1.0改进

张林才1,梁正友2,王红霞3   

  1. (1. 辽宁石油化工大学计算机与通信工程学院,抚顺 113001;2. 广西大学计算机与电子信息学院,南宁 530004; 3. 北京青年政治学院计算机系,北京 100102)
  • 出版日期:2010-09-05 发布日期:2010-09-02
  • 作者简介:张林才(1978-),男,硕士,主研方向:网络与并行计算,搜索引擎;梁正友,教授;王红霞,博士
  • 基金资助:
    广西教育厅科研基金资助项目(桂教科研[2006]26号);广西大学博士启动基金资助项目

Improvement of ProActive-based P-Spider1.0

ZHANG Lin-cai1, LIANG Zheng-you2, WANG Hong-xia3   

  1. (1. School of Computer and Communication Engineering, Liaoning Shihua University, Fushun 113001; 2. School of Computer and Electronic Information, Guangxi University, Nanning 530004; 3. Computer Department, Beijing Young and Political College, Beijing 100102)
  • Online:2010-09-05 Published:2010-09-02

摘要: 针对带中心节点结构的分布式并行Web Spider的中心节点负担过重、通信负载不均衡、可扩展性差的问题,提出基于Rabin指纹算法的URL去重改进算法和节点对等结构的改进方案,利用ProActive中间件设计开发改进的分布式并行Web Spider。对比实验表明,改进后的Web Spider采集效率更高,通信负载均衡,无节点瓶颈问题,具有良好的可扩展性。

关键词: 网络蜘蛛, ProActive中间件, 节点对等, 分布式, 中心节点

Abstract: The distributed parallel Web Spider with center node is inadequate in expandability, and there is excessive burden on center node. In the same way, the communication load is not balanced. In order to overcome these problems, this paper presents an improved URL removing algorithm based on Rabin fingerprint algorithm. The improved scheme of Peer-to-Peer structure is proposed. The improved distributed parallel Web Spider is developed with ProActive middleware. Contrast experiments show that the improved Web Spider has higher collection efficiency, balanced communication load, without node bottleneck, and better expandability.

Key words: Web Spider, ProActive middleware, Peer-to-Peer(P2P), distributed, center node

中图分类号: