作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (19): 47-48,5. doi: 10.3969/j.issn.1000-3428.2008.19.017

• 软件技术与数据库 • 上一篇    下一篇

基于ProActive的分布式并行Web Spider设计

张林才1,2,梁正友1

  

  1. (1. 广西大学计算机与电子信息学院,南宁 530004;2. 辽宁石油化工大学计算机与通信工程学院,抚顺 113001)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-10-05 发布日期:2008-10-05

Design of Distributed Parallel Web Spider Based on ProActive

ZHANG Lin-cai1,2, LIANG Zheng-you1   

  1. (1. School of Computer and Electronic Information, Guangxi University, Nanning 530004; 2. School of Computer and Communication Engineering, Liaoning Shihua University, Fushun 113001)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-10-05 Published:2008-10-05

摘要: 单机Web Spider的数据采集速度较慢,采用MPI技术或直接用Java开发分布式Web Spider代价较高。该文利用ProActive中间件提供的主动对象技术、网络并行计算技术、自动部署机制设计实现了P-Spider分布式并行Web Spider。实验结果表明,该P-Spider采集速率是单机多线程Web Spider的2.2倍。

关键词: Web Spider程序, ProActive中间件, 并行, 分布式

Abstract: It becomes more slowly to collect the data by single Web Spider, and higher cost for developing distributed Web Spider by MPI technology or Java technology. This paper designs and realizes a distributed parallel Web Spider with the Active Object, Network parallel computing technology and automatic deployment mechanism provided by ProActive middleware. The experimental results show that the data collection rate of the P-Spider is 2.2 times faster than the data collection rate of the multi-thread Web Spider.

Key words: Web Spider programme, ProActive middleware, parallel, distributed

中图分类号: