计算机工程 ›› 2009, Vol. 35 ›› Issue (20): 273-275.doi: 10.3969/j.issn.1000-3428.2009.20.096

• 开发研究与设计技术 • 上一篇    下一篇

基于ProActive的分布式并行网页索引算法

梁正友,陈 涛   

  1. (广西大学计算机与电子信息学院,南宁 530004)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-10-20 发布日期:2009-10-20

ProActive-based Distributed Parallel Web Page Index Algorithm

LIANG Zheng-you, CHEN Tao   

  1. (School of Computer, Electronics and Information, Guangxi University, Nanning 530004)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-10-20 Published:2009-10-20

摘要: 针对单机网页索引器索引速度慢和串行倒排索引算法具有可并行处理的特性,提出分布式并行倒排索引算法。该算法应用分布式并行计算ProActive中间件和单机索引Lucene包,设计和实现一个在机群系统下工作的分布式并行网页索引器。实验结果表明,该索引器有较高的索引性能和较好的扩展性能。

关键词: 倒排索引, 分布式并行, 中间件

Abstract: Aiming at the traits of single-site Web page indexer is low index speed and serial invert index algorithm, this paper proposes a distributed parallel invert index algorithm. The algorithm applies distributed parallel to compute ProActive middleware and single index Lucene package, designs and implements distributed parallel Web page indexer. Experimental result shows it has high index performance and good scalability.

Key words: invert index, distributed parallel, middleware

中图分类号: