作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (12): 80-81,84.

• 软件技术与数据库 • 上一篇    下一篇

基于主题的分布式信息检索技术研究

张 刚1,2,周昭涛1,2,王 斌1   

  1. 1. 中国科学院计算技术研究所软件室,北京 100080;2. 中国科学院研究生院,北京 100039
  • 出版日期:2006-06-20 发布日期:2006-06-20

Research on Topic Based Distributed Information Retrieval Technology

ZHANG Gang1,2, ZHOU Zhaotao1,2, WANG Bin1   

  1. 1. Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080;2. Graduate School of Chinese Academy of Sciences, Beijing 100039
  • Online:2006-06-20 Published:2006-06-20

摘要: 介绍了一种基于主题的分布式信息检索方法,并对算法的有效性进行了深入的分析。该文通过文本聚类方法,把文档按照主题的方式来划分,经过实验发现查询答案明显地汇聚在少数的文档集合中。由此表明,基于主题的分布式信息检索方法比传统分布式信息检索方法在检索效果上有了显著的提高。

关键词: 分布式信息检索;文本聚类;K 平均聚类

Abstract: This paper introduces a topic based distributed information retrieval method, thoroughly analyses the reason for the good performance. Through text clustering method, divides the text by theme, and the experimental results show that inquired answers obviously converge among minority collections of documents, such indicates that the topic based distributed information retrieval method achieves great improvement comparing to the traditional method.

Key words: Distributed information retrieval; Text clustering; K-means clustering