作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (4): 36-38. doi: 10.3969/j.issn.1000-3428.2010.04.013

• 软件技术与数据库 • 上一篇    下一篇

基于集合覆盖的分布式信息检索资源选择

王秀红   

  1. (江苏大学科技信息研究所,镇江 212013)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-02-20 发布日期:2010-02-20

Resource Selection in Distributed Information Retrieval Based on Set-covering

WANG Xiu-hong   

  1. (Institute of Science and Technology Information, Jiangsu University, Zhenjiang 212013)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-02-20 Published:2010-02-20

摘要: 考虑到不同的数据资源(数据集)之间存在的覆盖问题,基于集合覆盖理论,针对提问Q的检索结果在融合排序后位置的不同,对其赋以不同的权值,用来计算该项检索结果对其所在的数据集的贡献。若检索结果在先选的数据集中出现过,则不再计入后选的数据集得分内。通过加权求和得到待选数据集的得分,从而确定资源选择的先后顺序。由此优选出的资源集合可用于检索与问题Q同类或类似的提问Q’,缩短由于数据库之间的覆盖而重复检索的时间。

关键词: 分布式信息检索, 集合选择, 资源选择, 集合覆盖

Abstract: Considering overlapping extent between resources, a set-covering-based algorithm for resource selection in Distributed Information Retrieval(DIR) is proposed. Different document with different weight according to its position in merged results for question Q is given. Only results that have not appeared in some earlier selected resource are focused on in later selected resources. The score of each resource is decided by the total weights of those merged results included in, and only the resource with max score is selected in each selecting step. The selecting order is the actual rank of selected resources which are used to answer the question Q’, which is similar to question Q. The approach makes time cost decreased in DIR.

Key words: Distributed Information Retrieval(DIR), collection selection, resource selection, set-covering

中图分类号: