作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (19): 30-31,3. doi: 10.3969/j.issn.1000-3428.2008.19.011

• 软件技术与数据库 • 上一篇    下一篇

一种基于锚文本的并行检索策略

高 珊,何婷婷,胡文敏   

  1. (华中师范大学计算机科学系,武汉 430079)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-10-05 发布日期:2008-10-05

Parallel Retrieval Strategy Based on Anchor Text

GAO Shan, HE Ting-ting, HU Wen-min   

  1. (Department of Computer Science, Huazhong Normal University, Wuhan 430079)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-10-05 Published:2008-10-05

摘要: 进行Web信息检索时,页面中的锚文本与正文存在较大相关性,多数检索系统忽视了锚文本对页面正文的贡献。该文提出一种提高检索精度的方法,为文档集建立一个基于页面正文的索引和一个基于锚文本的索引,对其采取并行检索策略。实验结果表明,该方法可以有效处理特定结构的网页集。

关键词: 锚文本, 并行检索, 信息检索

Abstract: Most retrieval systems ignore the anchor text, which is highly relevant to the page content in Web information retrieving. This paper proposes a method to improve the retrieval accuracy. It makes two indices, one for page content and the other for anchor text. A parallel retrieval strategy is utilized for the two indices. Experimental results show that this method is efficient for the special structure document collection.

Key words: anchor text, parallel retrieval, information retrieval

中图分类号: