计算机工程 ›› 2010, Vol. 36 ›› Issue (2): 36-38.doi: 10.3969/j.issn.1000-3428.2010.02.013

• 软件技术与数据库 • 上一篇    下一篇

MED算法及其在网页搜索中的应用

叶福军   

  1. (浙江传媒学院动画系,杭州 310018)

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-01-20 发布日期:2010-01-20

Modified Edit Distance Algorithm and Its Application in Web Search

YE Fu-jun   

  1. (Dept. of Animation, Zhejiang Institute of Media and Communications, Hangzhou 310018)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-01-20 Published:2010-01-20

摘要: 针对传统方法不能很好地处理网页中简短域和用户查询之间的相关性排序问题,提出一种改进的编辑距离(MED)排序算法,在编码和计算过程中引入查询词分布的位置、顺序和距离等信息,将查询和简短域之间的相关性问题转化为编码字符串的相似性问题。仿真实验结果表明,与传统的相关性排序算法相比,该算法可以提高网页搜索中简短网页域的相关性排序性能。

关键词: 网页搜索, 相关性排序, 编辑距离, 字符串匹配

Abstract: Aiming at the problems that the traditional methods can not perform well on the short Web page fields, a Modified Edit Distance(MED) algorithm is proposed. In the process of encoding and calculating, the algorithm uses the position, order, and distance information, so the problem on the relevance between the corresponding query and short field can be converted to the problem on the similarity between the encoding strings. Simulation experimental results show this algorithm can significantly outperform the traditional algorithms for relevance ranking on short Web fields, especially for very short fields.

Key words: Web search, relevance ranking, edit distance, string match

中图分类号: