作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (12): 38-40. doi: 10.3969/j.issn.1000-3428.2011.12.013

• 软件技术与数据库 • 上一篇    下一篇

信息检索中的句子相似度计算

王 品,黄广君   

  1. (河南科技大学电子信息工程学院,河南 洛阳 471003)
  • 收稿日期:2010-11-20 出版日期:2011-06-20 发布日期:2011-06-20
  • 作者简介:王 品(1982-),女,硕士研究生,主研方向:语义Web;黄广君,副教授、博士
  • 基金资助:
    河南省科技攻关计划基金资助项目(102102210159)

Sentence Similarity Computation in Information Retrieval

WANG Pin, HUANG Guang-jun   

  1. (Electronic Information Engineering College, Henan University of Science and Technology, Luoyang 471003, China)
  • Received:2010-11-20 Online:2011-06-20 Published:2011-06-20

摘要: 为同时提高信息检索的查全率和查准率,提出一种基于语义依存度的句子相似度改进算法。在计算关键词相似度的基础上,研究基于语义依存相似度算法,在判定句子有效搭配对权重时加入语义角色标注信息,对算法进行加权,并用实例证明其可行性。在提高系统查全率的基础上,用改进算法对查询结果进行重排序,从而提高前K个返回结果的查准率。实验数据显示,重排序后的前20篇返回文档的查准率比系统排序前提高了3.6%。结果表明,该算法能有效提高系统查准率。

关键词: 信息查询, 相似度, 关键词, 语义依存, 依存树, 重排序

Abstract: It is a difficulty problem that how to improve the recall and the accuracy ratio simultaneously on information searching. In view of this question, this paper proposes one kind of improved sentence similarity algorithm which based on the semantic interdependence degree. IT analyses the characteristics of algorithm based on semantic interdependence similarity, adds the semantic role labeling information through determining weights of the sentences effective collocation, and then weights the algorithm of keyword similarity calculation, then proves this algorithm feasibility with the example, makes re-sorting with the improved algorithm for inquiry results, which founded on enhancing the inquiry system recall. Thus enhance the accuracy ratio of the first K returns results. Experiment proves that this algorithm improves accuracy ratio of system, the first 20 of re-sorting compared with the before of system sorting, to enhance 3.6%.

Key words: information inquiry, similarity, key words, semantic interdependence, interdependence tree, re-sorting

中图分类号: