摘要: 为同时提高信息检索的查全率和查准率,提出一种基于语义依存度的句子相似度改进算法。在计算关键词相似度的基础上,研究基于语义依存相似度算法,在判定句子有效搭配对权重时加入语义角色标注信息,对算法进行加权,并用实例证明其可行性。在提高系统查全率的基础上,用改进算法对查询结果进行重排序,从而提高前K个返回结果的查准率。实验数据显示,重排序后的前20篇返回文档的查准率比系统排序前提高了3.6%。结果表明,该算法能有效提高系统查准率。
关键词:
信息查询,
相似度,
关键词,
语义依存,
依存树,
重排序
Abstract: It is a difficulty problem that how to improve the recall and the accuracy ratio simultaneously on information searching. In view of this question, this paper proposes one kind of improved sentence similarity algorithm which based on the semantic interdependence degree. IT analyses the characteristics of algorithm based on semantic interdependence similarity, adds the semantic role labeling information through determining weights of the sentences effective collocation, and then weights the algorithm of keyword similarity calculation, then proves this algorithm feasibility with the example, makes re-sorting with the improved algorithm for inquiry results, which founded on enhancing the inquiry system recall. Thus enhance the accuracy ratio of the first K returns results. Experiment proves that this algorithm improves accuracy ratio of system, the first 20 of re-sorting compared with the before of system sorting, to enhance 3.6%.
Key words:
information inquiry,
similarity,
key words,
semantic interdependence,
interdependence tree,
re-sorting
中图分类号:
王品, 黄广君. 信息检索中的句子相似度计算[J]. 计算机工程, 2011, 37(12): 38-40.
WANG Pin, HUANG An-Jun. Sentence Similarity Computation in Information Retrieval[J]. Computer Engineering, 2011, 37(12): 38-40.