摘要: 传统的句子相似度计算方法只关注句子的某个特征,导致召回率和准确率的不均衡。针对该问题,提出一种基于多特征的句子相似度计算方法(MFS)。该方法加入包含词性和位置信息的词权重,并综合考虑词的语义和句子结构。实验结果表明,与其他方法相比,MFS方法的F1值较高。在基于实例的问答系统中,使用MFS方法得到的MRR值也较高。
关键词:
句子相似度,
多特征,
词权重,
知网,
问答系统
Abstract: The traditional sentence semantic similarity computing approaches usually only focus on one feature of a sentence and result in imbalance between the recall and the accuracy. Therefore, this paper describes a method for semantic similarity computation based on the Multi-feature of a Sentence(MFS). It is integrating more features of the weights of words, word semantics and sentence structure. The result of experiment show that compared with Jaccard coefficient method, the MFS method increased for comprehensive index F-measure. In a case-based question answering system the MRR value of the presented method is higher than other compared methods.
Key words:
sentence similarity,
multi-feature,
word weight,
HowNet,
question answering system
中图分类号:
赵臻, 吴宁, 宋盼盼. 基于多特征融合的句子语义相似度计算[J]. 计算机工程, 2012, 38(01): 171-173.
DIAO Zhen, TUN Ning, SONG Fen-Fen. Sentence Semantic Similarity Calculation Based on Multi-feature Fusion[J]. Computer Engineering, 2012, 38(01): 171-173.