Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2011, Vol. 37 ›› Issue (7): 193-195.

• Networks and Communications • Previous Articles     Next Articles

Semantic Similarity Computing Method Based on Wikipedia

SHENG Zhi-chao, TAO Xiao-peng   

  1. (School of Computer Science, Fudan University, Shanghai 200433, China)
  • Online:2011-04-05 Published:2011-03-31

基于维基百科的语义相似度计算方法

盛志超,陶晓鹏   

  1. (复旦大学计算机科学技术学院,上海 200433)
  • 作者简介:盛志超(1984-),男,硕士研究生,主研方向:语义比较,文本分类;陶晓鹏,副教授

Abstract: Aiming at the low accuracy and poor intelligibility of current algorithms for semantic analysis, a semantic similarity computing method based on Wikipedia is proposed. Different from computing word’s semantic similarity by category information, this method uses link information to calculate the similarity of different words in a way like human thinking. Result can be easily understood and the accuracy rate can be increased with semantic category. Experiment compared with current algorithms proves its advantage.

Key words: PageNet, CategoryNet, Wikipedia, human thinking

摘要: 针对目前语义计算准确率低、可理解性差的问题,提出一种基于维基百科的语义相似度计算方法。不同于利用分类信息计算词的语义相似度,该方法利用页面的链接信息,通过模仿人类联想的方式计算不同词之间的相似度,所得到的结果较容易被理解,并结合词语的语义类别提高计算结果的准确率。和现有算法的对比实验证明了该方法的优越性。

关键词: 页面网, 类别网, 维基百科, 人脑思维

CLC Number: