摘要: 现有词语相似度计算方法未深入考虑义原之间的距离与义原深度的主次关系,或直接指定含具体词概念的相似度,导致计算结果不够精确。针对该问题,通过义原之间的距离限制义原深度对义原相似度的影响,分析统计《知网》中概念的义项表达式,使用第一基本义原(能反映具体词本质)替换概念义项表达式中出现的具体词,从而提出一种改进的词语语义相似度计算算法。实验结果表明,该算法能有效提高词汇相似度计算的精确度。
关键词:
词语相似度,
词语语义,
义原深度,
概念
Abstract: The current word similarity calculation does not consider in depth with the primary and secondary
relationship between the distance and the depth of sememes. In addition,concept similarity is specified directly when the conceptual description expression contains specific words,which leads to unreasonable. The depth of sememes impacts on the word similarity is limited by the distance of sememes. It analyzes the statistical meanings of the concept expression in“HowNet”. Besides,word similarity calculation uses the first basic sememe that can reflect the essence of the word to replace the specific words that appear in the conceptual description expression. Based on the above,an improved algorithm of word semantic similarity is proposed in this paper. Experimental results show that the improved algorithm effectively improves the precision of word similarity calculations.
Key words:
word similarity,
word semantic,
depth of sememe,
concept
中图分类号:
张沪寅,刘道波,温春艳. 基于《知网》的词语语义相似度改进算法研究[J]. 计算机工程.
ZHANG Huyin,LIU Daobo,WEN Chunyan. Research on Improved Algorithm of Word Semantic Similarity Based on HowNet[J]. Computer Engineering.