Abstract:
A new Information Content(IC) model is given in this paper which can be used to calculate the IC value of concept in WordNet. The concept of entropy is introduced in the model which considers not only the number of child node of concept and the depth in the tree of taxonomy of WordNet, but also the spatial structure of hyponyms of the concept, the model makes the value of IC of the concept more accurate. Experimental results show that the precision of the semantic similarity algorithms using the IC values computed by the entropy model can be improved.
Key words:
Information Content(IC),
ontology,
semantic similarity,
child node,
classification tree,
entropy
摘要: 提出一种用于计算WordNet中概念信息内容(IC)值的模型。引入熵的概念,不仅考虑概念的子节点数目和概念所处分类树中的深度,而且考虑了概念子节点的空间结构,使得概念的IC值更为精确。将该模型代入到基于IC的语义相似度算法中,实验结果表明,该模型可有效提高算法的准确度。
关键词:
信息内容,
本体,
语义相似度,
子节点,
分类树,
熵
CLC Number:
HE Yan, ZHOU Zi-li. Concept IC Model in WordNet Based on Entropy[J]. Computer Engineering.
何艳,周子力. 基于熵的WordNet概念IC模型[J]. 计算机工程.