作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (24): 114-116. doi: 10.3969/j.issn.1000-3428.2009.24.038

• 软件技术与数据库 • 上一篇    下一篇

基于最大熵模型的本体概念获取方法

韦小丽1,孙 涌1,2,张书奎1,2,苗艳军1   

  1. (1. 苏州大学计算机科学与技术学院,苏州 215006;2. 江苏省计算机信息处理技术重点实验室,苏州 215006)

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-12-20 发布日期:2009-12-20

Ontological Concept Extraction Method Based on Maximum Entropy Model

WEI Xiao-li1, SUN Yong1,2, ZHANG Shu-kui1,2, MIAO Yan-jun1   

  1. (1. Department of Computer Science and Technology, Soochow University, Suzhou 215006; 2. Jiangsu Province Key Laboratory of Computer Information Processing Technology, Suzhou 215006)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-20 Published:2009-12-20

摘要: 本体是语义检索的核心。本体构建主要包括领域概念获取和概念间关系获取,其中领域概念获取是本体构建的基础。采用基于最大熵模型的方法来获取概念,通过对领域文本进行挖掘而得到名词性短语,使用改进的TF-IDF公式从中抽取具有领域性的短语,并经人工修正后得到本体概念。实验表明该方法提高了概念的准确性和完整性。

关键词: 本体, 最大熵模型, 自然语言处理

Abstract: Ontology is the core of the semantic retrieval. Ontology construction mainly includes concept extraction and the extraction of relationship between concepts, and the concept extraction is the base of ontology construction. In this paper, the domain-specific concepts are extracted by the approach which is based on the maximum entropy model, the base noun phrases are mined from the texts in the field, the domain-specific phrases are extracted from the phrases, and the phrases to form the ontology concept are corrected. Experimental results demonstrate the incensement of the accuracy and completeness of the concepts.

Key words: ontology, maximum entropy model, natural language processing

中图分类号: