作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (16): 206-208. doi: 10.3969/j.issn.1000-3428.2011.16.070

• 人工智能及识别技术 • 上一篇    下一篇

基于最大熵的汉语短语结构识别方法

霍亚格,黄广君   

  1. (河南科技大学电子信息工程学院,河南 洛阳 471003)
  • 收稿日期:2011-02-21 出版日期:2011-08-20 发布日期:2011-08-20
  • 作者简介:霍亚格(1984-),女,硕士研究生,主研方向:自然语言处理,中文信息检索;黄广君,副教授、博士
  • 基金资助:

    河南省科技攻关计划基金资助项目(102102210159)

Recognition Method of Chinese Phrase Structure Based on Maximum Entropy

HUO Ya-ge, HUANG Guang-jun   

  1. (Electronic & Information Engineering College, Henan University of Science and Technology, Luoyang 471003, China)
  • Received:2011-02-21 Online:2011-08-20 Published:2011-08-20

摘要: 为提高计算机对汉语信息的处理能力,更好地进行浅层句法分析,提出一种基于最大熵的汉语短语结构识别方法。利用词语之间的互信息知识对句子的短语结构边界进行预测,应用最大熵模型建立原子模板与复合模板,选择有效的特征构成特征集,实现对句子短语结构的识别。实例证明,基于互信息的最大熵模型能取得较好的精确率和召回率。

关键词: 浅层句法分析, 互信息, 边界预测, 最大熵模型, 特征选择

Abstract: To improve the computer’s processing capacity on Chinese information, and do better shallow parsing, this paper presents a recognition method of Chinese phrase structure based on Maximum Entropy(ME). The Mutual Information(MI) among the phrases is proposed to achieve boundary prediction of the sentences structure, and the ME model is used to set up atomic and composite templates, selects more effective features for constituting the final feature set. The identification of phrase structure is completed by using the ME method, and good precision and recall are proved in the ME model based on MI by the practical experiment.

Key words: shallow parsing, Mutual Information(MI), boundary prediction, Maximum Entropy(ME) model, feature selection

中图分类号: