Abstract:
To improve the computer’s processing capacity on Chinese information, and do better shallow parsing, this paper presents a recognition method of Chinese phrase structure based on Maximum Entropy(ME). The Mutual Information(MI) among the phrases is proposed to achieve boundary prediction of the sentences structure, and the ME model is used to set up atomic and composite templates, selects more effective features for constituting the final feature set. The identification of phrase structure is completed by using the ME method, and good precision and recall are proved in the ME model based on MI by the practical experiment.
Key words:
shallow parsing,
Mutual Information(MI),
boundary prediction,
Maximum Entropy(ME) model,
feature selection
摘要: 为提高计算机对汉语信息的处理能力,更好地进行浅层句法分析,提出一种基于最大熵的汉语短语结构识别方法。利用词语之间的互信息知识对句子的短语结构边界进行预测,应用最大熵模型建立原子模板与复合模板,选择有效的特征构成特征集,实现对句子短语结构的识别。实例证明,基于互信息的最大熵模型能取得较好的精确率和召回率。
关键词:
浅层句法分析,
互信息,
边界预测,
最大熵模型,
特征选择
CLC Number:
HE E-Ge, HUANG An-Jun. Recognition Method of Chinese Phrase Structure Based on Maximum Entropy[J]. Computer Engineering, 2011, 37(16): 206-208.
霍亚格, 黄广君. 基于最大熵的汉语短语结构识别方法[J]. 计算机工程, 2011, 37(16): 206-208.