摘要: 采用最大熵模型实现中文依存语法的分析。用自底而上的方式构建语句的依存关系树,构建过程每一步在向左连接、向右连接以及不连接3 种动作选取其一。用最大熵原理判断每个动作的概率,得到依存树中各边的概率,然后找出具有最大概率的依存关系树。实验结果表明,该模型具有较好的分析精度。目前,该模型已被应用于基于自然语言的信息检索项目中。
关键词:
统计句法分析;依存文法;最大熵原理
Abstract: This paper uses maximum entropy (ME) model to parse chinese sentence with dependency grammar. The dependency-tree is constructed with a bottom-up process, and one of the three actions (left-concatenation, right-concatenation, non-concatenation) is selected in every step of the constructing process. The maximum entropy principle is used to compute the probability of the actions. Thus the dependency-tree with maximum probability can be obtained. The model is experimentally proved satisfying in precision and has been applied in a Chinese natural language retrieval project.
Key words:
Statistical parsing; Dependency grammar; Maximum entropy principle
刘贵全,曾宇斌. 基于最大熵模型的汉语依存分析[J]. 计算机工程, 2006, 32(11): 216-218.
LIU Guiquan, ZENG Yubin. Chinese Dependency Parsing with Maximum Entropy Principle[J]. Computer Engineering, 2006, 32(11): 216-218.