作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2013, Vol. 39 ›› Issue (4): 300-304. doi: 10.3969/j.issn.1000-3428.2013.04.069

• 开发研究与工程应用 • 上一篇    下一篇

基于判别式的藏语依存句法分析

华却才让1,2,赵海兴1   

  1. (1. 青海师范大学藏文信息研究中心,西宁 810008;2. 陕西师范大学计算机科学学院,西安 710062)
  • 收稿日期:2012-04-16 出版日期:2013-04-15 发布日期:2013-04-12
  • 作者简介:华却才让(1976-),男,副教授、博士研究生,主研方向:计算语言学,藏文信息处理;赵海兴,教授、博士生导师
  • 基金资助:
    国家自然科学基金资助项目(61063033, 61163018);国家“973”计划前期研究专项基金资助项目(2010CB334708);青海省科技基金资助项目(2011-Z-752)

Tibetan Text Dependency Syntactic Analysis Based on Discriminant

Hua-que-cai-rang 1,2, ZHAO Hai-xing 1   

  1. (1. Tibetan Information Research Center, Qinghai Normal University, Xining 810008, China; 2. School of Computer Science, Shaanxi Normal University, Xi’an 710062, China)
  • Received:2012-04-16 Online:2013-04-15 Published:2013-04-12

摘要: 现有藏语句法体系复杂,不利于藏文自然语言处理的应用。为此,提出基于判别式的藏语依存句法分析方法,采用感知机方法训练句法分析模型,CYK自底向上算法解码生成最大生成树。实验结果表明,在人工标注的测试集上,句法分析正确率达到81.2%,可实际应用到藏语依存树库的构建和其他自然语言处理中。

关键词: 藏语依存句法, 句法标注规范, 最大生成树, 特征模板, 依存句法, 感知机

Abstract: The existing Tibetan syntax system is complex, which is not conducive to the application of Tibetan natural language processing. So this paper describes an approach based on discriminant for analysis of Tibetan text dependency structure, where perceptron training method is used to training parsing model. And it also proposes a maximum spanning tree with CYK from the bottom-up algorithm for decoding. Experimental results show that, the method obtains acceptable score of 81.2% on manual test set. And it is applicable to Tibetan dependency library and other natural language processing.

Key words: Tibetan dependency syntax, syntax tagging specification, maximum-spanning tree, feature template dependency syntax, perceptron

中图分类号: