作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于语序变换的藏文复述句生成方法

柔特 1a,1b,才让加 1a,1b,孙茂松 2   

  1. (1.青海师范大学 a.计算机学院,西宁 810008; b.藏文信息处理教育部重点实验室,西宁 810008;2.清华大学 计算机科学与技术系 智能技术与系统国家重点实验室,北京 100084)
  • 收稿日期:2017-01-09 出版日期:2018-04-15 发布日期:2018-04-15
  • 作者简介:柔特(1975—),男,副教授、博士研究生,主研方向为藏文信息处理;才让加、孙茂松,教授、博士生导师。
  • 基金资助:
    国家自然科学基金(61662061);国家社会科学基金(14BYY132,16YY167);教育部长江学者和创新团队发展计划项目(IRT1068);青海省重点实验室项目(2015-Z-Y03,2017-GX-146)。

Tibetan Paraphrase Sentence Generation Method Based on Word Order Transformation

ROU Te  1a,1b,CAI Rangjia  1a,1b,SUN Maosong  2   

  1. (1a.Computer College; 1b.Key Laboratory of Tibetan Information Processing of Ministry of Education, Qinghai Normal University,Xining 810008,China; 2.State Key Laboratory of Intelligent Technology and System, Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
  • Received:2017-01-09 Online:2018-04-15 Published:2018-04-15

摘要: 机器理解藏文语句存在灵活性差和复杂性高的问题。为此,针对藏文相同语义句子的不同表达方式,设计复述句自动生成方法。通过对藏文句型结构、句子内部组块进行分析,利用全排列递归算法生成复述句。实验结果显示,与其他语言复述生成方法不同,该方法根据藏文句子中组块数量的不同,通过一个句子可以生成一个或多个,甚至上千个句义相同的复述句并且准确率达到93.4%,可应用于藏汉机器翻译、机器翻译评测和藏文问答系统等领域。

关键词: 复述生成, 藏文, 语序变换, 句型结构, 组块分析

Abstract: Aiming at the problem of the flexibility and complexity of machines to understand natural language and Tibetan sentences,in view of the different expressions of the same semantic sentences in Tibetan language,this paper proposes a Tibetan paraphrases sentence generation method.Through the parsing of the sentence structure of Tibetan and the internal chunks of sentences,it uses permutation recursive algorithm to generate paraphrases sentence.Experimental results show that different from other languages and Tibetan chunks,the number of chunks in a sentence can generate one or more or even thousands of complex sentences with the same semantic meanings by the proposed method,and the accuracy of automatic generation of Tibetan paraphrases sentences reaches 93.4%.It can be applied to Tibetan-Chinese machine translation,machine translation evaluation,Tibetan QA system and other research fields.

Key words: paraphrase generation, Tibetan, word order transformation, sentence structure, chunk parsing

中图分类号: