作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (13): 210-212. doi: 10.3969/j.issn.1000-3428.2007.13.072

• • 上一篇    下一篇

面向小词典的高效英汉双语语料对齐算法

熊 伟,陈 蓉,刘 佳,徐 淼,于中华   

  1. (四川大学计算机学院,成都 610064)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-07-05 发布日期:2007-07-05

Efficient Small-dictionary-oriented Algorithm for Alignment of English-Chinese Parallel Corpora

XIONG Wei, CHEN Rong, LIU Jia, XU Miao, YU Zhonghua   

  1. (Dept. of Computer Science, Sichuan University, Chengdu 610064)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-07-05 Published:2007-07-05

摘要: 双语语料自动对齐是自然语言处理的一个重要研究课题。该文针对基于词典译文的英汉句子对齐算法存在的缺点,提出了面向小词典的高效英汉句子对齐算法,该算法在小词典的情况下仍具有较高的准确率,效率比传统算法提高近一倍。通过理论分析、对比实验可知,该算法是有效的。

关键词: 机器翻译, 局部对齐, 补偿, 双语语料

Abstract: Automatic alignment of parallel corpora is an important research subject in natural language processing area. An efficient small-dictionary-oriented algorithm for alignment of English-Chinese parallel corpora is proposed in this paper in order to improve the efficiency of translation-based approach to parallel corpora alignment, and enhance its precision when the dictionary used is small. The algorithm has competitive precision under the condition of small dictionary, and doubles the efficiency of the traditional approach. The validity of the algorithm is argued through theoretical analysis and experiments.

Key words: machine translation, local alignment, compensation, parallel corpora

中图分类号: