作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (10): 120-126, 135. doi: 10.19678/j.issn.1000-3428.0066109

• 人工智能与模式识别 • 上一篇    下一篇

基于适应性训练与丢弃机制的神经机器翻译

段仁翀, 段湘煜   

  1. 苏州大学 计算机科学与技术学院, 江苏 苏州 215000
  • 收稿日期:2022-10-27 出版日期:2023-10-15 发布日期:2023-01-03
  • 作者简介:

    段仁翀(1998—),男,硕士研究生,主研方向为机器翻译、自然语言处理

    段湘煜,教授

  • 基金资助:
    江苏高校优势学科建设工程项目

Neural Machine Translation Based on Adaptive Training and Drop Mechanism

Renchong DUAN, Xiangyu DUAN   

  1. School of Computer Science and Technology, Soochow University, Suzhou 215000, Jiangsu, China
  • Received:2022-10-27 Online:2023-10-15 Published:2023-01-03

摘要:

在机器翻译领域中,提升翻译质量的一个重要方法是提高短语的翻译准确率。统计机器翻译模型通过对短语而非单词进行建模,大幅提升了短语翻译准确率。然而,对于神经机器翻译模型,传统的训练目标最小化每个词的损失,而无显式的约束记忆短语存在短语的翻译准确率较低的缺陷,另外基于自回归解码的神经机器翻译模型导致误译的短语会影响后续短语的准确翻译。为了解决上述问题,提出短语感知适应性训练和短语丢弃机制的方法。短语感知适应性训练将句子分割为多个短语片段,借助适应性训练目标为每个词分配合适的权重,以鼓励模型记忆短语,提高模型对短语的翻译准确率,短语丢弃机制通过在训练中随机丢弃目标端短语来增强模型对误译短语的鲁棒性,避免对后续短语的翻译造成影响。在WMT2014英德和NIST中英两个翻译任务上的实验结果表明,与Transformer基线模型相比,提出方法可以使译文的BLEU值分别提高1.64和0.96分。此外还证明了短语知识作为一种通用的知识,可以从教师模型迁移到学生模型,进一步提升翻译质量。

关键词: 机器翻译, 知识迁移, 适应性训练, 短语, 丢弃机制

Abstract:

In the field of machine translation, enhancing the translation accuracy of phrases is a key strategy for improving overall translation quality. Although statistical machine translation models have substantially improved phrase translation accuracy by focusing on the phrase level instead of individual words. However, Neural Machine Translation(NMT) models face particular challenges. First, traditional training objectives, which minimize per-word loss, do not impose explicit constraints that encourage NMT models to prioritize phrases. Consequently, this often results in less precise phrase translations. Second, autoregressive decoding in neural machine translation can generate mistranslated phrases, leading to subsequent reduction in the accuracy of later translations. To address these challenges, this study introduces two methods: phrase perception adaptation training and a phrase drop mechanism. The former, known as phrase-aware adaptive training, begins by segmenting sentences into multiple phrase segments. During training, different weights are assigned to target words based on their positions within phrases, with the aim of augmenting the model's comprehension of phrases. Concurrently, the phrase drop mechanism is introduced to improve the model's resilience against mistranslated phrases by randomly omitting phrases during training. Experimental evaluations on two translation benchmarks, Workshop on statistical Machine Translation 2014(WMT2014) English-German and National Institute of Standards and Technology(NIST) Chinese-English, indicate that the proposed strategies enhance the translation BiLingual Evaluation Understudy(BLEU) scores by 1.64 and 0.96 points, respectively, when compared to the baseline model, the Transformer. Additionally, the experiments affirm that phrase knowledge is universally applicable, facilitating its transfer from teacher models to student models and further enhancing translation quality.

Key words: machine translation, knowledge transfer, adaptive training, phrase, drop mechanism