作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (11): 70-76, 84. doi: 10.19678/j.issn.1000-3428.0066031

• 人工智能与模式识别 • 上一篇    下一篇

结合向量化方法与掩码机制的术语干预翻译模型

张金鹏, 段湘煜   

  1. 苏州大学 计算机科学与技术学院, 江苏 苏州 215000
  • 收稿日期:2022-10-19 出版日期:2023-11-15 发布日期:2023-02-09
  • 作者简介:

    张金鹏(1996—),男,硕士研究生,主研方向为自然语言处理

    段湘煜,教授

  • 基金资助:
    国家自然科学基金(61673289)

Terminology Intervention Translation Model Combining Vectorization Method and Mask Mechanism

Jinpeng ZHANG, Xiangyu DUAN   

  1. School of Computer Science and Technology, Soochow University, Suzhou 215000, Jiangsu, China
  • Received:2022-10-19 Online:2023-11-15 Published:2023-02-09

摘要:

术语干预神经机器翻译模型通常借助人为给定的术语翻译来改变译文,从而改善翻译质量。向量化干预方法为术语干预任务提供了新的范式,但仅考虑将术语与句子信息以向量的形式融合,没有关注术语信息对术语翻译效果的影响。为此,构建一种结合向量化方法与掩码机制的术语干预机器翻译模型,将人为给定的源端术语与目标端术语编码为特征向量,显式地融入机器翻译模型的编码器、解码器以及输出层。在训练阶段,借助掩码机制屏蔽注意力机制中源端术语对应的关键字,增强模型编码器与解码器对术语特征向量的关注。在推理阶段,利用掩码机制优化术语干预输出层的概率分布,进一步提高术语字符的翻译准确率。在WMT2014德英和WMT2021英中数据集上的实验结果表明,相较于基于原始向量化方法的Code-Switching机器翻译模型,所提模型的术语翻译准确率分别提升了9.27和2.95个百分点,并且能大幅度提升长术语的翻译准确率。

关键词: 机器翻译, 术语干预, 向量化, 注意力机制, 掩码机制

Abstract:

The terminology intervention Neural Machine Translation(NMT) model optimizes translations with the help of human-provided translations; this improves the translation quality. Recently, vectorization methods have emerged to provide a new paradigm for terminology intervention tasks; however, these methods consider only fusing terminology information with sentence information and neglect the low contribution of terminology vectors to terminology translation. To address these issues, a terminology intervention machine translation model combining the vectorization method and mask mechanism is built. This model encodes human-provided source terminology and target terminology into feature vectors and integrates them into the encoder, decoder, and output layers of the machine translation model. To enhance its attention to term feature vectors, the model uses a mask mechanism to mask the keys corresponding to the source-side terminologies in the attention mechanism during the training phase. In the inference phase, the probability distribution of the output layer is optimized to improve terminology generation. The experimental results on the WMT 2014 German-English and WMT2021 English-Chinese datasets show that, compared with the Code-Switching machine translation model based on the original vectorization method, the proposed model has improved the terminology translation accuracy by 9.27 and 2.95 percentage points, respectively, and can significantly improve the translation accuracy of long-terms.

Key words: machine translation, terminology intervention, vectorization, attention mechanism, mask mechanism