基于译文易错词纠正机制的大语言模型机器翻译

doi:10.19678/j.issn.1000-3428.0069767

摘要/Abstract

摘要：

大语言模型在机器翻译任务中已经展现出一定水平, 通过提供翻译提示, 模型能够生成译文。然而, 受预训练语料质量和语言分布的限制, 大语言模型生成的译文仍存在一些低质量翻译问题, 如错译、漏译、幻觉和脱靶翻译等。为了减少大语言模型的低质量翻译, 提出基于译文易错词纠正机制的大语言模型机器翻译方法。首先使用原始训练集的模型译文和参考译文定义大语言模型在特定语向的译文易错词, 然后根据译文中的易错词及其纠正词构建易错词纠正数据集, 利用易错词纠正数据集微调另外一个小型预训练模型得到纠正模型。在推理阶段, 使用纠正模型对大语言模型译文中的易错词进行纠正, 纠正后再由大语言模型完成自回归解码, 最终得到更高质量的译文。实验采用Llama2-7B模型, 在WMT2022测试集的中↔英、德↔英和俄↔英6个语向中进行了验证。结果显示, 与未经纠正的译文相比, X-英翻译语向的平均COMET(Crosslingual Optimized Metric for Evaluation of Translation)和平均SacreBLEU(Bilingual Evaluation Understudy)分别提高了0.018 7和1.26分, 英-X语向的平均COMET和平均SacreBLEU分别提高了0.087 9和7.67分。实验证明了易错词纠正机制能够有效提高文本翻译质量。

关键词: 机器翻译, 大语言模型, 易错词, 纠正机制, 脱靶翻译

Abstract:

Large Language Models (LLMs) demonstrate a certain level of performance in machine translation tasks. These models can generate translations upon receiving a translation prompt. However, owing to limitations imposed by the quality of pre-training corpora and the distribution of languages, translations generated by LLMs still show quality issues such as mistranslations, omissions, hallucinations, and off-target translations. To mitigate the issue of low-quality translations generated by LLMs, this paper proposes a machine translation method using LLMs based on the correction mechanism of error-prone words in translations. Initially, error-prone words for a particular language direction are defined using model and reference translations from the original training set. Subsequently, a dataset for correcting these error-prone words is constructed based on the error-prone words in the model translations and their corresponding corrections. The correction model is then obtained by fine-tuning a small pre-trained model using the correction dataset. During the inference phase, the correction model is employed to rectify error-prone words in the translations generated by the LLM; subsequently, the LLM performs autoregressive decoding to produce a higher-quality translation. Experiments were conducted using the Llama2-7B model across six language directions (Chinese↔English, German↔English, and Russian↔English) on the WMT2022 test set. The results indicate that the average Crosslingual Optimized Metric for Evaluation of Translation (COMET) and SacreBilingual Evaluation Understudy (BLEU) scores for the X-English translation direction improved by 0.018 7 and 1.26 points, respectively, while those for the English-X translation direction improved by 0.087 9 and 7.67 points, respectively, when compared to translations without correction. These experiments substantiate the effectiveness of the correction mechanism of error-prone words in enhancing the quality of text translation by LLMs.

Key words: machine translation, Large Language Model (LLM), error-prone word, correction mechanism, off-target translation

李博, 季佰军, 段湘煜. 基于译文易错词纠正机制的大语言模型机器翻译[J]. 计算机工程, 2026, 52(2): 372-382.

LI Bo, JI Baijun, DUAN Xiangyu. Machine Translation with Large Language Models Based on Correction Mechanism of Error-Prone Words in Translations[J]. Computer Engineering, 2026, 52(2): 372-382.

https://www.ecice06.com/CN/Y2026/V52/I2/372

图/表 12

图1 基于译文易错词纠正机制的大语言模型机器翻译

Fig.1 Machine translation with LLM based on correction mechanism of error-prone words in translations

图2 译文长度分布与易错词纠正位置分布统计

Fig.2 Statistics of translation length distribution and error-prone word correction position distribution

图3 多轮纠正BLEU分数

Fig.3 BLEU score of multi-round correction

图4 多轮纠正COMET分数

Fig.4 COMET score of multi-round correction

参考文献 49

1	章钧津, 田永红, 宋哲煜, 等. 神经机器翻译综述. 计算机工程与应用, 2024, 60 (4): 57- 74. doi: 10.11897/SP.J.1016.2018.02734
	ZHANG J J , TIAN Y H , SONG Z Y , et al. Survey of neural machine translation. Computer Engineering and Applications, 2024, 60 (4): 57- 74. doi: 10.11897/SP.J.1016.2018.02734
2	冯洋, 邵晨泽. 神经机器翻译前沿综述. 中文信息学报, 2020, 34 (7): 1- 18.
	FENG Y , SHAO C Z . Frontiers in neural machine translation: a literature review. Journal of Chinese Information Processing, 2020, 34 (7): 1- 18.
3	CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, USA: ACL Press, 2014: 1724-1734.
4	SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2014: 3104-3112.
5	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: Curran Associates Inc., 2017: 6000-6010.
6	BROWN P F , COCKE J , S A D , et al. A statistical approach to machine translation. Computational Linguistics, 1990, 16 (2): 79- 85. URL
7	KOEHN P, OCH F J, MARCU D. Statistical phrase-based translation[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg, USA: ACL Press, 2003: 127-133.
8	CHIANG D. A hierarchical phrase-based model for statistical machine translation[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2005: 263-270.
9	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long and Short Papers). Stroudsburg, USA: ACL Press, 2019: 4171-4186.
10	RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL]. [2024-03-17]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
11	LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 7871-7880.
12	LIU Y H , GU J T , GOYAL N , et al. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 2020, 8, 726- 742. doi: 10.1162/tacl_a_00343
13	WEI J, TAY Y, BOMMASANI R, et al. Emergent abilities of large language models[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2206.07682.
14	BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York, USA: Curran Associates Inc., 2020: 1877-1901.
15	DONG Q, LI L, DAI D, et al. A survey for in-context learning[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2301.00234.
16	LIU J C, SHEN D H, ZHANG Y Z, et al. What makes good in-context examples for GPT-3?[C]//Proceedings of the 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Stroudsburg, USA: ACL Press, 2022: 100-114.
17	MIN S, LYU X X, HOLTZMAN A, et al. Rethinking the role of demonstrations: what makes in-context learning work?[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL Press, 2022: 11048-11064.
18	LU Y, BARTOLO M, MOORE A, et al. Fantastically ordered prompts and where to find them: overcoming few-shot prompt order sensitivity[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2022: 8086-8098.
19	ZHAO Z, WALLACE E, FENG S, et al. Calibrate before use: improving few-shot performance of language models[C]//Proceedings of the 38th International Conference on Machine Learning. [S. l.]: PMLR Press, 2021: 12697-12706.
20	AGRAWAL S, ZHOU C T, LEWIS M, et al. In-context examples selection for machine translation[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2023: 8857-8873.
21	AYCOCK S, BAWDEN R. Topic-guided example selection for domain adaptation in LLM-based machine translation[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop. Stroudsburg, USA: ACL Press, 2024: 175-195.
22	TAORI R, GULRAJANI I, ZHANG T, et al. Stanford Alpaca: an instruction-following LLaMA model[EB/OL]. [2024-03-17]. https://github.com/tatsu-lab/stanford_alpaca.
23	XU H, KIM Y J, SHARAF A, et al. A paradigm shift in machine translation: boosting translation performance of large language models[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2309.11674.
24	GUO J X, YANG H, LI Z Y, et al. A novel paradigm boosting translation capabilities of large language models[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2403.11430.
25	FILIPPOVA K. Controlled hallucinations: learning to generate faithfully from noisy data[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 864-870.
26	MAYNEZ J, NARAYAN S, BOHNET B, et al. On faithfulness and factuality in abstractive summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 1906-1919.
27	DALE D, VOITA E, BARRAULT L, et al. Detecting and mitigating hallucinations in machine translation: model internal workings alone do well, sentence similarity even better[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2023: 36-50.
28	XIAO Y J, WANG W Y. On hallucination and predictive uncertainty in conditional language generation[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Stroudsburg, USA: ACL Press, 2021: 2734-2744.
29	HIMMI A, STAERMAN G, PICOT M, et al. Enhanced hallucination detection in neural machine translation through simple detector aggregation[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2402.13331.
30	WANG W, JIAO W, WANG S, et al. Understanding and mitigating the uncertainty in zero-shot translation[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2205.10068.
31	SENNRICH R, VAMVAS J, MOHAMMADSHAHI A. Mitigating hallucinations and off-target machine translation with source-contrastive and language-contrastive decoding[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2024: 21-33.
32	BROCKETT C, DOLAN W B, GAMON M. Correcting ESL errors using phrasal SMT techniques[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL. Stroudsburg, USA: ACL Press, 2006: 249-256.
33	XIE Z A, AVATI A, ARIVAZHAGAN N, et al. Neural language correction with character-based attention[EB/OL]. [2024-03-17]. https://arxiv.org/abs/1603.09727.
34	YUAN Z, BRISCOE T. Grammatical error correction using neural machine translation[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL Press, 2016: 380-386.
35	JI J S, WANG Q L, TOUTANOVA K, et al. A nested attention neural hybrid model for grammatical error correction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2017: 753-762.
36	SONG K T, TAN X, LU J F. Neural machine translation with error correction[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan: International Joint Conferences on Artificial Intelligence Organization, 2020: 3891-3897.
37	REI R, DE SOUZA J G C, ALVES D, et al. COMET-22: Unbabel-IST 2022 submission for the metrics shared task[C]//Proceedings of the 7th Conference on Machine Translation. Stroudsburg, USA: ACL Press, 2022: 578-585.
38	BURCHELL L, BIRCH A, BOGOYCHEV N, et al. An open dataset and model for language identification[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, USA: ACL Press, 2023: 865-879.
39	TOUVRON H, MARTIN L, STONE K, et al. Llama 2: open foundation and fine-tuned chat models[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2307.09288.
40	COSTA-JUSSA M R, CROSS J, ÇELEBI O, et al. No language left behind: scaling human-centered machine translation[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2207.04672.
41	WOLF T, DEBUT L, SANH V, et al. Transformers: state-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Stroudsburg, USA: ACL Press, 2020: 38-45.
42	XUE L T, CONSTANT N, ROBERTS A, et al. MT5: a massively multilingual pre-trained text-to-text transformer[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL Press, 2021: 483-498.
43	SHAZEER N, STERN M. Adafactor: adaptive learning rates with sublinear memory cost[C]//Proceedings of the 35th International Conference on Machine Learning. [S. l.]: PMLR, 2018: 4596-4604.
44	REI R, TREVISO M, GUERREIRO N M, et al. CometKiwi: IST-Unbabel 2022 submission for the quality estimation shared task[C]//Proceedings of the Seventh Conference on Machine Translation (WMT). Stroudsburg, USA: ACL Press, 2022: 634-645.
45	JIAO W X, HUANG J T, WANG W X, et al. ParroT: translating during chat using large language models tuned with human translation and feedback[C]//Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023. Singapore. Stroudsburg, USA: ACL Press, 2023: 15009-15020.
46	CHEN Y, LIU Y, MENG F, et al. Improving translation faithfulness of large language models via augmenting instructions[EB/OL]. [2024-04-17]. https://arxiv.org/abs/2308.12674.
47	YANG W, LI C, ZHANG J J, et al. BigTrans: augmenting large language models with multilingual translation capability over 100 languages[EB/OL]. [2024-03-17]. https://arxiv.org/abs/2305.18098.
48	POST M. A call for clarity in reporting BLEU scores[C]//Proceedings of the 3rd Conference on Machine Translation: Research Papers. Stroudsburg, USA: ACL Press, 2018: 186-191.
49	MÜLLER M, SENNRICH R. Understanding the properties of minimum Bayes risk decoding in neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2021: 259-272.

[1]	王利民, 朱光辉, 吴涛. 大模型技术演进：世界模型让人工智能从感知走向决策(特邀)[J]. 计算机工程, 2026, 52(2): 1-6.
[2]	张成辉, 罗景, 涂新辉, 陈雨霖. 基于大语言模型的语料库查询自动生成方法[J]. 计算机工程, 2026, 52(2): 404-412.
[3]	刘荣龙, 李梓炜, 万悦, 吴嘉婧, 蒋子规. 面向Web3钓鱼网站的域名检测与网页分析方法[J]. 计算机工程, 2026, 52(1): 76-85.
[4]	林丹, 卢顺峰, 刘姿妍, 张博昭, 何龙, 蒋子规, 吴嘉婧, 郑子彬. 大语言模型赋能区块链服务安全研究综述: 现状、挑战与机遇(特邀)[J]. 计算机工程, 2026, 52(1): 1-21.
[5]	张珑耀, 温东新, 马庄宇, 舒燕君, 李庆, 刘明义, 左德承. 基于大语言模型的多智能体系统异常综述(特邀)[J]. 计算机工程, 2026, 52(1): 22-32.
[6]	刘根壕, 张能, 郑子彬. 基于大语言模型的API使用约束知识构建[J]. 计算机工程, 2025, 51(8): 74-85.
[7]	梁绪宁, 王思琪, 杨海龙, 栾钟治, 刘轶, 钱德沛. 基于自适应张量交换和重算的大模型推理优化[J]. 计算机工程, 2025, 51(10): 27-36.
[8]	罗焕坤, 葛一烽, 刘帅. 大语言模型在数学推理中的研究进展[J]. 计算机工程, 2024, 50(9): 1-17.
[9]	杨冬菊, 黄俊涛. 基于大语言模型的中文科技文献标注方法[J]. 计算机工程, 2024, 50(9): 113-120.
[10]	杨兴睿, 马斌, 李森垚, 钟忺. 基于大语言模型的教育文本幂等摘要方法[J]. 计算机工程, 2024, 50(7): 32-41.
[11]	翟洁, 李艳豪, 李彬彬, 郭卫斌. 基于大语言模型的个性化实验报告评语自动生成与应用[J]. 计算机工程, 2024, 50(7): 42-52.
[12]	侯钰涛, 阿布都克力木·阿布力孜, 史亚庆, 马依拉木·木斯得克, 哈里旦木·阿布都克里木. 面向"一带一路"的低资源语言机器翻译研究[J]. 计算机工程, 2024, 50(4): 332-341.
[13]	李敬灿, 肖萃林, 覃晓婷, 谢夏. 基于大语言模型与语义增强的文本关系抽取算法[J]. 计算机工程, 2024, 50(4): 87-94.
[14]	陈琳, 范元凯, 何震瀛, 刘晓清, 杨阳, 汤路民. SQL-to-text模型的组合泛化能力评估方法[J]. 计算机工程, 2024, 50(3): 326-335.
[15]	王靖尧, 曹敏. 基于文本的行人图像检索的多样化数据扩充方法[J]. 计算机工程, 2024, 50(12): 276-287.

选择文件类型/文献管理软件名称

选择包含的内容