[1] 章钧津, 田永红, 宋哲煜, 等. 神经机器翻译综述[J]. 计算机工程与应用, 2024, 60(4): 57-74. ZHANG J J, TIAN Y H, SONG Z Y, et al. Survey of neural machine translation[J]. Computer Engineering and Applications, 2024, 60(4): 57-74. (in Chinese) [2] 冯洋, 邵晨泽. 神经机器翻译前沿综述[J]. 中文信息学报, 2020, 34(7): 1-18. FENG Y, SHAO C Z. Frontiers in neural machine translation: a literature review[J]. Journal of Chinese Information Processing, 2020, 34(7): 1-18. (in Chinese) [3] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, USA: ACL Press, 2014: 1724-1734. [4] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2014: 3104-3112. [5] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: Curran Associates Inc., 2017: 6000-6010. [6] BROWN P F, COCKE J, S A D, et al. A statistical approach to machine translation[J]. Computational Linguistics, 1990, 16(2): 79-85. [7] KOEHN P, OCH F J, MARCU D. Statistical phrase-based translation[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg, USA: ACL Press, 2003: 127-133. [8] CHIANG D. A hierarchical phrase-based model for statistical machine translation[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2005: 263-270. [9] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long and Short Papers). Stroudsburg, USA: ACL Press, 2019: 4171-4186. [10] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL].[2024-03-17]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf. [11] LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 7871-7880. [12] LIU Y H, GU J T, GOYAL N, et al. Multilingual denoising pre-training for neural machine translation[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 726-742. [13] WEI J, TAY Y, BOMMASANI R, et al. Emergent abilities of large language models[EB/OL].[2024-03-17]. https://arxiv.org/abs/2206.07682. [14] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York, USA: Curran Associates Inc., 2020: 1877-1901. [15] DONG Q, LI L, DAI D, et al. A survey for in-context learning[EB/OL].[2024-03-17]. https://arxiv.org/abs/2301.00234. [16] LIU J C, SHEN D H, ZHANG Y Z, et al. What makes good in-context examples for GPT-3?[C]//Proceedings of the 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Stroudsburg, USA: ACL Press, 2022: 100-114. [17] MIN S, LYU X X, HOLTZMAN A, et al. Rethinking the role of demonstrations: what makes in-context learning work?[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL Press, 2022: 11048-11064. [18] LU Y, BARTOLO M, MOORE A, et al. Fantastically ordered prompts and where to find them: overcoming few-shot prompt order sensitivity[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2022: 8086-8098. [19] ZHAO Z, WALLACE E, FENG S, et al. Calibrate before use: improving few-shot performance of language models[C]//Proceedings of the 38th International Conference on Machine Learning.[S. l.]: PMLR Press, 2021: 12697-12706. [20] AGRAWAL S, ZHOU C T, LEWIS M, et al. In-context examples selection for machine translation[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2023: 8857-8873. [21] AYCOCK S, BAWDEN R. Topic-guided example selection for domain adaptation in LLM-based machine translation[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop. Stroudsburg, USA: ACL Press, 2024: 175-195. [22] TAORI R, GULRAJANI I, ZHANG T, et al. Stanford Alpaca: an instruction-following LLaMA model[EB/OL].[2024-03-17]. https://github.com/tatsu-lab/stanford_alpaca. [23] XU H, KIM Y J, SHARAF A, et al. A paradigm shift in machine translation: boosting translation performance of large language models[EB/OL].[2024-03-17]. https://arxiv.org/abs/2309.11674. [24] GUO J X, YANG H, LI Z Y, et al. A novel paradigm boosting translation capabilities of large language models[EB/OL].[2024-03-17]. https://arxiv.org/abs/2403.11430. [25] FILIPPOVA K. Controlled hallucinations: learning to generate faithfully from noisy data[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 864-870. [26] MAYNEZ J, NARAYAN S, BOHNET B, et al. On faithfulness and factuality in abstractive summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 1906-1919. [27] DALE D, VOITA E, BARRAULT L, et al. Detecting and mitigating hallucinations in machine translation: model internal workings alone do well, sentence similarity even better[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2023: 36-50. [28] XIAO Y J, WANG W Y. On hallucination and predictive uncertainty in conditional language generation[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Stroudsburg, USA: ACL Press, 2021: 2734-2744. [29] HIMMI A, STAERMAN G, PICOT M, et al. Enhanced hallucination detection in neural machine translation through simple detector aggregation[EB/OL].[2024-03-17]. https://arxiv.org/abs/2402.13331. [30] WANG W, JIAO W, WANG S, et al. Understanding and mitigating the uncertainty in zero-shot translation[EB/OL].[2024-03-17]. https://arxiv.org/abs/2205.10068. [31] SENNRICH R, VAMVAS J, MOHAMMADSHAHI A. Mitigating hallucinations and off-target machine translation with source-contrastive and language-contrastive decoding[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2024: 21-33. [32] BROCKETT C, DOLAN W B, GAMON M. Correcting ESL errors using phrasal SMT techniques[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL. Stroudsburg, USA: ACL Press, 2006: 249-256. [33] XIE Z A, AVATI A, ARIVAZHAGAN N, et al. Neural language correction with character-based attention[EB/OL].[2024-03-17]. https://arxiv.org/abs/1603.09727. [34] YUAN Z, BRISCOE T. Grammatical error correction using neural machine translation[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL Press, 2016: 380-386. [35] JI J S, WANG Q L, TOUTANOVA K, et al. A nested attention neural hybrid model for grammatical error correction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2017: 753-762. [36] SONG K T, TAN X, LU J F. Neural machine translation with error correction[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan: International Joint Conferences on Artificial Intelligence Organization, 2020: 3891-3897. [37] REI R, DE SOUZA J G C, ALVES D, et al. COMET-22: Unbabel-IST 2022 submission for the metrics shared task[C]//Proceedings of the 7th Conference on Machine Translation. Stroudsburg, USA: ACL Press, 2022: 578-585. [38] BURCHELL L, BIRCH A, BOGOYCHEV N, et al. An open dataset and model for language identification[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, USA: ACL Press, 2023: 865-879. [39] TOUVRON H, MARTIN L, STONE K, et al. Llama 2: open foundation and fine-tuned chat models[EB/OL].[2024-03-17]. https://arxiv.org/abs/2307.09288. [40] COSTA-JUSSA M R, CROSS J, ÇELEBI O, et al. No language left behind: scaling human-centered machine translation[EB/OL].[2024-03-17]. https://arxiv.org/abs/2207.04672. [41] WOLF T, DEBUT L, SANH V, et al. Transformers: state-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Stroudsburg, USA: ACL Press, 2020: 38-45. [42] XUE L T, CONSTANT N, ROBERTS A, et al. MT5: a massively multilingual pre-trained text-to-text transformer[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL Press, 2021: 483-498. [43] SHAZEER N, STERN M. Adafactor: adaptive learning rates with sublinear memory cost[C]//Proceedings of the 35th International Conference on Machine Learning.[S. l.]: PMLR, 2018: 4596-4604. [44] REI R, TREVISO M, GUERREIRO N M, et al. CometKiwi: IST-Unbabel 2022 submission for the quality estimation shared task[C]//Proceedings of the Seventh Conference on Machine Translation (WMT). Stroudsburg, USA: ACL Press, 2022: 634-645. [45] JIAO W X, HUANG J T, WANG W X, et al. ParroT: translating during chat using large language models tuned with human translation and feedback[C]//Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023. Singapore. Stroudsburg, USA: ACL Press, 2023: 15009-15020. [46] CHEN Y, LIU Y, MENG F, et al. Improving translation faithfulness of large language models via augmenting instructions[EB/OL].[2024-04-17]. https://arxiv.org/abs/2308.12674. [47] YANG W, LI C, ZHANG J J, et al. BigTrans: augmenting large language models with multilingual translation capability over 100 languages[EB/OL].[2024-03-17]. https://arxiv.org/abs/2305.18098. [48] POST M. A call for clarity in reporting BLEU scores[C]//Proceedings of the 3rd Conference on Machine Translation: Research Papers. Stroudsburg, USA: ACL Press, 2018: 186-191. [49] MVLLER M, SENNRICH R. Understanding the properties of minimum Bayes risk decoding in neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, USA: ACL Press, 2021: 259-272. |