Survey of Uyghur Machine Translation Research

doi:10.19678/j.issn.1000-3428.0068124

Abstract

Abstract:

As one of the important tasks in China's low-resource machine translation research, the development and application of Uyghur machine translation can better promote cultural exchanges and trade between different regions and ethnic groups.However, Uyghur, as an adhesive language, has problems such as complex morphology and a scarce corpus in the field of machine translation. In recent years, at different stages of the development of Uyghur machine translation, researchers have optimized and innovated algorithms and models to address its characteristics and achieved various research results; however, no systematic review has been conducted. The paper comprehensively reviews the related research on Uyghur machine translation and categorizes it into three types according to methods used: rule- and example-based Uyghur machine translation, statistics-based Uyghur machine translation, and neural network-based Uyghur machine translation. Related academic activities and corpus resources are also summarized. To further explore the potential of Uyghur machine translation, the ChatGPT model is adopted as a preliminary attempt of the Uyghur-Chinese machine translation task.The experimental results show that in the Few-shot scenario, the translation performance is higher and then decreases with an increase in the number of examples, and the best performance is for 10-shot. Also, the chain-of-thought approach does not demonstrate better translation ability in the Uyghur machine translation task. Finally, future research directions for Uyghur machine translation are proposed.

Key words: Uyghur, rule- and example-based machine translation, statistical machine translation, Neural Machine Translation(NMT), Large Language Model(LLM)

摘要：

维吾尔语机器翻译作为我国低资源机器翻译研究的重要任务之一，其发展与应用可以更好地促进不同地区和民族之间的文化交流与贸易往来。然而，维吾尔语作为一种黏着性语言，在机器翻译领域存在形态复杂、语料稀缺等问题。近年来，在维吾尔语机器翻译发展的不同阶段，研究人员针对其特点在算法和模型上不断优化与创新，取得了一定的研究成果，但缺乏系统性的综述。全面回顾维吾尔语机器翻译的相关研究，并根据方法的不同将其分为基于规则和实例的维吾尔语机器翻译、基于统计的维吾尔语机器翻译以及基于神经网络的维吾尔语机器翻译3种类型，同时对相关学术活动和语料库资源进行汇总。为进一步探索维吾尔语机器翻译的潜力，采用ChatGPT模型对维吾尔语-汉语机器翻译任务进行初步研究，实验结果表明，在Few-shot情景下，随着示例数的增加，翻译性能先升后降，在10-shot时表现最佳。此外，思维链方法在维吾尔语机器翻译任务中并未展示出更优的翻译能力。最后对维吾尔语机器翻译未来的研究方向进行了展望。

关键词: 维吾尔语, 基于规则和实例的机器翻译, 统计机器翻译, 神经机器翻译, 大语言模型

Halidanmu ABUDUKELIMU, Yutao HOU, Dengfeng YAO, Abudukelimu ABULIZI, Jishang CHEN. Survey of Uyghur Machine Translation Research[J]. Computer Engineering, 2024, 50(1): 1-16.

哈里旦木·阿布都克里木, 侯钰涛, 姚登峰, 阿布都克力木·阿布力孜, 陈吉尚. 维吾尔语机器翻译研究综述[J]. 计算机工程, 2024, 50(1): 1-16.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0068124

http://www.ecice06.com/EN/Y2024/V50/I1/1

Figures/Tables 18

Fig.1 Rule- and example-based Uyghur machine translation

Fig.2 Statistical-based Uyghur machine translation

Fig.3 Uyghur NMT training framework

Fig.4 Model of Uyghur machine translation based on attention mechanisms

Fig.5 Uyghur machine translation based on back translation

Fig.6 Uyghur machine translation based on transfer learning

Fig.7 Uyghur machine translation based on BERT-fused

References 106

1	KALCHBREBEER N, BLUSOM P. Recurrent continuous translation models[C]//Proceedings of 2013 IEEE Conference on Empirical Methods in Natural Language Processing. Washington D. C., USA: IEEE Press, 2013: 1700-1709.
2	SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[EB/OL]. [2023-06-11]. https://arxiv.org/abs/1409.3215.pdf.
3	CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Washington D. C., USA: Association for Computational Linguistics, 2014: 1724-1734.
4	陈海艳. 新中国成立以来的维吾尔语研究概述. 民族翻译, 2021, (1): 88- 96.
	CHEN H Y. Scholarly research into the Uyghur language since the PRC's founding: an overview. Minority Translators Journal, 2021, (1): 88- 96.
5	王世杰, 周殿生. 汉维语机器翻译研究中的主要问题与发展. 语言与翻译, 1997, (1): 51- 54.
	WANG S J, ZHOU D S. The main problems and development of Chinese-Uyghur machine translation research. Language and Translation, 1997, (1): 51- 54.
6	NAGAO M. A framework of a mechanical translation between Japanese and English by analogy principle[M]. [S. 1. ]: MIT Press, 2003.
7	许亚梅, 张立臣. 基于实例的机器翻译方法及其优化策略探讨. 福建电脑, 2006, 22 (5): 48- 49.
	XU Y M, ZHANG L C. Case-based machine translation method and its optimization strategy. Fujian Computer, 2006, 22 (5): 48- 49.
8	衣马木艾山·阿布都力克木, 吐尔地·托合提, 艾斯卡尔·艾木都拉. 基于规则的维吾尔人名汉文机器翻译算法研究. 计算机应用与软件, 2010, 27 (8): 86-87, 147.
	ABDULIKIM Imamusisan, TOHTI Turdi, HAMDULLA Askar. On rules-based Chinese machine translation algorithm for Uyghur personal name. Computer Applications and Software, 2010, 27 (8): 86-87, 147.
9	艾山·吾买尔, 吐尔根·依布拉音. 英文-维吾尔文人名机器翻译算法的研究与实现[C]//中国中文信息学会25周年学术会议论文集. 北京: [出版者不祥], 2006: 469-473.
	WUMAIER Aishan, IBRAHIM Tuergen. Researching and implementation of the English to Uyghur personal name machine translation algorithm[C]//Proceedings of the 25th Anniversary Academic Conference of the Chinese Information Society of China. Beijing: [s. n. ], 2006: 469-473. (in Chinese)
10	塞麦提·麦麦提敏, 亚森·伊明. 基于转换规则的汉文-维文专有名词自动翻译研究[C]//第7届中文信息处理国际会议论文集. 北京: [出版者不祥], 2007: 591-596.
	MAIMAITIMIN Saimaiti, IMIN Yasin. Research on automatic translation of proper nouns from Chinese-Uighur based on conversion rules[C]//Proceedings of the 7th International Conference on Chinese Information Processing. Beijing: [s. n. ], 2007: 591-596. (in Chinese)
11	米尔夏提·力提甫, 米吉提·阿布力米提. 汉维机器翻译中维语动词的处理方法. 新疆大学学报(自然科学版), 2004, 21 (1): 77- 80.
	LITIP Mirxat, ABLIMIT Mijit. Method of the manayement of Uyghur verbs with trams lation in Chinese-Uyghur machine. Journal of Xinjiang University(Natural Science Edition), 2004, 21 (1): 77- 80.
12	维尼拉·木沙江, 木合塔尔. 日-维机器翻译中粘着性特点的应用. 新疆大学学报(哲学社会科学版), 2005, 33 (1): 129- 134.
	MUSHAJIANG Weinila, MUHETAER. Use of the characteristic of cohesiveness in Japanese-Uyghur machine translation. Journal of Xinjiang University (Philosophy and Social Science Edition), 2005, 33 (1): 129- 134.
13	NIMAITI M, IZUMI Y. A rule based approach for Japanese-Uyghur machine translation system. International Journal of Software Science and Computational Intelligence, 2014, 6 (1): 56- 69. doi: 10.4018/ijssci.2014010104
14	田生伟, 吐尔根·依布拉音, 禹龙. EBMT中高效的维吾尔语单词散列表构造算法. 中文信息学报, 2009, 23 (4): 124- 128.
	TIAN S W, IBRAHIM Tuergen, YU L. Efficient hash algorithm for Uyhur words in EBMT. Journal of Chinese Information Processing, 2009, 23 (4): 124- 128.
15	田生伟, 吐尔根·依布拉音, 禹龙, 等. 一种维吾尔语句子相似度算法的研究. 计算机工程与应用, 2009, 45 (26): 144- 146.
	TIAN S W, IBRAHIM Tuergen, YU L, et al. Similarity measure algorithm of Uyhur sentence. Computer Engineering and Applications, 2009, 45 (26): 144- 146.
16	卡哈尔江·阿比的热西提, 吐尔根·依布拉音, 姚天昉, 等. 一种改进的维吾尔语句子相似度计算方法. 中文信息学报, 2011, 25 (4): 50- 53.
	ABIDEREXITI Kahaerjiang, IBRAHIM Tuergen, YAO T F, et al. An improved method for Uyghur sentence similarity computation. Journal of Chinese Information Processing, 2011, 25 (4): 50- 53.
17	阿里甫·库尔班, 阿布力米提·阿不都热依木, 吐尔根·依布拉音. 维汉机器翻译用电子词典的设计. 计算机工程与应用, 2006, 42 (20): 76- 78.
	KUERBAN Alifu, ABUDUREYIMU Abulimiti, IBRAHIM Tuergen. Design of electronic dictionary for Uyghur-Chinese machine translation. Computer Engineering and Applications, 2006, 42 (20): 76- 78.
18	WUSHOUER J, ABULIZI W, ABIDEREXITI K, et al. Building contemporary Uyghur grammatical information dictionary[C]//Proceedings of International Workshop on Worldwide Language Service Infrastructure. Berlin, Germany: Springer, 2016: 137-144.
19	卡米利·毛依丁. 维汉英机器翻译系统中电子词典的研究. 新疆大学学报(自然科学版), 2003, 20 (2): 148- 150.
	MAOYIDING Kamili. Researcing and designing of uighur-chinese-english electronic dictionary. Journal of Xinjiang University (Natural Science Edition), 2003, 20 (2): 148- 150.
20	维尼拉·木沙江, 米尔夏提·力提甫, 木合塔尔. 日-维机器翻译系统中词典的研究. 新疆大学学报(哲学社会科学版), 2006, 34 (1): 149- 153.
	MUSHAJIANG Weinila, LITIP Mirxat, MUHETAER. A study on the dictionary in Japanese-Uyghur machine translation system. Journal of Xinjiang University (Philosophy and Social Science), 2006, 34 (1): 149- 153.
21	吐尔根·依布拉音, 艾尔肯·伊米尔, 阿布力米提·阿不都热依木. 基于翻译记忆库与基于规则的汉维-维汉机器辅助翻译系统方法与框架研究[C]//全国第7届计算语言学联合学术会议论文集. 哈尔滨: [出版者不祥], 2003: 415-421.
	IBRAHIM Tuergen, YIMIER Aerken, ABUDUREYIMU Abulimiti. Translation-memory-lib basedand rule based Chinese Uighur, Uighur Chinese tranlation system aid machine research [C]//Proceedings of the 7th National Joint Conference on Computational Linguistics. Harbin: [s. n. ], 2003: 415-421. (in Chinese)
22	买买提吐逊·祖农. 维吾尔语-土耳其语名词短语机器翻译关键技术研究[D]. 乌鲁木齐: 新疆师范大学, 2017.
	ZUNONG Maimaitituxun. Research on key technologies of machine translation of noun phrases from Uyghur to Turkish[D]. Urumqi: Xinjiang Normal University, 2017. (in Chinese)
23	如克燕木·吾斯曼江, 买热哈巴·艾力, 吐尔根·依布拉音. 基于规则的维吾尔语、哈萨克语机器翻译. 新疆大学学报(自然科学版), 2016, 33 (3): 338- 342.
	WUSIMANJIANG Rukeyanmu, AILI Maierhaba, IBRAHIM Tuergen. The rule-based Uyghur Kazak machine translation. Journal of Xinjiang University(Natural Science Edition), 2016, 33 (3): 338- 342.
24	阿西穆·托合提. 维吾尔语-乌兹别克语机器翻译研究[D]. 乌鲁木齐: 新疆大学, 2017.
	TUOHETI Aximu. Research on Uyghur-Uzbek machine translation[D]. Urumqi: Xinjiang University, 2017. (in Chinese)
25	肖桐, 朱靖波. 机器翻译: 基础与模型. 中文信息学报, 2021, 35 (12): 167.
	XIAO T, ZHU J B. Machine translation: fundamentals and models. Journal of Chinese Information Processing, 2021, 35 (12): 167.
26	杨攀, 李淼, 张建. 基于短语统计翻译的汉维机器翻译系统. 计算机应用, 2009, 29 (7): 2022- 2025.
	YANG P, LI M, ZHANG J. Chinese-Uyghur machine translation system for phrase-based statistical translation. Journal of Computer Applications, 2009, 29 (7): 2022- 2025.
27	宿建军, 张小燕, 吐尔洪·吾司曼, 等. 联合式多引擎维汉机器翻译系统. 计算机工程, 2011, 37 (16): 179- 181. doi: 10.3969/j.issn.1000-3428.2011.16.061
	SU J J, ZHANG X Y, WUSIMAN Tuerhong, et al. Joint multiple engines Uyghur-Chinese machine translation system. Computer Engineering, 2011, 37 (16): 179- 181. doi: 10.3969/j.issn.1000-3428.2011.16.061
28	米成刚, 王磊, 杨雅婷, 等. 维汉机器翻译未登录词识别研究. 计算机应用研究, 2013, 30 (4): 1112- 1115.
	MI C G, WANG L, YANG Y T, et al. Research on out-of-vocabulary words recognition in Uyghur-Chinese machine translation. Application Research of Computers, 2013, 30 (4): 1112- 1115.
29	米莉万·雪合来提, 麦热哈巴·艾力, 吐尔根·依布拉音, 等. 维吾尔语词尾对汉维统计机器翻译影响的研究. 计算机工程, 2014, 40 (3): 224- 227. doi: 10.3969/j.issn.1000-3428.2014.03.047
	XUEHELAITI Miliwan, AILI Mairehaba, IBRAHIM Tuergen, et al. Research on Uyghur suffix's influence on Chinese-Uyghur statistical machine translation. Computer Engineering, 2014, 40 (3): 224- 227. doi: 10.3969/j.issn.1000-3428.2014.03.047
30	米莉万·雪合来提, 刘凯, 吐尔根·依布拉音. 基于维吾尔语词干词缀粒度的汉维机器翻译. 中文信息学报, 2015, 29 (3): 201- 206.
	XUEHELAITI Miliwan, LIU K, IBRAHIM Tuergen. Chinese-Uyghur machine translation based on smallest translation units of stems and suffixes. Journal of Chinese Information Processing, 2015, 29 (3): 201- 206.
31	董兴华, 周俊林, 郭树盛, 等. 基于短语的汉维/维汉统计机器翻译. 计算机工程, 2011, 37 (9): 16-18, 21. doi: 10.3969/j.issn.1000-3428.2011.09.006
	DONG X H, ZHOU J L, GUO S S, et al. Phrase-based Chinese-Uyghur/Uyghur-Chinese statistical machine translation. Computer Engineering, 2011, 37 (9): 16-18, 21. doi: 10.3969/j.issn.1000-3428.2011.09.006
32	杨世勤, 王磊, 杨雅婷, 等. 基于短语汉维机器翻译解码的研究及实现. 计算机工程与设计, 2019, 40 (4): 1183- 1189.
	YANG S Q, WANG L, YANG Y T, et al. Research and implementation of decoding in phrase-based Chinese-Uyghur machine translation. Computer Engineering and Design, 2019, 40 (4): 1183- 1189.
33	陈科海, 周喜, 杨雅婷, 等. 基于粘着性模糊规则的维汉机器翻译最大熵调序研究. 计算机应用研究, 2013, 30 (9): 2587-2590, 2605.
	CHEN K H, ZHOU X, YANG Y T, et al. Research on max entropy reordering in Uyghur-Chinese machine translation based on tackiness fuzzy rules. Application Research of Computers, 2013, 30 (9): 2587-2590, 2605.
34	徐春, 杨勇, 董兴华. 汉维/维汉统计机器翻译中若干问题研究. 计算机工程与应用, 2011, 47 (35): 150-154, 167.
	XU C, YANG Y, DONG X H. Research on aspects of statistical machine translation between Chinese and Uyghur. Computer Engineering and Applications, 2011, 47 (35): 150-154, 167.
35	贾春兰, 吐尔根·依布拉音. 面向汉维机器翻译的相关转换及匹配规则设计. 计算机应用与软件, 2012, 29 (3): 216-218, 238.
	JIA C L, IBRAHIM Tuergen. Designing related transformation and matching rules for Chinese-Uyghur machine translation. Computer Applications and Software, 2012, 29 (3): 216-218, 238.
36	解倩倩, 艾山·吾买尔, 吐尔根·依布拉音, 等. 混合策略的汉维辅助翻译系统的设计与实现. 现代电子技术, 2017, 40 (20): 5- 9.
	XIE Q Q, WUMAIER Aishan, IBRAHIM Tuergen, et al. Design and implementation of Chinese and Uyghur computer-aided translation system based on hybrid strategy. Modern Electronics Technique, 2017, 40 (20): 5- 9.
37	哈里旦木·阿布都克里木, 刘洋, 孙茂松. 神经机器翻译系统在维吾尔语-汉语翻译中的性能对比. 清华大学学报(自然科学版), 2017, 57 (8): 878- 883.
	ABUDUKELIMU Halidanmu, LIU Y, SUN M S. Performance comparison of neural machine translation systems in Uyghur-Chinese translation. Journal of Tsinghua University (Science and Technology), 2017, 57 (8): 878- 883.
38	BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL]. [2023-06-11]. https://arxiv.org/abs/1409.0473.pdf.
39	LUONG T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2015: 1412-1421.
40	ZHOU C T, MA X Z, HU J J, et al. Handling syntactic divergence in low-resource machine translation[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2019: 1388-1394.
41	潘一荣, 李晓, 杨雅婷, 等. 面向汉维机器翻译的双语关联度优化模型. 计算机应用研究, 2020, 37 (3): 726- 730.
	PAN Y R, LI X, YANG Y T, et al. Bilingual relatedness optimization model for Chinese-Uyghur machine translation. Application Research of Computers, 2020, 37 (3): 726- 730.
42	潘一荣, 李晓, 杨雅婷, 等. 面向维汉机器翻译的层次化多特征融合模型. 厦门大学学报(自然科学版), 2020, 59 (2): 206- 212.
	PAN Y R, LI X, YANG Y T, et al. Hierarchical multi-features combination model for Uyghur-Chinese machine translation. Journal of Xiamen University (Natural Science), 2020, 59 (2): 206- 212.
43	阿里木·赛买提, 沙丽瓦尔·阿里木, 吐尔根·依不拉音, 等. 维汉人名翻译中不雅字或OOV的前处理研究. 东北师大学报(自然科学版), 2022, 54 (2): 76- 80.
	SAMAT Alimu, ALIMU Shaliwaer, IBRAHIM Tuergen, et al. Research on preprocessing methods of OOV or indecent words in Uyghur-Chinese name machine translation. Journal of Northeast Normal University (Natural Science Edition), 2022, 54 (2): 76- 80.
44	张金超, 艾山·吾买尔, 买合木提·买买提, 等. 基于多编码器多解码器的大规模维汉神经网络机器翻译模型. 中文信息学报, 2018, 32 (9): 20- 27.
	ZHANG J C, WUMAIER Aishan, MAIMAITI Maihemuti, et al. A large-scale Uyghur-Chinese neural machine translation model based on multiple encoders and decoders. Journal of Chinese Information Processing, 2018, 32 (9): 20- 27.
45	阿依古丽·哈力克, 卡哈尔江·阿比的热西提, 艾山·吾买尔, 等. 维吾尔语-汉语量词短语的神经机器翻译. 计算机工程与设计, 2019, 40 (9): 2649- 2653.
	HALIKE Ayiguli, ABIDEREXITI Kahaerjiang, WUMAIER Aishan, et al. Neural machine translation of Uyghur-Chinese quantifier. Computer Engineering and Design, 2019, 40 (9): 2649- 2653.
46	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 5998-6008.
47	PAN Y R, LI X A, YANG Y T, et al. Multi-source neural model for machine translation of agglutinative language. Future Internet, 2020, 12 (6): 96. doi: 10.3390/fi12060096
48	WUMAIER A, XU C Y, KADEER Z, et al. A neural-network-based approach to Chinese-Uyghur organization name translation. Information, 2020, 11 (10): 492. doi: 10.3390/info11100492
49	XU Z W, QIN H B, HUA Y Z. Research on Uyghur-Chinese neural machine translation based on the transformer at multistrategy segmentation granularity. Mobile Information Systems, 2021, 2021, 1- 7.
50	艾山·吾买尔, 斯拉吉艾合麦提·如则麦麦提, 西热艾力·海热拉, 等. 带标记音节的双向维汉神经机器翻译方法. 计算机工程与应用, 2021, 57 (4): 161- 168.
	WUMAIER Aishan, RUZMAMAT Sirajahmat, HAIRELA Xireaili, et al. Bi-directional Uyghur-Chinese neural machine translation with marked syllables. Computer Engineering and Applications, 2021, 57 (4): 161- 168.
51	李洪政, 冯冲, 黄河燕. 稀缺资源语言神经网络机器翻译研究综述. 自动化学报, 2021, 47 (6): 1217- 1231.
	LI H Z, FENG C, HUANG H Y. A survey on low-resource neural machine translation. Acta Automatica Sinica, 2021, 47 (6): 1217- 1231.
52	SENNRICH R, HADDOW B, BIRCH A. Improving neural machine translation models with monolingual data[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2016: 86-96.
53	杨郑鑫, 李京谕, 胡镓伟, 等. 基于增量训练的维汉神经机器翻译系统. 厦门大学学报(自然科学版), 2019, 58 (2): 195- 199.
	YANG Z X, LI J Y, HU J W, et al. Uyghur-to-Chinese neural machine translation based on incremental training. Journal of Xiamen University(Natural Science), 2019, 58 (2): 195- 199.
54	张文博, 张新路, 杨雅婷, 等. 面向低资源神经机器翻译的回译方法. 厦门大学学报(自然科学版), 2021, 60 (4): 675- 679.
	ZHANG W B, ZHANG X L, YANG Y T, et al. Back translation for low resources neural machine translation. Journal of Xiamen University(Natural Science), 2021, 60 (4): 675- 679.
55	宜年, 艾山·吾买尔, 买合木提·买买提, 等. 基于多种数据筛选的维汉神经机器翻译. 厦门大学学报(自然科学版), 2022, 61 (4): 660- 666.
	YI N, WUMAIER Aishan, MAIMAITI Maihemuti, et al. Uyghur-Chinese neural machine translation system based on multiple data filtering. Journal of Xiamen University (Natural Science), 2022, 61 (4): 660- 666.
56	宜年, 艾山·吾买尔, 刘胜全. 集成多种策略模型的维汉神经网络机器翻译系统. 现代计算机, 2021, (21): 41- 46.
	YI N, WUMAIER Aishan, LIU S Q. Multiple strategy model integrated Uyghur-Chinese neural network machine translation system. Modern Computer, 2021, (21): 41- 46.
57	冯笑, 杨雅婷, 董瑞, 等. 基于回译和集成学习的维汉神经机器翻译方法. 兰州理工大学学报, 2022, 48 (5): 99- 106.
	FENG X, YANG Y T, DONG R, et al. Uyghur-Chinese neural machine translation method based on back translation and ensemble learning. Journal of Lanzhou University of Technology, 2022, 48 (5): 99- 106.
58	冯笑, 杨雅婷, 董瑞, 等. 基于集成修剪的维汉神经机器翻译系统. 制造业自动化, 2023, 45 (2): 69-73, 110.
	FENG X, YANG Y T, DONG R, et al. Uyghur and Chinese machine translation system based on ensemble pruning. Manufacturing Automation, 2023, 45 (2): 69-73, 110.
59	朱相荣, 王磊, 杨雅婷, 等. 基于知识蒸馏的维汉神经翻译模型解码速度提升方法. 计算机应用与软件, 2022, 39 (11): 180- 186.
	ZHU X R, WANG L, YANG Y T, et al. A method to improve decoding speed of Uyghur-Chinese neural translation model based on knowledge distillation. Computer Applications and Software, 2022, 39 (11): 180- 186.
60	ZOPH B, YURET D, MAY J, et al. Transfer learning for low-resource neural machine translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2016: 1568-1575.
61	NGUYEN T Q, CHIANG D. Transfer learning across low-resource, related languages for neural machine translation[EB/OL]. [2023-06-11]. https://arxiv.org/abs/1708.09803.pdf.
62	KOCMI T, BOJAR O. Trivial transfer learning for low-resource neural machine translation[C]//Proceedings of the 3rd Conference on Machine Translation. Stroudsburg, USA: Association for Computational Linguistics, 2018: 244-252.
63	DABRE R, NAKAGAWA T, KAZAWA H. An empirical study of language relatedness for transfer learning in neural machine translation[C]//Proceedings of the 31st IEEE Conference on Language, Information and Computation. Washington D. C., USA: IEEE Press, 2017: 282-286.
64	LUO G X, YANG Y T, YUAN Y, et al. Hierarchical transfer learning architecture for low-resource neural machine translation. IEEE Access, 2019, 7, 154157- 154166. doi: 10.1109/ACCESS.2019.2936002
65	LUO G X, YANG Y T, DONG R, et al. A joint back-translation and transfer learning method for low-resource neural machine translation. Mathematical Problems in Engineering, 2020, 2020, 1- 11.
66	ZHU J H, XIA Y C, WU L J, et al. Incorporating BERT into neural machine translation[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2002.06823.pdf.
67	SUN Z W, WANG M X, LI L. Multilingual translation via grafting pre-trained language models[C]//Proceedings of Findings of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2021: 2735-2747.
68	YANG J C, WANG M X, ZHOU H, et al. Towards making the most of BERT in neural machine translation. Artificial Intelligence, 2020, 34 (5): 9378- 9385.
69	MA S M, DONG L, HUANG S H, et al. DeltaLM: encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encoders[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2106.13736.pdf.
70	TAN Z X, ZHANG X W, WANG S, et al. MSP: multi-stage prompting for making pre-trained language models better translators[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2022: 6131-6142.
71	EN G H, MA S M, CHEN Y, et al. Zero-shot cross-lingual transfer of neural machine translation with multilingual pretrained encoders[C]//Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2021: 15-26.
72	CHEN G H, MA S M, CHEN Y, et al. Towards making the most of cross-lingual transfer for zero-shot neural machine translation[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2022: 142-157.
73	LIN Z H, PAN X A, WANG M X, et al. Pre-training multilingual neural machine translation by leveraging alignment information[C]//Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2020: 2649-2663.
74	PAN X A, WANG M X, WU L W, et al. Contrastive learning for many-to-many multilingual neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2021: 244-258.
75	LI P F, LI L Y, ZHANG M, et al. Universal conditional masked language pre-training for neural machine translation[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2022: 6379-6391.
76	LIU Y H, GU J T, GOYAL N, et al. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 2020, 8, 726- 742. doi: 10.1162/tacl_a_00343
77	XUE L T, CONSTANT N, ROBERTS A, et al. MT5: a massively multilingual pre-trained text-to-text transformer[C]//Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: Association for Computational Linguistics, 2021: 483-498.
78	SONG K, TAN X, QIN T, et al. MASS: masked sequence to sequence pre-training for language generation[C]//Proceedings of International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2019: 5926-5936.
79	CHI Z W, DONG L, MA S M, et al. MT6: multilingual pretrained text-to-text transformer with translation pairs[C]//Proceedings of 2021 International Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2021: 1671-1683.
80	FAN A, BHOSALE S, SCHWENK H, et al. Beyond english-centric multilingual machine translation[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2010.11125.pdf.
81	TEAM N, COSTA-JUSSÀ M R, CROSS J, et al. No language left behind: scaling human-centered machine translation[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2207.04672.pdf.
82	陈玺, 杨雅婷, 董瑞. 面向汉维机器翻译的BERT嵌入研究. 计算机工程, 2021, 47 (12): 112- 117. doi: 10.19678/j.issn.1000-3428.0059863
	CHEN X, YANG Y T, DONG R. Research on BERT embedding for Chinese-Uyghur machine translation. Computer Engineering, 2021, 47 (12): 112- 117. doi: 10.19678/j.issn.1000-3428.0059863
83	ZHANG W B, LI X A, YANG Y T, et al. Pre-training on mixed data for low-resource neural machine translation. Information, 2021, 12 (3): 133. doi: 10.3390/info12030133
84	LI B, WENG Y X, SUN B, et al. A multi-tasking and multi-stage Chinese minority pre-trained language model[C]//Proceedings of International Conference on Machine Translation. Berlin, Germany: Springer, 2022: 93-105.
85	帕丽旦·木合塔尔, 吾守尔·斯拉木, 买买提·阿依甫, 等. RNN编码器-解码器在维汉机器翻译中的应用. 计算机工程与应用, 2018, 54 (15): 235- 240.
	MUHETAER Palidan, SILAMU Wushouer, AYIFU Maimaiti, et al. Application of RNN encoder-decoder in Uyghur-Chinese machine translation. Computer Engineering and Applications, 2018, 54 (15): 235- 240.
86	WANG Y J, LI X, YANG Y T, et al. Research of Uyghur-Chinese machine translation system combination based on semantic information[C]//Proceedings of International Conference on Natural Language Processing and Chinese Computing. Berlin, Germany: Springer, 2019: 497-507.
87	KONG J Y, YANG Y T, ZHOU X, et al. Research for Uyghur-Chinese neural machine translation. Berlin, Germany: Springer, 2016.
88	WANG Y J, LI X A, YANG Y T, et al. Hybrid system combination framework for Uyghur-Chinese machine translation. Information, 2021, 12 (3): 98. doi: 10.3390/info12030098
89	李毓, 杨雅婷, 李晓, 等. 面向汉维机器翻译的神经网络语言模型. 厦门大学学报(自然科学版), 2019, 58 (2): 189- 194.
	LI Y, YANG Y T, LI X, et al. Research on neural network language model for the Chinese-to-Uyghur machine translation. Journal of Xiamen University (Natural Science), 2019, 58 (2): 189- 194.
90	朱顺乐. 融合多特征的汉维神经网络机器翻译模型. 计算机工程与设计, 2019, 40 (5): 1484-1488, 1501.
	ZHU S L. Optimized Chinese-Uyghur neural machine translation model based on multi-features. Computer Engineering and Design, 2019, 40 (5): 1484-1488, 1501.
91	ZHANG S Y, MAHMUT G, WANG D, et al. Memory-augmented Chinese-Uyghur neural machine translation[C]//Proceedings of 2017 International Signal and Information Processing Association Annual Summit and Conference. Washington D. C., USA: IEEE Press, 2017: 1092-1096.
92	OPENAI. GPT-4 technical report[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2303.08774.pdf.
93	宗成庆. 统计自然语言处理. 北京: 清华大学出版社, 2013.
	ZONG C Q. Statistical natural language processing. Beijing: Tsinghua University Press, 2013.
94	毕雪华, 吐尔根·依布拉音. 面向汉维机器翻译的双语对齐语料库的构建[C]//第10届全国少数民族语言文字信息处理学术研讨会论文集. 西宁: [出版者不祥], 2005: 135-138.
	BI X H, IBRAHIM Turgen. Construction of bilingual aligned corpus for Chinese Uyghur mach-ine translation[C]//Proceedings of the 10th National Symposium on Language and Character Information Processing of Ethnic Minorities. Xining: [s. n. ], 2005: 135-138. (in Chinese)
95	YU Q, LI Z, SHENG J B, et al. A Chinese-Uyghur medical-domain neural machine translation dataset towards knowledge-driven[C]//Proceedings of International Conference on Machine Translation. Berlin, Germany: Springer, 2020: 37-54.
96	冯韬, 李淼, 曹宜超, 等. 汉维可比语料数据集. 中国科学数据, 2020, 5 (1): 163- 168.
	FENG T, LI M, CAO Y C, et al. A Chinese-Uighur comparable corpus. China Scientific Data, 2020, 5 (1): 163- 168.
97	ZHANG B A, WILLIAMS P, TITOV I, et al. Improving massively multilingual neural machine translation and zero-shot translation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2020: 1628-1639.
98	MIRZAKHALOV J, BABU A, ATAMAN D, et al. A large-scale study of machine translation in Turkic languages[C]//Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2021: 5876-5890.
99	PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Morristown, USA: Association for Computational Linguistics, 2001: 311-318.
100	POPOVIĆ M. ChrF++: words helping character n-grams[C]//Proceedings of the 2nd International Conference on Machine Translation. Stroudsburg, USA: Association for Computational Linguistics, 2017: 612-618.
101	REI R, STEWART C, FARINHA A C, et al. COMET: a neural framework for MT evaluation[C]//Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2020: 2685-2702.
102	REI R, DE S J G, ALVES D, et al. COMET-22: unbabel-IST 2022 submission for the metrics shared task[C]//Proceedings of the 7th IEEE International Conference on Machine Translation. Washington D. C., USA: IEEE Press, 2022: 578-585.
103	ZHANG B, HADDOW B, BIRCH A. Prompting large language model for machine translation: a case study[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2301.07069.pdf.
104	PENG K Q, DING L, ZHONG Q H, et al. Towards making the most of ChatGPT for machine translation[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2303.13780.pdf.
105	KOJIMA T, GU S S, REID M, et al. Large language models are zero-shot reasoners[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2205.11916.pdf.
106	ZHOU Y C, MURESANU A I, HAN Z W, et al. Large language models are human-level prompt engineers[EB/OL]. [2023-06-11]. https://arxiv.org/abs/2211.01910.pdf.

[1]	ZHANG Boxu, PU Zhi, CHENG Xi. Research on Uyghur Text Classification Based on Prompt Learning [J]. Computer Engineering, 2023, 49(6): 292-299,313.
[2]	CHEN Xi, YANG Yating, DONG Rui. Research on BERT Embedding for Chinese-Uyghur Machine Translation [J]. Computer Engineering, 2021, 47(12): 112-117.
[3]	MUNIRE·Muhetare, LI Xiao, YANG Yating. Research on Influence of Uyghur Complex Morphology on Chinese-Uyghur Machine Translation [J]. Computer Engineering, 2020, 46(2): 309-314.
[4]	WANG Yajuan,LI Xiao,YANG Yating,MI Chenggang. Research of Uyghur-Chinese Machine Translation System Combination Based on Paraphrase Information [J]. Computer Engineering, 2019, 45(4): 288-295,301.
[5]	SAIMAITI Maimaitimin, ESMAEL Abdurehim. Research on Uyghur Stop Words Extraction Method [J]. Computer Engineering, 2019, 45(10): 288-292,300.
[6]	Maimaitiayifu,SILAMU Wushouer,MUHETAER Palidan,YANG Wenzhong. Uyghur Named Entity Recognition Based on BiLSTM-CNN-CRF Model [J]. Computer Engineering, 2018, 44(8): 230-236.
[7]	WANG Shuyuana,TIAN Shengwei,YU Long,FENG Guanjun,AISHAN Wumaier,LI Pu,ZHAO Jianguo. Identification of Uyghur Event Coreference Relationship Based on Stacked Denoising Autoencoder [J]. Computer Engineering, 2018, 44(6): 305-310.
[8]	LUO Yan’gen,LI Xiao,JIANG Tonghai,YANG Yating,ZHOU Xi,WANG Lei. Uyghur Lexicon Normalization Method Based on Word Vector [J]. Computer Engineering, 2018, 44(2): 220-225.
[9]	WANG Junchao,HUANG Hao,XU Haihua,HU Ying. Low-resource Uyghur Speech Recognition Based on Transfer Learning [J]. Computer Engineering, 2018, 44(10): 281-285,291.
[10]	ZHOU Ke,YU Zhengtao,GAO Shengxiang. Statistical Machine Translation Method Integrating Topic for Chinese-Vietnamese Metallurgy Field [J]. Computer Engineering, 2017, 43(12): 179-183.
[11]	Guljamal Mamateli,Askar Rozi,Gulmire Imam,Askar Hamdulla. Uyghur Language Prosodic Boundary Prediction Combined with Hierarchical Conditional Random Field and Punctuation [J]. Computer Engineering, 2015, 41(11): 299-302,307.
[12]	Yiliyaer Dawut,Halidan Abudureyimu,YANG Nana. Research on Multiple Pattern Matching Algorithm for Uyghur [J]. Computer Engineering, 2015, 41(1): 143-149.
[13]	Miliwan.Xuehelaiti, Mairehaba.Aili, Tuergen.Yibulayin, JIANG Wen-bin. Research on Uyghur Suffix’s Influence on Chinese-Uyghur Statistical Machine Translation [J]. Computer Engineering, 2014, 40(3): 224-227.
[14]	Rayila Parhat,MENG Xiang-tao,Askar Hamdulla. Uyghur Text Sentiment Classification Based on Discriminative Keyword Model [J]. Computer Engineering, 2014, 40(10): 132-136,142.
[15]	MO Jin-E, YUAN Bao-She, LI Xiao, GU Chao, MI Er-Sha-Li-Jiang-?Sha-Wu-Chi. An Improved Projection Segmentation Method of Print Uyghur [J]. Computer Engineering, 2013, 39(4): 263-266,271.

Please choose a citation manager

Content to export