[1] SUTSKEVER I, VINYALS O, QUOC V.Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2014:3104-3112. [2] BAHDANAU D, CHO K, BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].(2016-05-19)[2020-09-10].https://arxiv.org/pdf/1409.0473.pdf. [3] MENG F, ZHANG J.DTMT:a novel deep transition architecture for neural machine translation[C]//Proceedings of 2019 AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press, 2019:224-231. [4] GEHRING J, AULI M, GRANGIER D, et al.Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning.New York, USA:ACM Press, 2017:1243-1252. [5] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:6000-6010. [6] 哈里旦木·阿布都克里木, 刘洋, 孙茂松.神经机器翻译系统在维吾尔语汉语翻译中的性能对比[J].清华大学学报(自然科学版), 2017, 57(8):878-883. ABUDUKELIMU H, LIU Y, SUN M S.Performance comparison of neural machine translation systems in Uyghur-Chinese translation[J].Journal of Tsinghua University(Science and Technology), 2017, 57(8):878-883.(in Chinese) [7] DEVLIN J, CHANG M W, LEE K, et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].(2019-05-24)[2020-09-10].https://arxiv.org/pdf/1810.04805.pdf. [8] LIU Y, OTT M, GOYAL N, et al.RoBERTa:a robustly optimized BERT pretraining approach[EB/OL].(2019-07-26)[2020-09-10].https://arxiv.org/pdf/1907.11692v1.pdf. [9] RADFORD A, WU J, CHILD R, et al.Language models are unsupervised multitask learners[J].OpenAI Blog, 2019, 1(8):9. [10] 李俊, 吕学强.融合BERT语义加权与网络图的关键词抽取方法[J].计算机工程, 2020, 46(9):89-94. LI J, LÜ X Q.Keyword extraction method based on BERT semantic weighting and network graph[J].Computer Engineering, 2020, 46(9):89-94.(in Chinese) [11] RAJPURKAR P, JIA R, LIANG P.Know what you don't know:unanswerable questions for SQuAD[EB/OL].(2018-06-11)[2020-09-10].https://arxiv.org/pdf/1806.03822.pdf. [12] ZHANG H, XU J, WANG J.Pretraining-based natural language generation for text summarization[EB/OL].(2019-02-25)[2020-09-10].https://arxiv.org/pdf/1902.09243v2.pdf. [13] CLINCHANT S, JUNG K W, NIKOULINA V.On the use of BERT for neural machine translation[EB/OL].(2019-09-27)[2020-09-10].https://arxiv.org/pdf/1909.12744.pdf. [14] LI L, JIANG X, LIU Q.Pretrained language models for document-level neural machine translation[EB/OL].(2019-11-08)[2020-09-10].https://arxiv.org/pdf/1911.03110.pdf. [15] ZHU J, XIA Y, WU L, et al.Incorporating BERT into neural machine translation[EB/OL].(2020-02-17)[2020-09-10].https://arxiv.org/pdf/2002.06823.pdf. [16] BERT-base-multilingual-uncased model[EB/OL].[2020-09-10].https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip. [17] BERT-base-Chinese model[EB/OL].[2020-09-10].https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip. [18] BERT-wwm-ext model[EB/OL].[2020-09-10].https://drive.google.com/file/d/1iNeYFhCBJWeUsIlnW_2K6SMwXkM4gLb_/view. [19] CUI Y, CHE W, LIU T, et al.Pre-training with whole word masking for Chinese BERT[EB/OL].(2020-02-17)[2020-09-10].https://arxiv.org/pdf/1906.08101v2.pdf. [20] RoBERTa-wwm-large-ext model[EB/OL].[2020-09-10].https://drive.google.com/open?id=1-2vEZfIFCdM1-vJ3GD6DlSyKT4eVXMKq. [21] RoBERTa-wwm-ext model[EB/OL].[2020-09-10].https://drive.google.com/open?id=1eHM3l4fMo6DsQYGmey7UZGiTmQquHw25. [22] RBTL3 model[EB/OL].[2020-09-10].https://drive.google.com/open?id=1qs5OasLXXjOnR2XuGUh12NanUl0pkjEv. [23] JAWAHAR G, SAGOT B, SEDDAH D.What does BERT learn about the structure of language?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics, 2019:3651-3657. [24] McCLOSKEY M, COHEN N J.Catastrophic interference in connectionist networks:the sequential learning problem[J].Psychology of Learning and Motivation.1989, 24:109-165. [25] SENNRICH R, HADDOW B, BIRCH A.Neural machine translation of rare words with subword units[EB/OL].(2016-06-03)[2020-09-10].https://arxiv.org/pdf/1508.07909v4.pdf. [26] OTT M, EDUNOV S, BAEVSKI A, et al.fairseq:a fast, extensible toolkit for sequence modeling[EB/OL].(2019-04-01)[2020-09-10].https://arxiv.org/pdf/1904.01038.pdf. [27] PAPINENI K, ROUKOS S, WARD T, et al.BLEU:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics, 2002:311-318. |