[1] DEVLIN J, CHANG M W, LEE K, et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].[2021-05-10].https://arxiv.org/abs/1810.04805. [2] SANH V, DEBUT L, CHAUMOND J, et al.DistilBERT, a distilled version of BERT:smaller, faster, cheaper and lighter[EB/OL].[2021-05-10].https://arxiv.org/abs/1910. 01108. [3] JIAO X Q, YIN Y C, SHANG L F, et al.TinyBERT:distilling BERT for natural language understanding[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2020:4163-4174. [4] SUN Z Q, YU H K, SONG X D, et al.MobileBERT:a compact task-agnostic BERT for resource-limited devices[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2020:2158-2170. [5] TANG R, LU Y, LIU L Q, et al.Distilling task-specific knowledge from BERT into simple neural networks[EB/OL].[2021-05-10].https://arxiv.org/abs/1903.12136. [6] CHEN Y B, XU L H, LIU K, et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedigns of the 53th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2015:409-419. [7] NGUYEN T H, CHO K, GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of NAACL-HLT 2016.Stroudsburg, USA:Association for Computational Linguistics, 2016:300-309. [8] SHA L, QIAN F, CHANG B, et al.Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Palo Alto, USA:AAAI, 2018:5916-5923. [9] LIU X, LUO Z C, HUANG H Y.Jointly multiple events extraction via attention-based graph information aggregation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2018:1247-1256. [10] 贺瑞芳, 段绍杨.基于多任务学习的中文事件抽取联合模型[J].软件学报, 2019, 30(4):1015-1030. HE R F, DUAN S Y.Joint Chinese event extraction based multi-task learning[J].Journal of Software, 2019, 30(4):1015-1030.(in Chinese) [11] WANG X, HAN X, LIU Z, et al.Adversarial training for weakly supervised event detection[C]//Proceedings of NAACL-HLT 2019.Stroudsburg, USA:Association for Computational Linguistics, 2019:998-1008. [12] WANG X, WANG Z, HAN X, et al.HMEAE:hierarchical modular event argument extraction[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2019:5781-5787. [13] YANG S, FENG D W, QIAO L B, et al.Exploring pre-trained language models for event extraction and generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:5284-5294. [14] LIU J, CHEN Y, LIU K, et al.Event extraction as machine reading comprehension[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2020:1641-1651. [15] LI F Y, PENG W H, CHEN Y G, et al.Event extraction as multi-turn question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2020:829-838. [16] HINTON G, VINYALS O, DEAN J.Distilling the knowledge in a neural network[EB/OL].[2021-05-10].https://arxiv.org/abs/1503.02531. [17] AHN S, HU S X, DAMIANOU A, et al.Variational information distillation for knowledge transfer[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:9155-9163. [18] ZAGORUYKO S, KOMODAKIS N.Paying more attention to attention:improving the performance of convolutional neural networks via attention transfer[EB/OL].[2021-05-10].https://arxiv.org/abs/1612.03928. [19] 廖胜兰, 吉建民, 俞畅, 等.基于BERT模型与知识蒸馏的意图分类方法[J].计算机工程, 2021, 47(5):73-79. LIAO S L, JI J M, YU C, et al.Intention classification method based on BERT model and knowledge distillation[J].Computer Engineering, 2021, 47(5):73-79.(in Chinese) [20] LAN Z Z, CHEN M D, GOODMAN S, et al.ALBERT:a lite BERT for self-supervised learning of language representations[EB/OL].[2021-05-10].https://arxiv.org/abs/1909.11942. [21] PENNINGTON J, SOCHER R, MANNING C.GloVe:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2014:1532-1543. [22] KIPF T N, WELLING M.Semi-supervised classification with graph convolutional networks[EB/OL].[2021-05-10].https://arxiv.org/abs/1609.02907. |