基于注意力机制特征融合的中文命名实体识别

doi:10.19678/j.issn.1000-3428.0064432

摘要/Abstract

摘要： 命名实体识别是自然语言处理领域中信息抽取、信息检索、知识图谱等任务的基础。在命名实体识别任务中，Transformer编码器更加关注全局语义，对位置和方向信息不敏感，而双向长短期记忆（BiLSTM）网络可以提取文本中的方向信息，但缺少全局语义信息。为同时获得全局语义信息和方向信息，提出使用注意力机制动态融合Transformer编码器和BiLSTM的模型。使用相对位置编码和修改注意力计算公式对Transformer编码器进行改进，利用改进的Transformer编码器提取全局语义信息，并采用BiLSTM捕获方向信息。结合注意力机制动态调整权重，深度融合全局语义信息和方向信息以获得更丰富的上下文特征。使用条件随机场进行解码，实现实体标注序列预测。此外，针对Word2Vec等传统词向量方法无法表示词的多义性问题，使用RoBERTa-wwm预训练模型作为模型的嵌入层提供字符级嵌入，获得更多的上下文语义信息和词汇信息，增强实体识别效果。实验结果表明，该方法在中文命名实体识别数据集Resume和Weibo上F1值分别达到96.68%和71.29%，相比ID-CNN、BiLSTM、CAN-NER等方法，具有较优的识别效果。

关键词: 注意力机制, Transformer编码器, 特征融合, 中文命名实体识别, 预训练模型

Abstract: Named Entity Recognition(NER) is the basis of information extraction and retrieval, knowledge mapping, and other tasks in the field of Natural Language Processing(NLP).In the NER task, the Transformer encoder pays more attention to global semantics and is insensitive to position and direction information, while the Bidirectional Long-Short Term Memory (BiLSTM) network can extract direction information from text but lacks global semantic information.To obtain global semantic and direction information simultaneously, a model of a dynamic fusion of the Transformer encoder and BiLSTM, using an attention mechanism, is proposed.The Transformer encoder is improved by using relative position coding and a modified attention calculation formula.The improved Transformer encoder is used to extract global semantic information, and the BiLSTM is used to capture direction information.Using the attention mechanism, the weight is dynamically adjusted, and the global semantic and direction information are deeply fused to obtain richer context features.By decoding the Conditional Random Field (CRF), the entity's annotation sequence prediction is realized.Furthermore, in view of the inability of Word2Vec and other traditional word vector methods to express the polysemy of words, RoBERTa-wwm pretraining model is used as the embedding layer to provide character-level embedding, obtain more contextual semantic and vocabulary information, and enhance the effect of entity recognition.The experimental results show that the F1 value of the proposed method is 96.68% and 71.29% respectively on the Chinese NER benchmark datasets, Resume and Weibo.Compared with ID-CNN, BiLSTM, CAN-NER, and other methods, the proposed method has a better recognition effect.

Key words: attention mechanism, Transformer encoder, feature fusion, Chinese Named Entity Recognition(NER), pretraining model

中图分类号:

TP391

廖列法, 谢树松. 基于注意力机制特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(4): 256-262.

LIAO Liefa, XIE Shusong. Chinese Named Entity Recognition Based on Attention Mechanism Feature Fusion[J]. Computer Engineering, 2023, 49(4): 256-262.

https://www.ecice06.com/CN/Y2023/V49/I4/256

图/表 7

20230417190243

20230417190246

20230417190249

20230417190253

20230417190257

20230417190300

20230417190302

参考文献

[1] 邓依依, 邬昌兴, 魏永丰, 等.基于深度学习的命名实体识别综述[J].中文信息学报, 2021, 35(9):30-45. DENG Y Y, WU C X, WEI Y F, et al.A survey on named entity recognition based on deep learning[J].Journal of Chinese Information Processing, 2021, 35(9):30-45.(in Chinese)
[2] LI J, SUN A, HAN J L, et al.A survey on deep learning for named entity recognition[J].IEEE Transactions on Knowledge and Data Engineering, 2020, 34(1):50-70.
[3] 宋旭晖, 于洪涛, 李邵梅.基于图注意力网络字词融合的中文命名实体识别[J].计算机工程, 2022, 48(10):298-305. SONG X H, YU H T, LI S M.Chinese named entity recognition based on word fusion of graph attention network[J].Computer Engineering, 2022, 48(10):298-305.(in Chinese)
[4] 杨飘, 董文永.基于BERT嵌入的中文命名实体识别方法[J].计算机工程, 2020, 46(4):40-45, 52. YANG P, DONG W Y.Chinese named entity recognition method based on BERT embedding[J].Computer Engineering, 2020, 46(4):40-45, 52.(in Chinese)
[5] DEVLIN J, CHANG M W, LEE K, et al.BERT:pre-training of deep bidirectional Transformers for language understanding[EB/OL].[2022-03-08].https://arxiv.org/pdf/1810.04805.pdf.
[6] 张云秋, 汪洋, 李博诚.基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别[J].数据分析与知识发现, 2022, 6(S1):242-250. ZHANG Y Q, WANG Y, LI B C.Identifying named entities of Chinese electronic medical records based on RoBERTa-wwm dynamic fusion model[J].Data Analysis and Knowledge Discovery, 2022, 6(S1):242-250.(in Chinese)
[7] 胡新棒, 于溆乔, 李邵梅, 等.基于知识增强的中文命名实体识别[J].计算机工程, 2021, 47(11):84-92. HU X B, YU X Q, LI S M, et al.Chinese named entity recognition based on knowledge enhancement[J].Computer Engineering, 2021, 47(11):84-92.(in Chinese)
[8] 史占堂, 马玉鹏, 赵凡, 等.基于CNN-Head Transformer编码器的中文命名实体识别[J].计算机工程, 2022, 48(10):73-80. SHI Z T, MA Y P, ZHAO F, et al.Chinese named entity recognition based on CNN-Head Transformer encoder[J].Computer Engineering, 2022, 48(10):73-80.(in Chinese)
[9] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:6000-6010.
[10] SU J S, TAN Z X, XIONG D Y, et al. Lattice-based recurrent neural network encoders for neural machine translation[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence.New York, USA:ACM Press, 2017:3302-3308.
[11] REI M, CRICHTON G K O, PYYSALO S.Attending to characters in neural sequence labeling models[EB/OL].[2022-03-08].https://arxiv.org/pdf/1611.04361.pdf.
[12] CUI L Y, ZHANG Y.Hierarchically-refined label attention network for sequence labeling[EB/OL].[2022-03-08].https://arxiv.org/abs/1908.08676v1.
[13] COLLOBERT R, WESTON J, BOTTOU L, et al.Natural language processing (almost) from scratch[EB/OL].[2022-03-08].https://arxiv.org/pdf/1103.0398v1.pdf.
[14] HUANG Z H, XU W, YU K.Bidirectional LSTM-CRF models for sequence tagging[EB/OL].[2022-03-08].https://arxiv.org/pdf/1508.01991.pdf.
[15] STRUBELL E, VERGA P, BELANGER D, et al.Fast and accurate entity recognition with iterated dilated convolutions[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2017:2670-2680.
[16] YAN H, DENG B C, LI X N, et al.TENER:adapting Transformer encoder for named entity recognition[EB/OL].[2022-03-08].https://arxiv.org/abs/1911.04474v3.
[17] GUI T, ZOU Y, ZHANG Q, et al.A lexicon-based graph neural network for Chinese NER[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.[S.l.]:IEEE Press, 2019:1040-1050.
[18] SUI D B, CHEN Y B, LIU K, et al.Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.[S.l.]:IEEE Press, 2019:3828-3838.
[19] ZHANG Y, YANG J.Chinese NER using lattice LSTM[EB/OL].[2022-03-08].https://arxiv.org/pdf/1805.02023.pdf.
[20] LI X N, YAN H, QIU X P, et al.FLAT:Chinese NER using flat-lattice Transformer[EB/OL].[2022-03-08].https://arxiv.org/abs/2004.11795.
[21] PENG M L, MA R, ZHANG Q, et al.Simplify the usage of lexicon in Chinese NER[EB/OL].[2022-03-08].https://arxiv.org/abs/1908.05969v1.
[22] WU S, SONG X N, FENG Z H.MECT:multi-metadata embedding based cross-Transformer for Chinese named entity recognition[EB/OL].[2022-03-08].https://arxiv.org/abs/2107.05418v1.
[23] MIKOLOV T, SUTSKEVER I, CHEN K, et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2013:3111-3119.
[24] LIU Y H, OTT M, GOYAL N, et al.RoBERTa:a robustly optimized BERT pretraining approach[EB/OL].[2022-03-08].https://arxiv.org/abs/1907.11692v1.
[25] CUI Y M, CHE W X, LIU T, et al.Pre-training with whole word masking for Chinese BERT[J].IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29:3504-3514.
[26] ZHU Y Y, WANG G X, KARLSSON B F.CAN-NER:convolutional attention network for Chinese named entity recognition[EB/OL].[2022-03-08].https://arxiv.org/pdf/1904.02141.pdf.

选择文件类型/文献管理软件名称

选择包含的内容