Chinese Named Entity Recognition Model Based on Transformer Encoder

doi:10.19678/j.issn.1000-3428.0061432

Abstract

Abstract: Named Entity Recognition(NER) is an important task in Natural Language Processing(NLP), and compared with English NER, Chinese NER is often more difficult to achieve.Traditional Chinese entity recognition models are usually based on deep neural networks used to label all characters in the text.Although they identify named entities according to the label sequence, such character-based labeling methods have difficulty obtaining the word information.To address this problem, this paper proposes a Chinese NER model based on the Transformer encoder.In the word embedding layer of the model, the word vector coding method is used in combination with a dictionary, such that the char vector contains the word information.At the same time, to solve the problem in which the Transformer encoder loses the relative position information of the characters during an attention calculation, this paper modifies the attention calculation method of the Transformer encoder and introduces a relative position coding method.Finally, a Conditional Random Field(CRF) model is introduced to obtain the optimal tag sequence.The experimental results show that the F1 value of this model when applied to the Resume dataset reaches 94.7%, and on the Weibo dataset reaches 58.2%, which are improvements in comparison with traditional NER models based on a Bidirectional Long Short-Term Memory(BiLSTM) network and Iterated Dilated Convolution Neural Network(ID-CNN).In addition, it achieves a better recognition and faster convergence speed.

Key words: Natural Language Processing(NLP), Chinese Named Entity Recognition(NER), Transformer encoder, Conditional Random Field(CRF), relative position encoding

摘要： 命名实体识别是自然语言处理中的重要任务，且中文命名实体识别相比于英文命名实体识别任务更具难度。传统中文实体识别模型通常基于深度神经网络对文本中的所有字符打上标签，再根据标签序列识别命名实体，但此类基于字符的序列标注方式难以获取词语信息。提出一种基于Transformer编码器的中文命名实体识别模型，在字嵌入过程中使用结合词典的字向量编码方法使字向量包含词语信息，同时针对Transformer编码器在注意力运算时丢失字符相对位置信息的问题，改进Transformer编码器的注意力运算并引入相对位置编码方法，最终通过条件随机场模型获取最优标签序列。实验结果表明，该模型在Resume和Weibo中文命名实体识别数据集上的F1值分别达到94.7%和58.2%，相比于基于双向长短期记忆网络和ID-CNN的命名实体识别模型均有所提升，具有更优的识别效果和更快的收敛速度。

关键词: 自然语言处理, 中文命名实体识别, Transformer编码器, 条件随机场, 相对位置编码

CLC Number:

TP18

SI Yichen, GUAN Youqing. Chinese Named Entity Recognition Model Based on Transformer Encoder[J]. Computer Engineering, 2022, 48(7): 66-72.

司逸晨, 管有庆. 基于Transformer编码器的中文命名实体识别模型[J]. 计算机工程, 2022, 48(7): 66-72.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0061432

http://www.ecice06.com/EN/Y2022/V48/I7/66

Figures/Tables 12

References

[1] 殷章志, 李欣子, 黄德根, 等.融合字词模型的中文命名实体识别研究[J].中文信息学报, 2019, 33(11):95-100, 106. YIN Z Z, LI X Z, HUANG D G, et al.Chinese named entity recognition ensembled with character[J].Journal of Chinese Information Processing, 2019, 33(11):95-100, 106.(in Chinese)
[2] 王红, 史金钏, 张志伟.基于注意力机制的LSTM的语义关系抽取[J].计算机应用研究, 2018, 35(5):1417-1420, 1440. WANG H, SHI J C, ZHANG Z W.Text semantic relation extraction of LSTM based on attention mechanism[J].Application Research of Computers, 2018, 35(5):1417-1420, 1440.(in Chinese)
[3] HUANG Z, XU W, YU K.Bidirectional LSTM-CRF models for sequence tagging[EB/OL].[2021-03-16].https://arxiv.org/abs/1508.01991v1.
[4] ZHANG Y, YANG J.Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2018:1554-1564.
[5] 杜琳, 曹东, 林树元, 等.基于BERT与Bi-LSTM融合注意力机制的中医病历文本的提取与自动分类[J].计算机科学, 2020, 47(S2):416-420. DU L, CAO D, LIN S Y, et al.Extraction and automatic classification of TCM medical records based on attention mechanism of BERT and Bi-LSTM[J].Computer Science, 2020, 47(S2):416-420.(in Chinese)
[6] ZENG D H, SUN C J, LIN L, et al.LSTM-CRF for drug-named entity recognition[J].Entropy, 2017, 19(6):283.
[7] YAN S, CHAI J P, WU L Y.Bidirectional GRU with multi-head attention for Chinese NER[C]//Proceedings of the 5th Information Technology and Mechatronics Engineering Conference.Washington D.C., USA:IEEE Press, 2020:1160-1164.
[8] DING R X, XIE P J, ZHANG X Y, et al.A neural multi-digraph model for Chinese NER with gazetteers[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:1462-1467.
[9] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need?[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:6000-6010.
[10] GUO S G, LIU Y P, LI H, et al.Transformer winding deformation detection based on BOTDR and ROTDR[J].Sensors, 2020, 20(7):2062.
[11] DAI Z H, YANG Z L, YANG Y M, et al.Transformer-XL:attentive language models beyond a fixed-length context[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:1-15.
[12] SHAW P, USZKOREIT J, VASWANI A.Self-attention with relative position representations[C]//Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg, USA:Association for Computational Linguistics, 2018:1-25.
[13] MIKOLOV T, CHEN K, CORRADO G, et al.Efficient estimation of word representations in vector space[EB/OL].[2021-03-16].https://arxiv.org/abs/1301.3781.
[14] 张华伟.基于Word2Vec的神经网络协同推荐模型[J].网络空间安全, 2019, 10(6):25-28. ZHANG H W.Neural network cooperative recommendation model based on Word2Vec[J].Cyberspace Security, 2019, 10(6):25-28.(in Chinese)
[15] 章跃琳.基于Word2Vec的在线商品特征提取与文本分类研究[D].温州:温州大学, 2019. ZHANG Y L.Research on feature extraction and text classification of online commodity based on Word2Vec[D].Wenzhou:Wenzhou University, 2019.(in Chinese)
[16] LEI S.Research on the improved Word2Vec optimization strategy based on statistical language model[C]//Proceedings of International Conference on Information Science, Parallel and Distributed Systems.Washington D.C., USA:IEEE Press, 2020:356-359.
[17] YAN H, DENG B C, LI X N, et al.TENER:adapting Transformer encoder for named entity recognition[EB/OL].[2021-03-16].https://arxiv.org/abs/1911.04474.
[18] 张应成, 杨洋, 蒋瑞, 等.基于BiLSTM-CRF的商情实体识别模型[J].计算机工程, 2019, 45(5):308-314. ZHANG Y C, YANG Y, JIANG R, et al.Commercial intelligence entity recognition model based on BiLSTM-CRF[J].Computer Engineering, 2019, 45(5):308-314.(in Chinese)
[19] DAI N, LIANG J Z, QIU X P, et al.Style Transformer:unpaired text style transfer without disentangled latent representation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:5997-6007.
[20] GAO M, XIAO Q F, WU S C, et al.An attention-based ID-CNNs-CRF model for named entity recognition on clinical electronic medical records[C]//Proceedings of International Conference on Artificial Neural Networks.Berlin, Germany:Springer, 2019:231-242.

Please choose a citation manager

Content to export