作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 256-262. doi: 10.19678/j.issn.1000-3428.0064432

• 开发研究与工程应用 • 上一篇    下一篇

基于注意力机制特征融合的中文命名实体识别

廖列法, 谢树松   

  1. 江西理工大学 信息工程学院, 江西 赣州 341000
  • 收稿日期:2022-04-11 修回日期:2022-05-27 发布日期:2023-04-07
  • 作者简介:廖列法(1975-),男,教授、博士,主研方向为城市计算、电子商务、个性化推荐、自然语言处理;谢树松,硕士研究生。
  • 基金资助:
    国家自然科学基金(71761018)。

Chinese Named Entity Recognition Based on Attention Mechanism Feature Fusion

LIAO Liefa, XIE Shusong   

  1. School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, Jiangxi, China
  • Received:2022-04-11 Revised:2022-05-27 Published:2023-04-07

摘要: 命名实体识别是自然语言处理领域中信息抽取、信息检索、知识图谱等任务的基础。在命名实体识别任务中,Transformer编码器更加关注全局语义,对位置和方向信息不敏感,而双向长短期记忆(BiLSTM)网络可以提取文本中的方向信息,但缺少全局语义信息。为同时获得全局语义信息和方向信息,提出使用注意力机制动态融合Transformer编码器和BiLSTM的模型。使用相对位置编码和修改注意力计算公式对Transformer编码器进行改进,利用改进的Transformer编码器提取全局语义信息,并采用BiLSTM捕获方向信息。结合注意力机制动态调整权重,深度融合全局语义信息和方向信息以获得更丰富的上下文特征。使用条件随机场进行解码,实现实体标注序列预测。此外,针对Word2Vec等传统词向量方法无法表示词的多义性问题,使用RoBERTa-wwm预训练模型作为模型的嵌入层提供字符级嵌入,获得更多的上下文语义信息和词汇信息,增强实体识别效果。实验结果表明,该方法在中文命名实体识别数据集Resume和Weibo上F1值分别达到96.68%和71.29%,相比ID-CNN、BiLSTM、CAN-NER等方法,具有较优的识别效果。

关键词: 注意力机制, Transformer编码器, 特征融合, 中文命名实体识别, 预训练模型

Abstract: Named Entity Recognition(NER) is the basis of information extraction and retrieval, knowledge mapping, and other tasks in the field of Natural Language Processing(NLP).In the NER task, the Transformer encoder pays more attention to global semantics and is insensitive to position and direction information, while the Bidirectional Long-Short Term Memory (BiLSTM) network can extract direction information from text but lacks global semantic information.To obtain global semantic and direction information simultaneously, a model of a dynamic fusion of the Transformer encoder and BiLSTM, using an attention mechanism, is proposed.The Transformer encoder is improved by using relative position coding and a modified attention calculation formula.The improved Transformer encoder is used to extract global semantic information, and the BiLSTM is used to capture direction information.Using the attention mechanism, the weight is dynamically adjusted, and the global semantic and direction information are deeply fused to obtain richer context features.By decoding the Conditional Random Field (CRF), the entity's annotation sequence prediction is realized.Furthermore, in view of the inability of Word2Vec and other traditional word vector methods to express the polysemy of words, RoBERTa-wwm pretraining model is used as the embedding layer to provide character-level embedding, obtain more contextual semantic and vocabulary information, and enhance the effect of entity recognition.The experimental results show that the F1 value of the proposed method is 96.68% and 71.29% respectively on the Chinese NER benchmark datasets, Resume and Weibo.Compared with ID-CNN, BiLSTM, CAN-NER, and other methods, the proposed method has a better recognition effect.

Key words: attention mechanism, Transformer encoder, feature fusion, Chinese Named Entity Recognition(NER), pretraining model

中图分类号: