Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (1): 58-65,71. doi: 10.19678/j.issn.1000-3428.0055986

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Entity Name Recognition Method Based on Dilated Convolutional Iterative and Attention Mechanism

Lü Jianghai, DU Junping, ZHOU Nan, XUE Zhe   

  1. Beijing Key Laboratory of Intelligent Communication Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2019-09-11 Revised:2019-11-18 Published:2019-12-11

基于膨胀卷积迭代与注意力机制的实体名识别方法

吕江海, 杜军平, 周南, 薛哲   

  1. 北京邮电大学 计算机学院 智能通信软件与多媒体北京市重点实验室, 北京 100876
  • 作者简介:吕江海(1993-),男,硕士研究生,主研方向为深度学习、信息检索、机器学习;杜军平(通信作者),教授;周南,博士研究生;薛哲,讲师。
  • 基金资助:
    国家自然科学基金(61772083,61532006);广西科技重大专项(AA18118054)。

Abstract: The traditional entity name recognition methods fail to balance the effectiveness of feature extraction of text sequence and the training speed of neural network models.To address the problem,this paper proposes an entity name recognition method based on Iterated Dilated Convolutional Neural Network and Attention Mechanism(IDCNN-ATT).The IDCNN can fully utilize the optimization ability of GPU parallel computing,and retain the ability of Long Short-Term Memory(LSTM) neural network to remember as much information as possible based on simple structure.IDCNN can accurately extract the features of text sequences while greatly accelerating the training of neural network models.Moreover,the Attention Mechanism(ATT) is introduced to use the grammar information of the text and part-of-speech information of the words to select the features that are more critical to entity name recognition from multiple text features,which effectively improves the accuracy of text feature extraction.Experimental results show that the proposed attention-based method improves the indicators of entity name recognition by 2% compared with the traditional non-attention mechanism method.

Key words: entity name recognition, Attention Mechanism(ATT), dilated convolution, Long Short-Term Memory(LSTM) network, Conditional Random Field(CRF)

摘要: 针对传统实体名识别方法无法兼顾文本序列提取特征的有效性和神经网络模型训练速度的问题,提出一种基于迭代膨胀卷积神经网络(IDCNN)与注意力机制(ATT)的实体名识别方法。IDCNN可利用GPU并行计算的优化能力,保留长短期记忆神经网络的特性,即用简单的结构记录尽可能多的输入信息,并在准确提取文本序列特征的同时加快神经网络模型的训练速度。通过引入ATT运用文本语法信息和单词词性信息,从众多文本特征中选择对实体名识别更关键的特征,从而提高文本特征提取的准确性。在新闻数据集和微博数据集上的实验结果表明,神经网络模型的训练速度比传统的双向长短期记忆神经网络有显著提升,基于注意力的实体名识别方法的评价指标比传统的无注意力机制方法提高2%左右。

关键词: 实体名识别, 注意力机制, 膨胀卷积, 长短期记忆网络, 条件随机场

CLC Number: