作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (6): 89-94,106. doi: 10.19678/j.issn.1000-3428.0061630

• 人工智能与模式识别 • 上一篇    下一篇

基于ALBERT-BGRU-CRF的中文命名实体识别方法

李军怀1, 陈苗苗1, 王怀军1, 崔颖安1, 张爱华2   

  1. 1. 西安理工大学 计算机科学与工程学院, 西安 710048;
    2. 中铝萨帕特种铝材(重庆)有限公司, 重庆 401326
  • 收稿日期:2021-05-12 修回日期:2021-07-26 发布日期:2021-08-11
  • 作者简介:李军怀(1969—),男,教授、博士,主研方向为行为识别、云计算、大数据;陈苗苗,硕士研究生;王怀军,副教授、博士;崔颖安,讲师、博士;张爱华,工程师。
  • 基金资助:
    国家重点研发计划(2018YFB1703000);陕西省水利厅基金(2020slkj-17)。

Chinese Named Entity Recognition Method Based on ALBERT-BGRU-CRF

LI Junhuai1, CHEN Miaomiao1, WANG Huaijun1, CUI Ying'an1, ZHANG Aihua2   

  1. 1. School of Computer Science and Engineering, Xi'an University of Technology, Xi'an 710048, China;
    2. Sapa Chalco Aluminium Products(Chongqing) Co., Ltd., Chongqing 401326, China
  • Received:2021-05-12 Revised:2021-07-26 Published:2021-08-11

摘要: 命名实体识别是知识图谱构建、搜索引擎、推荐系统等上层自然语言处理任务的重要基础,中文命名实体识别是对一段文本序列中的专有名词或特定命名实体进行标注分类。针对现有中文命名实体识别方法无法有效提取长距离语义信息及解决一词多义的问题,提出一种基于ALBERT-双向门控循环单元(BGRU)-条件随机场(CRF)模型的中文命名实体识别方法。使用ALBERT预训练语言模型对输入文本进行词嵌入获取动态词向量,有效解决了一词多义的问题。采用BGRU提取上下文语义特征进一步理解语义,获取长距离词之间的语义特征。将拼接后的向量输入至CRF层并利用维特比算法解码,降低错误标签输出概率。最终得到实体标注信息,实现中文命名实体识别。实验结果表明,ALBERT-BGRU-CRF模型在MSRA语料库上的中文命名实体识别准确率和召回率分别达到95.16%和94.58%,同时相比于片段神经网络模型和CNN-BiLSTM-CRF模型的F1值提升了4.43和3.78个百分点。

关键词: 命名实体识别, 预训练语言模型, 双向门控循环单元, 条件随机场, 词向量, 深度学习

Abstract: Named Entity Recognition(NER) is an important basis for upper-level natural language processing tasks such as knowledge graph construction, search engines, and recommendation systems.Chinese NER labels and classifies proper nouns or specific named entities in a text sequence.Aiming at the problem that the existing Chinese NER methods cannot effectively extract long-distance semantic information and solve the problem of polysemy, this study proposes a Chinese NER method based on ALBERT pre-training language model, Bidirectional Gated Recurrent Unit(BGRU) and Conditional Random Field(CRF), called ALBERT-BGRU-CRF model.First, the ALBERT pre-trained language model performs word embedding on the input text to obtain dynamic word vectors, which can effectively solve the polysemy problem.Second, BGRU extracts contextual semantic features to further understand semantics and obtain semantic features between long-distance words.Finally, the concatenated vector is input to the CRF layer and decoded using the Viterbi algorithm to reduce the probability of wrongly labelling the output.Then, the entity annotation information is obtained, and the Chinese NER task is completed.The experimental results show that the Chinese NER accuracy and recall rate of the ALBERT-BGRU-CRF model on the MSRA corpus reach 95.16% and 94.58%, respectively.Simultaneously, compared with the fragment neural network model and the CNN-BiLSTM-CRF model, the F1 value of the ALBERT-BGRU-CRF model has increased by 4.43 and 3.78 percentage points.

Key words: Named Entity Recognition(NER), pre-trained language model, Bidirectional Gated Recurrent Unit(BGRU), Conditional Random Field(CRF), word vector, deep learning

中图分类号: