基于全局节点和多片段的格栅命名实体识别

doi:10.19678/j.issn.1000-3428.0066449

摘要/Abstract

摘要：

现有命名实体识别模型对标注数据量要求较高，基于主动学习的命名实体识别模型需要人工分词造成标注代价大。针对上述问题，提出一种结合全局节点和多片段的格栅命名实体识别模型。将Transformer的全连接结构替换为全局节点和多片段结构，每个节点仅与构造的上下文向量进行注意力计算，全局和片段节点分别获取全局和局部信息，从而降低对标注数据的需求量。对Flat-Lattice结构进行改进，解决现有主动学习策略需要分词的问题，从而在保证模型性能的情况下降低数据标注代价。在MSRA、OntoNotes 5.0、Weibo、PeopleDaily这4个数据集上的实验结果表明，与FLAT模型相比，所提模型达到对应F1阈值所需的标注数据量分别降低了39.90%、2.17%、34.60%和35.67%。

关键词: 全局节点, 多片段, 格栅, 命名实体识别, 主动学习, Transformer结构

Abstract:

A lattice Named Entity Recognition(NER) model that combines global nodes and multiple fragments is proposed to address the high demand for annotation data in existing NER models and the high cost of annotation incurred by manual word segmentation in active learning-based NER models. The fully connected structure of the Transformer is replaced with a global node and multi-fragment structure, where each node performs only attention calculations with the constructed context vector. The global and fragmented nodes obtain global and local information separately, thereby reducing the demand for annotated data. The Flat-Lattice structure is improved to solve the problem of word segmentation in existing active learning strategies, thereby reducing the cost of data annotation and ensuring model performance. The experimental results on four datasets, namely MSRA, OntoNotes 5.0, Weibo, and PeopleDaily, show that compared with the FLAT model, the proposed model reduces the amount of annotated data required to reach the corresponding F1 threshold by 39.90%, 2.17%, 34.60%, and 35.67%, respectively.

Key words: global node, multi-fragment, lattice, Named Entity Recognition(NER), active learning, Transformer structure

郭江涛, 彭甫镕. 基于全局节点和多片段的格栅命名实体识别[J]. 计算机工程, 2023, 49(12): 96-102.

Jiangtao GUO, Furong PENG. Lattice Named Entity Recognition Based on Global Nodes and Multi-fragments[J]. Computer Engineering, 2023, 49(12): 96-102.

http://www.ecice06.com/CN/Y2023/V49/I12/96

图/表 6

图1 基于全局节点和多片段的格栅命名实体识别模型结构

Fig.1 Structure of lattice named entity recognition model based on global nodes and multi-fragments

图2 对比实验结果

Fig.2 Comparative experimental results

图3 消融实验结果

Fig.3 Ablation experimental results

图4 参数分析实验结果

Fig.4 Parameter analysis experimental results

参考文献 25

1	孔玲玲. 面向少量标注数据的中文命名实体识别技术研究[D]. 杭州: 浙江大学, 2019.
	KONG L L. Research on Chinese named entity recognition technology for a small amount of labeled data[D]. Hangzhou: Zhejiang University, 2019. (in Chinese)
2	LI X N, YAN H, QIU X P, et al. FLAT: Chinese NER using Flat-Lattice Transformer[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2020: 1-9.
3	PENG M L, MA R T, ZHANG Q, et al. Simplify the usage of Lexicon in Chinese NER[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1908.05969.
4	HUANG H, WANG H Y, JIN D W. A low-cost named entity recognition research based on active learning[EB/OL]. [2022-11-12]. https://www.hindawi.com/journals/sp/2018/1890683/.
5	LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: probabilistic models for segmenting and labeling sequence data[EB/OL]. [2022-11-12]. https://readpaper.cn/paper/2147880316.
6	SHEN Y Y, YUN H, LIPTON Z C, et al. Deep active learning for named entity recognition[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1707.05928.
7	ZHANG Y E, YANG J E. Chinese NER using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2018: 1-12.
8	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 6000-6010.
9	梁杰, 陈嘉豪, 张雪芹, 等. 基于独热编码和卷积神经网络的异常检测. 清华大学学报(自然科学版), 2019, 59(7): 523- 529.
	LIANG J, CHEN J H, ZHANG X Q, et al. One-hot encoding and convolutional neural network based anomaly detection. Journal of Tsinghua University(Science and Technology), 2019, 59(7): 523- 529.
10	MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1301.3781.
11	PENNINGTON J, SOCHER R, MANNING C. GloVe: global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2014: 1532-1543.
12	杨长沛, 廖列法. 基于门控空洞卷积特征融合的中文命名实体识别. 计算机工程, 2023, 49(8): 85- 95. URL
	YANG C P, LIAO L F. Chinese named entity recognition based on dilated gated convolution feature fusion. Computer Engineering, 2023, 49(8): 85- 95. URL
13	PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1802.05365.
14	RADFORD A, NARASIMHAN K. Improving language understanding by generative pre-training[EB/OL]. [2022-11-12]. https://blog.csdn.net/dawnyi_yang/article/details/121318545.
15	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language. Minneapolis, USA: Association for Computational Linguistics, 2019: 4171-4186.
16	HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1508.01991.
17	PENG N Y, DREDZE M. Improving named entity recognition for Chinese social media with word segmentation representation learning[EB/OL]. [2022-11-12]. https://blog.csdn.net/YW_Vine/article/details/106481816.
18	张应成, 杨洋, 蒋瑞, 等. 基于BiLSTM-CRF的商情实体识别模型. 计算机工程, 2019, 45(5): 308- 314. URL
	ZHANG Y C, YANG Y, JIANG R, et al. Commercial intelligence entity recognition model based on BiLSTM-CRF. Computer Engineering, 2019, 45(5): 308- 314. URL
19	顾亦然, 霍建霖, 杨海根, 等. 基于BERT的电机领域中文命名实体识别方法. 计算机工程, 2021, 47(8): 78-83, 92. URL
	GU Y R, HUO J L, YANG H G, et al. BERT-based Chinese named entity recognition method in motor field. Computer Engineering, 2021, 47(8): 78-83, 92. URL
20	YAN H, DENG B C, LI X N, et al. TENER: adapting Transformer encoder for named entity recognition[EB/OL]. [2022-11-12]. https://arxiv.org/abs/1911.04474.
21	GUI T, ZOU Y C, ZHANG Q, et al. A lexicon-based graph neural network for Chinese NER[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2019: 1-7.
22	WU C H, WU F Z, HUANG Y F. DA-Transformer: distance-aware Transformer[C]//Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: Association for Computational Linguistics, 2021: 1-11.
23	LEVOW G A. The third international Chinese language processing bakeoff: word segmentation and named entity recognition[EB/OL]. [2022-11-12]. https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=012D90A8EE6687762C2CFF89AA6BBC1B?doi=10.1.1.211.3353&rep=rep1&type=pdf.
24	WEISCHEDEL R, PALMER M, MARCUS M, et al. OntoNotes release 5.0[EB/OL]. [2022-11-12]. https://catalog.ldc.upenn.edu/LDC2013T19.
25	PENG N Y, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2015: 1-8.

[1]	杨长沛, 廖列法. 基于门控空洞卷积特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(8): 85-95.
[2]	张家熔, 苑津莎, 许珈宁, 罗志宏. 基于多元信息嵌入与协同神经网络的力学实体识别算法[J]. 计算机工程, 2023, 49(7): 125-134.
[3]	侯华, 郭宏洋, 代超娜, 李峻辉. 结合多重注意力与迭代优化的立体匹配算法[J]. 计算机工程, 2023, 49(7): 161-168.
[4]	陈明, 刘蓉, 张晔. 基于多重注意力机制的中文医疗实体识别[J]. 计算机工程, 2023, 49(6): 314-320.
[5]	朱红, 牛浩然, 朱彤. 基于字词融合与对抗训练的行业人物实体识别[J]. 计算机工程, 2023, 49(5): 56-62.
[6]	毛亮, 赵林均, 余敦辉, 孙斌. 基于知识蒸馏的企业命名实体识别模型[J]. 计算机工程, 2023, 49(5): 90-96.
[7]	廖列法, 谢树松. 基于注意力机制特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(4): 256-262.
[8]	李晓腾, 张盼盼, 勾智楠, 高凯. 基于多任务学习的多模态命名实体识别方法[J]. 计算机工程, 2023, 49(4): 114-119.
[9]	段建勇, 朱奕霏, 王昊, 何丽, 李欣. 基于位置嵌入和多级预测的中文嵌套命名实体识别[J]. 计算机工程, 2023, 49(12): 71-77.
[10]	陈梦萱, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于词义增强的生物医学命名实体识别方法[J]. 计算机工程, 2023, 49(10): 305-312.
[11]	连艺谋, 张英俊, 谢斌红. 用于嵌套命名实体识别的边界强化分类模型[J]. 计算机工程, 2022, 48(8): 313-320.
[12]	司逸晨, 管有庆. 基于Transformer编码器的中文命名实体识别模型[J]. 计算机工程, 2022, 48(7): 66-72.
[13]	李军怀, 陈苗苗, 王怀军, 崔颖安, 张爱华. 基于ALBERT-BGRU-CRF的中文命名实体识别方法[J]. 计算机工程, 2022, 48(6): 89-94,106.
[14]	崔丽平, 古丽拉·阿东别克, 王智悦. 基于有向图模型的旅游领域命名实体识别[J]. 计算机工程, 2022, 48(2): 306-313.
[15]	廖涛, 黄荣梅, 张顺香, 段松松. 基于交互式特征融合的嵌套命名实体识别[J]. 计算机工程, 2022, 48(12): 119-126,133.

选择文件类型/文献管理软件名称

选择包含的内容