Named Entity Recognition Method for Unsafe Underground Behaviors in Coal Mines

doi:10.19678/j.issn.1000-3428.0069917

Abstract

Abstract:

A coal mine unsafe behavior corpus containing 8 entity categories and 2 359 samples has been constructed using a BIO labeling strategy to improve the efficiency of underground safety management and realize safe coal mine production, based on the relevant standards and norms of the coal mine industry as well as insights into the field of underground unsafe behavior. Aiming at the problems of insufficient semantic information utilization, unbalanced entity distribution, and fuzzy entity boundary in the named entity recognition task of unsafe behavior in coal mines, this study proposes a named entity recognition model based on Global Pointer and adversarial training. First, the improved hierarchical RoBERTa model is used to make full use of multi-layer semantic information to enhance the text vectorization of underground unsafe behavior, and the word embedding layer is disturbed by adversarial training to alleviate the problem of data imbalance and enhance model robustness. Second, Bidirectional Gated Recurrent Unit (BiGRU) is used in the feature extraction layer to more effectively capture the contextual semantic features of the corpus and enhance the semantic association of the text. Finally, Global Pointer is constructed in the decoding layer to obtain more accurate entity boundary recognition results. The effectiveness of the proposed model is evaluated on a self-built small sample coal mine underground unsafe behavior dataset. The results show that the accuracy, recall, and F1 value of the proposed model are 78.77%, 78.20%, and 78.48%, respectively, which are 2.27, 0.63, and 1.45 percentage points higher than those of the BERT-Global Pointer model. The findings provide a basis for constructing a knowledge graph of unsafe behavior in underground mines.

Key words: unsafe underground behavior, named entity recognition, RoBERTa model, adversarial training, Global Pointer model

摘要：

为提高井下安全管理效率, 实现煤矿安全生产, 根据煤矿行业相关标准规范, 并结合井下不安全行为领域知识, 采用BIO标注策略构建一个包含8类实体类别、2 359条样本的煤矿井下不安全行为语料库。针对煤矿井下不安全行为命名实体识别任务中存在的语义信息利用不足、实体分布不均衡、实体边界模糊的问题, 提出一种基于Global Pointer和对抗训练的煤矿井下不安全行为命名实体识别模型。首先, 采用改进的分层RoBERTa模型并利用多层语义信息增强井下不安全行为文本向量化, 结合对抗训练对词嵌入层进行扰动, 缓解数据不平衡问题, 增强模型的鲁棒性; 其次, 在特征提取层采用双向门控循环单元(BiGRU)可以更有效地捕获语料的上下文语义特征, 加强文本语义关联; 最后, 在解码层构造Global Pointer, 获得更准确的实体边界识别结果。为验证提出模型的有效性, 在自建的小样本煤矿井下不安全行为数据集上进行实验, 结果表明, 该模型的精确率、召回率和F1值分别为78.77%、78.20%、78.48%, 相比于BERT-Global Pointer模型分别提高了2.27、0.63、1.45百分点, 为构建井下不安全行为知识图谱提供基础。

关键词: 井下不安全行为, 命名实体识别, RoBERTa模型, 对抗训练, Global Pointer模型

FU Yan, LIU Peiyi, YE Ou. Named Entity Recognition Method for Unsafe Underground Behaviors in Coal Mines[J]. Computer Engineering, 2026, 52(4): 424-432.

付燕, 刘佩怡, 叶鸥. 煤矿井下不安全行为的命名实体识别方法[J]. 计算机工程, 2026, 52(4): 424-432.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069917

https://www.ecice06.com/EN/Y2026/V52/I4/424

Figures/Tables 9

References 29

1	HANISCH D, FUNDEL K, MEVISSEN H T, et al. ProMiner: rule-based protein and gene entity recognition. BMC Bioinformatics, 2005, 6(1): 14. doi: 10.1186/1471-2105-6-14
2	QUIMBAYA A P, MÚNERA A S, RIVERA R A G, et al. Named entity recognition over electronic health records through a combined dictionary-based approach. Procedia Computer Science, 2016, 100, 55- 61. doi: 10.1016/j.procs.2016.09.123
3	ZHOU G D, SU J. Named entity recognition using an HMM-based chunk tagger[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Morristown, USA: ACL Press, 2001: 473-480.
4	TSAI R T, HUNG H, SUNG C, et al. On closed task of Chinese word segmentation: an improved CRF model coupled with character clustering and automatically generated template matching[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. Sydney, Australia: [s. n.], 2006: 1-8.
5	ISOZAKI H, KAZAWA H. Efficient support vector classifiers for named entity recognition[C]//Proceedings of the 19th International Conference on Computational Linguistics. Morristown, USA: ACL Press, 2002: 271-278.
6	刘杰. 基于改进的隐马尔科夫模型的中文命名实体识别算法. 太原师范学院学报(自然科学版), 2009, 8(1): 80-83, 90. doi: 10.3969/j.issn.1672-2027.2009.01.025
	LIU J. The arithmetic of Chinese named entity recognition based on the improved nidden Markov model. Journal of Taiyuan Normal University(Natural Science Edition), 2009, 8(1): 80-83, 90. doi: 10.3969/j.issn.1672-2027.2009.01.025
7	胡文博, 都云程, 吕学强, 等. 基于多层条件随机场的中文命名实体识别. 计算机工程与应用, 2009, 45(1): 163-165, 227. doi: 10.3778/j.issn.1002-8331.2009.01.051
	HU W B, DU Y C, LÜ X Q, et al. Study on Chinese named entity recognition based on cascaded conditional random fields. Computer Engineering and Applications, 2009, 45(1): 163-165, 227. doi: 10.3778/j.issn.1002-8331.2009.01.051
8	卢青华, 袁丽娜. 基于组合神经网络的软件命名实体识别仿真. 计算机仿真, 2023, 40(1): 489-492, 509. doi: 10.3969/j.issn.1006-9348.2023.01.088
	LU Q H, YUAN L N. Software named entity recognition simulation based on combined neural network. Computer Simulation, 2023, 40(1): 489-492, 509. doi: 10.3969/j.issn.1006-9348.2023.01.088
9	余丹丹, 黄洁, 党同心, 等. 基于ALBERT的中文简历命名实体识别. 计算机工程与设计, 2024, 45(1): 261- 267. doi: 10.16208/j.issn1000-7024.2024.01.033
	YU D D, HUANG J, DANG T X, et al. Recognition of named entity in Chinese resume based on ALBERT. Computer Engineering and Design, 2024, 45(1): 261- 267. doi: 10.16208/j.issn1000-7024.2024.01.033
10	褚天舒, 唐球, 梁军学, 等. 基于词汇增强和表格填充的中文命名实体识别. 电子技术应用, 2024, 50(2): 23- 29. doi: 10.16157/j.issn.0258-7998.233939
	CHU T S, TANG Q, LIANG J X, et al. Chinese named entity recognition based on lexicon enhancement and table filing. Application of Electronic Technique, 2024, 50(2): 23- 29. doi: 10.16157/j.issn.0258-7998.233939
11	崔少国, 陈俊桦, 李晓虹. 融合语义及边界信息的中文电子病历命名实体识别. 电子科技大学学报, 2022, 51(4): 565- 571. doi: 10.12178/1001-0548.2021350
	CUI S G, CHEN J H, LI X H. Named entity recognition for Chinese electronic medical record by fusing semantic and boundary information. Journal of University of Electronic Science and Technology of China, 2022, 51(4): 565- 571. doi: 10.12178/1001-0548.2021350
12	林娜, 岳希, 唐聃. 基于数据增强和损失平衡的机电领域命名实体识别. 计算机工程与应用, 2025, 61(7): 222- 232. doi: 10.3778/j.issn.1002-8331.2311-0310
	LIN N, YUE X, TANG D. Named entity recognition in electromechanical field based on data enhancement and loss balancing. Computer Engineering and Applications, 2025, 61(7): 222- 232. doi: 10.3778/j.issn.1002-8331.2311-0310
13	曹现刚, 吴可昕, 张梦园, 等. 基于BERT的煤矿装备维护知识命名实体识别研究. 机床与液压, 2023, 51(9): 103- 108. doi: 10.3969/j.issn.1001-3881.2023.09.017
	CAO X G, WU K X, ZHANG M Y, et al. Coal mine equipment maintenance knowledge named entity recognition model based on BERT. Machine Tool & Hydraulics, 2023, 51(9): 103- 108. doi: 10.3969/j.issn.1001-3881.2023.09.017
14	王向前, 李敏敏, 孟祥瑞. 基于ALBERT-BiLSTM-CRF的煤矿事故案例文本命名实体识别方法. 阜阳师范大学学报(自然科学版), 2022, 39(3): 56- 64. doi: 10.14096/j.cnki.cn34-1069/n/2096-9341(2022)03-0056-09
	WANG X Q, LI M M, MENG X R. Named entity recognition method of coal mine accident case text based on ALBERT-BiLSTM-CRF. Journal of Fuyang Normal University(Natural Science), 2022, 39(3): 56- 64. doi: 10.14096/j.cnki.cn34-1069/n/2096-9341(2022)03-0056-09
15	刘飞翔, 李泽荃, 赵嘉良, 等. 基于ERNIE-BiGRU-CRF模型的煤矿安全隐患命名实体智能识别研究. 煤炭工程, 2024, 56(2): 206- 212. doi: 10.11799/ce202402030
	LIU F X, LI Z Q, ZHAO J L, et al. Intelligent recognition of named entities of coal mine safety hidden danger based on ERNIE-BiGRU-CRF model. Coal Engineering, 2024, 56(2): 206- 212. doi: 10.11799/ce202402030
16	付燕, 刘致豪, 叶鸥. 基于煤矿井下不安全行为知识图谱构建方法. 工矿自动化, 2024, 50(1): 88- 95. doi: 10.13272/j.issn.1671-251x.2023060014
	FU Y, LIU Z H, YE O. A method for constructing a knowledge graph of unsafe behaviors in coal mines. Journal of Mine Automation, 2024, 50(1): 88- 95. doi: 10.13272/j.issn.1671-251x.2023060014
17	黄辉, 张雪. 煤矿员工不安全行为研究综述. 煤炭工程, 2018, 50(6): 123- 127. doi: 10.11799/ce201806035
	HUANG H, ZHANG X. Review of research on unsafe behavior of miners. Coal Engineering, 2018, 50(6): 123- 127. doi: 10.11799/ce201806035
18	李红霞, 樊欣怡. 人因视角下国内煤矿安全领域研究现状与发展趋势. 煤炭工程, 2022, 54(1): 181- 186. doi: 10.11799/ce202201033
	LI H X, FAN X Y. Status and development trend of coal mine safety research from the perspective of human factors. Coal Engineering, 2022, 54(1): 181- 186. doi: 10.11799/ce202201033
19	隗昊, 刁宏悦, 孔亮宸, 等. 东北亚舆情文本细粒度命名实体识别方法研究. 计算机工程, 2024, 50(5): 354- 362. doi: 10.19678/j.issn.1000-3428.0068955
	WEI H, DIAO H Y, KONG L C, et al. Research on fine-grained named-entity-recognition method for public-opinion texts in Northeast Asia. Computer Engineering, 2024, 50(5): 354- 362. doi: 10.19678/j.issn.1000-3428.0068955
20	LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL]. [2024-04-20]. https://arxiv.org/pdf/1907.11692.
21	JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA, ACL Press, 2019: 232-241.
22	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL]. [2024-04-20]. https://arxiv.org/pdf/1412.6572.
23	CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL]. [2024-04-20]. https://arxiv.org/pdf/1406.1078.
24	SU J L, MURTADHA A, PAN S F, et al. Global pointer: novel efficient span-based approach for named entity recognition[EB/OL]. [2024-04-20]. http://arxiv.org/abs/2208.03054.
25	HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. [2024-04-20]. https://arxiv.org/abs/1508.01991.
26	关斯琪, 董婷婷, 万子敬, 等. 基于BERT-CRF模型的火灾事故案例实体识别研究. 消防科学与技术, 2023, 42(11): 1529- 1534. doi: 10.3969/j.issn.1009-0029.2023.11.014
	GUAN S Q, DONG T T, WAN Z J, et al. Fire accident case named entity recognition based on BERT-CRF model. Fire Science and Technology, 2023, 42(11): 1529- 1534. doi: 10.3969/j.issn.1009-0029.2023.11.014
27	谢腾, 杨俊安, 刘辉. 基于BERT-BiLSTM-CRF模型的中文实体识别. 计算机系统应用, 2020, 29(7): 48- 55. doi: 10.15888/j.cnki.csa.007525
	XIE T, YANG J A, LIU H. Chinese entity recognition based on BERT-BiLSTM-CRF model. Computer Systems & Applications, 2020, 29(7): 48- 55. doi: 10.15888/j.cnki.csa.007525
28	王权于, 李振华, 涂志鹏, 等. 基于BERT-BiGRU-CRF模型的岩土工程实体识别. 地球科学, 2023, 48(8): 3137- 3150. doi: 10.3799/dqkx.2022.462
	WANG Q Y, LI Z H, TU Z P, et al. Geotechnical named entity recognition based on BERT-BiGRU-CRF Model. Earth Science, 2023, 48(8): 3137- 3150. doi: 10.3799/dqkx.2022.462
29	LOU Q F, WANG S T, CHEN J H, et al. Named entity recognition of traditional Chinese medicine cases based on RoBERTa-BiLSTM-CRF[C]//Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine. Washington D. C., USA: IEEE Press, 2023: 4609-4614.

[1]	LI Qiang, TAN Xingyi, ZHENG Wei, LIU Zhen, YANG Wenhai. Graph Neural Network Inference Optimization Based on Adversarial Training and Contrastive Representation Distillation [J]. Computer Engineering, 2026, 52(1): 126-135.
[2]	ZHANG Jiacheng, WEI Jin, CHEN Yishi. Improved YOLOv8 Real-time Lightweight Robust Hedge Detection Algorithm [J]. Computer Engineering, 2025, 51(7): 362-374.
[3]	YANG Junhui, LI Sujin. Chinese Named Entity Recognition Integrating Positional and Entity Category Information [J]. Computer Engineering, 2025, 51(3): 113-121.
[4]	GUO Huayi, YOU Jinguo, GENG Qiqi, TAO Jingmei, YI Jianhong. Complex Entity Relation Extraction Method for Copper-Based Composite Material Literatures [J]. Computer Engineering, 2025, 51(11): 100-111.
[5]	LIN Shuobin, CAI Jieyi, FANG Xiaocheng, ZHANG Zheng, LU Guangming, CHEN Bingzhi. Adversarial Robust Distillation Method Based on Intensity Correlation Regularization Learning [J]. Computer Engineering, 2025, 51(1): 42-50.
[6]	DANG Xiaochao, LIU Jian, DONG Xiaohui, ZHU Zhongyan, LI Fenfang. Named Entity Recognition of Mechanical Equipment Failure for Imbalanced Data [J]. Computer Engineering, 2024, 50(9): 104-112.
[7]	WANG Yanguo, LÜ Pengyuan, LAN Jinjiang, LIU Mingzhe, QIN Guanjun, ZHANG Shuohua, ZHOU Yu. Wind Turbine Fault Classification Method Based on Adversarial Training and Transformer [J]. Computer Engineering, 2024, 50(9): 377-384.
[8]	Huaqing ZHANG, Zhangtao XIA, Xiaoqing LU, Jijun TONG. Named Entity Recognition of Vascular Surgery Based on Glyph Features [J]. Computer Engineering, 2024, 50(8): 13-21.
[9]	Han CHEN, Chunlei ZHAO, Haoda JIANG, Chundong WANG. Research on App User Intent Recognition Based on Fusion Model and Semantic Network [J]. Computer Engineering, 2024, 50(8): 50-63.
[10]	Huayu LI, Zhikang ZHANG, Yang YAN, Yang YUE. Enhanced Domain Multi-modal Entity Recognition Based on Knowledge Graph [J]. Computer Engineering, 2024, 50(8): 31-39.
[11]	WEI Zhuoyi, LUO Mai, LI Wenbing, ZENG Yuansong, YU Weijiang, YANG Yuedong. Intelligent Single-Cell Classification Based on Multisource Domain Adaptation [J]. Computer Engineering, 2024, 50(6): 48-55.
[12]	ZHOU Zhaochen, FANG Qingmao, WU Xiaohong, HU Ping, HE Xiaohai. Machine Reading Comprehension Model Based on MacBERT and Adversarial Training [J]. Computer Engineering, 2024, 50(5): 41-50.
[13]	Hao WEI, Hongyue DIAO, Liangchen KONG, Yaochen DENG. Research on Fine-grained Named-Entity-Recognition Method for Public-Opinion Texts in Northeast Asia [J]. Computer Engineering, 2024, 50(5): 354-362.
[14]	WANG Minghu, SHI Zhikui, SU Jia, ZHANG Xinsheng. Sequence Recommendation Method Based on RoBERTa and Graph-Enhanced Transformer [J]. Computer Engineering, 2024, 50(4): 121-131.
[15]	Wei LIU, Lei MA, Kai LI, Rong LI. Chinese Medical Named Entity Recognition Based on Multi-Granularity Glyph Enhancement [J]. Computer Engineering, 2024, 50(2): 337-344.

Please choose a citation manager

Content to export