基于BiLSTM-CRF的商情实体识别模型

doi:10.19678/j.issn.1000-3428.0052810

计算机工程 ›› 2019, Vol. 45 ›› Issue (5): 308-314. doi: 10.19678/j.issn.1000-3428.0052810

基于BiLSTM-CRF的商情实体识别模型

张应成¹,杨洋²,蒋瑞^3,4,全兵⁵,张利君³,任晓雷⁶

1.四川大学计算机学院,成都 610065; 2.四川省计算机研究院,成都 610041; 3.成都瑞贝英特信息技术有限公司,成都 610041; 4.四川智仟科技有限公司,成都 610041; 5.中移(苏州)软件技术有限公司,江苏苏州 215000; 6.四川黑马数码科技有限公司,四川泸州 646000

收稿日期:2018-10-08 出版日期:2019-05-15 发布日期:2019-05-15
作者简介:张应成(1994—),男,硕士研究生,主研方向为自然语言处理、人工智能;杨洋、蒋瑞、全兵、张利君,工程师、硕士;任晓雷,工程师。
基金资助:
四川省科技计划项目(18PTDJ0085,2019YFH0075,2018GZDZX0030);泸州市科技计划项目(2017CDLZ-G25)。

Commercial intelligence entity recognition model based on BiLSTM-CRF

ZHANG Yingcheng¹,YANG Yang²,JIANG Rui^3,4,QUAN Bing⁵,ZHANG Lijun³,REN Xiaolei⁶

1.College of Computer Science,Sichuan University,Chengdu 610065,China; 2.Sichuan Institute of Computer Sciences,Chengdu 610041,China; 3.Chengdu Ruibeiyingte Information Technology Co.,Ltd.,Chengdu 610041,China; 4.Sichuan Zhiqian Science and Technology Co.,Ltd.,Chengdu 610041,China; 5.China Mobile(Suzhou) Software Technolgy Co.,Ltd.,Suzhou,Jiangsu 215000,China; 6.Sichuan Heima Digital Technology Co.,Ltd.,Luzhou,Sichuan 646000,China

Received:2018-10-08 Online:2019-05-15 Published:2019-05-15

摘要/Abstract

摘要：

结合语言模型条件随机场(CRF)和双向长短时记忆(BiLSTM)网络,构建一种BiLSTM-CRF模型,以提取商情文本序列中的招标人、招标代理以及招标编号3类实体信息。将规范化后的招标文本序列按字进行向量化,利用BiLSTM神经网络获取序列化文本的前向、后向文本特征,并通过CRF提取出双向本文特征中相应的实体。实验结果表明,与传统机器学习算法CRF相比,该模型3类实体的精确率、召回率和F1值平均提升15.21%、12.06%和13.70%。

关键词: 条件随机场, 双向长短时记忆网络, 语言模型, 命名实体识别, 深度学习

Abstract:

A BiLSTM-CRF model is constructed by combining the Conditional Random Field(CRF) model of Bidirectional Long Short-Term Memory(BiLSTM) network to extract three kinds of entity information,tenderer,bidding agent and bidding number,in a commercial text sequence.The normalized bidding text sequence is vectorized by word.The forward and backward text features of the serialized text are obtained by BiLSTM neural network,and the corresponding entities in the two-way text features are extracted by CRF.Experimental results show that compared with the traditional machine learning algorithm CRF,the precision,recall rate and F1 value of the three types of entities in the proposed model are improved by 15.21%,12.06% and 13.70% in average,respectively.

Key words: Conditional Random Field(CRF), Bidirectional Long Short-Term Memory(BiLSTM) network, language model, Named Entity Recognition(NER), deep learning

中图分类号:

TP391

张应成,杨洋,蒋瑞,全兵,张利君,任晓雷. 基于BiLSTM-CRF的商情实体识别模型[J]. 计算机工程, 2019, 45(5): 308-314.

ZHANG Yingcheng,YANG Yang,JIANG Rui,QUAN Bing,ZHANG Lijun,REN Xiaolei. Commercial intelligence entity recognition model based on BiLSTM-CRF[J]. Computer Engineering, 2019, 45(5): 308-314.

https://www.ecice06.com/CN/Y2019/V45/I5/308

参考文献

［1］ZHANG Lei,ZHANG Yi.Big data analysis by infinite deep neural networks［J］.Journal of Computer Research and Development,2016,53(1):68-79.
［2］BOSCO A,LAGAN D,MUSMANNO R,et al.Modeling and solving the mixed capacitated general routing problem［J］.Optimization Letters,2013,7(7):1451-1469.
［3］MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distri-buted representations of words and phrases and their compositionality［J］.Advances in Neural Information Processing Systems,2013,26:3111-3119.
［4］CHIU J P C,NICHOLS E.Namedentity recognition with bidirectional LSTM-CNNs［EB/OL］.［2018-09-30］.https://arxiv.org/pdf/1511.08308.pdf.
［5］ZHANG Suxiang,WANG Xiaojie.Automatic recognition of Chinese organization name based on conditional random fields［C］//Proceedings of International Conference on Natural Language Processing and Knowledge Engineering.Washington D.C.,USA:IEEE Press,2007:229-233.
［6］BORTHWICK A E.A maximum entropy approach to named entity recognition［D］.New York,USA:New York University,1999.
［7］BIKEL D M,MILLER S,SCHWARTZ R,et al.Nymble:a high-performance learning name-finder［C］//Proceedings of the 15th Conference on Applied Natural Language Processing.Washington D.C.,USA:IEEE Press,1997:194-201.
［8］ASAHARA M,MATSUMOTO Y.Japanese named entity extraction with redundant morphological analysis［C］//Proceedings of NAACL’03.Stroudsburg,USA:Association for Computational Linguistics,2003:8-15.
［9］MCCALLUM A,LI Wei.Early results for named entity recognition with conditional random fields,feature induction and Web-enhanced lexicons［C］//Proceedings of CONLL’03.Stroudsburg,USA:Association for Computational Linguistics,2003:188-191.
［10］CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation［EB/OL］.［2018-09-30］.http://anthology.aclweb.org/D/D14/D14-1179.pdf.
［11］SANTOS C N D,GATTIT M.Deep convolutional neural networks for sentiment analysis of short texts［EB/OL］.［2018-09-30］.http://www.aclweb.org/anthology/C14-1008.
［12］LI Jiwei,GALLEY M,BROCKETT C,et al.A diversity-promoting objective function for neural conversation models［EB/OL］.［2018-09-25］.https://arxiv.org/pdf/1510.03055.pdf.
［13］KARJALA T W,HIMMELBLAU D M,MIIKKULAINEN R.Data rectification using recurrent (Elman) neural networks［C］//Proceedings of International Joint Confe-rence on Neural Networks.Washington D.C.,USA:IEEE Press,1992:901-906.
［14］GRAVES A.Long short-term memory［M］//GRAVES A.Supervised sequence labelling with recurrent neural networks.Berlin,Germany:Springer,2012:1735-1780.
［15］ZHOU Guobing,WU Jianxin,ZHANG Chenlin,et al.Minimal gated unit for recurrent neural networks［J］.International Journal of Automation and Computing,2016,13(3):226-234.
［16］MIKOLOV T,CHEN Kai,CORRADO G,et al.Efficient estimation of word representations in vector space［EB/OL］.［2018-09-10］.http://export.arxiv.org/pdf/1301.3781.
［17］BENGIO Y,SCHWENK H,SENCAL J S,et al.Neural probabilistic language models［J］.Journal of Machine Learning Research,2001,3(6):1137-1155.
［18］MIKOLOV T,KARAFIT M,BURGET L,et al.Recurrent neural network based language model［EB/OL］.［2018-09-25］.http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf.
［19］MIKOLOV T,ZWEIG G.Context dependent recurrent neural network language model［C］//Proceedings of 2012 IEEE Spoken Language Technology Workshop.Washington D.C.,USA:IEEE Press,2012:234-239.
［20］MIKOLOV T,DEORAS A,POVEY D,et al.Strategies for training large scale neural network language models［C］//Proceedings of 2011 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2011:196-201.
［21］GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures［J］.Neural Networks,2005,18(5):602-610.
［22］HOCHREITER S,SCHMIDHUBER J.Long short-term memory［J］.Neural Computation,1997,9(8):1735-1780.
［23］GRAVES A,JAITLY N,MOHAMED A R .Hybrid speech recognition with deep bidirectional LSTM［C］//Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2013:273-278.
［24］BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate［EB/OL］.［2018-09-15］.https://arxiv.org/pdf/1409.0473.pdf.
［25］RATNAPARKHI A.A maximum entropy model for part-of-speech tagging［C］//Proceedings of Conference on Empirical Methods in Natural Language Processing.Washington D.C.,USA:IEEE Press,1996:133-142.
［26］MCCALLUM A,FREITAG D,PEREIRA F C N.Maximum entropy Markov models for information extraction and segmentation［C］//Proceedings of the 17th International Conference on Machine Learning.Washington D.C.,USA:IEEE Press,2000:591-598.
［27］LAFFERTY J D,MCCALLUM A,PEREIRA F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data［C］//Proceedings of the 18th International Conference on Machine Learning.［S.l.］:Morgan Kaufmann Publishers Inc.,2001:282-289.

[1]	罗焕坤, 葛一烽, 刘帅. 大语言模型在数学推理中的研究进展[J]. 计算机工程, 2024, 50(9): 1-17.
[2]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[3]	屈潇雅, 李兵, 温立强. 面向行政执法案件文本的事件抽取研究[J]. 计算机工程, 2024, 50(9): 63-71.
[4]	党小超, 刘涧, 董晓辉, 祝忠彦, 李芬芳. 面向不平衡数据的机械设备故障命名实体识别[J]. 计算机工程, 2024, 50(9): 104-112.
[5]	杨冬菊, 黄俊涛. 基于大语言模型的中文科技文献标注方法[J]. 计算机工程, 2024, 50(9): 113-120.
[6]	朱凯, 李理, 张彤, 江晟, 别一鸣. 基于Transformer的多阶段运动模糊图像修复网络[J]. 计算机工程, 2024, 50(9): 276-285.
[7]	张天鹏, 韩晶, 吕学强. 基于多任务学习的超分辨率辅助小目标检测[J]. 计算机工程, 2024, 50(9): 304-312.
[8]	高煜宝, 文志诚. 基于注意力机制的双路解码器图像去噪方法[J]. 计算机工程, 2024, 50(9): 324-332.
[9]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[10]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[11]	张亚洲, 和玉, 戎璐, 王祥凯. 基于上下文知识增强型Transformer网络的抑郁检测[J]. 计算机工程, 2024, 50(8): 75-85.
[12]	高伟, 李帅龙, 茆琳, 王磊, 李颖颖, 韩林. 一种基于TVM的算子生成加速策略[J]. 计算机工程, 2024, 50(8): 353-362.
[13]	陈宇航, 杨勇, 先木斯亚·买买提明, 帕力旦·吐尔逊, 樊小超, 任鸽, 刁宇峰. 基于主题感知和语义增强的作文自动评分方法[J]. 计算机工程, 2024, 50(8): 363-371.
[14]	王宇, 祁琦, 王纯, 许才. 储能变流器信号高精度故障诊断方法[J]. 计算机工程, 2024, 50(8): 389-396.
[15]	杨兴睿, 马斌, 李森垚, 钟忺. 基于大语言模型的教育文本幂等摘要方法[J]. 计算机工程, 2024, 50(7): 32-41.

选择文件类型/文献管理软件名称

选择包含的内容

基于BiLSTM-CRF的商情实体识别模型

Commercial intelligence entity recognition model based on BiLSTM-CRF

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于BiLSTM-CRF的商情实体识别模型

Commercial intelligence entity recognition model based on BiLSTM-CRF

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价