基于多注意力CNN的问题相似度计算模型

doi:10.19678/j.issn.1000-3428.0052098

计算机工程 ›› 2019, Vol. 45 ›› Issue (9): 284-290. doi: 10.19678/j.issn.1000-3428.0052098

基于多注意力CNN的问题相似度计算模型

冯兴杰, 张乐, 曾云泽

中国民航大学计算机科学与技术学院, 天津 300300

收稿日期:2018-07-13 修回日期:2018-08-19 出版日期:2019-09-15 发布日期:2019-09-03
作者简介:冯兴杰(1969-),男,教授、博士,主研方向为智能信息处理、智能算法;张乐(通信作者)、曾云泽,硕士研究生
基金资助:
国家自然科学基金青年科学基金（61301245，61201414）；赛尔网络下一代互联网技术创新项目（NGII20160605）。

Question Similarity Calculation Model Based on Multi-Attention CNN

FENG Xingjie, ZHANG Le, ZENG Yunze

College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China

Received:2018-07-13 Revised:2018-08-19 Online:2019-09-15 Published:2019-09-03

摘要/Abstract

摘要： 在智能客服问答系统中，用户所提问句具有咨询意图复杂、上下文相关性弱以及口语化等特点，导致问句相似度计算的准确率不高，出现答非所问的情况。提出一种基于卷积神经网络的相似度计算模型MA-CNN。通过2个不同的注意力机制，同时关注词汇间的语义信息和句子间的整体语义信息，提高智能客服对问题的理解能力。实验结果表明，与基于词向量和基于循环神经网络的模型相比，MA-CNN模型对问句的辨识能力更强，其F1值最高可达0.501。

关键词: 智能客服, 文本相似度, 词语语义, 句子语义, 卷积神经网络, 注意力机制

Abstract: In the intelligent customer service question-answering system,the questions asked by users are characterized by complex consultation intention,weak contextual relevance,and serious colloquialization.As a result,the accuracy of question similarity calculation is not high and irrelevant answer occurs.In order to solve these problems,a similarity calculation model MA-CNN based on Convolutional Neural Network (CNN) is proposed.Focusing on the semantic information between words and the overall semantic information between sentences through two different attention mechanisms,the problem understanding ability of the intelligent customer service can be improved.Experimental results show that,compared with the models based on word vector and Recurrent Neural Network(RNN),the MA-CNN model has stronger ability to identify questions,and its F1 value can reach up to 0.501.

Key words: intelligent customer service, text similarity, semantic of words, semantic of sentence, Convolutional Neural Network(CNN), attention mechanism

中图分类号:

TP391

冯兴杰, 张乐, 曾云泽. 基于多注意力CNN的问题相似度计算模型[J]. 计算机工程, 2019, 45(9): 284-290.

FENG Xingjie, ZHANG Le, ZENG Yunze. Question Similarity Calculation Model Based on Multi-Attention CNN[J]. Computer Engineering, 2019, 45(9): 284-290.

https://www.ecice06.com/CN/Y2019/V45/I9/284

图/表 9

20190912190026

20190912190029

20190912190032

20190912190035

20190912190040

20190912190043

20190912190045

20190912190048

20190912190051

参考文献

[1] KANG Longbiao,HU Baotian,WU Xiangping,et al.A short texts matching method using shallow features and deep features[C]//Proceedings of Natural Language Processing and Chinese Computing.Berlin,Germany:Springer,2014:150-159.
[2] WANG Hao,LU Zhengdong,LI Hang,et al.A dataset for research on short-text conversations[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Stroudsburg,USA:Association for Computational Linguistics,2013:935-945.
[3] KENTER T,RIJKE M D.Short text similarity with word embeddings[C]//Proceedings of ACM International on Conference on Information and Knowledge Management.New York,USA:ACM Press,2015:1411-1420.
[4] JIJKOUN V,RIJKE M D.Recognizing textual entailment using lexical similarity[EB/OL].[2018-07-01].http://u.cs.biu.ac.il/~nlp/RTE1/Proceedings/jijkoun_and_de_rijke.pdf.
[5] ISLAM A,INKPEN D.Semantic text similarity using corpus-based word similarity and string similarity[J].ACM Transactions on Knowledge Discovery from Data,2008,2(2):1-25.
[6] KUSNER M J,SUN Yu,KOLKIN N I,et al.From word embeddings to document distances[C]//Proceedings of the 32nd International Conference on Machine Learning.New York,USA:ACM Press,2015:957-966.
[7] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J].Nature,1986,323(6088):399-421.
[8] MIKOLOV T,CHEN Kai,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].[2018-07-01].https://arxiv.org/abs/1301.3781.
[9] MIKOLOV T,SUTSKEVER I,CHEN Kai,et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems.New York,USA:ACM Press,2013:3111-3119.
[10] BARONI M,DINU G,KRUSZEWSKI G.Don't count,predict! a systematic comparison of context-counting vs.context-predicting semantic vectors[C]//Proceedings of Meeting of the Association for Computational Linguistics.Stroudsburg,USA:Association for Computational Linguistics,2014:238-247.
[11] 黄江平,姬东鸿.基于句子语义距离的释义识别研究[J].四川大学学报(工程科学版),2016,48(6):202-207.
[12] 高云龙,左万利,王英,等.基于稀疏自学习卷积神经网络的句子分类模型[J].计算机研究与发展,2018,55(1):179-187.
[13] 张琦,彭志平.融合注意力机制和CNN-GRNN模型的读者情绪预测[J].计算机工程与应用,2018,54(13):168-174.
[14] CHEN Kai,WANG Jiang,CHEN Liangchieh,et al.ABC-CNN:an attention based convolutional neural network for visual question answering[EB/OL].[2018-07-01].https://arxiv.org/pdf/1511.05960.pdf.
[15] XIAO Tianjun,XU Yichong,YANG Kuiyuan,et al.The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:842-850.
[16] 徐俊.基于视觉的文本生成方法研究[D].合肥:中国科学技术大学,2018.
[17] 李青山.基于注意力选择机制的图像分割与场景理解[D].上海:上海交通大学,2012.
[18] CAO Chunshui,LIU Xianming,YANG Yi,et al.Look and think twice:capturing top-down visual attention with feedback convolutional neural networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:2956-2964.
[19] BAHDANAU D,CHO K,BENGIO Y.Neural machine trans-lation by jointly learning to align and translate[EB/OL].[2018-07-01].https://arxiv.org/pdf/1409.0473v6.pdf.
[20] 刘宇鹏,马春光,张亚楠.深度递归的层次化机器翻译模型[J].计算机学报,2017,40(4):861-871.
[21] BOWMAN S R,POTTS C,MANNING C D.Recursive neural networks can learn logical semantics[EB/OL].[2018-07-01].https://arxiv.org/pdf/1406.1827.pdf.
[22] ROCKTÄSCHEL T,GREFENSTETTE E,HERMANN K M,et al.Reasoning about entailment with neural attention[EB/OL].[2018-07-01].https://arxiv.org/pdf/1509.06664.pdf.
[23] 冯兴杰,张志伟,史金钏.基于卷积神经网络和注意力模型的文本情感分析[J].计算机应用研究,2018,35(5):1434-1436.
[24] YIN WENPENG,KANN K,YU Mo,et al.Comparative study of cnn and rnn for natural language processing[EB/OL].[2018-07-01].https://arxiv.org/pdf/1702.01923.pdf.
[25] KIM Y.Convolutional neural networks for sentence classification[EB/OL].[2018-07-01].https://arxiv.org/pdf/1408.5882.pdf.

选择文件类型/文献管理软件名称

选择包含的内容

基于多注意力CNN的问题相似度计算模型

Question Similarity Calculation Model Based on Multi-Attention CNN

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	李俊俊, 董建刚, 李坤. 基于Kubernetes的集群节能策略研究[J]. 计算机工程, 2024, 50(9): 82-91.
[2]	林畅, 郭伟, 任哲聪, 金海波. 基于Transformer的目标跟踪与分割统一算法[J]. 计算机工程, 2024, 50(9): 130-141.
[3]	李泽霖, 吕兆峰, 陈富强, 李克. 基于多跳信息融合的实体对齐模型[J]. 计算机工程, 2024, 50(9): 142-152.
[4]	王汝英, 马嘉骏, 董建强, 刘万龙, 张海涛, 尹凯, 赵博超. 基于MTS-BiGRU-DMHSA的工业负荷预测方法[J]. 计算机工程, 2024, 50(9): 169-178.
[5]	张鲁, 田春伟, 宋焕生, 刘侍刚. 用于低剂量CT图像去噪的多级双树复小波网络[J]. 计算机工程, 2024, 50(9): 266-275.
[6]	朱凯, 李理, 张彤, 江晟, 别一鸣. 基于Transformer的多阶段运动模糊图像修复网络[J]. 计算机工程, 2024, 50(9): 276-285.
[7]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[8]	王志浩, 钱沄涛. 基于Swin Transformer的双流遥感图像时空融合超分辨率重建[J]. 计算机工程, 2024, 50(9): 33-45.
[9]	张天鹏, 韩晶, 吕学强. 基于多任务学习的超分辨率辅助小目标检测[J]. 计算机工程, 2024, 50(9): 304-312.
[10]	郭敏, 张熙涵, 李阳. 融合注意力的教师互一致性半监督医学图像分割[J]. 计算机工程, 2024, 50(9): 313-323.
[11]	高煜宝, 文志诚. 基于注意力机制的双路解码器图像去噪方法[J]. 计算机工程, 2024, 50(9): 324-332.
[12]	曾钰琦, 刘博, 钟柏昌, 钟瑾. 智慧教育下基于改进YOLOv8的学生课堂行为检测算法[J]. 计算机工程, 2024, 50(9): 344-355.
[13]	饶日昕, 王怡文, 曾砺志, 童心恬, 赵海涛. 面向废旧电缆检测的轻量化网络模型[J]. 计算机工程, 2024, 50(8): 22-30.
[14]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[15]	王蕾, 党时鹏, 潘丰. 基于卷积神经网络的隐匿性旁路预测模型[J]. 计算机工程, 2024, 50(8): 40-49.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于多注意力CNN的问题相似度计算模型

Question Similarity Calculation Model Based on Multi-Attention CNN

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价