基于弱依赖信息的知识库问答方法

doi:10.19678/j.issn.1000-3428.0058312

计算机工程 ›› 2021, Vol. 47 ›› Issue (6): 76-82. doi: 10.19678/j.issn.1000-3428.0058312

基于弱依赖信息的知识库问答方法

吴天波¹, 刘露平¹, 罗晓东¹, 卿粼波^1,2, 何小海¹

1. 四川大学电子信息学院, 成都 610065;
2. 无线能量传输教育部重点实验室, 成都 610065

收稿日期:2020-05-13 修回日期:2020-06-16 发布日期:2020-06-17
作者简介:吴天波(1996-),男,硕士研究生,主研方向为自然语言处理;刘露平、罗晓东,博士研究生;卿粼波,副教授;何小海(通信作者),教授。

Knowledge Base Question Answering Method Based on Weak Dependency Information

WU Tianbo¹, LIU Luping¹, LUO Xiaodong¹, QING Linbo^1,2, HE Xiaohai¹

1. College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China;
2. Key Laboratory of Wireless Power Transmission, Ministry of Education, Chengdu 610065, China

Received:2020-05-13 Revised:2020-06-16 Published:2020-06-17
Contact: 国家自然科学基金（61871278）；四川省科技计划项目（2018HH0143）；成都市产业集群协同创新项目（2016-XT00-00015-GX）。 E-mail:nic5602@scu.edu.cn

摘要/Abstract

摘要： 传统自动问答方法通常依赖谓词等先验信息实现知识库问答，需要耗费较多的人力且泛化能力不佳。提出一种针对弱依赖信息的知识库问答方法，结合BERT与BiLSTM-CRF网络提取问句中的命名实体，定位知识库中与该实体相关的三元组信息，通过答案匹配网络为三元组集合中的答案标上相似度分数，使用阈值选择策略选取符合要求的答案集合，并按照相似度分数由高到纸排序后呈现给用户。实验结果表明，该方法弱化了对先验信息的依赖，在减少人工干预的同时保证了问答质量，并且在NLPCC-ICCPOL-2016KBQA数据集上取得了87.05%的F1分数。

关键词: 弱依赖信息, 知识库问答, 命名实体识别, 答案匹配, 阈值选择

Abstract: Traditional automatic question answering methods mostly rely on priori information such as predicate to realize knowledge base question answering, which is labor-intensive and leads to poor generalization performance.To address the problem, this paper proposes a KBQA method for weak dependency information.The method uses BERT in combination with the BiLSTM-CRF network to extract the named entity in a question.Then, it locates the triple information related to the entity in the knowledge base, and uses the answer matching network to give similarity scores to the answers in the triple set.Finally, it uses the threshold selection strategy to select the answer set that meets the requirements, and the answers are sorted according to the similarity before being presented to the user.Experimental results show that the method weakens the dependence on priori information, and reduces the manual intervention while ensuring the quality of KBQA.It achieves an F1 score of 87.05% on the NLPCC-ICCPOL-2016KBQA data set.

Key words: weak dependency information, knowledge base question answering, named entity recognition, answer matching, threshold selection

中图分类号:

TP391

吴天波, 刘露平, 罗晓东, 卿粼波, 何小海. 基于弱依赖信息的知识库问答方法[J]. 计算机工程, 2021, 47(6): 76-82.

WU Tianbo, LIU Luping, LUO Xiaodong, QING Linbo, HE Xiaohai. Knowledge Base Question Answering Method Based on Weak Dependency Information[J]. Computer Engineering, 2021, 47(6): 76-82.

https://www.ecice06.com/CN/Y2021/V47/I6/76

图/表 15

20210618124253

20210618124256

20210618124300

20210618124304

20210618124307

20210618124311

20210618124314

20210618124319

20210618124322

20210618124326

20210618124330

20210618124333

20210618124337

20210618124341

20210618124347

参考文献

[1] DIEFENBACH D,LOPEZ V,SINGH K,et al.Core techniques of question answering systems over knowledge bases:a survey[J].Knowledge and Information Systems,2018,55(3):529-569.
[2] BERANT J,CHOU A,FROSTIG R,et al.Semantic parsing on freebase from question-answer pairs[C]//Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:ACL Press,2013:1533-1544.
[3] YAO X,DURME B V.Information extraction over structured data:question answering with freebase[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Philadelphia,USA:ACL Press,2014:956-966.
[4] XIE Zhiwen,ZENG Zhao,ZHOU Guangyou,et al.Topic enhanced deep structured semantic models for knowledge base question answering[J].Science China Information Sciences,2017,60(11):28-42.
[5] SUN H,DHINGRA B,ZAHEER M,et al.Open domain question answering using early fusion of knowledge bases and text[EB/OL].[2020-04-02].https://arxiv.org/abs/1809.00782.
[6] YU Jianxing,ZHA Zhengjun,YIN Jian.Inferential machine comprehension:answering questions by recursively deducing the evidence chain from text[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Philadelphia,USA:ACL Press,2019:2241-2251.
[7] HUANG Xiao,ZHANG Jingyuan,LI Dingcheng,et al.Knowledge graph embedding based question answering[C]//Proceedings of the 20th ACM International Conference on Web Search and Data Mining.New York,USA:ACM Press,2019:105-113.
[8] LAI Yuxuan,JIA Yanyan,LIN Yang,et al.A Chinese question answering system for single-relation factoid questions[EB/OL].[2020-04-02].https://arxiv.org/abs/1809.00782.
[9] ZHOU Botong,SUN Chengjie,LIN Lei,et al.LSTM based question answering for large scale knowledge base[J].Journal of Peking University(Natural Science Edition),2018,54(2):286-292.(in Chinese)周博通,孙承杰,林磊,等.基于LSTM的大规模知识库自动问答[J].北京大学学报(自然科学版),2018,54(2):286-292.
[10] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[11] ZHANG Fangrong,YANG Qing.Research on entity relation extraction method in knowledge-based question answering[J].Computer Engineering and Applications,2020,56(11):219-224.(in Chinese)张芳容,杨青.知识库问答系统中实体关系抽取方法研究[J].计算机工程与应用,2020,56(11):219-224.
[12] DUAN Jiangli,HU Xin.Semantic relation recognition for natural language question answering[J].Journal of Shandong University(Engineering Science),2020,50(3):1-7.(in Chinese)段江丽,胡新.自然语言问答中的语义关系识别[J].山东大学学报(工学版),2020,50(3):1-7.
[13] WANG Yue,ZHANG Richong.Question answering over knowledge base using dynamic programming[J].Journal of Zhengzhou University(Science Edition),2019,51(4):37-42.(in Chinese)王玥,张日崇.基于动态规划的知识库问答方法[J].郑州大学学报(理学版),2019,51(4):37-42.
[14] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Philadelphia,USA:ACL Press,2019:4171-4186.
[15] SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
[16] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Philadelphia,USA:ACL Press,2017:5998-6008.
[17] MA X,HOVY E.End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computat-ional Linguistics.New York,USA:ACM Press,2016:1064-1074.
[18] WEN Xiuxiu,MA Chao,GAO Yuanyuan,et al.A Chinese overlapping name entity recognition method based on label clustering[J].Computer Engineering,2020,46(5):41-46.(in Chinese)温秀秀,马超,高原原,等.基于标签聚类的中文重叠命名实体识别方法[J].计算机工程,2020,46(5):41-46.
[19] LOSHCHILOV I,HUTTER F.Decoupled weight decay regularization[C]//Proceedings of International Conference on Learning Representations.Washington D.C.,USA:IEEE Press,2019:1-8.
[20] WenRichard.KBQA-BERT[EB/OL].[2020-04-02].https://github.com/WenRichard/KBQA-BERT.

选择文件类型/文献管理软件名称

选择包含的内容

基于弱依赖信息的知识库问答方法

Knowledge Base Question Answering Method Based on Weak Dependency Information

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	党小超, 刘涧, 董晓辉, 祝忠彦, 李芬芳. 面向不平衡数据的机械设备故障命名实体识别[J]. 计算机工程, 2024, 50(9): 104-112.
[2]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[3]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[4]	隗昊, 刁宏悦, 孔亮宸, 邓耀臣. 东北亚舆情文本细粒度命名实体识别方法研究[J]. 计算机工程, 2024, 50(5): 354-362.
[5]	隗昊, 刁宏悦, 孔亮宸, 邓耀臣. 东北亚舆情文本细粒度命名实体识别方法研究[J]. 计算机工程, 2024, 50(5): 354-362.
[6]	刘威, 马磊, 李凯, 李蓉. 基于多粒度字形增强的中文医学命名实体识别[J]. 计算机工程, 2024, 50(2): 337-344.
[7]	任义, 苏博, 袁帅. 教育领域下多维度特征命名实体识别方法[J]. 计算机工程, 2024, 50(10): 110-118.
[8]	唐卓然, 柳毅. 基于词汇融合和依存关系的中文命名实体识别[J]. 计算机工程, 2024, 50(10): 145-153.
[9]	杨长沛, 廖列法. 基于门控空洞卷积特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(8): 85-95.
[10]	张家熔, 苑津莎, 许珈宁, 罗志宏. 基于多元信息嵌入与协同神经网络的力学实体识别算法[J]. 计算机工程, 2023, 49(7): 125-134.
[11]	陈明, 刘蓉, 张晔. 基于多重注意力机制的中文医疗实体识别[J]. 计算机工程, 2023, 49(6): 314-320.
[12]	朱红, 牛浩然, 朱彤. 基于字词融合与对抗训练的行业人物实体识别[J]. 计算机工程, 2023, 49(5): 56-62.
[13]	毛亮, 赵林均, 余敦辉, 孙斌. 基于知识蒸馏的企业命名实体识别模型[J]. 计算机工程, 2023, 49(5): 90-96.
[14]	李晓腾, 张盼盼, 勾智楠, 高凯. 基于多任务学习的多模态命名实体识别方法[J]. 计算机工程, 2023, 49(4): 114-119.
[15]	廖列法, 谢树松. 基于注意力机制特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(4): 256-262.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于弱依赖信息的知识库问答方法

Knowledge Base Question Answering Method Based on Weak Dependency Information

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献

相关文章 15

编辑推荐

Metrics

本文评价