Research on Answer Generation Based on Knowledge Base Question Answering

doi:10.19678/j.issn.1000-3428.0068433

Abstract

Abstract: Knowledge base question answering aims to use pre-constructed knowledge bases to answer questions raised by users. Existing knowledge base question answering research mainly sorts candidate entities and relationship paths, and finally returns the tail entity of the triple as the answer. After the questions given by the user pass through the entity recognition model and the entity disambiguation model, they can be linked to candidate entities related to the answers in the knowledge base. Using the generation ability of the language model, the answer can be expanded into a sentence and returned, which is more user-friendly. In order to improve the generalization ability of the model and make up for the difference between question text and structured knowledge, candidate entities and their one-hop relationship subgraphs are organized and input into the generation model through prompt template, and a popular and fluent text is generated under the guidance of the answer template. Experimental results on the NLPCC 2016 CKBQA and KgCLUE Chinese datasets indicate that, on average, the proposed method outperformed the BART-large model by 2.8%, 2.3%, and 1.5% on the BLEU, METEOR, and ROUGE series metrics, respectively. On the Perplexity metric, the method performed comparably to ChatGPT's responses.

摘要： 知识库问答旨在利用事先构建好的知识库来回答用户提出的问题。现有的知识库问答研究主要通过对候选实体和关系路径进行排序，最后将三元组的尾实体作为答案返回。用户给出的问题经过实体识别模型和实体消歧模型之后，可以链接到知识库中与答案相关的候选实体。利用语言模型的生成能力，可以将答案拓展为一句话并返回，这对用户而言是更加友好的。为了提高模型的泛化能力和弥补问题文本与结构化知识之间的差别，将候选实体及其一跳关系子图通过提示模板进行组织输入到生成模型中并在回答模板的引导下生成通俗流畅的回答。在NLPCC 2016 CKBQA和KgCLUE两个中文数据集上的实验结果表明，该方法在BLEU、METEOR和ROUGE系列三类指标上分别平均比BART-large模型提高了2.8%、2.3%和1.5%。在Perplexity指标上，该方法与ChatGPT的回答表现相当。

RAO Dongning, XU Zhenghui, Liang Ruishi. Research on Answer Generation Based on Knowledge Base Question Answering[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0068433.

饶东宁, 许正辉, 梁瑞仕. 基于知识库问答的回答生成研究[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0068433.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0068433

References

[1] Deng C, Zeng G, Cai Z, et al. A survey of knowledge based question answering with deep learning[J]. Journal of Artificial Intelligence, 2020, 2(4): 157.
[2] Huang X, Kim J J, Zou B. Unseen entity handling in complex question answering over knowledge base via language generation[C]//Findings of the Association for Computational Linguistics: EMNLP 2021. 2021: 547-557.
[3] Pérez J, Arenas M, Gutierrez C. Semantics and complexity of SPARQL[J]. ACM Transactions on Database Systems (TODS), 2009, 34(3): 1-45.
[4] Ye X, Yavuz S, Hashimoto K, et al. RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022: 6032-6043.
[5] Hu X, Wu X, Shu Y, et al. Logical form generation via multi-task learning for complex question answering over knowledge bases[C]//Proceedings of the 29th International Conference on Computational Linguistics. 2022: 1687-1696.
[6] 何展鹏. 基于知识库标记预训练孪生神经网络的中文实体链接 [J]. 计算机科学与应用 , 2022, 12(4): 1202-1212. https://doi.org/10.12677/CSA.2022.124122
[7] Xue L, Constant N, Roberts A, et al. mT5: A massively multilingual pre-trained text-to-text transformer[J]. arXiv preprint arXiv:2010.11934, 2020.
[8] Rossiello G, Mihindukulasooriya N, Abdelaziz I, et al. Generative relation linking for question answering over knowledge bases[C]//The Semantic Web–ISWC 2021: 20th International Semantic Web Conference, ISWC 2021, Virtual Event, October 24–28, 2021, Proceedings 20. Springer International Publishing, 2021: 321-337.
[9] Duan N. Overview of the nlpcc-iccpol 2016 shared task: open domain chinese question answering[C]//Natural Language Understanding and Intelligent Applications: 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings 24. Springer International Publishing, 2016: 942-948.
[10] Dong G, Li R, Wang S, et al. Bridging the kb-text gap: Leveraging structured knowledge-aware pre-training for kbqa[C]//Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 2023: 3854-3859.
[11] Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551.
[12] Lewis M, Liu Y, Goyal N, et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 7871-7880.
[13] Zhang L, Zhang J, Wang Y, et al. FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering[J]. arXiv preprint arXiv:2306.14722, 2023.
[14] Sevgili Ö, Shelmanov A, Arkhipov M, et al. Neural entity linking: A survey of models based on deep learning[J]. Semantic Web, 2022, 13(3): 527-570.
[15] Chen Y, Wan W, Zhao Y, et al. Generalization performance optimization of KBQA system for Chinese open domain[J]. Multimedia Tools and Applications, 2023: 1-22.
[16] Jiang S, Zhao S, Hou K, et al. A BERT-BiLSTM-CRF model for Chinese electronic medical records named entity recognition[C]//2019 12th international conference on intelligent computation technology and automation (ICICTA). IEEE, 2019: 166-169.
[17] Ravi M P K, Singh K, Mulang I O, et al. CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021:504-514.
[18] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[19] Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35.
[20] Schick T, Schütze H. Exploiting cloze questions for few shot text classification and natural language inference[J]. arXiv preprint arXiv:2001.07676, 2020.
[21] Li X L, Liang P. Prefix-tuning: Optimizing continuous prompts for generation[J]. arXiv preprint arXiv:2101.00190, 2021.
[22] Gao T, Fisch A, Chen D. Making pre-trained language models better few-shot learners[J]. arXiv preprint arXiv:2012.15723, 2020.
[23] Chen Y, Liu Y, Dong L, et al. Adaprompt: Adaptive model training for prompt-based nlp[J]. arXiv preprint arXiv:2202.04824, 2022.
[24] Zhong W, Gao Y, Ding N, et al. ProQA: Structural prompt-based pre-training for unified question answering[J]. arXiv preprint arXiv:2205.04040, 2022.
[25] Lv X, Lin Y, Cao Y, et al. Do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach[C]. Association for Computational Linguistics, 2022.
[26] Tan C, Chen Y, Shao W, et al. Make a Choice! Knowledge Base Question Answering with In-Context Learning[J]. arXiv preprint arXiv:2305.13972, 2023.
[27] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
[28] Sang E F, De Meulder F. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition[J]. arXiv preprint cs/0306050, 2003.
[29] Reimers N, Gurevych I. Sentence-bert: Sentence embeddings using siamese bert-networks[J]. arXiv preprint arXiv:1908.10084, 2019.
[30] Liu A, Huang Z, Lu H, et al. BB-KBQA: BERT-based knowledge base question answering[C]//China National Conference on Chinese Computational Linguistics. Cham: Springer International Publishing, 2019: 81-92.
[31] CCKS 2019 &百度. 2019 中文短文本的实体链指第一名方案 [EB/OL]. https://github.com/panchunguang/ccks_baidu_entity_link
[32] Liu W, Zhou P, Zhao Z, et al. K-bert: Enabling language representation with knowledge graph[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(03): 2901-2908.
[33] Papineni K, Roukos S, Ward T, et al. Bleu: a methodfor automatic evaluation of machine translation[C]//Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002: 311-318.
[34] Lin C Y. Rouge: A package for automatic evaluation of summaries[C]//Text summarization branches out. 2004: 74-81.
[35] Banerjee S, Lavie A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 2005: 65-72.

Please choose a citation manager

Content to export