作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (2): 94-101. doi: 10.19678/j.issn.1000-3428.0068433

• 人工智能与模式识别 • 上一篇    下一篇

基于知识库问答的回答生成研究

饶东宁1, 许正辉1, 梁瑞仕2,*()   

  1. 1. 广东工业大学计算机学院, 广东 广州 510000
    2. 电子科技大学中山学院计算机学院, 广东 中山 528400
  • 收稿日期:2023-09-21 出版日期:2025-02-15 发布日期:2024-04-09
  • 通讯作者: 梁瑞仕
  • 基金资助:
    广东省自然科学基金面上项目(2021A1515012556); 中山市重大科技专项(2021A1003); 中山市重大科技专项(2023AJ002); 广东省企业科技特派员专项(GDKTP2021025700); 广东省本科一流课程资助项目(YLKC202202)

Research on Answer Generation Based on Knowledge Base Question Answering

RAO Dongning1, XU Zhenghui1, LIANG Ruishi2,*()   

  1. 1. School of Computer Science, Guangdong University of Technology, Guangzhou 510000, Guangdong, China
    2. School of Computer Science, University of Electronic Science and Technology of China, Zhongshan Institute, Zhongshan 528400, Guangdong, China
  • Received:2023-09-21 Online:2025-02-15 Published:2024-04-09
  • Contact: LIANG Ruishi

摘要:

知识库问答旨在利用事先构建好的知识库来回答用户提出的问题。现有的知识库问答研究主要通过对候选实体和关系路径进行排序, 最后将三元组的尾实体作为答案返回。用户给出的问题经过实体识别模型和实体消歧模型之后, 可以链接到知识库中与答案相关的候选实体。利用语言模型的生成能力, 可以将答案拓展为一句话并返回, 这对用户而言是更加友好的。为了提高模型的泛化能力和弥补问题文本与结构化知识之间的差别, 将候选实体及其一跳关系子图通过提示模板进行组织输入到生成模型中, 并在回答模板的引导下生成通俗流畅的回答。在NLPCC 2016 CKBQA和KgCLUE两个中文数据集上的实验结果表明: 该方法在BLEU、METEOR和ROUGE指标上分别平均比BART-large模型提高了2.8、2.3和1.5百分点; 在Perplexity指标上, 该方法与ChatGPT的回答表现相当。

关键词: 知识库问答, 提示, 实体链接, 预训练模型, 回答生成

Abstract:

Knowledge base question answering aims to use pre-constructed knowledge bases to answer questions raised by users. Existing knowledge base question answering research sorts candidate entities and relationship paths and finally returns the tail entity of the triple as the answer. After the questions provided by the user pass through the entity recognition and entity disambiguation models, they can be linked to candidate entities related to the answers in the knowledge base. Using the generation capability of the language model, the answer can be expanded into a sentence and returned, which is more user-friendly. To improve the generalization ability of the model and compensate for the difference between the question text and structured knowledge, candidate entities and their one-hop relationship subgraphs are organized and input into the generation model through a prompt template, and a popular and fluent text is generated under the guidance of the answer template. Experimental results on the NLPCC 2016 CKBQA and KgCLUE Chinese datasets indicated that on average, the proposed method outperforms the BART-large model by 2.8, 2.3, and 1.5 percentage points on the Bilingual Evaluation Understudy (BLEU), Metric for Evaluation of Translation with Explicit Ordering (METEOR), and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) series metrics, respectively. For the Perplexity metric, the method performs comparably to the ChatGPT responses.

Key words: knowledge base question answering, prompt, entity linking, pre-training language model, answer generation