作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (3): 315-320. doi: 10.19678/j.issn.1000-3428.0060474

• 开发研究与工程应用 • 上一篇    

基于多任务学习的短文本实体链接方法

詹飞1,2, 朱艳辉1,2, 梁文桐1,2, 张旭1,2, 欧阳康1,2, 孔令巍1,2, 黄雅淋1,2   

  1. 1. 湖南工业大学 计算机学院, 湖南株洲 412008;
    2. 湖南省智能信息感知及处理技术重点实验室, 湖南 株洲 412008
  • 收稿日期:2021-01-04 修回日期:2021-02-23 发布日期:2021-03-02
  • 作者简介:詹飞(1993-),男,硕士研究生,主研方向为自然语言处理、知识工程;朱艳辉(通信作者),教授;梁文桐、张旭、欧阳康、孔令巍、黄雅淋,硕士研究生。
  • 基金资助:
    国家自然科学基金(61871432);湖南省自然科学基金(2020JJ6089);湖南省教育厅基金资助重点项目(19A133)。

Short Text Entity Linking Method Based on Multi-Task Learning

ZHAN Fei1,2, ZHU Yanhui1,2, LIANG Wentong1,2, ZHANG Xu1,2, OUYANG Kang1,2, KONG Lingwei1,2, HUANG Yalin1,2   

  1. 1. School of Computer, Hunan University of Technology, Zhuzhou, Hunan 412008, China;
    2. Hunan Key Laboratory of Intelligent Information Perception and Processing Technology, Zhuzhou, Hunan 412008, China
  • Received:2021-01-04 Revised:2021-02-23 Published:2021-03-02

摘要: 实体链接是明确文本中实体指称的重要手段,也是构建知识图谱的关键技术,在智能问答、信息检索等领域中具有重要作用,但由于短文本的上下文语境不丰富、表达不正式、语法结构不完整等特点,现有的短文本实体链接方法准确率较低。提出一种新的短文本实体链接方法,将多任务学习方法引入短文本实体链接过程中,从而增强短文本实体链接方法的效果。在此基础上,构建多任务学习模型,将短文本实体链接作为主任务,并引入实体分类作为辅助任务,促使模型学习到更加通用的底层表达,提高模型的泛化能力,优化模型在短文本实体链接任务中的表现。在CCKS2020测评任务2提供的数据集上的实验结果表明,辅助任务的引入能够缓解短文本实体链接过程中信息不充分的问题,且该多任务学习模型的F值为0.894 9,优于基于BERT编码器的单任务实体链接模型。

关键词: 短文本实体链接, 多任务学习, 实体分类, 辅助任务, 底层表达

Abstract: Entity linking is an important means to clarify the entity reference in the text, and the rise of the knowledge graph has prompted the development of entity linking technology, including playing an important role in question answering and information retrieval.However, the accuracy of the existing short text entity linking method is low because of the lack of rich context, informal expression and incomplete grammatical structure of short texts.A short new text entity linking method based on multi-task learning is proposed to solve this problem.Specifically, the multi-task learning method is introduced into the short text entity linking process, a multi task learning model is constructed, which takes a short text entity link as the main task and introduces entity classification as an auxiliary task-this alleviates the problem of insufficient information in the process of a short text entity link.On this basis, the model is promoted to learn more general underlying expressions, improve the generalization ability of the model, and optimize the performance of the model in the short text entity link task.The experimental results on the data set provided by the National Knowledge Mapping and Semantic Computing Conference (CCKS 2020) Evaluation Task 2 show that the introduction of auxiliary tasks can alleviate the problem of insufficient information in the process of short text entity linking and the F value of this proposed model reached 0.894 9, which is better than the single task entity linking model based on Bidirectional Encoder Representations from Transformers(BERT).

Key words: short text entity linking, multi-task learning, entity classification, auxiliary task, underlying expression

中图分类号: