作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (4): 294-302. doi: 10.19678/j.issn.1000-3428.0067595

• 开发研究与工程应用 • 上一篇    下一篇

基于知识注入提示学习的专利短语相似度计算

邓远飞1, 李加伟1, 蒋运承1,2   

  1. 1. 华南师范大学计算机学院, 广东 广州 510631;
    2. 华南师范大学人工智能学院, 广东 佛山 528225
  • 收稿日期:2023-05-10 修回日期:2023-07-04 发布日期:2023-07-27
  • 通讯作者: 邓远飞,E-mail:dengyf@m.scnu.edu.cn E-mail:dengyf@m.scnu.edu.cn
  • 基金资助:
    国家自然科学基金(61772210, U1911201)。

Similarity Computation of Patent Phrases Based on Knowledge Injection Prompt Learning

DENG Yuanfei1, LI Jiawei1, JIANG Yuncheng1,2   

  1. 1. School of Computer Science, South China Normal University, Guangzhou 510631, Guangdong, China;
    2. School of Artificial Intelligence, South China Normal University, Foshan 528225, Guangdong, China
  • Received:2023-05-10 Revised:2023-07-04 Published:2023-07-27

摘要: 专利是授予发明者在一定时期内保护其发明的法定权利,在当今的社会活动中发挥着重要作用。然而现有研究并未针对专利相似度数据进行适配优化,导致其应用在专利短语相似度匹配任务中效果不佳。已有研究表明,在低资源的场景下,提示学习将文本片段(模板)作为输入,将分类问题转换为掩码语言建模问题,其关键的一步是在标签空间和标签词空间之间构造一个投影。提出一种基于知识注入的提示学习方法,将其应用于专利短语相似度匹配计算任务。为解决专利短语信息不足的问题,利用专利短语中的相似度标签信息,使用知识增强专利短语与标签信息。首先通过实体链接技术建立专利短语与外部知识的关联关系;然后设计一种基于实体影响度的邻域信息过滤机制,用于缓解专利短语信息不足的问题;最后考虑不同外部知识对专利短语相似度计算的影响,设计应用于专利短语的多种增强提示文本。实验结果表明,该方法的Pearson相关系数(PCC)和Spearman相关系数(SRC)相较次优对比方法分别提升6.8%和5.7%。

关键词: 专利短语, 相似度计算, 知识注入, 提示学习, 提示文本

Abstract: A patent is a legal right conferred to inventors to protect their inventions for a limited time, and it plays a crucial role in present-day social activities. Existing research has not optimized the adaptation of patent similarity data, which has negatively affected matching patent phrase similarity. Previous research has shown that in low-resource scenarios, prompt learning uses text fragments (i.e., templates) as input, transforming the classification problem into a mask language modeling problem; here, a key step is to construct a projection between the label space and label word space. This study presents a knowledge-based prompt learning method and applies it to the similarity matching of patent phrases. To solve the problem of insufficient information related to patent phrases, this study uses similarity label information in patent phrases and knowledge to enhance the patent phrases and label information. This study first establishes the relationship between patent phrases and external knowledge using entity-linking technology. The study then designs a neighborhood information filtering mechanism based on the degree of entity influence to expand the problem of insufficient patent phrase information. Finally, based on the effects of different types of external knowledge on the similarity calculation of patent phrases, the study generates a variety of enhanced prompt text applied to patent phrases. Experimental results show that the Pearson Correlation Coefficient (PCC) and Spearman Rank Correlation (SRC) of the proposed method are increased by 6.8% and 5.7%, respectively, as compared with the suboptimal method.

Key words: patent phrase, similarity computation, knowledge injection, prompt learning, prompt text

中图分类号: