基于先验知识引导提示学习的自监督分类法补全

doi:10.19678/j.issn.1000-3428.0068536

摘要/Abstract

摘要：

由于现有各领域种子分类法不完整, 且随着时间的推移, 涌现出大量新的领域术语, 使得各领域种子分类法有待自动补全。现有的自监督分类法补全方法采用图嵌入技术, 并未充分利用预训练语言模型所提供的丰富语义信息, 且只关注图中局部的节点关系, 忽视了整体图结构所蕴含的信息。针对上述问题, 提出一个基于先验知识引导提示学习的自监督分类法补全模型, 该模型融合了预训练语言模型的语义信息和种子分类法的结构信息。根据查询节点在垂直路径上存在粗粒度三元组的特性, 改进自监督数据集构建策略。在大样本情况下, 利用基于预训练和微调模式进行匹配。在微调阶段, 为了加强预训练语言模型对真实上位词的关注, 在提示(prompt)中融入真实上位词的同义词或缩略词的先验知识注意力, 从而更有效地利用prompt来引导预训练模型的微调过程。在匹配阶段, 为了降低时间复杂度, 采用软束搜索规则, 具体来说, 在局部图结构上, 利用prompt指导生成的节点嵌入来评估同级对兄弟节点的查询置信度; 在整体图结构上, 采用垂直路径的游走方法进行路径截取与排序筛选。在小样本情况下, 利用基于提示学习的模式进行匹配, 同时采用不同模板组合和上下文示例去微调预训练语言模型。在4个不同领域的大型公开数据集上进行实验, 结果表明, 相较于对比模型, 该模型的MR、MRR、Hit@10指标分别提升15%、0.057、0.030。

关键词: 分类法补全, 先验知识, 提示学习, 自监督, 预训练语言模型

Abstract:

Owing to the incompleteness of existing seed taxonomies in various fields and the emergence of a considerable number of new domain terms over time, seed taxonomies in various fields must be automatically completed. Existing self-supervised taxonomy completion methods utilize graph embedding technology; however, these methods do not fully utilize the rich semantic information provided by the pre-trained language model; they only focus on the local node relationship in the graph while ignoring the information contained in the overall graph structure. To address the above problems, a self-supervised taxonomy completion model based on prior knowledge-guided prompt learning named Pro-tax is proposed. The model integrates the semantic information of the pre-trained language model and the structural information of the seed taxonomy. First, based on the coarse-grained triplet characteristics of the query node on the vertical path, the building strategy of a self-supervised dataset is improved. Second, the matching based on the pre-training and fine-tuning modes is used for large samples. To strengthen the attention of the pre-trained language model to the true hypernyms during the fine-tuning stage, the prior knowledge attention of the synonyms or abbreviations of the true hypernyms is integrated into the prompt; therefore, the prompt can be used to guide the fine-tuning process of the pre-trained language model more effectively. During the matching stage, soft beam search rules are adopted to reduce time complexity. Specifically, in the local graph structure, the node embedding generated by prompt guidance is used to evaluate the query confidence level of the sibling nodes at the same level, whereas the walk method of vertical paths is used for path interception and sorting filter in the global graph structure. Third, for few-shot, matching based on prompt learning is used; concurrently, different template combinations and in-context demonstration are used to fine-tune the pre-trained language model. Finally, the experimental results on large public datasets in four different domains indicate that compared with the comparative model, the Mean Rank(MR), Mean Reciprocal Rank(MRR), and Hit@10 indicators of Pro-tax increase by 15%, 0.057, and 0.030, respectively.

Key words: taxonomy completion, prior knowledge, prompt learning, self-supervised, pre-trained language model

陈志强, 仇瑜, 朱宇, 王晓英. 基于先验知识引导提示学习的自监督分类法补全[J]. 计算机工程, 2024, 50(12): 151-162.

CHEN Zhiqiang, QIU Yu, ZHU Yu, WANG Xiaoying. Self-Supervised Taxonomy Completion Based on Prior Knowledge-Guided Prompt Learning[J]. Computer Engineering, 2024, 50(12): 151-162.

https://www.ecice06.com/CN/Y2024/V50/I12/151

图/表 10

图1 分类法补全

Fig.1 Taxonomy completion

图2 在多头注意力中引入先验知识的过程

Fig.2 The process of introducing prior knowledge into multi-head attention

图3 Pro-tax匹配流程

Fig.3 Pro-tax matching process

图4 prompt集成

Fig.4 prompt ensembling

图5 小样本下的三元组分类任务准确率

Fig.5 Accuracy of triplet classification task under small sample size

参考文献 34

1	KEJRIWAL M, SELVAM R K, NI C C, et al. Local taxonomy construction: an information retrieval approach using representation learning[EB/OL]. [2023-09-05]. https://link.springer.com/chapter/10.1007/978-3-031-08242-9_6.
2	YU W H, WU L F, DENG Y, et al. Technical question answering across tasks and domains[C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2021: 178-186.
3	ZHANG Y C, AHMED A, JOSIFOVSKI V, et al. Taxonomy discovery for personalized recommendation[C]//Proceedings of the 7th ACM International Conference on Web Search and Data Mining. New York, USA: ACM Press, 2014: 243-252.
4	HEARST M A. Automatic acquisition of hyponyms from large text corpora[C]//Proceedings of the 14th Conference on Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 1992: 539-545.
5	COCOS A, APIDIANAKI M, CALLISON-BURCH C. Comparing constraints for taxonomic organization[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2018: 323-333.
6	SHEN J M, SHEN Z H, XIONG C Y, et al. TaxoExpan: self-supervised taxonomy expansion with position-enhanced graph neural network[EB/OL]. [2023-09-05]. https://arxiv.org/pdf/2001.09522.
7	YU Y, LI Y H, SHEN J M, et al. STEAM: self-supervised taxonomy expansion with mini-paths[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery[WT《Times New Roman》]& Data Mining. New York, USA: ACM Press, 2020: 1026-1035.
8	ZHANG J Y , SONG X C , ZENG Y , et al. Taxonomy completion via triplet matching network. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35 (5): 4662- 4670. doi: 10.1609/aaai.v35i5.16596
9	LIU Z C, XU H Y, WEN Y L, et al. TEMP: taxonomy expansion with dynamic margin loss through taxonomy-paths[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. [S. l. ]: Association for Computational Linguistics, 2021: 3854-3863.
10	ZENG Q K, LIN J F, YU W H, et al. Enhancing taxonomy completion with concept generation via fusing relational representations[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery[WT《Times New Roman》]& Data Mining. New York, USA: ACM Press, 2021: 2104-2113.
11	JIANG M H, SONG X C, ZHANG J Y, et al. TaxoEnrich: self-supervised taxonomy completion via structure-semantic representations[C]//Proceedings of the 2022 ACM Web Conference. New York, USA: ACM Press, 2022: 925-934.
12	WANG S, ZHAO R H, ZHENG Y F, et al. QEN: applicable taxonomy completion via evaluating full taxonomic relations[C]//Proceedings of the 2022 ACM Web Conference. New York, USA: ACM Press, 2022: 1008-1017.
13	XUE W, SHEN Y L, REN W Q, et al. Taxonomy completion with probabilistic scorer via box embedding[EB/OL]. [2023-09-05]. https://arxiv.org/pdf/2305.11004.
14	AROUS I, DOLAMIC L, CUDRÉ-MAUROUX P. TaxoComplete: self-supervised taxonomy completion leveraging position-enhanced semantic matching[C]//Proceedings of the 2023 ACM Web Conference. New York, USA: ACM Press, 2023: 2509-2518.
15	ZHOU J , CUI G Q , HU S D , et al. Graph neural networks: a review of methods and applications. AI Open, 2020, 1, 57- 81. doi: 10.1016/j.aiopen.2021.01.001
16	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2019: 4171-4186.
17	LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL]. [2023-09-05]. http://arxiv.org/abs/1907.11692v1.
18	LIU Y H, HAN T L, MA S Y, et al. Summary of ChatGPT-related research and perspective towards the future of large language models[EB/OL]. [2023-09-05]. http://arxiv.org/abs/2304.01852v4.
19	BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[EB/OL]. [2023-09-05]. https://arxiv.org/abs/2005.14165.
20	COLIN R , NOAM S , ADAM R , et al. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020, 21 (1): 5485- 5551.
21	LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2020: 7871-7880.
22	OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[EB/OL]. [2023-09-05]. https://arxiv.org/abs/2203.02155.
23	CHUNG H W, HOU L, LONGPRE S, et al. Scaling instruction-finetuned language models[EB/OL]. [2023-09-05]. https://arxiv.org/abs/2210.11416.
24	SU H J, SHI W J, KASAI J, et al. One embedder, any task: instruction-finetuned text embeddings[C]//Proceedings of the Findings of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2023: 1102-1121.
25	邓远飞, 李加伟, 蒋运承. 基于知识注入提示学习的专利短语相似度计算. 计算机工程, 2024, 50 (4): 294- 302. doi: 10.19678/j.issn.1000-3428.0067595
	DENG Y F , LI J W , JIANG Y C . Similarity computation of patent phrase based on knowledge injection prompt learning. Computer Engineering, 2024, 50 (4): 294- 302. doi: 10.19678/j.issn.1000-3428.0067595
26	李鸿鹏, 马博, 杨雅婷, 等. 基于槽位语义增强提示学习的篇章级事件抽取方法. 计算机工程, 2023, 49 (9): 23- 31. doi: 10.19678/j.issn.1000-3428.0066170
	LI H P , MA B , YANG Y T , et al. Document-level event extraction method based on slot semantic enhanced prompt learning. Computer Engineering, 2023, 49 (9): 23- 31. doi: 10.19678/j.issn.1000-3428.0066170
27	QIAO S F, OU Y X, ZHANG N Y, et al. Reasoning with language model prompting: a survey[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. [S. l. ]: Association for Computational Linguistics, 2023: 5368-5393.
28	WU Z R, XIONG Y J, YU S X, et al. Unsupervised feature learning via non-parametric instance discrimination[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 3733-3742.
29	XIA T Y, WANG Y, TIAN Y, et al. Using prior knowledge to guide BERT's attention in semantic textual matching tasks[C]//Proceedings of the 2021 Web Conference. New York, USA: ACM Press, 2021: 2466-2475.
30	VAN DEN OORD A, LI Y Z, VINYALS O, et al. Representation learning with contrastive predictive coding[EB/OL]. [2023-09-05]. http://arxiv.org/abs/1807.03748v2.
31	LIU P F , YUAN W Z , FU J L , et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, 55 (9): 1- 35. URL
32	GAO T Y, FISCH A, CHEN D Q. Making pre-trained language models better few-shot learners[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. [S. l. ]: Association for Computational Linguistics, 2021: 3816-3830.
33	SHEN Z H, MA H, WANG K S. A web-scale system for scientific knowledge exploration[EB/OL]. [2023-09-05]. https://aclanthology.org/P18-4015.pdf.
34	JURGENS D, PILEHVAR M T. SemEval-2016 task 14: semantic taxonomy enrichment[EB/OL]. [2023-09-05]. https://aclanthology.org/S16-1169/.

[1]	杨冬菊, 黄俊涛. 基于大语言模型的中文科技文献标注方法[J]. 计算机工程, 2024, 50(9): 113-120.
[2]	陈宇航, 杨勇, 先木斯亚·买买提明, 帕力旦·吐尔逊, 樊小超, 任鸽, 刁宇峰. 基于主题感知和语义增强的作文自动评分方法[J]. 计算机工程, 2024, 50(8): 363-371.
[3]	曾碧卿, 陈鹏飞, 姚勇涛. 融合思维链和低秩自适应微调的方面情感三元组抽取[J]. 计算机工程, 2024, 50(7): 53-62.
[4]	周炫余, 吴莲华, 郑勤华, 肖天星, 王紫璇, 张思敏. 联合语义提示和记忆增强的弱监督跳绳视频异常检测方法[J]. 计算机工程, 2024, 50(7): 87-95.
[5]	刘娟, 段友祥, 陆誉翕, 张鲁. 引入知识增强和对比学习的知识图谱补全[J]. 计算机工程, 2024, 50(7): 112-122.
[6]	张正康, 杨丹, 聂铁铮, 寇月. 基于图结构聚类的自监督学习疾病诊断方法[J]. 计算机工程, 2024, 50(7): 360-371.
[7]	陈佳玉, 王元龙, 张虎. 基于文本知识增强的问题生成模型[J]. 计算机工程, 2024, 50(6): 86-93.
[8]	张宝鑫, 杨丹, 聂铁铮, 寇月. 基于自监督的多视角图协同过滤推荐方法[J]. 计算机工程, 2024, 50(5): 100-110.
[9]	李晶, 李健, 陈海丰, 张倩, 王丽燕, 裴二成. 基于关键区域遮挡与重建的人脸表情识别[J]. 计算机工程, 2024, 50(5): 241-249.
[10]	隗昊, 刁宏悦, 孔亮宸, 邓耀臣. 东北亚舆情文本细粒度命名实体识别方法研究[J]. 计算机工程, 2024, 50(5): 354-362.
[11]	隗昊, 刁宏悦, 孔亮宸, 邓耀臣. 东北亚舆情文本细粒度命名实体识别方法研究[J]. 计算机工程, 2024, 50(5): 354-362.
[12]	张洪程, 李林育, 杨莉, 伞晨峻, 尹春林, 颜冰, 于虹, 张璇. 基于对比学习与语言模型增强嵌入的知识图谱补全[J]. 计算机工程, 2024, 50(4): 168-176.
[13]	邓远飞, 李加伟, 蒋运承. 基于知识注入提示学习的专利短语相似度计算[J]. 计算机工程, 2024, 50(4): 294-302.
[14]	王琳, 黄浩. 引入预训练表示混合矢量量化和CTC的语音转换[J]. 计算机工程, 2024, 50(4): 313-320.
[15]	杨兴耀, 李志林, 张祖莲, 于炯, 陈嘉颖, 王东晓. 基于层间融合滤波器与社交神经引文网络的推荐算法[J]. 计算机工程, 2024, 50(11): 98-106.

选择文件类型/文献管理软件名称

选择包含的内容