Few-Shot Joint Recognition Method of Intent and Slot Based on Cloze

doi:10.19678/j.issn.1000-3428.0069714

Abstract

Abstract:

As the core module of a task-oriented dialogue system, Natural Language Understanding (NLU) aims to structurally represent user inputs in natural language; this is generally decomposed into two subtasks: intent recognition and slot filling. Recently, the joint modeling of these two tasks has become a universal solution. However, establishing the connection between the two tasks is difficult to collect using a small number of support set samples in few-shot scenarios. Owing to domain gaps, the general knowledge learned from resource-rich source domains cannot be directly transferred to target domains. Inspired by cloze, this paper considers the average vector of non-slot (labeled as ″O″) words as the sentence pattern representation and proposes a Sentence Pattern Adaptive Prototype Network (SPAPN). In resource-rich source domains, the model fully learns the cross-domain semantic knowledge of sentence patterns and uses this information as a hub to indirectly model the relationship between intents and slots. Resource-low target domains adopt a meta-learning training mode and an attention mechanism to learn the correlation among the prototypes of intents, slots, and sentence patterns to enhance the semantic representations of intent and slot prototypes, and combine Comparative Alignment Learning (CAL) is employed to judge the labels of intents and slots based on the vector similarity between the query samples and these prototypes. Experiments conducted on Chinese and English benchmark datasets show that, irrespective of fine-tuning, the proposed method consistently outperforms state-of-the-art baselines in terms of intent accuracy, slot filling F1 score, and joint accuracy.

Key words: task-oriented dialogue system, intent recognition, slot filling, few-shot learning, attention mechanism

摘要：

作为任务型对话系统的核心模块, 自然语言理解(NLU)旨在将用户输入的自然语言进行结构化表示, 通常分为意图识别和槽位填充两个子任务。由于两者联系密切, 对意图和槽位进行显式联合建模成为通用的解决方案。然而, 在资源稀缺的小样本场景下较难通过少量支持集样本提取意图和槽位的关联关系, 且从资源丰富的源领域学习到的通用知识无法直接应用于目标领域。受英语完形填空任务启发, 将语句中非槽位(标签为"O")单词的平均向量视为句型表示, 提出一种句型自适应原型网络(SPAPN)方法。在资源丰富的源领域, 充分学习跨越领域的句型语义知识, 以句型信息为枢纽, 间接完成意图和槽位的关系建模。在低资源目标领域, 采用元学习的训练模式, 通过注意力机制学习意图、槽位、句型原型的关联关系, 获取意图和槽位的增强原型语义表示, 结合对比对齐学习(CAL)方法, 根据查询样本与原型之间的向量相似度判断其标签类别。在中英文基准数据集上的实验结果表明, 无论是否经过微调, 该方法较现有最优基线方法在意图识别准确率、槽位填充F1值以及联合准确率方面均能够取得更加优秀的表现。

关键词: 任务型对话系统, 意图识别, 槽位填充, 小样本学习, 注意力机制

BI Ran, YANG Fengyi, ZHOU Xi, YANG Yating, Abibulla Atawulla. Few-Shot Joint Recognition Method of Intent and Slot Based on Cloze[J]. Computer Engineering, 2025, 51(10): 79-86.

毕然, 杨奉毅, 周喜, 杨雅婷, 艾比布拉·阿塔伍拉. 基于完形填空的小样本意图槽位联合识别方法[J]. 计算机工程, 2025, 51(10): 79-86.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069714

https://www.ecice06.com/EN/Y2025/V51/I10/79

Figures/Tables 5

References 26

1	杨帆, 饶元, 丁毅, 等. 面向任务型的对话系统研究进展. 中文信息学报, 2021, 35 (10): 1- 20.
	YANG F , RAO Y , DING Y , et al. Progress in task-oriented dialogue system. Journal of Chinese Information Processing, 2021, 35 (10): 1- 20.
2	毕然, 王轶, 周喜. 基于重建误差的任务型对话未知意图检测. 计算机工程, 2023, 49 (2): 54- 60. doi: 10.19678/j.issn.1000-3428.0063847
	BI R , WANG Y , ZHOU X . Unknown intent detection for task-oriented dialogs based on reconstruction error. Computer Engineering, 2023, 49 (2): 54- 60. doi: 10.19678/j.issn.1000-3428.0063847
3	NI J J , YOUNG T , PANDELEA V , et al. Recent advances in deep learning based dialogue systems: a systematic survey. Artificial Intelligence Review, 2023, 56 (4): 3055- 3155. doi: 10.1007/s10462-022-10248-8
4	WELD H , HUANG X Q , LONG S Q , et al. A survey of joint intent detection and slot filling models in natural language understanding. ACM Computing Surveys, 2023, 55 (8): 1- 38.
5	TANG H , JI D H , ZHOU Q J . End-to-end masked graph-based CRF for joint slot filling and intent detection. Neurocomputing, 2020, 413, 348- 359. doi: 10.1016/j.neucom.2020.06.113
6	季泊男, 张永刚. 基于注意力机制归纳网络的小样本关系抽取模型. 吉林大学学报(理学版), 2023, 61 (4): 845- 852.
	JI B N , ZHANG Y G . Few-shot relation extraction model based on attention mechanism induction network. Journal of Jilin University (Science Edition), 2023, 61 (4): 845- 852.
7	司明悦, 齐斌, 张文胜, 等. 基于张量计算的智慧交通多维数据计算与小样本学习. 计算机工程, 2024, 50 (4): 41- 49. doi: 10.19678/j.issn.1000-3428.0069223
	SI M Y , QI B , ZHANG W S , et al. Multi-dimensional data calculation and few-shot learning for intelligent transportation based on tensor calculation. Computer Engineering, 2024, 50 (4): 41- 49. doi: 10.19678/j.issn.1000-3428.0069223
8	ZHAO W, ZHOU K, LI J, et al. A survey of large language models[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2303.18223.
9	BROWN T , MANN B , RYDER N , et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 2020, 33, 1877- 1901.
10	GUNASEKAR S, ZHANG Y, ANEJA J, et al. Textbooks are all you need[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2306.11644v2.
11	HOU Y T, LIU Y J, CHE W X, et al. Sequence-to-sequence data augmentation for dialogue language understanding[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1807.01554v1.
12	YANG F Y, ZHOU X, WANG Y, et al. Diversity features enhanced prototypical network for few-shot intent detection[C]//Proceedings of the 31st International Joint Conference on Artificial Intelligence. Vienna, Austria: International Joint Conferences on Artificial Intelligence Organization, 2022: 4447-4453.
13	HOU Y T, CHEN C, LUO X Z, et al. Inverse is better! Fast and accurate prompt for few-shot slot tagging[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2204.00885v1.
14	YANG F Y, ZHOU X, YANG Y T, et al. A domain-transfer meta task design paradigm for few-shot slot tagging[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2023: 13887-13895.
15	HOU Y T, LAI Y K, CHEN C, et al. Learning to bridge metric spaces: few-shot joint learning of intent detection and slot filling[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2106.07343v1.
16	LIU H, ZHANG F, ZHANG X T, et al. An explicit-joint and supervised-contrastive learning framework for few-shot intent classification and slot filling[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2110.13691v1.
17	孙相会, 苗德强, 窦辰晓, 等. 基于小样本学习的意图识别与槽位填充方法. 中文信息学报, 2023, 37 (2): 119- 128.
	SUN X H , MIAO D Q , DOU C X , et al. A few-shot learning approach to intent recognition and slot filling. Journal of Chinese Information Processing, 2023, 37 (2): 119- 128.
18	SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 4080-4090.
19	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1706.03762.
20	COUCKE A, SAADE A, BALL A, et al. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1805.10190v3.
21	HOU Y T, MAO J F, LAI Y K, et al. FewJoint: a few-shot learning benchmark for joint language understanding[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2009.08138v3.
22	BHATHIYA H S, THAYASIVAM U. Meta learning for few-shot joint intent detection and slot-filling[C]//Proceedings of the 2020 5th International Conference on Machine Learning Technologies. New York, USA: ACM Press, 2020: 86-92.
23	KRONE J, ZHANG Y, DIAB M. Learning to classify intents and slot labels given a handful of examples[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2004.10793v1.
24	CHEN Q, ZHUO Z, WANG W. BERT for joint intent classification and slot filling[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1902.10909v1.
25	GOO C W, GAO G, HSU Y K, et al. Slot-gated modeling for joint slot filling and intent prediction[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Philadelpha, USA: ACL Press, 2018: 753-757.
26	HOU Y T, CHE W X, LAI Y K, et al. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2006.05702v1.

[1]	ZHAI Zhipeng, CAO Yang, SHEN Qinqin, SHI Quan. Traffic Flow Prediction Based on Multiple Spatio-Temporal Graph Fusion and Dynamic Attention [J]. Computer Engineering, 2025, 51(9): 139-148.
[2]	HUANG Jingui, LIU Peng, TANG Wensheng. MMD-YOLOv7: Vehicle Detection Method Under Dark Conditions [J]. Computer Engineering, 2025, 51(9): 340-349.
[3]	FU Jiacheng, TIAN Jin, ZHANG Yujin, FANG Zhijun. Knowledge Graph Recommendation Based on Previous Triple Set [J]. Computer Engineering, 2025, 51(9): 101-109.
[4]	LI Xiaoyu, LUO Na. Few-Shot Learning Method with Augmentation Data Based on Transferring Intra-Class Variations [J]. Computer Engineering, 2025, 51(9): 242-251.
[5]	MA Gan, GU Yu, PENG Dongliang. Combining Improved YOLOv5s and Dynamic Data Augmentation for Sea Surface Ship Detection [J]. Computer Engineering, 2025, 51(9): 294-305.
[6]	CHEN Yanru, LIU Keliang, RAN Maoliang. Real-time Optimization of Instant Meal Delivery Based on Deep Reinforcement Learning [J]. Computer Engineering, 2025, 51(9): 328-339.
[7]	NI Yuansong, HAN Jun, ZOU Xiaoyan, HU Guangyi, WANG Wenshuai. Two-Stage Adaptive Block Transmission Line Bolt Defect Detection Method [J]. Computer Engineering, 2025, 51(8): 281-291.
[8]	HAO Hongda, LUO Jianxu. Multi-Organ Semantic Segmentation Model Based on Multi-Scale Region Feature Fusion [J]. Computer Engineering, 2025, 51(8): 270-280.
[9]	ZHANG Zhaoli, LI Jiahao, LIU Hai, SHI Fobo, HE Jiawen. Personalized Forgetting Modeling for Knowledge Tracing via Transformers [J]. Computer Engineering, 2025, 51(8): 120-130.
[10]	YAN Jianhong, LIU Zhiyan, WANG Zhen. Multi-Scale Convolutional Vehicle Trajectory Prediction Integrating Spatiotemporal Attention Mechanism [J]. Computer Engineering, 2025, 51(8): 406-414.
[11]	LIU Chunxia, MENG Jixing, PAN Lihu, GONG Dali. Remote Sensing Small-Target Detection Method with Fusion of RGB and IR Images [J]. Computer Engineering, 2025, 51(7): 326-338.
[12]	LUAN Mengna, ZHENG Qiumei, WANG Fenghua. Real-time Traffic Sign Detection Algorithm Based on DMC-YOLO [J]. Computer Engineering, 2025, 51(7): 90-99.
[13]	LU Xuan, JING Luqi, PENG Furong. Colorectal Polyp Segmentation Method Based on Incremental Learning [J]. Computer Engineering, 2025, 51(7): 284-293.
[14]	SONG Jie, XU Huiying, ZHU Xinzhong, HUANG Xiao, CHEN Chen, WANG Zeyu. Improved Fall Detection Algorithm Based on YOLOv8: OEF-YOLO [J]. Computer Engineering, 2025, 51(7): 127-139.
[15]	SHAN Pengchang, GAO Lijian, DONG Wenlong, MAO Qirong. Action Detection Method Based on Salient Target Tracking [J]. Computer Engineering, 2025, 51(6): 93-101.

Please choose a citation manager

Content to export