Unknown Intent Detection for Task-Oriented Dialogs Based on Reconstruction Error

doi:10.19678/j.issn.1000-3428.0063847

Abstract

Abstract: Existing unknown intent detection models map utterances to the vector space and use the Local Outlier Factor(LOF) algorithm to define the feature points with low density as the unknown intent.However, known intent feature clusters trained by the cross-entropy loss are narrower and longer, which causes the overall spacing in density, and the dispersion in the clusters is not sufficiently uniform for detection.This study proposes an unknown intent detection model based on the autoencoder reconstruction error to solve the above problems.During the training stage, the model uses a joint loss function with label knowledge to train a known intent classifier, which forces the distribution of known intent features to minimize the intraclass distance and maximize the interclass distance.It then uses these features to train an autoencoder that only contains information regarding the known intent.During the testing stage, samples with significant reconstruction errors are regarded as unknown intentions using the automatic encoder, and the other samples are regarded as normal classifications of known intentions.Experiments on the SNIPS dataset show that the Macro F1 score of the proposed model increases by 16.93%, 1.14%, and 2.37% compared with the Semantic-Enhanced large-margin Gaussian mixture loss(SEG) model of the best performance in the baseline models when proportions of the known intents are 25%, 50%, and 75%, respectively.Moreover, the proposed model can detect more unknown samples.Furthermore, the proposed model exhibits improved performance on the ATIS dataset, in which the intent distribution is highly unbalanced.

Key words: intent identification, task-oriented dialog, unknown intent detection, loss function, autoencoder, reconstruction error

摘要： 现有未知意图检测模型通常将语句映射到向量空间，并使用局部异常因子算法定义密度较小的特征点为未知意图，但经交叉熵损失训练的已知意图特征簇更加狭长，簇内的整体间距、密度和分散情况不均匀，进而增加了检测难度。针对上述问题，提出一种基于自动编码器重建误差的未知意图检测模型。在训练阶段，使用融入标签知识的联合损失函数训练已知意图分类器，使已知意图特征类间距离大且类内距离小，并利用这些特征训练一个仅能获取已知意图信息的自动编码器。在测试阶段，利用自动编码器将重建误差较大的样本视为未知意图，其余样本视为已知意图正常分类。在SNIPS数据集上的实验结果表明，在已知意图占比为25%、50%、75%时，该模型的Macro F1得分相比于表现最优的增强语义的高斯混合损失基线模型分别提升了16.93%、1.14%和2.37%，能够检测到更多的未知意图样本，同时在类别分布极不平衡的ATIS数据集上也有较好的性能表现。

关键词: 意图识别, 任务型对话, 未知意图检测, 损失函数, 自动编码器, 重建误差

CLC Number:

TP391.1

BI Ran, WANG Yi, ZHOU Xi. Unknown Intent Detection for Task-Oriented Dialogs Based on Reconstruction Error[J]. Computer Engineering, 2023, 49(2): 54-60.

毕然, 王轶, 周喜. 基于重建误差的任务型对话未知意图检测[J]. 计算机工程, 2023, 49(2): 54-60.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0063847

http://www.ecice06.com/EN/Y2023/V49/I2/54

Figures/Tables 12

References

[1] 叶铱雷, 曹斌, 范菁, 等.面向任务型多轮对话的粗粒度意图识别方法[J].小型微型计算机系统, 2020, 41(8):1620-1626. YE Y L, CAO B, FAN J, et al.Coarse-grained intent recognition method for task-oriented multi-turn dialogue[J].Journal of Chinese Computer Systems, 2020, 41(8):1620-1626.(in Chinese)
[2] 赵阳洋, 王振宇, 王佩, 等.任务型对话系统研究综述[J].计算机学报, 2020, 43(10):1862-1896. ZHAO Y Y, WANG Z Y, WANG P, et al.A survey on task-oriented dialogue systems[J].Chinese Journal of Computers, 2020, 43(10):1862-1896.(in Chinese)
[3] 车万翔, 张伟男.人机对话系统综述[J].人工智能, 2018, 5(1):76-82. CHE W X, ZHANG W N.Overview of man-machine conversation system[J].AI-View, 2018, 5(1):76-82.(in Chinese)
[4] 刘其开, 姜代红, 李文吉.基于分段损失的生成对抗网络[J].计算机工程, 2019, 45(5):155-160, 168. LIU Q K, JIANG D H, LI W J.Generative adversarial network based on piecewise loss[J].Computer Engineering, 2019, 45(5):155-160, 168.(in Chinese)
[5] 朱海琦, 李宏, 李定文.基于单幅图像学习的生成对抗网络模型[J].计算机工程, 2021, 47(8):271-276, 283. ZHU H Q, LI H, LI D W.Generative adversarial network model based on single image learning[J].Computer Engineering, 2021, 47(8):271-276, 283.(in Chinese)
[6] YU Y, QU W Y, LI N, et al.Open-category classification by adversarial sample generation[EB/OL].[2022-01-15].https://arxiv.org/abs/1705.08722.
[7] HENDRYCKS D, GIMPEL K.A baseline for detecting misclassified and out-of-distribution examples in neural networks[EB/OL].[2022-01-15].https://arxiv.org/abs/1610.02136.
[8] SHU L, XU H, LIU B.DOC:deep open classification of text documents[EB/OL].[2022-01-15].https://arxiv.org/abs/1709.08716.
[9] LIN T E, XU H.Deep unknown intent detection with margin loss[EB/OL].[2022-01-15].https://arxiv.org/abs/1906.00434.
[10] YAN G F, FAN L, LI Q M, et al.Unknown intent detection using Gaussian mixture model with an application to zero-shot intent classification[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2020:1-10.
[11] WANG H, WANG Y T, ZHOU Z, et al.CosFace:large margin cosine loss for deep face recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:5265-5274.
[12] WAN W T, ZHONG Y Y, LI T P, et al.Rethinking feature distribution for loss functions in image classification[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:9117-9126.
[13] BREUNIG M M, KRIEGEL H P, NG R T, et al.LOF:identifying density-based local outliers[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York, USA:ACM Press, 2000:93-104.
[14] WEN Y D, ZHANG K P, LI Z F, et al.A discriminative feature learning approach for deep face recognition[M].Berlin, Germany:Springer, 2016.
[15] AN J, CHO S.Variational autoencoder based anomaly detection using reconstruction probability[EB/OL].[2022-01-15].http://dm.snu.ac.kr/static/docs/TR/SNUDM-TR-2015-03.pdf.
[16] CHEN Z M, YEO C K, LEE B S, et al.Autoencoder-based network anomaly detection[C]//Proceedings of Wireless Telecommunications Symposium.Washington D.C., USA:IEEE Press, 2018:1-5.
[17] ZHAI J H, ZHANG S F, CHEN J F, et al.Autoencoder and its various variants[C]//Proceedings of IEEE International Conference on Systems, Man, and Cybernetics.Washington D.C., USA:IEEE Press, 2018:415-419.
[18] WELD H, HUANG X, LONG S, et al.A survey of joint intent detection and slot-filling models in natural language understanding[EB/OL].[2022-01-15].https://arxiv.org/pdf/2101.08091.pdf.
[19] XU P Y, SARIKAYA R.Convolutional neural network based triangular CRF for joint intent detection and slot filling[C]//Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C., USA:IEEE Press, 2013:78-83.
[20] LIU B, LANE I.Attention-based recurrent neural network models for joint intent detection and slot filling[C]//Proceedings of InterSpeech 2016.[S.l.]:ISCA, 2016:1-10.
[21] CHEN Q, ZHUO Z, WANG W.BERT for joint intent classification and slot filling[EB/OL].[2022-01-15].https://arxiv.org/abs/1902.10909.
[22] ZHANG Z C, ZHANG Z W, CHEN H Y, et al.A joint learning framework with BERT for spoken language understanding[J].IEEE Access, 2016, 7:168849-168858.
[23] ZHANG Z, TAKANOBU R, ZHU Q, et al.Recent advances and challenges in task-oriented dialog systems[J].Science China Technological Sciences, 2020, 63(10):2011-2027.
[24] SABOKROU M, KHALOOEI M, FATHY M, et al.Adversarially learned one-class classifier for novelty detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:3379-3388.
[25] COUCKE A, SAADE A, BALL A, et al.SNIPS voice platform:an embedded spoken language understanding system for private-by-design voice interfaces[EB/OL].[2022-01-15].https://arxiv.org/abs/1805.10190.
[26] HEMPHILL C T, GODFREY J J, DODDINGTON G R.The ATIS spoken language systems pilot corpus[C]//Proceedings of the Workshop on Speech and Natural Language.Philadelphia, USA:Association for Computational Linguistics, 1990:96-101.

Please choose a citation manager

Content to export