基于融合模型与语义网络的App用户意图识别研究

doi:10.19678/j.issn.1000-3428.0068206

摘要/Abstract

摘要：

随着手机应用软件的流行，应用市场上出现了大量非结构化的中文用户评论。基于用户评论识别App用户意图，可以帮助开发人员对App软件进行有针对性的维护和改善。为了从中准确识别用户意图，提出一种基于融合模型和语义网络的App用户意图识别方法FSAUIR。使用百度工具Senta判断评论的情感倾向，构建基于RoBERTa的融合意图分类模型RBMS，通过RoBERTa模型将用户评论转化为语义特征表示，并将其输入到双向门控循环单元中，以提取评论的全局上下文语义信息，同时利用多头自注意力机制和SoftPool获取关键的特征信息，保留主要特征，通过Softmax进行归一化处理，得到意图分类结果。在意图分类的基础上，引入PositionRank模型提取各意图类别下评论的关键词，计算关键词之间的共现关系，构建关键词语义网络，从而更细粒度地识别用户意图。实验结果表明，相比BERT、RoBERTa、RoBERTa-CNN等模型，RBMS模型在人工标注数据集上具有较优的分类性能，准确率、精确率、召回率、F1值分别为87.75%、88.09%、87.80%、87.88%。此外，在意图分类的结果集中，FSAUIR构建的语义网络可以高效地挖掘出用户评论中有价值的信息。

关键词: 意图识别, 意图分类, RoBERTa模型, 双向循环门控单元, PositionRank模型, 多头自注意力机制

Abstract:

With the popularity of mobile Applications(Apps), a large number of unstructured Chinese user reviews have appeared in the application market. Identifying App user intent based on these reviews helps developers make targeted maintenance and improvement of App software. To accurately recognize user intent, this study proposes an App user intent recognition method based on fusion model and semantic network, named FSAUIR. First, FSAUIR uses the Baidu tool Senta to determine the emotional tendency of the reviews. It then introduces Robustly optimized Bidirectional Encoder Representation from Transformers approach(RoBERTa)-based fusion intent classification model, RoBERTa-BiGRU-Multiple Self-Attention+SoftPool(RBMS), which transforms user reviews into semantic feature representations through the RoBERTa model. These representations are input into a Bidirectional Gated Recurrent Unit(BiGRU) to extract the global contextual semantic information of the reviews. Simultaneously, the multiple self-attention and SoftPool mechanisms obtain more critical feature information, retaining the main features. Finally, the Softmax normalizes the features to obtain the intent classification results. Subsequently, FSAUIR employs the PositionRank model to extract keywords from reviews under each intent category, calculate the co-occurrence relationship between keywords, and construct a keywords semantic network to recognize user intent with finer granularity. Experimental results show that compared to BERT, RoBERTa, RoBERTa-CNN, and other models, the RBMS model exhibits superior classification performance on the manually labeled dataset. The model achieves accuracy, precision, recall, and F1 value of 87.75%, 88.09%, 87.80%, and 87.88%, respectively. Additionally, the semantic network constructed by FSAUIR efficiently mines valuable information from user reviews in the intent classification result set.

Key words: intent recognition, intent classification, RoBERTa model, Bidirectional Gated Recurrent Unit(BiGRU), PositionRank model, multihead self-attention mechanism

陈瀚, 赵春蕾, 蒋昊达, 王春东. 基于融合模型与语义网络的App用户意图识别研究[J]. 计算机工程, 2024, 50(8): 50-63.

Han CHEN, Chunlei ZHAO, Haoda JIANG, Chundong WANG. Research on App User Intent Recognition Based on Fusion Model and Semantic Network[J]. Computer Engineering, 2024, 50(8): 50-63.

https://www.ecice06.com/CN/Y2024/V50/I8/50

图/表 19

图1 FSAUIR整体框架

Fig.1 FSAUIR overall framework

图2 Senta情感分析使用示例

Fig.2 Sentiment analysis usage example of Senta

图3 RBMS融合意图分类模型

Fig.3 RBMS fusion intent classification model

图4 输入层逻辑结构

Fig.4 Input layer logic structure

图5 RoBERTa模型结构

Fig.5 RoBERTa model structure

图6 GRU神经网络基本单元

Fig.6 Basic unit of GRU neural network

图7 BiGRU结构

Fig.7 BiGRU structure

图8 PositionRank关键词提取过程

Fig.8 PositionRank keywords extraction process

图9 App评论各类别正负评论分布

Fig.9 Positive and negative reviews distribution of each App review category

图10 App分类别消极语义网络

Fig.10 App classified negative semantic networks

参考文献 31

1	陈琪, 张莉, 蒋竞, 等. 一种基于支持向量机和主题模型的评论分析方法. 软件学报, 2019, 30 (5): 1547- 1560. URL
	CHEN Q , ZHANG L , JIANG J , et al. Review analysis method based on support vector machine and latent Dirichlet allocation. Journal of Software, 2019, 30 (5): 1547- 1560. URL
2	MALGAONKAR S , LICORISH S A , SAVARIMUTHU B T R . Prioritizing user concerns in App reviews—a study of requests for new features, enhancements and bug fixes. Information and Software Technology, 2022, 144, 106798. doi: 10.1016/j.infsof.2021.106798
3	de LIMA V M A , de ARAÚJO A F , RICARDO M M . Temporal dynamics of requirements engineering from mobile App reviews. PeerJ Computer Science, 2022, 8, e874. doi: 10.7717/peerj-cs.874
4	JIANG H , ZHANG J X , LI X C , et al. Recommending new features from mobile App descriptions. ACM Transactions on Software Engineering and Methodology, 28 (4): 22. doi: 10.1145/3344158
5	WU H Y, DENG W J, NIU X T, et al. Identifying key features from App user reviews[C]//Proceedings of the 43rd International Conference on Software Engineering. Washington D. C., USA: IEEE Press, 2021: 922-932.
6	肖建茂, 陈世展, 冯志勇, 等. 一种基于用户评论自动分析的APP维护和演化方法. 计算机学报, 2020, 43 (11): 2184- 2202. doi: 10.11897/SP.J.1016.2020.02184
	XIAO J M , CHEN S Z , FENG Z Y , et al. An automatic analysis of user reviews method for APP evolution and maintenance. Chinese Journal of Computers, 2020, 43 (11): 2184- 2202. doi: 10.11897/SP.J.1016.2020.02184
7	ALI S F, SIRTS K, PFAHL D. Using App reviews for competitive analysis: tool support[C]//Proceedings of the 3rd ACM SIGSOFT International Workshop on App Market Analytics. New York, USA: ACM Press, 2019: 40-46.
8	GUO H, SINGH M P. Caspar: extracting and synthesizing user stories of problems from App reviews[C]//Proceedings of the 42nd International Conference on Software Engineering. New York, USA: ACM Press, 2020: 628-640.
9	MALGAONKAR S , LICORISH S A , SAVARIMUTHU B T R . Automatically generating taxonomy for grouping App reviews—a study of three Apps. Software Quality Journal, 2022, 30 (2): 483- 512. doi: 10.1007/s11219-021-09570-1
10	CIURUMELEA A, PANICHELLA S, GALL H C. Automated user reviews analyser[C]//Proceedings of the 40th International Conference on Software Engineering. New York, USA: ACM Press, 2018: 317-318.
11	LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL]. [2023-07-10]. https://arxiv.org/abs/1907.11692.
12	CHEN N, LIN J L, HOI S C H, et al. AR-miner: mining informative reviews for developers from mobile App marketplace[C]//Proceedings of the 36th International Conference on Software Engineering. New York, USA: ACM Press, 2014: 767-778.
13	贾一荻, 刘璘. 中文非功能需求描述的识别与分类方法研究. 软件学报, 2019, 30 (10): 3115- 3126. URL
	JIA Y D , LIU L . Recognition and classification of non-functional requirements in Chinese. Journal of Software, 2019, 30 (10): 3115- 3126. URL
14	SHARMA M, AGGARWAL D, PAHUJA D. Categorization and classification of Uber reviews[M]//SHARMA H, GOVINDAN K, POONIA R C, et al. Advances in computing and intelligent systems. Singapore: Springer Singapore, 2020: 347-355.
15	JHA N, MAHMOUD A. Mining user requirements from application store reviews using frame semantics[C]//Proceedings of International Working Conference on Requirements Engineering: Foundation for Software Quality. Berlin, Germany: Springer, 2017: 273-287.
16	DABROWSKI J , LETIER E , PERINI A , et al. Analysing App reviews for software engineering: a systematic literature review. Empirical Software Engineering, 2022, 27 (2): 43. doi: 10.1007/s10664-021-10065-7
17	王曙燕, 原柯. 基于RoBERTa-WWM的大学生论坛情感分析模型. 计算机工程, 2022, 48 (8): 292-298, 305. URL
	WANG S Y , YUAN K . Sentiment analysis model of college student forum based on RoBERTa-WWM. Computer Engineering, 2022, 48 (8): 292-298, 305. URL
18	李军怀, 陈苗苗, 王怀军, 等. 基于ALBERT-BGRU-CRF的中文命名实体识别方法. 计算机工程, 2022, 48 (6): 89-94, 106. URL
	LI J H , CHEN M M , WANG H J , et al. Chinese named entity recognition method based on ALBERT-BGRU-CRF. Computer Engineering, 2022, 48 (6): 89-94, 106. URL
19	HENAO P R, FISCHBACH J, SPIES D, et al. Transfer learning for mining feature requests and bug reports from Tweets and App store reviews[C]//Proceedings of the 29th International Requirements Engineering Conference. Washington D. C., USA: IEEE Press, 2021: 80-86.
20	de ARAÚJO A F, MARCACINI R M. RE-BERT: automatic extraction of software requirements from App reviews using BERT language model[C]//Proceedings of the 36th Annual ACM Symposium on Applied Computing. New York, USA: ACM Press, 2021: 1321-1327.
21	HADI M A , FARD F H . Evaluating pre-trained models for user feedback analysis in software engineering: a study on classification of APP-reviews. Empirical Software Engineering, 2023, 28 (4): 88. doi: 10.1007/s10664-023-10314-x
22	di SORBO A, PANICHELLA S, VISAGGIO C A, et al. Development emails content analyzer: intention mining in developer discussions(T)[C]//Proceedings of the 30th International Conference on Automated Software Engineering. Washington D. C., USA: IEEE Press, 2015: 12-23.
23	PANICHELLA S, di SORBO A, GUZMAN E, et al. How can I improve my APP? Classifying user reviews for software maintenance and evolution[C]//Proceedings of 2015 IEEE International Conference on Software Maintenance and Evolution. Washington D. C., USA: IEEE Press, 2015: 281-290.
24	MAALEJ W , KURTANOVIĆ Z , NABIL H , et al. On the automatic classification of App reviews. Requirements Engineering, 2016, 21 (3): 311- 331. doi: 10.1007/s00766-016-0251-9
25	NAILA A , YOUSUF R W , XIA K W , et al. Convolutional neural network based classification of App reviews. IEEE Access, 2020, 8, 185619- 185628. doi: 10.1109/ACCESS.2020.3029634
26	MIKOI O T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. [2023-07-10]. https://arxiv.org/pdf/1301.3781v1.
27	PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar: [s. n.], 2014: 1532-1543.
28	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 6000-6010.
29	STERGIOU A, POPPE R, KALLIATAKIS G. Refining activation downsampling with SoftPool[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 10357-10366.
30	BRIN S , PAGE L . The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 1998, 30 (1/2/3/4/5/6/7): 107- 117.
31	FLORESCU C, CARAGEA C. PositionRank: an unsupervised approach to keyphrase extraction from scholarly documents[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers). Stroudsburg, USA: Association for Computational Linguistics, 2017: 1105-1115.

[1]	王明虎, 石智奎, 苏佳, 张新生. 基于RoBERTa和图增强Transformer的序列推荐方法[J]. 计算机工程, 2024, 50(4): 121-131.
[2]	张雪, 陈钰枫, 徐金安, 田凤占. 专有名词增强的复述生成方法研究[J]. 计算机工程, 2024, 50(3): 98-105.
[3]	费蓉, 马梦阳, 张晓, 黑新宏, 徐庆征, 邱原. 基于轨迹预测与冲突检测的自动驾驶碰撞检测模型[J]. 计算机工程, 2023, 49(7): 10-20.
[4]	杨红菊, 靳新宇. 一个实体关系与事件抽取的通用模型[J]. 计算机工程, 2023, 49(2): 143-149.
[5]	毕然, 王轶, 周喜. 基于重建误差的任务型对话未知意图检测[J]. 计算机工程, 2023, 49(2): 54-60.
[6]	刘高军, 李亚欣, 段建勇. 基于混合注意力机制的中文机器阅读理解[J]. 计算机工程, 2022, 48(10): 67-72,80.
[7]	廖胜兰, 吉建民, 俞畅, 陈小平. 基于BERT模型与知识蒸馏的意图分类方法[J]. 计算机工程, 2021, 47(5): 73-79.
[8]	赵亚南, 刘渊, 宋设. 融合多头自注意力机制的金融新闻极性分析[J]. 计算机工程, 2020, 46(8): 85-92.

选择文件类型/文献管理软件名称

选择包含的内容