基于图神经网络的不平衡欺诈检测研究

doi:10.19678/j.issn.1000-3428.0066262

计算机工程 ›› 2023, Vol. 49 ›› Issue (11): 150-159. doi: 10.19678/j.issn.1000-3428.0066262

基于图神经网络的不平衡欺诈检测研究

陈安琪¹, 陈睿¹^,*, 邝祝芳¹, 黄华军²

1. 中南林业科技大学计算机与信息工程学院, 长沙 410004
2. 湖南财政经济学院信息技术与管理学院, 长沙 410205

收稿日期:2022-11-15 出版日期:2023-11-15 发布日期:2023-11-08
通讯作者: 陈睿
作者简介:
陈安琪（1998—），女，硕士研究生，主研方向为欺诈检测
邝祝芳，教授、博士
黄华军，教授、博士
基金资助:
国家重点研发计划(2019YFE0122600); 国家自然科学基金(62072477); 国家自然科学基金(61309027); 湖南省重点研发计划(63223008)

Research on Imbalance Fraud Detection Based on Graph Neural Network

Anqi CHEN¹, Rui CHEN¹^,*, Zhufang KUANG¹, Huajun HUANG²

1. School of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China
2. School of Information Technology and Management, Hunan University of Finance and Economics, Changsha 410205, China

Received:2022-11-15 Online:2023-11-15 Published:2023-11-08
Contact: Rui CHEN

摘要/Abstract

摘要：

现阶段图神经网络被广泛应用于欺诈检测，由于欺诈检测中往往存在类不平衡问题，导致基于图神经网络模型性能不佳。针对上述问题，设计一种基于图神经网络的不平衡欺诈检测模型。该模型细化了图结构数据中存在的邻域不平衡和中心不平衡两个不平衡的概念。在邻域不平衡中，通过多层感知机和高斯核函数衡量中心节点与其邻域节点的非欧氏空间距离（相似度），基于马尔可夫决策动态更新采样阈值对邻域节点进行多层自适应欠采样，并在每一层中仅聚合其原始特征和前一层的隐藏嵌入得到中心节点的目标嵌入；在中心不平衡中，引入加权交叉熵损失函数为每个中心节点的损失设置动态权重以达到中心平衡。在Yelp和Amazon两个数据集上的实验结果表明，该模型的曲线下面积（AUC）、召回率（Recall）两个指标相较于最优基准模型均有显著提升，在两个数据集上的AUC和Recall分别提升了5.52%、5.42%和1.57%、4.31%。

关键词: 图神经网络, 欺诈检测, 类不平衡, 马尔可夫决策, 加权交叉熵损失函数

Abstract:

Currently, graph neural network is widely used in fraud detection. Because of the class imbalance problem in fraud detection, the performance of the model based on graph neural network is poor. To solve these problems, an unbalanced fraud detection model is proposed based on graph neural network. This model refines two concepts of imbalance in graph structure data, viz. neighborhood imbalance and center imbalance. In neighborhood imbalance, first, the non-Euclidean space distance(similarity) between the central node and its neighborhood nodes is measured by using Multilayer Perceptron(MLP) and Gaussian kernel function. Second, Markov decision is used to dynamically update the sampling threshold to conduct multi-level adaptive undersampling for neighborhood nodes. Finally, target embedding of the central node is realized by aggregating only its original features and the hidden embedding of the previous layer in each layer. In central imbalance, the weighted cross-entropy loss function is introduced to set the dynamic weight for the loss of each central node to achieve central balance. The experimental results obtained from Yelp and Amazon data sets show that the model is significantly improved compared with the optimal benchmark model in terms of Area Under Curve(AUC) and Recall. AUC and Recall on the two datasets increased by 5.52% and 5.42%, and 1.57% and 4.31%, respectively.

Key words: Graph Neural Network(GNN), fraud detection, class imbalance, Markov decision, weighted cross-entropy loss function

陈安琪, 陈睿, 邝祝芳, 黄华军. 基于图神经网络的不平衡欺诈检测研究[J]. 计算机工程, 2023, 49(11): 150-159.

Anqi CHEN, Rui CHEN, Zhufang KUANG, Huajun HUANG. Research on Imbalance Fraud Detection Based on Graph Neural Network[J]. Computer Engineering, 2023, 49(11): 150-159.

http://www.ecice06.com/CN/Y2023/V49/I11/150

图/表 8

图1 NCI-GNN模型框架

Fig.1 NCI-GNN model framework

图2 不同采样方法有效性分析

Fig.2 Effectiveness analysis of different sampling methods

图3 层数有效性分析

Fig.3 Effectiveness analysis of layer number

图4 不同聚合方式的有效性分析

Fig.4 Effectiveness analysis of different polymerization modes

图5 不同维度对模型性能的影响

Fig.5 Influence of different dimensions on model performance

参考文献 26

1	BEHDAD M, BARONE L, BENNAMOUN M, et al. Nature-inspired techniques in the context of fraud detection. IEEE Transactions on Systems, Man, and Cybernetics, 2012, 42 (6): 1273- 1290. doi: 10.1109/TSMCC.2012.2215851
2	ALPAYDN G. An adaptive deep neural network for detection, recognition of objects with long range auto surveillance[C]//Proceedings of the 12th IEEE International Conference on Semantic Computing. Washington D. C., USA: IEEE Press, 2018: 316-317.
3	YANG J, ZHOU C J, YANG S H, et al. Anomaly detection based on zone partition for security protection of industrial cyber-physical systems. IEEE Transactions on Industrial Electronics, 2018, 65 (5): 4257- 4267. doi: 10.1109/TIE.2017.2772190
4	KARAMI A. An anomaly-based intrusion detection system in presence of benign outliers with visualization capabilities. Expert Systems with Applications, 2018, 108, 36- 60. doi: 10.1016/j.eswa.2018.04.038
5	KODAMA T, KAMATA K, FUJIWARA K, et al. Ischemic stroke detection by analyzing heart rate variability in rat middle cerebral artery occlusion model. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2018, 26 (6): 1152- 1160. doi: 10.1109/TNSRE.2018.2834554
6	刘华玲, 刘雅欣, 许珺怡, 等. 图异常检测在金融反欺诈中的应用研究进展. 计算机工程与应用, 2022, 58 (22): 41- 53. URL
	LIU H L, LIU Y X, XU J Y, et al. Research progress on the application of anomaly detection in financial anti-fraud. Computer Engineering and Applications, 2022, 58 (22): 41- 53. URL
7	陈卓, 朱淼, 杜军威. 基于多视角图神经网络的欺诈检测算法. 通信学报, 2022, 43 (11): 225- 232. URL
	CHEN Z, ZHU M, DU J W. Fraud detection algorithm based on multi-view graph neural network. Journal on Communications, 2022, 43 (11): 225- 232. URL
8	POURHABIBI T, ONG K L, KAM B H, et al. Fraud detection: a systematic literature review of graph-based anomaly detection approaches. Decision Support Systems, 2020, 133, 113303. doi: 10.1016/j.dss.2020.113303
9	AL-ZOUBI A M, FARIS H, ALQATAWNA J, et al. Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowledge-Based Systems, 2018, 153, 91- 104. doi: 10.1016/j.knosys.2018.04.025
10	PRADO-ROMERO M A, OLIVA A F, HERNÁNDEZ L G. Identifying twitter users influence and open mindedness using anomaly detection. Berlin, Germany: Springer, 2018: 166- 173.
11	RAMALINGAM D, CHINNAIAH V. Fake profile detection techniques in large-scale online social networks: a comprehensive review. Computers & Electrical Engineering, 2018, 65, 165- 177.
12	AL-QURISHI M, HOSSAIN M S, ALRUBAIAN M, et al. Leveraging analysis of user behavior to identify malicious activities in large-scale social networks. IEEE Transactions on Industrial Informatics, 2018, 14 (2): 799- 813. doi: 10.1109/TII.2017.2753202
13	DOU Y T, LIU Z W, SUN L, et al. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York, USA: ACM Press, 2020: 315-324.
14	LIU Z W, DOU Y T, YU P S, et al. Alleviating the inconsistency problem of applying graph neural network to fraud detection[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM Press, 2020: 1569-1572.
15	RAYANA S, AKOGLU L. Collective opinion spam detection: bridging review networks and metadata[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2015: 985-994.
16	LIU Y, AO X, QIN Z D, et al. Pick and choose: a GNN-based imbalanced learning approach for fraud detection[C]//Proceedings of Web Conference. New York, USA: ACM Press, 2021: 3168-3177.
17	ZHANG G, WU J, YANG J, et al. FRAUDRE: fraud detection dual-resistant to graph inconsistency and imbalance[C]//Proceedings of IEEE International Conference on Data Mining. Washington D. C., USA: IEEE Press, 2022: 867-876.
18	WANG X, ZHU M Q, BO D Y, et al. AM-GCN: adaptive multi-channel graph convolutional networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2020: 1243-1253.
19	代黎. 基于代价敏感的不平衡分类问题实证研究[D]. 武汉: 华中师范大学, 2019.
	DAI L. Empirical research on unbalanced classification based on cost sensitivity[D]. Wuhan: Central China Normal University, 2019. (in Chinese)
20	MCAULEY J J, LESKOVEC J. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews[C]//Proceedings of the 22nd International Conference on World Wide Web. New York, USA: ACM Press, 2013: 897-908.
21	HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 1025-1035.
22	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2022-10-10]. https://arxiv.org/abs/1609.02907.
23	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2022-10-10]. https://arxiv.org/abs/1710.10903.
24	LIU Z Q, CHEN C C, YANG X X, et al. Heterogeneous graph neural networks for malicious account detection[C]//Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York, USA: ACM Press, 2018: 2077-2085.
25	WANG J Y, WEN R, WU C M, et al. FdGars: fraudster detection via graph convolutional networks in online App review system[C]//Proceedings of 2019 World Wide Web Conference. New York, USA: ACM Press, 2019: 310-316.
26	ZHANG Y M, FAN Y J, YE Y F, et al. Key player identification in underground forums over attributed heterogeneous information network embedding framework[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York, USA: ACM Press, 2019: 549-558.

[1]	隋国华, 李陶然, 刘昊, 陈林, 汪卫. 基于图表示学习的领域知识图谱推理技术研究[J]. 计算机工程, 2023, 49(9): 89-98.
[2]	刘晓黎, 王轶彤. 基于自监督学习的多密度图会话推荐[J]. 计算机工程, 2023, 49(9): 60-68, 78.
[3]	张冠莹, 伊鹏, 李丹, 朱棣, 毛明. 面向大规模网络的服务功能链部署方法[J]. 计算机工程, 2023, 49(8): 122-129.
[4]	赵世豪, 毛国君, 熊保平, 黄山, 林江宏. 基于图小波卷积神经网络的时空图挖掘模型[J]. 计算机工程, 2023, 49(7): 85-93.
[5]	马月坤, 张可心, 高唱. 体现辨证论治差异的不孕症知识图谱构建方法研究[J]. 计算机工程, 2023, 49(3): 280-287,295.
[6]	李盼, 解庆, 李琳, 刘永坚. 知识增强的图神经网络序列推荐模型[J]. 计算机工程, 2023, 49(2): 70-80.
[7]	雷李想, 武志昊, 刘钰, 周子站. 基于域内特征间相似性的点击率预估优化[J]. 计算机工程, 2023, 49(2): 238-245.
[8]	李婉桦, 孙英娟, 刘艺璇, 刘乾. 基于全局图和多粒度意图单元的会话推荐[J]. 计算机工程, 2023, 49(10): 136-144, 153.
[9]	潘嘉诚, 董一鸿, 陈华辉. 基于图神经网络的自闭症辅助诊断研究综述[J]. 计算机工程, 2022, 48(9): 1-11.
[10]	丁庆丰, 李晋国. 一种物联网环境下的分布式异常流量检测方案[J]. 计算机工程, 2022, 48(8): 152-159.
[11]	胡承佐, 王庆梅, 李迪超, 王铮. 基于复杂结构信息的图神经网络序列推荐算法[J]. 计算机工程, 2022, 48(5): 82-90,97.
[12]	金雨澄, 王清钦, 高剑, 苗仲辰, 林越峰, 项雅丽, 熊贇. 基于图深度学习的金融文本多标签分类算法[J]. 计算机工程, 2022, 48(4): 16-21.
[13]	赵越, 武志昊, 赵苡积. 基于特征与域感知的点击率预估方法[J]. 计算机工程, 2022, 48(3): 60-68.
[14]	崔丽平, 古丽拉·阿东别克, 王智悦. 基于有向图模型的旅游领域命名实体识别[J]. 计算机工程, 2022, 48(2): 306-313.
[15]	苏珂, 黄瑞阳, 张建朋, 余诗媛, 胡楠. 多跳机器阅读理解研究进展[J]. 计算机工程, 2021, 47(9): 1-17.

选择文件类型/文献管理软件名称

选择包含的内容

基于图神经网络的不平衡欺诈检测研究

Research on Imbalance Fraud Detection Based on Graph Neural Network

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 26

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于图神经网络的不平衡欺诈检测研究

Research on Imbalance Fraud Detection Based on Graph Neural Network

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 26

相关文章 15

编辑推荐

Metrics

本文评价