基于特征可视化探究跳跃连接结构对深度神经网络特征提取的影响

doi:10.19678/j.issn.1000-3428.0068885

摘要/Abstract

摘要：

由于没有跳跃连接结构的深度神经网络在超过一定深度后难以训练, 因此现有的深度神经网络模型大都采用跳跃连接结构来解决优化问题并提高泛化性能。然而, 人们对于跳跃连接结构如何影响深度神经网络特征提取的研究还较少, 在大多数情况下, 这些模型仍然被认为是黑盒。为了分析跳跃连接结构对深度神经网络特征提取的影响, 从特征可视化的角度, 以基于扰动的方法为切入点, 提出一种在保持图像总体颜色分布和轮廓特征基本不变的前提下弱化图像细节特征的扰动方法, 并将其命名为网格乱序模糊(GSB)方法。同时, 研究结合特征可视化中的激活最大化(AM)方法和所提出的GSB扰动方法, 分析了拥有不同程度跳跃连接结构的经典图像分类深度神经网络模型VGG 19, ResNet 50和DenseNet 201。实验结果表明, 没有跳跃连接结构的深度神经网络只提取了图像中较强的特征, 提取的特征数量比较少, 而拥有跳跃连接结构的深度神经网络提取了图像中更多的特征, 但是这些特征相对较弱; 跳跃连接结构使模型更关注图像的局部颜色分布和全局总体轮廓, 而不过多依赖图像细节特征, 并且跳跃连接结构越密集, 这种趋势越强。

关键词: 深度神经网络, 跳跃连接结构, 特征可视化, 激活最大化, 扰动方法, 可解释性

Abstract:

Training deep neural networks without skip connection structures is challenging when the depth of the networks is high. Thus, to address optimization issues and enhance generalization performance, skip connection structures have been integrated into the most recent deep neural network models. However, the effect of skip connection structures on feature extraction in deep neural networks has not yet been clarified; in most cases, these models are considered black boxes. Toward the elucidation of this effect, this study focuses on perturbation-based methods and introduces a method called Grid-Shuffled Blurring (GSB). This method aims to reduce the fine-grained details within an image, while maintaining its overall color distribution and contour characteristics. This study employs the Activation Maximization (AM) method for feature visualization and the GSB perturbation method to analyze classic deep neural network models such as VGG 19, ResNet 50, and DenseNet 201 in image classification tasks, which have different levels of skip connection structures. Experimental results show that the neural networks without the skip connection structures extract only stronger features from images, resulting in fewer extracted features, whereas those with the skip connection structures extract more features from images, albeit weaker ones. Moreover, the skip connection structures cause the models to focus more on the local color distribution and global contours of images, rather than the detailed features of images. The more the skip connection structures, the stronger is the trend.

Key words: deep neural network, skip connection structures, feature visualization, Activation Maximization(AM), perturbation method, interpretability

郭佩林, 张德, 王怀秀. 基于特征可视化探究跳跃连接结构对深度神经网络特征提取的影响[J]. 计算机工程, 2025, 51(4): 149-157.

GUO Peilin, ZHANG De, WANG Huaixiu. Exploring the Impact of Skip Connection Structures on the Deep Neural Networks Feature Extraction Based on Feature Visualization[J]. Computer Engineering, 2025, 51(4): 149-157.

https://www.ecice06.com/CN/Y2025/V51/I4/149

图/表 14

图1 GSB扰动方法

Fig.1 GSB perturbation method

图2 来自针鼹类的1张原图和不同扰动参数ks下得到的新图像

Fig.2 An original image of a tachyglossidae class and new images generated with different perturbation parameters ks

图3 基于AM方法的可视化结果

Fig.3 Visualization results based on AM method

图4 不同网络模型在扰动参数ks取不同值时对各类别图像AD的影响

Fig.4 The influence of different network models on the AD of various categories of images when the perturbation parameter ks takes different values

图5 不同模型在GSB扰动参数ks为5时对不同类别图像的准确率下降值acc_dec的影响

Fig.5 The influence of different models on the accuracy degradation value acc_dec of different categories of images when the GSB perturbation parameter ks is 5

图6 来自燕雀类的一张原图像和当扰动参数ks为5时得到的新图像

Fig.6 Original image from the brambling class and a new image obtained when the perturbation parameter ks is 5

图7 燕雀类的原图像和添加扰动后的图像分别在VGG 19，ResNet 50和DenseNet 201模型下得到的Score-CAM热力图

Fig.7 Score-CAM heatmaps obtained from the original and perturbed images of brambling class using VGG 19, ResNet 50, and DenseNet 201 models, respectively

参考文献 39

1	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2023-10-18]. https://arxiv.org/pdf/1409.1556.
2	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 770-778.
3	REN S Q , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
4	YAN C Q , ZHANG H , LI X L , et al. R-SSD: refined single shot multi-box detector for pedestrian detection. Applied Intelligence, 2022, 52 (9): 10430- 10447. doi: 10.1007/s10489-021-02798-1
5	LIU Q, KORTYLEWSKI A, ZHANG Z S, et al. Learning part segmentation through unsupervised domain adaptation from synthetic vehicles[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 19118-19129.
6	PENG D, LEI Y J, HAYAT M, et al. Semantic-aware domain generalized segmentation[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 2584-2595.
7	ZHOU E , XU X , XU B , et al. An enhancement model based on dense aurous and inception convolution for image semantic segmentation. Applied Intelligence, 2023, 53 (5): 5519- 5531.
8	司念文, 张文林, 屈丹, 等. 卷积神经网络表征可视化研究综述. 自动化学报, 2022, 48 (8): 1890- 1920.
	SI N W , ZHANG W L , QU D , et al. Representation visualization of convolutional neural networks: a survey. Acta Automatica Sinical, 2022, 48 (8): 1890- 1892.
9	EHSAN U, WINTERSBERGER P, LIAO Q V, et al. Human-centered explainable AI: beyond opening the black-box of AI[C]//Proceedings of International Conference on Human Factors in Computing Systems. New York, USA: ACM Press, 2022: 1009-1020.
10	GLOROT X , BENGIO Y . Understanding the difficulty of training deep feedforward neural networks. Journal of Machine Learning Research, 2010, 9, 249- 256.
11	梁礼明, 金家新, 冯耀, 等. 融合坐标感知与混合提取的视网膜病变分级算法. 光电工程, 2024, 51 (1): 230276.
	LIANG L M , JIN J X , FENG Y , et al. Retinal lesions graded algorithm that integrates coordinate perception and hybrid extraction. Opto-Electronic Engineering, 2024, 51 (1): 230276.
12	MOHAMED E , SIRLANTZIS K , HOWELLS G . A review of visualization and explanation techniques for convolutional neural networks and their evaluation. Displays, 2022, 73 (5): 1245- 1258.
13	NGUYEN A , YOSINSKI J , CLUNE J . Understanding neural networks via feature visualization: a survey. Cambridge, USA: MIT Press, 2019.
14	OYEDOTUN O K, EL RAHMAN S A, AOUADA D, et al. Training very deep networks via residual learning with stochastic input shortcut connections[C]//Proceedings of International Conference on Neural Information Processing. Berlin, Germany: Springer, 2017: 23-33.
15	OYEDOTUN O K , ISMAEIL K A , AOUADA D . Why is everyone training very deep neural network with skip connections?. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34 (9): 5961- 5975. doi: 10.1109/TNNLS.2021.3131813
16	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. New York, USA: ACM Press, 2015: 448-456.
17	CHEN Y, LI J, XIAO H, et al. Dual path networks[C]//Proceedings of Annual Conference on Neural Information Processing Systems. Long Beach, USA: NIPS Foundation, 2017: 4468-4476.
18	ZHANG X C, LI Z Z, LOY C C, et al. PolyNet: a pursuit of structural diversity in very deep networks[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 3900-3908.
19	SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, Inception-ResNet and the impact of residual connections on learning[EB/OL]. [2023-10-18]. https://arxiv.org/pdf/1602.07261.
20	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of Annual Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 5999-6009.
21	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA: IEEE Press, 2021: 5278-5284.
22	DAI D , LI Y T , WANG Y Q , et al. Rethinking the image feature biases exhibited by deep convolutional neural network models in image recognition. CAAI Transactions on Intelligence Technology, 2022, 7 (4): 721- 731. doi: 10.1049/cit2.12097
23	FONG R C, VEDALDI A. Interpretable explanations of black boxes by meaningful perturbation[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 3449-3457.
24	FONG R C, PATRICK M, VEDALDI A. Understanding deep networks via extremal perturbations and smooth masks[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 2950-2958.
25	ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 818-833.
26	SMILKOV D, THORAT N, KIM B, et al. SmoothGrad: removing noise by adding noise[EB/OL]. [2023-10-18]. https://arxiv.org/pdf/1706.03825.
27	SUNDARARAJAN M, TALY A, YAN Q Q. Axiomatic attribution for deep networks[EB/OL]. [2023-10-18]. https://arxiv.org/pdf/1703.01365.
28	KIM B, SEO J, JEON S, et al. Why are saliency maps noisy solution to noisy saliency maps[C]//Proceedings of IEEE International Conference on Computer Vision Workshops. Washington D. C., USA: IEEE Press, 2019: 4149-4157.
29	GU J D, YANG Y C, TRESP V. Understanding individual decisions of CNNs via contrastive backpropagation[EB/OL]. [2023-10-18]. https://arxiv.org/pdf/1812.02100.
30	IWANA B K, KUROKI R, UCHIDA S. Explaining convolutional neural networks using softmax gradient layer-wise relevance propagation[C]//Proceedings of IEEE International Conference on Computer Vision Workshops. Washington D. C., USA: IEEE Press, 2019: 4176-4185.
31	SELVARAJU R R , COGSWELL M , DAS A , et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 2020, 128 (2): 336- 359. doi: 10.1007/s11263-019-01228-7
32	SHI T, LI Y, LIANG H, et al. Score-CAM: class activation map based on logarithmic transformation[C]//Proceedings of IEEE International Conference on Signal Processing. Washington D. C., USA: IEEE Press, 2022: 256-259.
33	MONTAVON G , LAPUSCHKIN S , BINDER A , et al. Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recognition, 2017, 65, 211- 222. doi: 10.1016/j.patcog.2016.11.008
34	YOSINSKI J , CLUNE J , NGUYEN A , et al. Understanding neural networks through deep visualization. Neural Networks, 2015, 34, 345- 356.
35	WANG F , LIU H , CHENG J . Visualizing deep neural network by alternately image blurring and deblurring. Neural Networks, 2018, 97, 162- 172. doi: 10.1016/j.neunet.2017.09.007
36	SHI R , LI T , YAMAGUCHI Y . Group visualization of class-discriminative features. Neural Networks, 2020, 129, 75- 90. doi: 10.1016/j.neunet.2020.05.026
37	KATZMANN A , TAUBMANN O , AHMAD S , et al. Explaining clinical decision support systems in medical imaging using cycle-consistent activation maximization. Neurocomputing, 2021, 458, 141- 156. doi: 10.1016/j.neucom.2021.05.081
38	MAHENDRAN A, VEDALDI A. Understanding deep image representations by inverting them[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2015: 5188-5196.
39	DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2009: 248-255.

[1]	赵宏, 宋馥荣, 李文改. 基于SE-AdvGAN的图像对抗样本生成方法研究[J]. 计算机工程, 2025, 51(2): 300-311.
[2]	黄舒怡, 谭光. 基于分区的高效视频目标检测[J]. 计算机工程, 2025, 51(2): 65-77.
[3]	宫阿娟, 潘天荣. 多病种眼底疾病诊断的深度学习策略讨论[J]. 计算机工程, 2024, 50(5): 363-372.
[4]	刘帅威, 李智, 王国美, 张丽. 基于Transformer和GAN的对抗样本生成算法[J]. 计算机工程, 2024, 50(2): 180-187.
[5]	靳雁霞, 史志儒, 杨晶, 刘亚变, 乔星宇, 张翎. 布料与精细建模物体间的碰撞检测算法研究[J]. 计算机工程, 2023, 49(7): 269-277.
[6]	陈锐, 孙羽菲, 郭强, 隋轶丞, 周振辉, 石昌青, 张玉志. OclDNN:一种可应用于TensorFlow的通用DNN库[J]. 计算机工程, 2023, 49(4): 138-148.
[7]	石磊, 张吉涛, 高宇飞, 卫琳, 陶永才. 基于Transformer与BiLSTM的网络流量入侵检测[J]. 计算机工程, 2023, 49(3): 29-36,57.
[8]	王春东, 孙嘉琪, 杨文军. 基于矫正理解的中文文本对抗样本生成方法[J]. 计算机工程, 2023, 49(2): 37-45.
[9]	吕学强, 赵兴强, 贾智彬, 韩晶. 面向分类网络的视觉语义解释模型[J]. 计算机工程, 2023, 49(11): 220-230.
[10]	刘金硕, 詹岱依, 邓娟, 王丽娜. 基于深度神经网络和联邦学习的网络入侵检测[J]. 计算机工程, 2023, 49(1): 15-21,30.
[11]	董卫宇, 李海涛, 王瑞敏, 任化娟, 孙雪凯. 基于堆叠卷积注意力的网络流量异常检测模型[J]. 计算机工程, 2022, 48(9): 12-19.
[12]	普瑞丽, 王元龙, 李茹. 融合因果关系表征的阅读理解因果关系类选项判断[J]. 计算机工程, 2022, 48(7): 89-96.
[13]	张恒, 陈晓红, 蓝宇翔, 李舜酩. 基于深度学习的监督型典型相关分析[J]. 计算机工程, 2022, 48(5): 222-228.
[14]	路东生, 张玉金, 党良慧. 面向图像篡改取证的多特征融合U形深度网络[J]. 计算机工程, 2022, 48(4): 213-222.
[15]	杨文雪, 吴非, 郭桐, 肖利民. 基于噪声溶解的对抗样本防御方法[J]. 计算机工程, 2022, 48(4): 158-164.

选择文件类型/文献管理软件名称

选择包含的内容