一种新的基于基底-门控机制的激活函数——自适应参数化Softplus-Sigmoid

doi:10.19678/j.issn.1000-3428.0253416

摘要/Abstract

摘要： 近年来深度学习在计算机视觉等研究领域取得越来越多的成果，其中，激活函数对于增强深度神经网络的非线性拟合能力具有重要的影响。但随着研究的深入，现有的激活函数，如ReLU和SiLU等，暴露出越来越多的问题，比如存在梯度消失/死亡现象，对负值区域不具有自适应调节性等。论文针对常见目标检测识别任务中显著性特征的去留问题，提出了一种新的激活函数--自适应参数化Softplus-Sigmoid函数 (Adaptive Parametric Softplus-Sigmoid，APSS)，旨在从复杂背景中精准地提取和学习目标的多尺度融合特征。该激活函数基于生物神经科学中的基底-门控组合机制。其中，基底项确保基础特征的可学习性与梯度稳定性，门控项则通过动态调节负值区域的响应强度，实现无效特征的抑制，通过两者的有机结合，实现网络模型保留和抑制特征能力的平衡。为了验证该激活函数的优势，论文在SoccerNet、UA-DETRAC和BEEF24等三组实验数据集上，与几种典型的目标检测识别网络原型进行了对比实验。研究结果表明，论文提出的APSS激活函数显著优于原始网络模型中的激活函数，具有更好的目标特征提取和拟合能力。

Abstract: In recent years, deep learning has achieved increasing success across various research fields such as computer vision, in which activation functions play an important role in enhancing the nonlinear fitting capability of deep neural networks. However, existing activation functions such as ReLU, SiLU, etc., have revealed more and more issues as research progresses, such as the problems of gradient vanishing/dead and the lack of adaptive regulation capability in the negative region, etc. This paper proposes a new activation function—Adaptive Parametric Softplus-Sigmoid (APSS)—for the salient feature preservation and dropping in common object detection and recognition tasks. It aims to extract and learn the multi-scale collaborative features from complex backgrounds. This activation function is based on the base-gate combination mechanism in biological neuroscience. The base unit ensures the learnability of basic features and gradient stability. The gate unit achieves the suppression of invalid features by dynamically adjusting the response intensity in the negative value region. The combination of two units can promote the network model's balance of retaining or suppressing features. To verify the advantages of this activation function, this paper conducts comparative experiments with several typical object detection and recognition network prototypes on three experimental datasets: SoccerNet, UA-DETRAC, and BEEF24. The research results show that the proposed APSS activation function is significantly superior to the activation functions in the original network models. It has better target feature extraction and fitting capabilities.

陆小辰, 王胜蓝, 钟琰, 张晶晶, 张磊. 一种新的基于基底-门控机制的激活函数——自适应参数化Softplus-Sigmoid[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0253416.

Lu Xiaochen, Wang Shenglan, Zhong Yan, Zhang Jingjing, Zhang Lei. A Novel Activation Function Regarding Substrate-Gate Mechanism--Adaptive Parameterized Softplus-Sigmoid Function[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0253416.

参考文献

[1] 李一波, 郭培宜, 张森悦. 深度卷积神经网络中激活函数的研究 [J]. 计算机技术与发展, 2021, 31(09): 61-66. Li Y B, Guo P Y, Zhang S Y. Research on activation function in deep convolutional neural network [J]. Computer Technology and Development, 2021, 31(09): 61-66.
[2] 涂洲, 陈明. 复杂交通场景下的轻量级目标检测算法 [J]. 计算机应用与软件, 2025, 42(08): 317-324+389. Tu Z, Chen M. Lightweight target detection algorithm in complex traffic scenes [J]. Computer Applications and Software, 2025, 42(08): 317-324+389.
[3] 曲之琳, 胡晓飞. 基于改进激活函数的卷积神经网络研究 [J]. 计算机技术与发展, 2017, 27(12): 77-80. Qu Z L, Hu X F. Research on convolutional neural network based on improved activation function [J]. Computer Technology and Development, 2017, 27(12): 77-80.
[4] Noel M M, L A, Trivedi A, et al. Growing cosine unit: A novel oscillatory activation function that can speedup training and reduce parameters in convolutional neural networks [J]. 2021, arXiv. 2108.12943.
[5] Dubey S L, Singh S K, Chaudhuri B B. Activation functions in deep learning: A comprehensive survey and benchmark [J]. Neurocomputing, 2021, 503: 92-108.
[6] Rahman J U, Makhdoom F, Lu D. Amplifying sine unit: an oscillatory activation function for deep neural networks to recover nonlinear oscillations efficiently [J]. 2023, arXiv. 2304.09759.
[7] Rahman J U, Makhdoom F, Lu D. Asu-Cnn: An efficient deep architecture for image classification and feature visualizations [J]. 2023, arXiv: 2305.19146.
[8] Girshick R. Fast R-CNN [J]. Computer Science, 2015, 1440-1448.
[9] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 39(6): 1137-1149.
[10] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector [J]. 2016, arXiv. 1512.02325.
[11] Khanam R, Hussain M. YOLOv11: An overview of the key architectural enhancements [J]. 2024, arXiv:2410.1772.
[12] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers [J]. 2020, arXiv:2005.12872.
[13] Waqas N, Islam M, Yahya M, et al. Med-ReLU: A parameter-free hybrid activation function for deep artificial neural network used in medical image segmentation [J]. Computers, Materials and Continua, 2025, 84: 3029-3051.
[14] Li J Y, Cheng Y X, Lu Y W, et al. From ReLU to GeMU: Activation functions in the lens of cone projection [J]. Neural Networks, 2025, 190: 107654.
[15] Lee S M, Sim B S, Ye J C. Magnitude and angle dynamics in training single ReLU neurons [J]. Neural Networks, 2024, 178: 106435.
[16] Cabanilla K I M., Mohammad R Z., Lope J E C., et al. Neural networks with ReLU powers need less depth [J]. Neural Networks, 2024, 172: 106073.
[17] Freeman D, Haider D. Optimal lower Lipschitz bounds for ReLU layers, saturation, and phase retrieval [J]. Applied and Computational Harmonic Analysis, 2026, 80: 101801.
[18] Lu L, Shin Y J, Su Y H, et al. Dying ReLU and initialization: Theory and numerical examples [J]. 2019, arXiv:1903.06733.
[19] Bigarella E D. Robust deep network learning of nonlinear regression tasks by parametric leaky exponential linear units (LELUs) and a diffusion metric [J]. Information Sciences, 2026, 725: 122739.
[20] 郭相均, 蒋朝根. 基于改进YOLOv8s的肺结节检测算法 [J]. 现代信息科技, 2025, 9(07): 87-92. Guo X J, Jiang C G. Pulmonary nodule detection algorithm based on improved YOLOv8s [J]. Modern Information Technology, 2025, 9(07): 87-92.
[21] Duman M G, Koparal S, Ömür N, et al. AdLU: Adaptive double parametric activation functions [J]. Digital Signal Processing, 2026, 168: 105579.
[22] Xu J, Li Z S, Du B W, et al. Reluplex made more practical: Leaky ReLU [C]//IEEE Symposium On Computers and communications (ISCC). Rennes, France, IEEE, 2020, 1-7.
[23] Ma N N, Zhang X Y, Sun J. Funnel activation for visual recognition [C]//European Conference on Computer Vision (ECCV). Glasgow, UK, Springer, 2020, 12356: 351-368.
[24] Chen Y P, Dai X Y, Liu M C, et al. Dynamic ReLU [J]. 2020, arXiv:2003.10027.
[25] Clevert D A, Unterthiner T, Hochreiter S. Fast and accurate deep network learning by exponential linear units (ELUs) [J]. Computer Science, 2015, arXiv:1511.07289.
[26] Misra D. Mish: A self regularized non-monotonic neural activation function [J]. 2019, arXiv:1908.08681.
[27] He K M, Zhang X Y, Ren S Q, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]//IEEE International Conference on Computer Vision (ICCV). Santiago, Chile, IEEE, 2015, 1026-1034.
[28] Ramachandran P, Zoph B, Le Q V. Searching for activation functions [J]. 2017, arXiv: 1710.05941.
[29] Elfwing S, Uchibe E, Doya K. Sigmoid weighted linear units forneural network function approximation in reinforcement learning [J]. Neural Networks. 2018, 107: 3-11.
[30] Cai S D. AdaShift: Learning discriminative self-gated neural feature activation with an adaptive shift factor [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA, IEEE, 2024, 5947-5956.
[31] Alexandridis K P, Deng J K, Nguyen A, et al. Adaptive Parametric Activation [C]//European Conference on Computer Vision (ECCV). Milan, Italy, Springer, 2024, 455-476.
[32]Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection [C]//IEEE International Conference on Computer Vision (ICCV). Venice, Italy, IEEE, 2017, 2999-3007.
[33] Tollner D, Wang Z Y, Zöldy M, et al. Demonstrating a new evaluation method on ReLU based Neural Networks for classification problems [J]. Expert Systems with Applications, 2024, 250: 123905.
[34] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, IEEE, 2017, 936-944.
[35] Benda J, Herz A V. M. A universal model for spike-frequency adaptation [J]. Neural Computation, 2003, 15(11): 2523-2564.
[36] Rosario J D, Coletta S, Kim S H, et al. Lateral inhibition in V1 controls neural and perceptual contrast sensitivity [J]. Nature Neuroscience, 2025, 28(4): 836-847.
[37]Zheng H, Yang Z L, Liu W J, et al. Improving deep neural networks using softplus units [C]//International Joint Conference on Neural Networks (IJCNN). Killarney, IEEE, 2015, 1-4.
[38] Cioppa A, Giancola S, Deliege A, et al. SoccerNet-Tracking: Multiple object tracking dataset and benchmark in soccer videos [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA, IEEE, 2022, 3491-3502.
[39] Wen L L, Du D W, Cai Z W, et al. UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking [J]. Computer Vision and Image Understanding, 2020, 193: 102907.
[40] Cao X Y, Zheng Y Y, Yao Y, et al. TOPIC: A parallel association paradigm for multi-object tracking under complex motions and diverse scenes [J]. IEEE Transactions on Image Processing, 2025, 34: 743-758.
[41] Sandler M, Howard A, Zhu M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT, USA, IEEE, 2018, 4510-4520.
[42] Varghese R, Sambath M. YOLOv8: A novel object detection algorithm with enhanced performance and robustness [C]//International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). Chennai, India, 2024, 1-6.
[43] Zhao Y, Lv W Y, Xu S L, et al. DETRs beat YOLOs on real-time object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA, IEEE, 2024, 16965-16974.
[44] Rahman J U, Zulfiqar R, Khan A, et al. SwishReLU: A unified approach to activation functions for enhanced deep neural networks performance [J]. 2024, arXiv: 2407.08232.
[45] Zheng Q H, Tian X Y, Yu Z G, et al. Robust automatic modulation classification using asymmetric trilinear attention net with noisy activation function [J]. Engineering Applications of Artificial Intelligence, 2025, 141: 109861.
[46] Fu R G, Hu Q Y, Dong X H, et al. Axiom-based grad CAM: towards accurate visualization and explanation of CNNs [J]. 2020, arXiv: 2008.02312.

选择文件类型/文献管理软件名称

选择包含的内容