结合大核注意力与信息保真采样的缺陷检测网络

doi:10.19678/j.issn.1000-3428.0260187

摘要/Abstract

摘要： 针对工业现场钢材表面缺陷对比度低、尺度变化大以及端侧算力受限等挑战，本文提出一种全路径协同增强的高效检测网络ISA-DETR。首先，构建信息保真下采样（Information-Preserving Downsampling,IPD）结构，采用空间-通道重排的像素重组方式替代传统步长下采样，在降低特征图分辨率的同时有效保留细粒度空间信息，缓解微小缺陷在特征提取过程中的信息丢失问题。其次，设计集成大核可分离注意力机制的SLK-HG（Large Separable Kernel Attention-Hybrid Group Block）模块，通过分组卷积与可分离卷积的协同优化，以近似线性计算复杂度构建超大感受野，增强网络对长程空间依赖及不规则缺陷形态的建模能力。最后，引入自适应动态采样（Adaptive Dynamic Sampling, ADS）算子，通过内容驱动的偏移预测实现跨尺度特征的精确对齐，减少复杂背景下的定位偏差，提升检测鲁棒性。在NEU-DET钢材表面缺陷数据集上的实验结果表明，在参数量仅为20.67M、计算量为77.5GFLOPs的条件下，ISA-DETR的检测精度达到75.2%的mAP@0.5。相较于基准模型，其参数量和计算量分别降低35.4%和25.1%，同时检测精度提升3.2%。此外，在PCB缺陷数据集上的迁移实验进一步验证了该方法良好的泛化能力。所提出算法在检测性能与部署效率之间实现了有效平衡，为工业端侧智能质检提供了一种高效可靠的解决方案。

Abstract: To address the challenges of low contrast, large-scale variation, and limited computational resources in industrial steel surface defect inspection, this paper proposes an efficient detection network with full-path collaborative enhancement, termed ISA-DETR. First, an Information-Preserving Downsampling (IPD) structure is constructed, which adopts a spatial–channel rearrangement-based pixel reorganization strategy to replace conventional strided downsampling. This design effectively preserves fine-grained spatial information while reducing feature map resolution, thereby alleviating the information loss of small defects during feature extraction. Second, a Large Separable Kernel Attention-Hybrid Group Block (SLK-HG) module is developed by integrating large-kernel separable attention mechanisms. Through the collaborative optimization of group convolution and separable convolution, the module builds an ultra-large receptive field with near-linear computational complexity, significantly enhancing the network’s ability to model long-range spatial dependencies and irregular defect patterns. Furthermore, an Adaptive Dynamic Sampling (ADS) operator is introduced to achieve precise cross-scale feature alignment via content-driven offset prediction, reducing localization deviations in complex backgrounds and improving detection robustness. Experimental results on the NEU-DET steel surface defect dataset demonstrate that ISA-DETR achieves an mAP@0.5 of 75.2% with only 20.67M parameters and 77.5 GFLOPs. Compared with the baseline model, the proposed method reduces the number of parameters and computational cost by 35.4% and 25.1%, respectively, while improving detection accuracy by 3.2%. In addition, transfer experiments on the PCB defect dataset further verify the strong generalization capability of the proposed approach. Overall, the proposed method achieves an effective balance between detection performance and deployment efficiency, providing a practical and reliable solution for intelligent quality inspection in industrial edge scenarios.

沈学利, 秦庆杰. 结合大核注意力与信息保真采样的缺陷检测网络[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0260187.

SHEN Xueli, QIN Qingjie. Defect Detection Network Combining Large-Kernel Attention and Information-Preserving Sampling[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0260187.

参考文献

[1] Lv Z, Zhou C, et al. Metal surface defect detection based on improved YOLOv5[J]. Scientific Reports, 2023,13: 20803.
[2] Zhang T, Pan P, Zhang J, Zhang X. Steel surface defect detection algorithm based on improved YOLOv8n[J]. Applied Sciences, 2024, 14(12): 5325.
[3] Chen C, Lee H, Chen M. Steel surface defect detection method based on improved YOLOv9[J]. Scientific Reports, 2025, 15: 25098.
[4] 张政超. 改进YOLOv5的轻量级带钢表面缺陷检测[J]. 计算机系统应用, 2023, 32(6): 278–285. ZHANG Zhengchao. Lightweight detection of steel strip surface defects based on improved YOLOv5[J]. Computer Systems & Applications, 2023, 32(6): 278-285.
[5] 杨本臣, 李世熙, 金海波, 康洁. 多尺度融合的轻量级钢材表面缺陷检测[J]. 计算机系统应用, 2024, 33(11): 58–67. YANG Benchen, LI Shixi, JIN Haibo, KANG Jie. Lightweight steel surface defect detection with multi-scale fusion[J]. Computer Systems & Applications, 2024, 33(11): 58-67.
[6] Zhou M, Wang H, Wang Y. A high precision and lightweight method for steel surface defect detection based on improved YOLOv5[J]. Scientific Reports, 2025, 15: 6045.
[7] 彭菊红, 张弛, 高谦, 等. 基于改进的 YOLOv8 算法的钢材缺陷检测[J]. 计算机工程, 2025, 51(7): 152-160. PENG Juhong, ZHANG Chi, GAO Qian, et al. Steel defect detection based on improved YOLOv8 algorithm[J]. Computer Engineering, 2025, 51(7): 152-160.
[8] Wu S, Yang H, Liao L, Song C, Fang Y, Yang Y. SH-DETR: Enhancing steel surface defect detection and classification with an improved transformer architecture[J]. PLOS ONE, 2025, 20(11): e0334048.
[9] Zhou S, Cai Y, Zhang Z, Yin J. MESC-DETR: An Improved RT-DETR Algorithm for Steel Surface Defect Detection[J]. Electronics, 2025, 14(11): 2232.
[10] Ma Y, et al. Surface defect inspection of industrial products with object detection deep networks: A systematic review[J]. Artificial Intelligence Review, 2024.
[11] He Y, Li S, Wen X, Xu J. A survey on surface defect inspection based on generative models in manufacturing[J]. Applied Sciences, 2024, 14(15): 6774.
[12] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
[13] 张冀,王定邦,曹锦纲,等.改进YOLOv8的轻量化钢材表面缺陷检测[J/OL].智能系统学报,1-15[2026-03-10].https://link.cnki.net/urlid/23.1538.TP.20250924.1242.002. ZHANG Ji, WANG Dingbang, CAO Jingang, et al. Lightweight steel surface defect detection based on improved YOLOv8[J/OL]. CAAI Transactions on Intelligent Systems, pp. 1-15 [2026-03-10]. https://link.cnki.net/urlid/23.1538.TP.20250924.1242.002.
[14] Zhao Y, Wang Z, Li X, et al. DETRs Beat YOLOs on Real-time Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 7423–7432.
[15] 陈俊洪. 复杂纹理下的工业产品表面缺陷检测方法研究[D]. 福州大学 2023 CHEN Junhong. Research on surface defect detection method for industrial products under complex texture[D]. Fuzhou University. 2023
[16] He L., Wang M. SliceSamp: A Promising Downsampling Alternative for Retaining Information in a Neural Network [J]. Applied Sciences, 2023, 13(21): 11657.
[17] Liu K., Fu Z., Jin S., et al. ESOD: Efficient Small Object Detection on High Resolution Images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE, 2024: 1723-1732.
[18] SUNKARA R, LUO T. No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects[C]//European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). 2022: 443-459.
[19] CARION N, MASSA F, SYNNAEVE G, et al. End-to-End Object Detection with Transformers[C]//European Conference on Computer Vision. Springer, Cham, 2020: 213-229.
[20] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver: IEEE, 2023: 7464-7475.
[21] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Advances in Neural Information Processing Systems (NeurIPS). Montreal: Curran Associates, Inc., 2015.
[22] LIU W, LU H, FU H, et al. Learning to upsample by learning to sample[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris: IEEE, 2023.
[23] LIU W, LU H, LIU Y, et al. On point affiliation in feature upsampling[R]. 2023: arXiv:2307.08198. https:// arxiv.org/ abs/2307.08198.
[24] LIU W, LU H, FU H, et al. DySample: Lightweight dynamic upsampling for dense prediction[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris: IEEE, 2023.
[25] LAU K W, PO L M, REHMAN Y A. Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN[J]. Expert Systems with Applications, 2024, 238: 121736.
[26] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J]. arXiv preprint, 2018, arXiv:1804.02767. https://arxiv.org/abs/1804.02767.
[27] Ding X, Zhang X, Han J, Ding G. Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022.
[28] Wang W, Li S, Shao J, et al. LKC-Net: Large kernel convolution object detection network[J]. *Scientific Reports*, 2023, 13(1): 9535.
[29] HE Y, SONG K, MENG Q, et al. An end-to-end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE transactions on instrumentation and measurement, 2019, 69(4):1493-1504.
[30] HUANG W B, WEI P. A PCB dataset for defects detection and classification[J/OL]. arXiv, 2019:1901. 08204 [31] WANG A, CHEN H, LI L, et al. YOLOv10: Real-time end-to-end object detection[J/OL]. arXiv, 2024: 2405.14458.
[32] ULTRALYTICS. YOLOv11: Next-generation real-time object detection[EB/OL]. GitHub Repository, 2024.
[33] ULTRALYTICS. YOLOv12: Advanced real-time object detection architecture[EB/OL]. GitHub Repository, 2025.
[34] ZHANG H, LIU Y, WANG X, et al. DEIM: DETR with improved matching for Fast Convergence [J/OL]. arXiv, 2024.
[35] HU Z, WU P, CHEN J, et al. Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection[C]//Proceedings of the 33rd ACM International Conference on Multimedia. New York: ACM, 2025: 101-110. DOI:10.1145/3746027.3754861.

选择文件类型/文献管理软件名称

选择包含的内容