Target Detection Algorithm for Remote Sensing Images with Multi-Scale Information Enhancement

doi:10.19678/j.issn.1000-3428.0070252

Abstract

Abstract: Feature extraction from remote sensing images with complex backgrounds is challenging, and he accuracy is low due to the high density of small targets and significant scale variations. To address these challenges, this paper proposes a multi-scale information-enhanced target detection algorithm based on YOLOv5s: Deep Learning YOLO(DL-YOLO). First, the improved algorithm employs cavity convolutional fast spatial pyramid pooling designed based on Spatial Pyramid Pooling-Fast (SPPF) at the top of the backbone network. This improves the feature extraction capability of the network by fusing the detailed information of the multi-scale targets with the semantic information through the Receptive Field Enhancement Block (RFEB). Second, the improved algorithm incorporates a Lightweight and Efficient Detection Head (LEDH), which is based on the Decoupling Head (DH) of YOLOv6. The original detection head is replaced with the LEDH, which features a lightweight cavity Global Depth Convolution (GDConv) module, to improve the correlation learning of classification and regression tasks. The LEDH also employs lightweight convolution for lightweighting purposes, which enhances the target detection accuracy at different scales and reduces the number of decoupling head parameters. The results of the experiment on the DIOR dataset demonstrate that the proposed DL-YOLO algorithm increases precision, recall, mAP@0.5, and mAP by 1.6, 2.1, 2.1, and 4.7 percentage points, respectively, compared with YOLOv5s. The all-around score of the proposed algorithm surpasses those of several current exceptional target detection algorithms; hence, it is feasible for detecting targets in remote sensing images at multiple scales.

Key words: remote sensing images, complex background, YOLOv5s algorithm, multi-scale target detection, Decoupling Header (DH)

摘要： 针对复杂背景遥感图像中小目标密集、目标尺度变化大等因素给目标检测带来的特征提取困难、精度不佳的问题,在YOLOv5s基础上提出一种多尺度信息增强的目标检测算法——深度学习YOLO(DL-YOLO)。首先,改进算法在主干网络顶部采用基于快速空间金字塔池化设计的空洞卷积快速空间金字塔池化,通过其中的感受野增强模块(RFEB)融合多尺度目标的细节信息与语义信息,提高网络的特征提取能力。其次,改进算法的检测头部分采用以YOLOv6s解耦头(DH)为基础设计的轻量高效解耦头(LEDH)来替换原有的检测头,在该解耦头中设计了轻量化空洞全局深度卷积(GDConv)模块来增强分类与回归任务关联信息的学习,以及引用轻量化卷积实现轻量化,在提高各尺度目标检测精度的同时,降低解耦头参数量。在DIOR数据集上的实验结果表明,与YOLOv5s相比,提出的DL-YOLO算法在精确率、召回率、mAP@0.5、mAP上分别提高了1.6、2.1、2.1和4.7百分点,综合指标超过了现有优秀的目标检测算法,对遥感图像中多尺度目标检测具有实际应用意义。

关键词: 遥感图像, 复杂背景, YOLOv5s算法, 多尺度目标检测, 解耦头

CLC Number:

TP751

YANG Lu, LIU Junjie, YU Xiang. Target Detection Algorithm for Remote Sensing Images with Multi-Scale Information Enhancement[J]. Computer Engineering, 2026, 52(4): 200-213.

杨路, 刘俊杰, 余翔. 多尺度信息增强的遥感图像目标检测算法[J]. 计算机工程, 2026, 52(4): 200-213.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0070252

https://www.ecice06.com/EN/Y2026/V52/I4/200

References

[1] CHENG G, HAN J W. A survey on object detection in optical remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 117: 11-28.
[2] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2015: 1440-1448.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2015: 20-36.
[4] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 779-788.
[5] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 7263-7271.
[6] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].[2024-07-10]. https://arxiv.org/pdf/1804.02767.pdf.
[7] LI Z, YUAN J, LI G, et al. RSI-YOLO: object detection method for remote sensing images based on improved YOLO[J]. Sensors, 2023, 23(14): 6414-6427.
[8] CUI M H, GONG G L, CHEN G, et al. LC-YOLO: a lightweight model with efficient utilization of limited detail features for small object detection[J]. Applied Sciences, 2023, 13(5): 3174-3190.
[9] ZHANG R, XIE C, DENG L W. A fine-grained object detection model for aerial images based on YOLOv5 deep neural network[J]. Chinese Journal of Electronics, 2023, 32(1): 51-63.
[10] ZHANG J R, CHEN Z H, YAN G X, et al. Faster and lightweight: an improved YOLOv5 object detector for remote sensing images[J]. Remote Sensing, 2023, 15(20): 4974-4999.
[11] YI J, SHEN Z L, CHEN F, et al. A lightweight multiscale feature fusion network for remote sensing object counting[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61(3): 1-13.
[12] 崔丽群, 曹华维. 基于改进YOLOv5的遥感图像目标检测[J]. 计算机工程, 2024, 50(4): 228-236. CUI L Q, CAO H W. Target detection of remote-sensing images based on improved YOLOv5[J]. Computer Engineering, 2024, 50(4): 228-236. (in Chinese)
[13] 雷大江, 杜加浩, 张莉萍, 等. 联合多流融合和多尺度学习的卷积神经网络遥感图像融合方法[J]. 电子与信息学报, 2022, 44(1): 237-244. LEI D J, DU J H, ZHANG L P, et al. A convolutional neural network remote sensing image fusion method with joint multi-stream fusion and multi-scale learning[J]. Journal of Electronics and Information, 2022, 44(1): 237-244. (in Chinese)
[14] 马梁, 苟于涛, 雷涛, 等. 基于多尺度特征融合的遥感图像小目标检测[J]. 光电工程, 2022, 49(4): 47-63. MA L, GOU Y T, LEI T, et al. Small target detection in remote sensing images based on multi-scale feature fusion[J]. Optical Engineering, 2022, 49(4): 47-63. (in Chinese)
[15] SU Z, YU J, TAN H, et al. MSA-YOLO: a remote sensing object detection model based on multi-scale strip attention[J]. Sensors, 2023, 23(15): 6811-6825.
[16] LIAO H, ZHU W. YOLO-DRS: a bioinspired object detection algorithm for remote sensing images incorporating a multi-scale efficient lightweight attention mechanism[J]. Biomimetics, 2023, 8(6): 458-470.
[17] ZHOU L M, LI Y H, RAO X H, et al. Ship target detection in optical remote sensing images based on multiscale feature enhancement[J]. Computational Intelligence and Neuroscience, 2022(1): 2605140.
[18] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Perss, 2017: 2117-2125.
[19] WANG W H, XIE E Z, SONG X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 8440-8449.
[20] LUO W, LI Y, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2016: 29-43.
[21] ZHAO Z P, HE C, ZHAO G M, et al. RA-YOLOX: re-parameterization align decoupled head and novel label assignment scheme based on YOLOX[J]. Pattern Recognition, 2023, 140: 109579.
[22] WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2018: 1451-1460.
[23] HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 1580-1589.
[24] LI K, WAN G, CHENG G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296-307.
[25] YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 472-480.
[26] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[27] LIU S T, HUANG D, WANG Y H. Receptive field block net for accurate and fast object detection[C]//Proceedings of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 404-419.
[28] ZHOU X, WANG D, KRAHENBUHL P. Objects as points[J].[EB/OL].[2024-07-10]. https://arxiv.org/pdf/1904.07850.pdf.
[29] ZHANG J, XIE C M, XU X, et al. A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518-4531.
[30] HUANG W, LI G Y, CHEN Q Q, et al. CF2PN: a cross-scale feature fusion pyramid network based remote sensing target detection[J]. Remote Sensing, 2021, 13(5): 847.
[31] YU D W, JI S P. A new spatial-oriented object detection framework for remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60(4): 1-16.
[32] YUAN Z C, LIU Z M, ZHU C B, et al. Object detection in remote sensing images via multi-feature pyramid network with receptive field block[J]. Remote Sensing, 2021, 13(5): 862-879.
[33] LI Y Y, HUANG Q, PEI X, et al. Cross-layer attention network for small object detection in remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14(1): 2148-2161.
[34] ZHOU L M, ZHENG C, YAN H X, et al. RepDarkNet: a multi-branched detector for small-target detection in remote sensing images[J]. ISPRS International Journal of Geo-Information, 2022, 11(3): 158-173.

Please choose a citation manager

Content to export