A Study on Improved Faster R-CNN Model for Multi-Object Detection in Remote Sensing Images

doi:10.19678/j.issn.1000-3428.0068856

Abstract

Abstract:

The complex backgrounds, diverse target types, and significant scale variations in remote sensing images lead to target omission and false detection. To address these issues, this study proposes an improved Faster R-CNN multi-object detection model. First, the ResNet 50 backbone network is replaced with the Swin Transformer to enhance the model's feature extraction capability. Second, a Balanced Feature Pyramid (BFP) module is introduced to fuse shallow and deep semantic information, further strengthening the feature fusion effect. Finally, in the classification and regression branches, a dynamic weighting mechanism is incorporated to encourage the network to focus more on high-quality candidate boxes during training, thereby improving the precision of target localization and classification. The experimental results on the RSOD dataset show that the proposed model significantly reduces the number of Floating-Point Operations per second (FLOPs) compared to the Faster R-CNN model. The proposed model achieves 10.7 percentage points improvement in mAP@0.5 ∶0.95 and 10.6 percentage points increase in Average Recall (AR). Compared to other mainstream detection models, the proposed model achieves higher accuracy while reducing the false detection rate. These results indicate that the proposed model significantly enhances detection accuracy in remote sensing images with complex backgrounds.

Key words: remote sensing images, multi-object detection, Faster R-CNN, Swin Transformer module, Balanced Feature Pyramid(BFP), dynamic weighting mechanism

摘要：

针对遥感图像背景复杂、目标种类多和尺度差异大所造成的目标漏检和误检问题，提出一种改进Faster R-CNN多目标检测模型。首先，采用Swin Transformer来替代ResNet 50骨干网络，增强模型特征提取能力；其次，添加平衡特征金字塔(BFP)模块融合浅层和高层语义信息，进一步加强特征融合效果；最后，在分类和回归分支中，添加动态权重机制，促进网络在训练过程中更关注高质量候选框，提高目标定位和分类的精确度。在RSOD数据集上的实验结果表明，所提模型相较于Faster R-CNN模型每秒浮点运算次数(FLOPs)大幅度减少，并且模型的mAP@0.5 ∶0.95提高了10.7百分点，平均召回率提高10.6百分点。相较于其他主流检测模型，所提模型在降低漏检率的同时，取得了更高的精度，能显著提高复杂背景下遥感图像的检测精度。

关键词: 遥感图像, 多目标检测, Faster R-CNN, Swin Transformer模块, 平衡特征金字塔, 动态权重机制

MIAO Ru, LI Yi, ZHOU Ke, ZHANG Yanna, CHANG Ranran, MENG Geng. A Study on Improved Faster R-CNN Model for Multi-Object Detection in Remote Sensing Images[J]. Computer Engineering, 2025, 51(8): 292-304.

苗茹, 李祎, 周珂, 张俨娜, 常然然, 孟更. 一种改进的Faster R-CNN遥感图像多目标检测模型研究[J]. 计算机工程, 2025, 51(8): 292-304.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0068856

https://www.ecice06.com/EN/Y2025/V51/I8/292

Figures/Tables 28

Fig.1 Procedure of overall technical

Fig.2 Architecture of FSBD R-CNN network

Fig.3 Swin Transformer module

Fig.4 Multi-head self-attention operation based on shifted window

Fig.5 Balanced feature pyramid module

Fig.6 Structure of feature pyramid combined with balanced feature pyramid

Fig.7 Procedure of dynamic label assignment

Fig.8 Curve of L1Loss and regression error

Fig.9 Curves of SmoothL1 loss and regression error for different β values

Fig.10 Examples of RSOD dataset

Fig.11 Number of object class labels

Fig.12 The change curves of the Loss function with the number of iterations

Fig.13 Comparison of P-R curves of network models before and after improvement

Fig.14 The detection results of objects with unclear semantic information by different models

Fig.15 Comparison of target missed detection and false detection results

Fig.16 Comparison of detection results in complex backgrounds

References 30

1	李振鲁, 黄威, 孙锴. 复杂环境下的轻量化道路目标识别算法研究. 计算机工程, 2024, 50 (4): 219- 227. doi: 10.19678/j.issn.1000-3428.0067576
	LI Z L , HUANG W , SUN K . Research on lightweight road-target-recognition algorithm in complex environment. Computer Engineering, 2024, 50 (4): 219- 227. doi: 10.19678/j.issn.1000-3428.0067576
2	周金涛, 高迪驹, 刘志全. 基于全景视觉的无人船水面障碍物检测方法. 计算机工程, 2024, 50 (2): 113- 121. doi: 10.19678/j.issn.1000-3428.0067238
	ZHOU J T , GAO D J , LIU Z Q . Detection method of water-surface obstacles for unmanned ships based on panoramic vision. Computer Engineering, 2024, 50 (2): 113- 121. doi: 10.19678/j.issn.1000-3428.0067238
3	蒋心璐, 陈天恩, 王聪, 等. 大田环境下的农业害虫图像小目标检测算法. 计算机工程, 2024, 50 (1): 232- 241. doi: 10.19678/j.issn.1000-3428.0067030
	JIANG X L , CHEN T E , WANG C , et al. Small object detection algorithm for agricultural pest images in field environments. Computer Engineering, 2024, 50 (1): 232- 241. doi: 10.19678/j.issn.1000-3428.0067030
4	管嘉程, 任红卫, 周宋佳. 基于YOLOv5改进的轻量化目标检测. 计算机系统应用, 2023, 32 (9): 132- 142.
	GUAN J C , REN H W , ZHOU S J . Lightweight object detection based on YOLOv5 improvement. Computer System Applications, 2023, 32 (9): 132- 142.
5	沙苗苗, 李宇, 李安. 改进Faster R-CNN的遥感图像多尺度飞机目标检测. 遥感学报, 2022, 26 (8): 1624- 1635.
	SHA M M , LI Y , LI A . Improving Faster R-CNN for multi-scale aircraft target detection in remote sensing images. Journal of Remote Sensing, 2022, 26 (8): 1624- 1635.
6	曲海成, 王蒙, 柴蕊. 双向多尺度特征融合的高效遥感图像车辆检测. 计算机工程与应用, 2024, 60 (12): 346- 356.
	QU H C , WANG M , CHAI R . Efficient vehicle detection in remote sensing images with bidirectional multi-scale feature fusion. Computer Engineering and Applications, 2024, 60 (12): 346- 356.
7	戚玲珑, 高建瓴. 基于改进YOLOv7的小目标检测. 计算机工程, 2023, 49 (1): 41- 48. doi: 10.19678/j.issn.1000-3428.0065942
	QI L L , GAO J L . Small object detection based on improved YOLOv7. Computer Engineering, 2023, 49 (1): 41- 48. doi: 10.19678/j.issn.1000-3428.0065942
8	梁嘉杰, 李星星. 特定任务上下文解耦的遥感图像目标检测方法. 计算机工程与应用, 2025, 61 (2): 293- 303.
	LIANG J J , LI X X . Task-specific context decoupling object detection method for remote images. Computer Engineering and Applications, 2025, 61 (2): 293- 303.
9	王龙博, 刘建辉, 张贝贝, 等. 利用注意力机制融合的YOLOv5遥感图像目标检测. 信息工程大学学报, 2023, 24 (4): 438- 446.
	WANG L B , LIU J H , ZHANG B B , et al. Object detection in YOLOv5 remote sensing image using attention mechanism fusion. Journal of University of Information Engineering, 2023, 24 (4): 438- 446.
10	左露, 牛晓伟, 朱春惠, 等. 基于改进ConvNeXt的遥感图像目标检测算法. 电光与控制, 2024, 31 (2): 46-51, 91.
	ZUO L , NIU X W , ZHU C H , et al. Remote sensing image object detection algorithm based on improved ConvNeXt. Electronics Optics and Control, 2024, 31 (2): 46-51, 91.
11	REN S Q , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149.
12	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2017: 2980-2988.
13	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2016: 770-778.
14	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2017: 936-944.
15	LIU Z. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2021: 9992-10002.
16	PANG J M, CHEN K, SHI J P, et al. Libra R-CNN: towards balanced learning for object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2019: 821-830.
17	ZHANG H K, CHANG H, MA B P, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 260-275.
18	LONG L , GONG Y P , XIAO Z F , et al. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55 (5): 2486- 2498.
19	XIAO Z F , LIU W , TANG G F , et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images. International Journal of Remote Sensing, 2015, 36 (2): 618- 644.
20	HOSANG J , BENENSON R , DOLLÁR P , et al. What makes for effective detection proposals?. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38 (4): 814- 830.
21	ZHANG S F, C CHI, YAO Y Q, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2020: 9756-9765.
22	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 6154-6162.
23	ZHU X Z, SU W J, LU L W, et al. Deformable DETR: deformable Transformers for end-to-end object detection[EB/OL]. [2023-10-10]. https://arxiv.org/abs/2010.04159?context=cs.
24	TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2019: 9626-9635.
25	KIM K, LEE H S. Probabilistic anchor assignment with IoU prediction for object detection[EB/OL]. [2023-10-10]. https://arxiv.org/pdf/2007.08103.
26	ZHANG H Y, WANG Y, DAYOUB F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2021: 8510-8519.
27	CHEN Q, WANG Y M, YANG T, et al. You only look one-level feature[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 13034-13043.
28	ZHENG G, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-10-10]. https://arxiv.org/pdf/2107.08430.
29	ZHANG H, LI F, LIU S L, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection[EB/OL]. [2023-10-10]. https://arxiv.org/pdf/2203.03605.
30	CAI Z, LIU S T, WANG G D, et al. . Align-DETR: improving DETR with simple IoU-aware BCE loss[EB/OL]. [2023-10-10]. https://arxiv.org/abs/2304.07527?context=cs.

[1]	HUANG Qiqiang, AN Guocheng, XIONG Gang. Open-Set Traffic Object Detection Algorithm Based on Vision-Language Pre-training Model [J]. Computer Engineering, 2025, 51(6): 375-384.
[2]	Huanyu LU, Yonghong ZHANG, Guangyi MA, Donglin XIE, Wei TIAN. Semi-Supervised Adversarial Learning-Based Water Body Extraction from Remote Sensing Images [J]. Computer Engineering, 2024, 50(7): 251-263.
[3]	TANG Rong, LI Qian, TANG Shaoen. Visibility Detection Method Based on Multi-Object [J]. Computer Engineering, 2023, 49(2): 314-320.
[4]	Zerui WANG, Shi CHEN. Quality-Aware Rotating-Ship Template Matching Algorithm Based on Deep Features [J]. Computer Engineering, 2023, 49(12): 161-168.
[5]	Liqiong LU, Changjiang CHEN, Dong WU, Jianfang XIONG. Natural Scene Braille Image Dataset and Braille Segment Detection Method [J]. Computer Engineering, 2023, 49(10): 171-177.
[6]	CHEN Huiwei, LIU Shumei, LIU Peixue, GONG Maofa. Remote Sensing Ship Recognition Based on Hyper-Scale Self-Guided Attention Networks [J]. Computer Engineering, 2021, 47(10): 314-320.
[7]	CUI Kunkun, FAN Shaosheng. Visual Navigation and Feature Recognition Method of Robot Based on Dynamic Double Windows [J]. Computer Engineering, 2020, 46(9): 313-320.
[8]	CHEN Ze, YE Xueyi, QIAN Dingwei, WEI Yangyang. Small-Scale Pedestrian Detection Based on Improved Faster R-CNN [J]. Computer Engineering, 2020, 46(9): 226-232,241.
[9]	LIN Fengxiao,CHEN Huajie,YAO Qinwei,ZHANG Jiehao. Target Fast Detection Algorithm Based on Hybrid Structure Convolutional Neural Network [J]. Computer Engineering, 2018, 44(12): 222-227.

Please choose a citation manager

Content to export