Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (8): 292-304. doi: 10.19678/j.issn.1000-3428.0068856

• Graphics and Image Processing • Previous Articles     Next Articles

A Study on Improved Faster R-CNN Model for Multi-Object Detection in Remote Sensing Images

MIAO Ru1,2, LI Yi1,2,3, ZHOU Ke1,2,3,*(), ZHANG Yanna1, CHANG Ranran1,2,3, MENG Geng1,2,3   

  1. 1. College of Computer and Information Engineering, Henan University, Kaifeng 475004, Henan, China
    2. Henan Province Engineering Research Center of Spatial Information Processing, Kaifeng 475004, Henan, China
    3. Henan Provincial Spatio-Temporal Big Data Technology Innovation Center, Kaifeng 475004, Henan, China
  • Received:2023-11-16 Revised:2024-04-01 Online:2025-08-15 Published:2025-08-25
  • Contact: ZHOU Ke

一种改进的Faster R-CNN遥感图像多目标检测模型研究

苗茹1,2, 李祎1,2,3, 周珂1,2,3,*(), 张俨娜1, 常然然1,2,3, 孟更1,2,3   

  1. 1. 河南大学计算机与信息工程学院,河南 开封 475004
    2. 河南省空间信息处理工程研究中心,河南 开封 475004
    3. 河南省时空大数据技术创新中心,河南 开封 475004
  • 通讯作者: 周珂
  • 基金资助:
    高分辨率对地观测系统国家科技重大专项(民用部分)科研项目(80-Y50G19-9001-22/23); 河南省科技攻关项目(222102210061)

Abstract:

The complex backgrounds, diverse target types, and significant scale variations in remote sensing images lead to target omission and false detection. To address these issues, this study proposes an improved Faster R-CNN multi-object detection model. First, the ResNet 50 backbone network is replaced with the Swin Transformer to enhance the model's feature extraction capability. Second, a Balanced Feature Pyramid (BFP) module is introduced to fuse shallow and deep semantic information, further strengthening the feature fusion effect. Finally, in the classification and regression branches, a dynamic weighting mechanism is incorporated to encourage the network to focus more on high-quality candidate boxes during training, thereby improving the precision of target localization and classification. The experimental results on the RSOD dataset show that the proposed model significantly reduces the number of Floating-Point Operations per second (FLOPs) compared to the Faster R-CNN model. The proposed model achieves 10.7 percentage points improvement in mAP@0.5 ∶0.95 and 10.6 percentage points increase in Average Recall (AR). Compared to other mainstream detection models, the proposed model achieves higher accuracy while reducing the false detection rate. These results indicate that the proposed model significantly enhances detection accuracy in remote sensing images with complex backgrounds.

Key words: remote sensing images, multi-object detection, Faster R-CNN, Swin Transformer module, Balanced Feature Pyramid(BFP), dynamic weighting mechanism

摘要:

针对遥感图像背景复杂、目标种类多和尺度差异大所造成的目标漏检和误检问题,提出一种改进Faster R-CNN多目标检测模型。首先,采用Swin Transformer来替代ResNet 50骨干网络,增强模型特征提取能力;其次,添加平衡特征金字塔(BFP)模块融合浅层和高层语义信息,进一步加强特征融合效果;最后,在分类和回归分支中,添加动态权重机制,促进网络在训练过程中更关注高质量候选框,提高目标定位和分类的精确度。在RSOD数据集上的实验结果表明,所提模型相较于Faster R-CNN模型每秒浮点运算次数(FLOPs)大幅度减少,并且模型的mAP@0.5 ∶0.95提高了10.7百分点,平均召回率提高10.6百分点。相较于其他主流检测模型,所提模型在降低漏检率的同时,取得了更高的精度,能显著提高复杂背景下遥感图像的检测精度。

关键词: 遥感图像, 多目标检测, Faster R-CNN, Swin Transformer模块, 平衡特征金字塔, 动态权重机制