作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (11): 284-296. doi: 10.19678/j.issn.1000-3428.0068647

• 图形图像处理 • 上一篇    下一篇

融合超分辨率和特征增强的轻量化遥感图像小目标检测

杨雨迪*(), 葛海波, 辛世澳, 薛紫涵, 袁昊   

  1. 西安邮电大学电子工程学院, 陕西 西安 710121
  • 收稿日期:2023-10-20 出版日期:2024-11-15 发布日期:2024-04-01
  • 通讯作者: 杨雨迪
  • 基金资助:
    陕西省自然科学基金(2011JM8038); 陕西省重点产业创新链(群)项目(S2019-YF-ZDCXL-0098)

Lightweight Small-Object Detection for Remote Sensing Images Integrating Super-Resolution and Feature Enhancement

YANG Yudi*(), GE Haibo, XIN Shiao, XUE Zihan, YUAN Hao   

  1. School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China
  • Received:2023-10-20 Online:2024-11-15 Published:2024-04-01
  • Contact: YANG Yudi

摘要:

为了应对遥感图像目标检测中小目标像素低、背景复杂、硬件资源有限等问题, 提出一种融合超分辨率(SR)和特征增强的小目标检测模型。采用GhostNet网络中的Ghost卷积层替换YOLOv8网络中的传统卷积层Conv, 在不影响检测精度的情况下降低网络模型的参数量和计算量。在主干网络中, 构建超分辨率辅助增强(SRAE)模块提升图像的分辨率和特征提取能力。利用三层特征融合(TFF)模块, 获取主干网络较低层的空间特征, 改善快速空间金字塔池化(SPPF)层特征空间提取不足的问题, 提高小目标空间定位能力。设计自注意力信息转移(SAT)模块, 在保证模型轻量化的同时增强小目标的语义信息和全局信息。实验结果表明, 改进模型在DIOR数据集上实现了90.5%的mAP@0.5、15.1×106的参数量和30.3×109的每秒浮点运算次数(FLOPs), 相比于其他模型在实现网络轻量化的同时提升了小目标检测精度。

关键词: 目标检测, 超分辨率, 遥感图像, YOLOv8网络, 注意力机制, 特征融合

Abstract:

To address the problems of low pixel size, complex background, and limited hardware resources in remote sensing image object detection, a small-object detection algorithm that combines Super-Resolution(SR) and feature enhancement is proposed. The Ghost convolution layers in the GhostNet are used to replace the conventional convolution layers, Conv, in the You Only Look Once v8 (YOLOv8) network, reducing the number of parameters and calculations of the network model without compromising detection accuracy. A Super-Resolution Assisted Enhancement (SRAE) is built in the backbone network to improve image resolution and feature extraction capabilities. A Three-layer Feature Fusion (TFF) module is proposed to obtain the spatial features of the lower layer of the backbone network, improve the insufficient feature space extraction in the Spatial Pyramid Pooling Fast(SPPF) layer, and enhance the spatial positioning ability of small targets. A Self-Attention information Transfer (SAT) module is designed to enhance the semantic and global information of small targets while ensuring a lightweight model. The improved model achieves 90.5% Mean Average Precision (mAP)@0.5, 15.1×106 parameter quantity, and 30.3×109 Floating Point Operations Per Second (FLOPs) on the DIOR dataset; additionally, it achieves lightweight while improving detection accuracy compared to other models.

Key words: object detection, Super-Resolution (SR), remote sensing image, YOLOv8 network, attention mechanism, feature fusion