作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (10): 230-237,244. doi: 10.19678/j.issn.1000-3428.0063316

• 图形图像处理 • 上一篇    下一篇

增强细节的RGB‐IR多通道特征融合语义分割网络

谢树春1, 陈志华1, 盛斌2   

  1. 1. 华东理工大学 信息科学与工程学院, 上海 200237;
    2. 上海交通大学 电子信息与电气工程学院, 上海 200240
  • 收稿日期:2021-11-23 修回日期:2022-01-06 发布日期:2022-01-10
  • 作者简介:谢树春(1996—),男,硕士研究生,主研方向为数字图像处理、计算机图形学;陈志华、盛斌,教授、博士。
  • 基金资助:
    国家自然科学基金面上项目“光照一致的立体视频编辑与合成技术研究”(61672228);装备预研教育部联合基金“基于遥感数据的目标识别预警和关联决策技术研究”(6141A02022373)。

Detail-Enhanced RGB-IR Multichannel Feature Fusion Network for Semantic Segmentation

XIE Shuchun1, CHEN Zhihua1, SHENG Bin2   

  1. 1. College of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China;
    2. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2021-11-23 Revised:2022-01-06 Published:2022-01-10

摘要: 现有基于深度学习的语义分割方法对于遥感图像的地物边缘分割不准确,小地物分割效果较差,并且RGB图像质量也会严重影响分割效果。提出一种增强细节的RGB-IR多通道特征融合语义分割网络MFFNet。利用细节特征抽取模块获取RGB和红外图像的细节特征并进行融合,生成更具区分性的特征表示并弥补RGB图像相对于红外图像所缺失的信息。在融合细节特征和高层语义特征的同时,利用特征融合注意力模块自适应地为每个特征图生成不同的注意力权重,得到具有准确语义信息和突出细节信息的优化特征图。将细节特征抽取模块和特征融合注意力模块结构在同一层级上设计为相互对应,从而与高层语义特征进行融合时抑制干扰或者无关细节信息的影响,突出重要关键细节特征,并在特征融合注意力模块中嵌入通道注意力模块,进一步加强高低层特征有效融合,产生更具分辨性的特征表示,提升网络的特征表达能力。在公开的Postdam数据集上的实验结果表明,MFFNet的平均交并比为70.54%,较MFNet和RTFNet分别提升3.95和4.85个百分点,并且对于边缘和小地物的分割效果提升显著。

关键词: 遥感图像, 深度学习, 语义分割, RGB-IR多通道, 细节特征抽取, 特征融合注意力

Abstract: Existing semantic segmentation methods based on deep learning are inaccurate for the edge segmentation of remote sensing images.The segmentation effect of small ground objects is poor, and the quality of RGB images seriously affects the segmentation effect.This study proposes a detail-enhanced RGB-IR multichannel feature fusion network for semantic segmentation, named MFFNet.The detail feature extraction module is used to obtain detailed features of RGB and infrared images.These are fused to generate a more distinctive feature representation and offset the missing information of RGB images relative to infrared images.While fusing the detail and high-level semantic features, the feature fusion attention module is used to adaptively generate different attention weights for each feature map to obtain an optimized feature map with more accurate semantic information and prominent detail information.The detail feature extraction and feature fusion attention module structures are designed to correspond to each other at the same level and suppress the influence of interference or irrelevant detail information when fusing high-level semantic features, highlight key detail features, and embed the channel attention module in the feature fusion attention module.This strengthens the effective fusion of high- and low-level features and generates a more discriminative feature representation, thereby improving the feature expression ability of the network.Experiments on the public Postdam dataset show that the mean intersection over union of MFFNet is 70.54%, which is an improvement of 3.95 and 4.85 percentage points compared with that of MFNet and RTFNet, respectively, and the segmentation effect for edges and small ground objects is significantly improved.

Key words: remote sensing image, deep learning, semantic segmentation, RGB-IR multichannel, detail feature extraction, feature fusion attention

中图分类号: