作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 239-248. doi: 10.19678/j.issn.1000-3428.0068259

• 图形图像处理 • 上一篇    下一篇

用于建筑物分割的平行结构特征融合网络

赵婉秋, 张俊虎*(), 李海涛   

  1. 青岛科技大学信息科学技术学院, 山东 青岛 266000
  • 收稿日期:2023-08-18 出版日期:2024-08-15 发布日期:2024-08-09
  • 通讯作者: 张俊虎
  • 基金资助:
    山东省重点研发计划(科技示范工程)课题(2021SFGC0701); 青岛市海洋科技创新专项(22-3-3-hygg-3-hy)

Feature Fusion Network with Parallel Structure for Building Segmentation

Wanqiu ZHAO, Junhu ZHANG*(), Haitao LI   

  1. School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266000, Shandong, China
  • Received:2023-08-18 Online:2024-08-15 Published:2024-08-09
  • Contact: Junhu ZHANG

摘要:

遥感建筑物分割是对遥感图像中的建筑物进行像素级别的分割, 从遥感图像中准确提取出建筑物区域, 包括建筑物轮廓和内部细节信息。由于遥感图像的特殊性, 在对建筑物分割时, 阴影与建筑物颜色相似易造成欠分割, 树木遮挡等因素易造成过分割。针对遥感图像中建筑物轮廓分割不完整、阴影干扰强以及分割边缘锯齿状明显等问题, 提出一种平行结构的多分支特征融合网络(MFF-Net)。该网络以ResNet-50作为主干网络, 解码器采用包含双通道掩码分支的多条平行结构, 分别恢复不同尺度的特征图。同时, 在每条分支结构中使用改进后的CBAM注意力以加强边缘重要特征, 通过双通道掩码结构调整通道交互性, 最后进行特征融合。在ISPRS Potsdam和ISPRS Vaihingen数据集上的实验结果表明, 与现有主流分割网络相比, MFF-Net的全局准确率、精确率、召回率、F1值、均交并比(mIoU)均有不同程度的提升, 在Vaihingen数据集上精确率达到96.22%, F1值达到95.55%, mIoU达到92.16%, 在Potsdam数据集上精确率达到96.95%, F1值达到96.32%, mIoU达到93.40%, 其提取的建筑物轮廓完整清晰, 抗干扰性更强。

关键词: 遥感图像, 特征融合, 建筑物分割, 双通道掩码, 注意力

Abstract:

Remote sensing building segmentation refers to pixel-level segmentation of buildings in remote sensing images, including precise outlines of buildings and detailed internal information. However, the unique characteristics of remote sensing images, with shadows similar in color to that of buildings, often result in under-segmentation, whereas factors such as tree occlusion can easily result in over-segmentation. A Multi-Feature Fusion Network (MFF-Net) based on a parallel structure is presented to address weak shadow interference, visible jagged edges of segmentation, and poor segmentation of building outlines in remote-sensing photos. The decoder uses ResNet-50 as its backbone and numerous parallel-structured dual-channel mask branches to reconstruct feature maps at various scales. To strengthen the critical edge features, an improved Convolutional Block Attention Module (CBAM) is further added to each branch structure. The bidirectional channel mask competition module is subsequently used to adjust channel interaction, thereby completing feature fusion. Experimental results on the ISPRS Potsdam and ISPRS Vaihingen datasets show that compared with existing mainstream segmentation networks, the global accuracy, precision, recall, F1 value, and mean Intersection over Union (mIoU) are improved to varying degrees. The precision of the Vaihingen dataset is 96.22%, the F1 value 95.55%, and mIoU 92.16%. On the Potsdam dataset, the precision is 96.95%, the F1 value 96.32%, and mIoU 93.40%. The extracted building contours are complete and clear, and resulted in strong interference resistance.

Key words: remote sensing image, feature fusion, building segmentation, bidirectional channel mask, attention