作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 233-239. doi: 10.19678/j.issn.1000-3428.0064711

• 图形图像处理 • 上一篇    下一篇

基于深度学习的无人机图像语义分割算法研究

白俊卿, 韩柏迅, 张丰侠   

  1. 西安石油大学 计算机学院, 西安 710065
  • 收稿日期:2022-05-16 修回日期:2022-06-17 发布日期:2023-04-07
  • 作者简介:白俊卿(1983-),女,副教授、博士,主研方向为计算机视觉、人工智能、数字电路设计及FPGA应用;韩柏迅、张丰侠,硕士研究生。
  • 基金资助:
    国家自然科学基金青年科学基金(41301480);西安石油大学研究生创新与实践能力培养计划(YCS21213254)。

Deep Learning-Based UAV Image Semantic Segmentation Algorithm Research

BAI Junqing, HAN Boxun, ZHANG Fengxia   

  1. School of Computer Science, Xi'an Shiyou University, Xi'an 710065, China
  • Received:2022-05-16 Revised:2022-06-17 Published:2023-04-07

摘要: 已有关于无人机视觉的图像语义分割算法多数是对遥感图像进行分割,无法表现地面细节信息,导致无人机在低空飞行任务中的实时自主环境感知存在障碍。针对该问题,提出一种低空无人机实时图像语义分割方法。设计一种新型的超网络体系结构,在编码器的最后一层加入一个上下文头权重生成模块,在编码器编码结束前生成解码器中每个块的权重,以减少预测时网络的参数量和计算量,达到实时分割的效果。在解码器中,利用局部连接层机制设计一种动态分片卷积算法,在面对跨越多个分片的大型分割对象时充分考虑上下文语义信息,使解码器中每个卷积核的权重随输入特征图的空间位置而变化,同时利用动态权重针对性地分割不同物体,最大程度地提高网络的自适应性。在低空无人机视觉图像数据集上的实验结果表明,该方法对于建筑、道路、静态车等类别图像的平均交并比为66.3%,预测速度达到37.9帧/s,与MSD、ABCNet算法相比,其分割精度分别提升9.3和2.5个百分点。

关键词: 无人机视觉, 实时语义分割, 超网络, 局部连接层, 迁移学习

Abstract: Most existing image semantic segmentation algorithms for UAV vision are limited to remote sensing images, which lack the resolution to accurately represent ground details, thereby hindering UAV's real-time autonomous environment perception in low-altitude flight missions.To address this issue, a real-time image semantic segmentation method for low-altitude UAV is proposed.A new hyper-network architecture is designed.A context header weight generation module is added to the last layer of the encoder, and the weight of each block in the decoder is generated before the end of the encoder encoding, to reduce the number of network parameters and computation during prediction and achieve the effect of real-time segmentation.In the decoder, a dynamic fragment convolution algorithm is designed using the local connection layer mechanism.When facing large segmented objects that span multiple fragments, the semantic information of the context is fully considered, to ensure that the weight of each convolution core in the decoder changes with the spatial position of the input feature map.Simultaneously, the dynamic weight is used to segment different objects in a targeted manner, maximizing the adaptability of the network.The experimental results on the low altitude UAV vision image dataset demonstrate that the mean Intersection over Union(mIoU) of this method for buildings, roads, static vehicles, and other categories is 66.3%, and the prediction speed reaches 37.9 frame/s.Compared with MSD and ABCNet algorithms, its segmentation accuracy improved by 9.3 and 2.5 percentage points, respectively.

Key words: UAV vision, real-time semantic segmentation, hyper-network, local connection layer, transfer learning

中图分类号: