基于深度学习的无人机图像语义分割算法研究

doi:10.19678/j.issn.1000-3428.0064711

摘要/Abstract

摘要： 已有关于无人机视觉的图像语义分割算法多数是对遥感图像进行分割，无法表现地面细节信息，导致无人机在低空飞行任务中的实时自主环境感知存在障碍。针对该问题，提出一种低空无人机实时图像语义分割方法。设计一种新型的超网络体系结构，在编码器的最后一层加入一个上下文头权重生成模块，在编码器编码结束前生成解码器中每个块的权重，以减少预测时网络的参数量和计算量，达到实时分割的效果。在解码器中，利用局部连接层机制设计一种动态分片卷积算法，在面对跨越多个分片的大型分割对象时充分考虑上下文语义信息，使解码器中每个卷积核的权重随输入特征图的空间位置而变化，同时利用动态权重针对性地分割不同物体，最大程度地提高网络的自适应性。在低空无人机视觉图像数据集上的实验结果表明，该方法对于建筑、道路、静态车等类别图像的平均交并比为66.3%，预测速度达到37.9帧/s，与MSD、ABCNet算法相比，其分割精度分别提升9.3和2.5个百分点。

关键词: 无人机视觉, 实时语义分割, 超网络, 局部连接层, 迁移学习

Abstract: Most existing image semantic segmentation algorithms for UAV vision are limited to remote sensing images, which lack the resolution to accurately represent ground details, thereby hindering UAV's real-time autonomous environment perception in low-altitude flight missions.To address this issue, a real-time image semantic segmentation method for low-altitude UAV is proposed.A new hyper-network architecture is designed.A context header weight generation module is added to the last layer of the encoder, and the weight of each block in the decoder is generated before the end of the encoder encoding, to reduce the number of network parameters and computation during prediction and achieve the effect of real-time segmentation.In the decoder, a dynamic fragment convolution algorithm is designed using the local connection layer mechanism.When facing large segmented objects that span multiple fragments, the semantic information of the context is fully considered, to ensure that the weight of each convolution core in the decoder changes with the spatial position of the input feature map.Simultaneously, the dynamic weight is used to segment different objects in a targeted manner, maximizing the adaptability of the network.The experimental results on the low altitude UAV vision image dataset demonstrate that the mean Intersection over Union(mIoU) of this method for buildings, roads, static vehicles, and other categories is 66.3%, and the prediction speed reaches 37.9 frame/s.Compared with MSD and ABCNet algorithms, its segmentation accuracy improved by 9.3 and 2.5 percentage points, respectively.

Key words: UAV vision, real-time semantic segmentation, hyper-network, local connection layer, transfer learning

中图分类号:

TP391

白俊卿, 韩柏迅, 张丰侠. 基于深度学习的无人机图像语义分割算法研究[J]. 计算机工程, 2023, 49(4): 233-239.

BAI Junqing, HAN Boxun, ZHANG Fengxia. Deep Learning-Based UAV Image Semantic Segmentation Algorithm Research[J]. Computer Engineering, 2023, 49(4): 233-239.

http://www.ecice06.com/CN/Y2023/V49/I4/233

图/表 10

20230417190110

20230417190113

20230417190116

20230417190119

20230417190122

20230417190125

20230417190128

20230417190131

20230417190134

20230417190137

参考文献

[1] BADRINARAYANAN V, KENDALL A, CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
[2] CHEN L C, LOPES R G, CHENG B W, et al.Naive-student:leveraging semi-supervised learning in video sequences for urban scene segmentation[EB/OL].[2022-04-05].https://arxiv.org/pdf/2005.10266.pdf.
[3] 程擎, 范满, 李彦冬, 等.无人机航拍图像语义分割研究综述[J].计算机工程与应用, 2021, 57(19):57-69. CHENG Q, FAN M, LI Y D, et al.Review on semantic segmentation of UAV aerial images[J].Computer Engineering and Applications, 2021, 57(19):57-69.(in Chinese)
[4] SHELHAMER E, LONG J, DARRELL T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.
[5] ZHAO H S, SHI J P, QI X J, et al.Pyramid scene parsing network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6230-6239.
[6] CHEN L C, PAPANDREOU G, KOKKINOS I, et al.DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[C]//Proceedings of IEEE Conference on Pattern Analysis and Machine Intelligence.Washington D.C., USA:IEEE Press, 2017:834-848.
[7] PASZKE A, CHAURASIA A, KIM S, et al.ENet:a deep neural network architecture for real-time semantic segmentation[EB/OL].[2022-04-05].https://arxiv.org/pdf/1606.02147.pdf.
[8] YU C Q, WANG J B, PENG C, et al.BiSeNet:bilateral segmentation network for real-time semantic segmentation[EB/OL].[2022-04-05].https://arxiv.org/pdf/1808.00897.pdf.
[9] 罗嗣卿, 张志超, 岳琪.基于改进SEGNET模型的图像语义分割[J].计算机工程, 2021, 47(4):256-261. LUO S Q, ZHANG Z C, YUE Q.Semantic image segmentation based on improved SEGNET model[J].Computer Engineering, 2021, 47(4):256-261.(in Chinese)
[10] 鲍海龙, 万敏, 刘忠祥, 等.基于区域自我注意力的实时语义分割网络[J].激光与光电子学进展, 2021, 58(8):204-210. BAO H L, WAN M, LIU Z X, et al.Real-time semantic segmentation network based on regional self-attention[J].Laser & Optoelectronics Progress, 2021, 58(8):204-210.(in Chinese)
[11] WANG S K, LIU L, QU L, et al.Accurate Ulva prolifera regions extraction of UAV images with superpixel and CNNs for ocean environment monitoring[J].Neurocomputing, 2019, 348:158-168.
[12] YUAN J L, DENG Z L, WANG S, et al.Multi receptive field network for semantic segmentation[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision.Washington D.C., USA:IEEE Press, 2020:1883-1892.
[13] 谢树春, 陈志华, 盛斌.增强细节的RGB-IR多通道特征融合语义分割网络[J].计算机工程, 2022, 48(10):230-237, 244. XIE S C, CHEN Z H, SHENG B.Detail-enhanced RGB-IR multichannel feature fusion network for semantic segmentation[J].Computer Engineering, 2022, 48(10):230-237, 244.(in Chinese)
[14] NOGUEIRA K, DOS SANTOS J A, CANCIAN L, et al.Semantic segmentation of vegetation images acquired by unmanned aerial vehicles using an ensemble of ConvNets[C]//Proceedings of IEEE International Geoscience and Remote Sensing Symposium.Washington D.C., USA:IEEE Press, 2017:3787-3790.
[15] LI J K, DING W R, LI H G, et al.Semantic segmentation for high-resolution aerial imagery using multi-skip network and Markov random fields[C]//Proceedings of IEEE International Conference on Unmanned Systems.Washington D.C., USA:IEEE Press, 2018:12-17.
[16] HUANG H S, DENG J Z, LAN Y B, et al.A fully convolutional network for weed mapping of Unmanned Aerial Vehicle(UAV) imagery[J].PLoS One, 2018, 13(4):e0196302.
[17] HA D, DAI A, LE Q V.HyperNetworks[EB/OL].[2022-04-05].https://arxiv.org/abs/1609.09106.
[18] COATES A, CARPENTER B, CASE C, et al.Large scale distributed deep networks[EB/OL].[2022-04-05].http://www.cs.toronto.edu/~ranzato/publications/DistBelief NIPS2012_withAppendix.pdf.
[19] RONNEBERGER O, FISCHER P, BROX T.U-Net:convolutional networks for biomedical image segmentation[EB/OL].[2022-04-05].https://arxiv.org/pdf/1505.04597.pdf.
[20] TAN M X, LE Q V.EfficientNet:rethinking model scaling for convolutional neural networks[EB/OL].[2022-04-05].https://arxiv.org/pdf/1905.11946.pdf.
[21] SANDLER M, HOWARD A, ZHU M L, et al.MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:4510-4520.
[22] LÜ Y, VOSSELMAN G, XIA G S, et al.UAVid:a semantic segmentation dataset for UAV imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 165:108-119.
[23] CORDTS M, OMRAN M, RAMOS S, et al.The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:3213-3223.
[24] LÜ Y, VOSSELMAN G, XIA G S, et al.Bidirectional multi-scale attention networks for semantic segmentation of oblique UAV imagery[J].ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2021, 166:75-82.
[25] LI R, ZHENG S Y, ZHANG C, et al.ABCNet:attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:84-98.

选择文件类型/文献管理软件名称

选择包含的内容