基于改进Deeplab v3+的服装图像分割网络

doi:10.19678/j.issn.1000-3428.0062392

计算机工程 ›› 2022, Vol. 48 ›› Issue (7): 284-291. doi: 10.19678/j.issn.1000-3428.0062392

基于改进Deeplab v3+的服装图像分割网络

胡新荣^1,2,3, 龚闯^1,2,3, 张自力^1,2,3, 朱强^1,2,3, 彭涛^1,2,3, 何儒汉^1,2,3

1. 湖北省服装信息化工程技术研究中心, 武汉 430200;
2. 纺织服装智能化湖北省工程研究中心, 武汉 430200;
3. 武汉纺织大学计算机与人工智能学院, 武汉 430200

收稿日期:2021-08-18 修回日期:2021-10-21 出版日期:2022-07-15 发布日期:2021-10-25
作者简介:胡新荣(1973—),女,教授、博士,主研方向为自然语言处理、图形图像处理、虚拟现实;龚闯,硕士研究生;张自力(通信作者)、朱强,讲师、博士;彭涛,副教授、博士;何儒汉,教授、博士。
基金资助:
湖北省高校优秀中青年科技创新团队计划项目（T201807）。

Clothing Image Segmentation Network Based on Improved Deeplab v3+

HU Xinrong^1,2,3, GONG Chuang^1,2,3, ZHANG Zili^1,2,3, ZHU Qiang^1,2,3, PENG Tao^1,2,3, HE Ruhan^1,2,3

1. Engineering Research Center of Hubei Province for Clothing Information, Wuhan 430200, China;
2. Hubei Provincial Engineering Research Center for Intelligent Textile and Fashion, Wuhan 430200, China;
3. School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan 430200, China

Received:2021-08-18 Revised:2021-10-21 Online:2022-07-15 Published:2021-10-25

摘要/Abstract

摘要： 在服装图像分割领域，现有算法存在服装边缘分割粗糙、分割精度差和服装深层语义特征提取不够充分等问题。将Coordinate Attention机制和语义特征增强模块（SFEM）嵌入到语义分割性能较好的Deeplab v3+网络，设计一种用于服装图像分割领域的CA_SFEM_Deeplab v3+网络。为了加强服装图像有效特征的学习，在Deeplab v3+网络的主干网络resnet101中嵌入Coordinate Attention机制，并将经过带空洞卷积池化金字塔网络的特征图输入到语义特征增强模块中进行特征增强处理，从而提高分割的准确率。实验结果表明，CA_SFEM_Deeplab v3+网络在DeepFashion2数据集上的平均交并比与平均像素准确率分别为0.557、0.671，相较于Deeplab v3+网络分别提高2.1%、2.3%，其所得分割服装轮廓更为精细，具有较好的分割性能。

关键词: 服装图像, 语义分割, Deeplab v3+网络, Coordinate Attention机制, 语义特征增强模块

Abstract: To solve the problems of rough clothing edge segmentation, unsatisfactory segmentation accuracy, and insufficient deep semantic feature extraction in clothing image segmentation, the Coordinate Attention(CA) mechanism and Semantic Feature Enhancement Module(SFEM) are embedded into the Deeplab v3+ network, whichfeatures good semantic segmentation performance, and a CA_SFEM_Deeplab v3+ network is proposed for clothing image segmentation in this study.To strengthen the learning of effective features in clothing images, the CA mechanism module is embedded into resnet101, which is the backbone network of the Deeplab v3+ network, and the feature map after convolution pooling is performed on a pyramid with holes is input into the SFEM for feature enhancement.Consequently, the segmentation accuracy improved.Experimental results show that the mean Intersection over Union(mIoU) and Mean Pixel Accuracy(MPA) of the CA_SFEM_Deeplabv3 + network are 0.557 and 0.671, respectively, in the DeepFashion2 dataset, which are 2.1% and 2.3% higher than those of the Deeplab v3 + network, respectively.Compared with the Deeplab v3+ network, the proposedCA_SFEM_Deeplab v3+offersa finer segmentation of the clothing contour and better segmentation performance.

Key words: clothing image, semantic segmentation, Deeplab v3+ network, Coordinate Attention mechanism, semantic feature enhancement module

中图分类号:

TP391.41

胡新荣, 龚闯, 张自力, 朱强, 彭涛, 何儒汉. 基于改进Deeplab v3+的服装图像分割网络[J]. 计算机工程, 2022, 48(7): 284-291.

HU Xinrong, GONG Chuang, ZHANG Zili, ZHU Qiang, PENG Tao, HE Ruhan. Clothing Image Segmentation Network Based on Improved Deeplab v3+[J]. Computer Engineering, 2022, 48(7): 284-291.

https://www.ecice06.com/CN/Y2022/V48/I7/284

图/表 10

20221029175908

20221029175911

20221029175915

20221029175918

20221029175921

20221029175925

20221029175929

20221029175932

20221029175936

20221029175939

参考文献

[1] YAMAGUCHI K, KIAPOUR M H, ORTIZ L E, et al.Parsing clothing in fashion photographs[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2012:3570-3577.
[2] LIU S, FENG J S, DOMOKOS C, et al.Fashion parsing with weak color-category labels[J].IEEE Transactions on Multimedia, 2014, 16(1):253-265.
[3] JI J, YANG R Y.An improved clothing parsing method emphasizing the clothing with complex texture[C]//Proceedings of Conference on Advances in Multimedia Information Processing.Berlin, German:Springer, 2017:487-496.
[4] LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:3431-3440.
[5] RONNEBERGER O, FISCHER P, BROX T.U-net:convolutional networks for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.Washington D.C., USA:IEEE Press, 2015:234-241.
[6] MARMANIS D, SCHINDLER K, WEGNER J D, et al.Classification with an edge:improving semantic image segmentation with boundary detection[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 135:158-172.
[7] ZHAO H S, SHI J P, QI X J, et al.Pyramid scene parsing network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6230-6239.
[8] BADRINARAYANAN V, KENDALL A, CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
[9] LIN G S, MILAN A, SHEN C H, et al.RefineNet:multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:5168-5177.
[10] 白美丽, 万韬阮, 汤汶, 等.一种改进的用于服装解析的自监督网络学习方法[J].纺织高校基础科学学报, 2019(4):385-392, 410. BAI M L, WAN T R, TANG W, et al.An improved self-supervised neural network learning method for clothing parsing[J].Basic Sciences Journal of Textile Universities, 2019(4):385-392, 410.(in Chinese)
[11] FU J, LIU J, TIAN H J, et al.Dual attention network for scene segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:3141-3149.
[12] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141.
[13] WOO S, PARK J, LEE Y J, et al.CBAM:convolutional block attention module[C]//Proceedings of European Conference on Computer Vision.Berlin, German:Springer, 2018:3-19.
[14] CHEN L C.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of European Conference on Computer Vision.Berlin, German:Springer, 2018:801-818.
[15] XU K, BA J, KIROS R, et al.Show, attend and tell:neural image caption generation with visual attention[C]//Proceedings of International Conference on Machine Learning.New York, USA:ACM Press, 2015:2048-2057.
[16] VOLODYMYR M, HEESS N, GRAVES A.Recurrent models of visual attention[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge, USA:MIT Press, 2014:2204-2212.
[17] WOO S, PARK J, LEE J Y, et al.CBAM:convolutional block attention module[C]//Proceedings of European Conference on Computer Vision.Berlin, German:Springer, 2018:3-19.
[18] CAO Y, XU J R, LIN S, et al.GCNet:non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:1971-1980.
[19] LIU J J, HOU Q B, CHENG M M, et al.Improving convolutional networks with self-calibrated convolutions[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:10093-10102.
[20] FU J, LIU J, TIAN H J, et al.Dual attention network for scene segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:3141-3149.
[21] HOU Q B, ZHANG L, CHENG M M, et al.Strip pooling:rethinking spatial pooling for scene parsing[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:4002-4011.
[22] TSOTSOS J K.Analyzing vision at the complexity level[J].Behavioral and Brain Sciences, 1990, 13(3):423-445.
[23] 黄文明, 卫万成, 张健, 等.基于注意力机制与评论文本深度模型的推荐方法[J].计算机工程, 2019, 45(9):176-182. HUANG W M, WEI W C, ZHANG J, et al.Recommendation method based on attention mechanism and review text deep model[J].Computer Engineering, 2019, 45(9):176-182.(in Chinese)
[24] HOU Q B, ZHOU D Q, FENG J S.Coordinate attention for efficient mobile network design[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:13708-13717.
[25] PATEL K, BUR A M, WANG G H.Enhanced U-net:a feature enhancement network for polyp segmentation[C]//Proceedings of the 18th Conference on Robots and Vision.Washington D.C., USA:IEEE Press, 2021:181-188.
[26] HE X, YANG S B, LI G B, et al.Non-local context encoder:robust biomedical image segmentation against adversarial attacks[C]//Proceedings of AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press, 2019:8417-8424.
[27] LIU J J, HOU Q B, CHENG M M, et al.A simple pooling-based design for real-time salient object detection[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:3912-3921.
[28] WANG X L, GIRSHICK R, GUPTA A, et al.Non-local neural networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7794-7803.
[29] GE Y Y, ZHANG R M, WANG X G, et al.DeepFashion2:a versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:5332-5340.
[30] LIU Z W, LUO P, QIU S, et al.DeepFashion:powering robust clothes recognition and retrieval with rich annotations[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:1096-1104.
[31] ZHENG S, YANG F, KIAPOUR M H, et al.ModaNet:a large-scale street fashion dataset with polygon annotations[EB/OL].[2021-07-10].https://arxiv.org/abs/1807. 01394.
[32] Fashionaidataset[EB/OL].[2021-07-10].http://fashionai.alibaba.com/datasets/.
[33] WU H K, ZHANG J G, HUANG K Q, et al.FastFCN:rethinking dilated convolution in the backbone for semantic segmentation[EB/OL].[2021-07-10].https://arxiv.org/abs/1903.11816.

选择文件类型/文献管理软件名称

选择包含的内容

基于改进Deeplab v3+的服装图像分割网络

Clothing Image Segmentation Network Based on Improved Deeplab v3+

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	李仲, 冒睿瑞, 王晓龙, 王根一, 安国成. 基于改进PIDNet的水位线检测算法[J]. 计算机工程, 2024, 50(8): 102-112.
[2]	闵莉, 董冰洁, 安冬. 基于多注意力机制与跨特征融合的语义分割算法[J]. 计算机工程, 2024, 50(8): 282-289.
[3]	逯焕宇, 张永宏, 马光义, 谢东林, 田伟. 基于半监督对抗学习的遥感图像水体提取[J]. 计算机工程, 2024, 50(7): 251-263.
[4]	肖慈, 徐杨, 张永丹, 冯明文, 黄易仟. 结合注意力和低光增强的夜间语义分割[J]. 计算机工程, 2024, 50(7): 271-281.
[5]	陈晓玉, 沈晨, 沈阅, 孔德明. 基于改进SwiftNet的堆场图像实时分割网络[J]. 计算机工程, 2024, 50(6): 296-303.
[6]	王安政, 党建武, 岳彪, 杨景玉. 基于位置信息和注意力机制的路面裂缝检测[J]. 计算机工程, 2024, 50(4): 303-312.
[7]	王柏涵, 姜晓燕, 范柳伊. 基于深度监督隐空间构建的语义分割改进方法[J]. 计算机工程, 2024, 50(3): 191-199.
[8]	徐威, 付晓薇, 李曦, 汪尧坤. 融合多层感知注意力的电极微观图像分割方法[J]. 计算机工程, 2024, 50(1): 329-338.
[9]	苏晓东, 李世洲, 赵佳圆, 亮洪宇, 张玉荣, 徐红岩. 基于多级叠加和注意力机制的图像语义分割[J]. 计算机工程, 2023, 49(9): 265-271, 278.
[10]	徐春波, 闫娟, 杨慧斌, 王博, 吴晗. 基于目标检测和语义分割的视觉SLAM算法[J]. 计算机工程, 2023, 49(8): 199-206, 214.
[11]	白俊卿, 韩柏迅, 张丰侠. 基于深度学习的无人机图像语义分割算法研究[J]. 计算机工程, 2023, 49(4): 233-239.
[12]	苏鸣方, 胡立坤, 黄润辉. 基于上下文注意力的室外点云语义分割方法[J]. 计算机工程, 2023, 49(3): 248-256.
[13]	马素刚, 陈期梅, 侯志强, 杨小宝, 张子贤. 基于密集连接与特征增强的语义分割算法[J]. 计算机工程, 2023, 49(3): 263-270.
[14]	范润泽, 刘宇红, 张荣芬, 李景玉. 基于多尺度注意力机制的道路场景语义分割模型[J]. 计算机工程, 2023, 49(2): 288-295.
[15]	李嘉豪, 闵卫东, 陈炯缙, 朱梦, 展国伟. 一种复杂场景下高精度交通标志检测模型[J]. 计算机工程, 2023, 49(11): 311-320.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于改进Deeplab v3+的服装图像分割网络

Clothing Image Segmentation Network Based on Improved Deeplab v3+

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献

相关文章 15

编辑推荐

Metrics

本文评价