基于深度神经网络的图像语义分割研究综述

doi:10.19678/j.issn.1000-3428.0058018

摘要/Abstract

摘要： 随着深度学习技术的快速发展及其在语义分割领域的广泛应用，语义分割效果得到显著提升。对基于深度神经网络的图像语义分割方法进行分析与总结，根据网络训练方式的不同，将现有的图像语义分割分为全监督学习图像语义分割和弱监督学习图像语义分割，对每种方法中代表性算法的效果以及优缺点进行对比与分析，并阐述深度神经网络对语义分割领域的贡献。在此基础上，归纳当前主流的公共数据集和遥感数据集，对比主要的图像语义分割方法的分割性能，探讨当前语义分割技术面临的挑战并对其未来的发展方向进行展望。

关键词: 深度神经网络, 图像语义分割, 计算机视觉, 全监督学习, 弱监督学习

Abstract: With the rapid development of deep learning and its widespread applications in semantic segmentation,the quality of semantic segmentation has been significantly improved.This paper reviews and analyzes the mainstream deep neural network-based methods in semantic image segmentation.According to the ways of network training,the existing semantic image segmentation methods are categorized into fully supervised learning-based methods and weakly supervised learning-based methods.The performance,advantages and disadvantages of the representative algorithms of these two categories of semantic image segmentation methods are compared and analyzed.Then the paper systematically details the contributions of deep neural network to semantic segmentation.On this basis,the paper summarizes the current mainstream public datasets and remote sensing datasets,compares the segmentation performance of mainstream semantic image segmentation methods.Finally,the paper discusses the challenges faced with existing semantic segmentation techniques and the future development trends.

Key words: deep neural network, image semantic segmentation, computer vision, fully supervised learning, weakly supervised learning

中图分类号:

TP391

景庄伟, 管海燕, 彭代峰, 于永涛. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10): 1-17.

JING Zhuangwei, GUAN Haiyan, PENG Daifeng, YU Yongtao. Survey of Research in Image Semantic Segmentation Based on Deep Neural Network[J]. Computer Engineering, 2020, 46(10): 1-17.

https://www.ecice06.com/CN/Y2020/V46/I10/1

图/表 11

20201023091127

20201023091134

20201023091139

20201023091150

20201023091154

20201023091200

20201023091204

20201023091208

20201023091213

20201023091218

20201023091221

参考文献

[1] YU H,YANG Z,TAN L.Methods and datasets on semantic segmentation:a review[J].Neurocomputing,2018,304:82-103.
[2] COATES A,NG A Y.Learning feature representations with K-means[M]//XIAO G,SHAN W,SYSTEMS O,et al.Lecture notes in computer science.Berlin,Germany:Springer,2012:561-580.
[3] WANG H Y,PAN D L,XIA D S.A fast algorithm for two-dimensional OTSU adaptive threshold algorithm[J].Acta Automatica Sinica,2005,33(9):969-970.
[4] SHOTTON J,JOHNSON M,CIPOLLA R.Semantic texton forests for image categorization and segmentation[C]//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2008:1-8.
[5] SU Jinling,WANG Zhaohui.An image segmentation method based on Graph Cut and super pixels in nature scene[J].Journal of Soochow University (Natural Science Edition),2012(2):27-33.(in Chinese)苏金玲,王朝晖.基于Graph Cut和超像素的自然场景显著对象分割方法[J].苏州大学学报(自然科学版),2012(2):27-33.
[6] LATEEF F,RUICHEK Y.Survey on semantic segmentation using deep learning techniques[J].Neurocomputing,2019,338:321-348.
[7] LIANG Xinyu,LUO Chen,QUAN Jichuan,et al.Research on progress of image semantic segmentation based on deep learning[J].Computer Engineering and Applications,2020,56(2):18-28.(in Chinese)梁新宇,罗晨,权冀川,等.基于深度学习的图像语义分割技术研究进展[J].计算机工程与应用,2020,56(2):18-28.
[8] MINAEE S,BOYKOV Y,PORIKLI F,et al.Image segmentation using deep learning:a survey[EB/OL].[2020-03-25].https://arxiv.org/pdf/2001.05566.pdf.
[9] TIAN Xuan,WANG Liang,DING Qi.Review of image semantic segmentation based on deep learning[J].Journal of Software,2019,30(2):440-468.(in Chinese)田萱,王亮,丁琪.基于深度学习的图像语义分割方法综述[J].软件学报,2019,30(2):440-468.
[10] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,39(4):640-651.
[11] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFS[J].International Conference on Learning Representations,2014(4):357-361.
[12] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848.
[13] CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:252-263.
[14] WANG P Q,CHEN P F,YUAN Y,et al.Understanding convolution for semantic segmentation[EB/OL].[2020-03-25].https://arxiv.org/abs/1702.08502.
[15] ZHAO H S,SHI J P,QI X J,et al.Pyramid scene parsing network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:2881-2890.
[16] CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2018:801-818.
[17] RONNEBERGER O,FISCHER P,BROX T.U-Net:convolu-tional networks for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.Berlin,Germany:Springer,2015:234-241.
[18] ZHOU Z W,RAHMAN SIDDIQUEE M M,TAJBAKHSH N,et al.UNet++:a nested U-net architecture for medical image segmentation[M]//CARDOSO M J,ARBEL T,CARNEIRO G,et al.Deep learning in medical image analysis and multimodal learning for clinical decision support.Berlin,Germany:Springer,2018:3-11.
[19] BADRINARAYANAN V,KENDALL A,CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[20] KENDALL A,BADRINARAYANAN V,CIPOLLA R.Bayesian SegNet:model uncertainty in deep convolutional encoder-decoder architectures for scene understanding[C]//Proceedings of British Machine Vision Conference.London,UK:British Machine Vision Association,2017:1-12.
[21] NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:1520-1528.
[22] PASZKE A,CHAURASIA A,KIM S,et al.ENet:a deep neural network architecture for real-time semantic segmentation[EB/OL].[2020-03-25].https://arxiv.org/abs/1606.02147.
[23] LI H C,XIONG P F,FAN H Q,et al.DFANet:deep feature aggregation for real-time semantic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:9522-9531.
[24] WANG Y,ZHOU Q,LIU J,et al.LEDNet:a lightweight encoder-decoder network for real-time semantic segmentation[C]//Proceedings of 2019 IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2019:1860-1864.
[25] TIAN Z,HE T,SHEN C H,et al.Decoders matter for semantic segmentation:data-dependent decoding enables flexible feature aggregation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:3126-3135.
[26] PENG C,ZHANG X Y,YU G,et al.Large kernel matters-improve semantic segmentation by global convolutional network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:4353-4361.
[27] FU J,LIU J,WANG Y H,et al.Stacked deconvolutional network for semantic segmentation[EB/OL].[2020-03-25].https://arxiv.org/pdf/1708.04943.pdf.
[28] ZHAO H S,ZHANG Y,LIU S,et al.PSANet:point-wise spatial attention network for scene parsing[M].Berlin,Germany:Springer,2018:270-286.
[29] HUANG Z L,WANG X G,HUANG L C,et al.CCNet:criss-cross attention for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:603-612.
[30] YU C Q,WANG J B,PENG C,et al.BiSeNet:bilateral segmentation network for real-time semantic segmenta-tion[M].Berlin,Germany:Springer,2018:334-349.
[31] HU X X,YANG K L,FEI L,et al.ACNet:attention based network to exploit complementary features for RGBD semantic segmentation[C]//Proceedings of 2019 IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2019:1440-1444.
[32] NIU R G,SUN X,DIAO W H,et al.HMANet:hybrid multiple attention network for semantic segmentation in aerial images[EB/OL].[2020-03-25].https://arxiv.org/abs/2001.02870.
[33] MNIH V,HEESS N,GRAVES A.Recurrent models of visual attention[M].Cambridge,USA:MIT Press,2014.
[34] WANG X L,GIRSHICK R,GUPTA A,et al.Non-local neural networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:7794-7803.
[35] FU J,LIU J,TIAN H J,et al.Dual attention network for scene segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:3146-3154.
[36] LI X,ZHONG Z S,WU J L,et al.Expectation-maximization attention networks for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:9167-9176.
[37] LIN G,SHEN C,REID I,et al.Deeply learning the messages in message passing inference[M].Cambridge,USA:MIT Press,2015.
[38] ARNAB A,JAYASUMANA S,ZHENG S,et al.Higher order conditional random fields in deep neural networks[M].Berlin,Germany:Springer,2016.
[39] VEMULAPALLI R,TUZEL O,LIU M Y,et al.Gaussian conditional random field network for semantic segmentation[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:3224-3233.
[40] SHEN F L,GAN R,YAN S C,et al.Semantic segmentation via structured patch prediction,context CRF and guidance CRF[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:1953-1961.
[41] JIANG J D,ZHANG Z J,HUANG Y Q,et al.Incorporating depth into both CNN and CRF for indoor semantic segmentation[C]//Proceedings of 2017 IEEE International Conference on Software Engineering and Service Science.Washington D.C.,USA:IEEE Press,2017:525-530.
[42] LIU Z W,LI X X,LUO P,et al.Semantic image segmentation via deep parsing network[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:1377-1385.
[43] LIU Y F,CHEN K,LIU C,et al.Structured knowledge distillation for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:2604-2613.
[44] LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:2117-2125.
[45] YANG M K,YU K,ZHANG C,et al.DenseASPP for semantic segmentation in street scenes[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:3684-3692.
[46] HE J J,DENG Z Y,QIAO Y.Dynamic multi-scale filters for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:3562-3572.
[47] ZHAO H S,QI X J,SHEN X Y,et al.ICNet for real-time semantic segmentation on high-resolution images[M].Berlin,Germany:Springer,2018.
[48] HE J J,DENG Z Y,ZHOU L,et al.Adaptive pyramid context network for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:7519-7528.
[49] WU H K,ZHANG J G,HUANG K Q,et al.FastFCN:rethinking dilated convolution in the backbone for semantic segmentation[EB/OL].[2020-03-25].https://arxiv.org/abs/1903.11816.
[50] LUC P,COUPRIE C,CHINTALA S,et al.Semantic segmentation using adversarial networks[EB/OL].[2020-03-25].https://arxiv.org/abs/1611.08408.
[51] HOFFMAN J,WANG D,YU F,et al.FCNs in the wild:pixel-level adversarial and constraint-based adaptation[EB/OL].[2020-03-25].https://arxiv.org/pdf/1612.02649.pdf.
[52] XUE Y,XU T,ZHANG H,et al.Segan:adversarial network with multi-scale loss for medical image segmentation[J].Neuroinformatics,2018,16(3/4):383-392.
[53] MAJURSKI M,MANESCU P,PADI S,et al.Cell image segmentation using generative adversarial networks,transfer learning,and augmentations[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:15-26.
[54] TANG Y B,CAI J Z,LU L,et al.CT image enhancement using stacked generative adversarial networks and transfer learning for lesion segmentation improvement[M]//WERNICK M,YANG Y,BRANKOV J,et al.Machine learning in medical imaging.Berlin,Germany:Springer,2018:46-54.
[55] HUNG W,TSAI Y H,LIOU Y T,et al.Adversarial learning for semi-supervised semantic segmentation[EB/OL].[2020-03-25].https://arxiv.org/abs/1802.07934.
[56] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].[2020-03-25].https://arxiv.org/abs/1511.07122.
[57] WANG P Q,CHEN P F,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision.Washington D.C.,USA:IEEE Press,2018:1451-1460.
[58] MEHTA S,RASTEGARI M,CASPI A,et al.ESPNet:efficient spatial pyramid of dilated convolutions for semantic segmentation[M].Berlin,Germany:Springer,2018:561-580.
[59] HU H,ZHANG Z,XIE Z D,et al.Local relation networks for image recognition[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:3464-3473.
[60] TAKIKAWA T,ACUNA D,JAMPANI V,et al.Gated-SCNN:gated shape CNNs for semantic segmentation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:5229-5238.
[61] VISIN F,ROMERO A,CHO K,et al.ReSeg:a recurrent neural network-based model for semantic segmentation[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops.Washington D.C.,USA:IEEE Press,2016:41-48.
[62] LI Z,GAN Y K,LIANG X D,et al.LSTM-CF:unifying context modeling and fusion with LSTMs for RGB-D scene labeling[M].Berlin,Germany:Springer,2016:541-557.
[63] LIANG X D,SHEN X H,FENG J S,et al.Semantic object parsing with graph LSTM[M].Berlin,Germany:Springer,2016:125-143.
[64] LIANG X D,LIN L,SHEN X H,et al.Interpretable structure-evolving LSTM[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:1010-1019.
[65] ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional random fields as recurrent neural networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:1529-1537.
[66] DAI J F,HE K M,SUN J.BoxSup:exploiting bounding boxes to supervise convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:1635-1643.
[67] RAJCHL M,LEE M C H,OKTAY O,et al.DeepCut:object segmentation from bounding box annotations using convolutional neural networks[J].IEEE Transactions on Medical Imaging,2017,36(2):674-683.
[68] SONG C F,HUANG Y,OUYANG W L,et al.Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:3136-3145.
[69] LIN D,DAI J F,JIA J Y,et al.ScribbleSup:scribble-supervised convolutional networks for semantic segmentation[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:3159-3167.
[70] TANG M,DJELOUAH A,PERAZZI F,et al.Normalized cut loss for weakly-supervised CNN segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:1818-1827.
[71] OBUKHOV A,GEORGOULIS S,DAI D X,et al.Gated CRF loss for weakly supervised semantic image segmentation[EB/OL].[2020-03-25].https://arxiv.org/abs/1906.04651.
[72] BEARMAN A,RUSSAKOVSKY O,FERRARI V,et al.What's the point:semantic segmentation with point supervision[M].Berlin,Germany:Springer,2016:549-565.
[73] MANINIS K K,CAELLES S,PONT-TUSET J,et al.Deep extreme cut:from extreme points to object segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:616-625.
[74] PINHEIRO P O,COLLOBERT R.From image-level to pixel-level labeling with convolutional networks[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:1713-1721.
[75] PAPANDREOU G,CHEN L C,MURPHY K P,et al.Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:16-23.
[76] DURAND T,MORDAN T,THOME N,et al.WILDCAT:weakly supervised learning of deep ConvNets for image classification,pointwise localization and segmentation[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:642-651.
[77] KOLESNIKOV A,LAMPERT C H.Seed,expand and constrain:three principles for weakly-supervised image segmentation[M].Berlin,Germany:Springer,2016:695-711.
[78] HUANG Z L,WANG X G,WANG J S,et al.Weakly-supervised semantic segmentation network with deep seeded region growing[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:7014-7023.
[79] WEI Y C,XIAO H X,SHI H H,et al.Revisiting dilated convolution:a simple approach for weakly-and semi-supervised semantic segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:7268-7277.
[80] AHN J,KWAK S.Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:4981-4990.
[81] ZHOU Y Z,ZHU Y,YE Q X,et al.Weakly supervised instance segmentation using class peak response[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:3791-3800.
[82] WEI Y C,LIANG X D,CHEN Y P,et al.STC:a simple to complex framework for weakly-supervised semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2314-2320.
[83] XU J,SCHWING A G,URTASUN R.Learning to segment under various forms of weak supervision[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:3781-3790.
[84] HONG S,NOH H,HAN B.Decoupled deep neural network for semi-supervised semantic segmentation[M].Cambridge,USA:MIT Press,2015:1495-1503.
[85] IBRAHIM M S,VAHDAT A,RANJBAR M,et al.Semi-supervised semantic image segmentation with self-correcting networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2020:159-168.
[86] ZHENG Baoyu,WANG Yu,WU Jinwen,et al.Weakly supervised learning based on deep convolutional neural networks for image semantic segmentation[J].Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2018,38(5):1-12.(in Chinese)郑宝玉,王雨,吴锦雯,等.基于深度卷积神经网络的弱监督图像语义分割[J].南京邮电大学学报(自然科学版),2018,38(5):1-12.
[87] HONG S,YEO D,KWAK S,et al.Weakly supervised semantic segmentation using Web-crawled videos[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:7322-7330.
[88] HONG S,OH J,LEE H,et al.Learning transferrable knowledge for semantic segmentation with deep convolutional neural network[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:3204-3212.
[89] KALLURI T,VARMA G,CHANDRAKER M,et al.Universal semi-supervised semantic segmentation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:5259-5270.
[90] BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:a high-definition ground truth database[J].Pattern Recognition Letters,2009,30(2):88-97.
[91] LIU C,YUEN J,TORRALBA A.Nonparametric scene parsing:label transfer via dense scene alignment[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2009:20-36.
[92] EVERINGHAM M,ESLAMI S M A,VAN GOOL L,et al.The pascal visual object classes challenge:a retrospective[J].International Journal of Computer Vision,2015,111(1):98-136.
[93] SILBERMAN N,HOIEM D,KOHLI P,et al.Indoor segmentation and support inference from RGBD images[M].Berlin,Germany:Springer,2012:746-760.
[94] PREST A,LEISTNER C,CIVERA J,et al.Learning object class detectors from weakly annotated video[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2012:3282-3289.
[95] MOTTAGHI R,CHEN X J,LIU X B,et al.The role of context for object detection and semantic segmentation in the wild[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2014:891-898.
[96] GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:the KITTI dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[97] CHEN X J,MOTTAGHI R,LIU X B,et al.Detect what you can:detecting and representing objects using holistic models and body parts[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2014:1971-1978.
[98] BELL S,UPCHURCH P,SNAVELY N,et al.Material recognition in the wild with the materials in context database[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:3479-3487.
[99] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[M].Berlin,Germany:Springer,2014:740-755.
[100] SONG S R,LICHTENBERG S P,XIAO J X.SUN RGB-D:a RGB-D scene understanding benchmark suite[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:567-576.
[101] PERAZZI F,PONT-TUSET J,MCWILLIAMS B,et al.A benchmark dataset and evaluation methodology for video object segmentation[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:724-732.
[102] CORDTS M,OMRAN M,RAMOS S,et al.The CityScapes Dataset for semantic urban scene understanding[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:3213-3223.
[103] ROS G,SELLART L,MATERZYNSKA J,et al.The SYNTHIA dataset:a large collection of synthetic images for semantic segmentation of urban scenes[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:3234-3243.
[104] ZHOU B L,ZHAO H,PUIG X,et al.Scene parsing through ADE20K dataset[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:633-641.
[105] WANG P,HUANG X Y,CHENG X J,et al.The ApolloScape open dataset for autonomous driving and its application[EB/OL].[2020-03-25].https://arxiv.org/pdf/1803.06184v3.pdf.
[106] ZOU Q,NI L H,ZHANG T,et al.Deep learning based feature selection for remote sensing scene classification[J].IEEE Geoscience and Remote Sensing Letters,2015,12(11):2321-2325.
[107] XIAO Z F,LIU Q,TANG G F,et al.Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J].International Journal of Remote Sensing,2015,36(2):618-644.
[108] XIA G S,HU J W,HU F,et al.AID:a benchmark data set for performance evaluation of aerial scene classification[J].IEEE Transactions on Geoscience and Remote Sensing,2017,55(7):3965-3981.
[109] MAGGIORI E,TARABALKA Y,CHARPIAT G,et al.Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark[C]//Proceedings of 2017 IEEE International Geoscience and Remote Sensing Symposium.Washington D.C.,USA:IEEE Press,2017:3226-3229.
[110] CHENG G,HAN J W,LU X Q.Remote sensing image scene classification:benchmark and state of the art[J].Proceedings of the IEEE,2017,105(10):1865-1883.
[111] CHENG G,HAN J W,ZHOU P C,et al.Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J].ISPRS Journal of Photogrammetry and Remote Sensing,2014,98:119-132.
[112] ZHU H G,CHEN X G,DAI W Q,et al.Orientation robust object detection in aerial images using deep convolutional neural network[C]//Proceedings of 2015 IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2015:3735-3739.
[113] YANG Y,NEWSAM S.Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems.New York,USA:ACM Press,2010:270-279.
[114] DAI D X,YANG W.Satellite image classification via two-layer sparse coding with biased image representation[J].IEEE Geoscience and Remote Sensing Letters,2011,8(1):173-176.
[115] ZHU Q Q,ZHONG Y F,ZHAO B,et al.Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery[J].IEEE Geoscience and Remote Sensing Letters,2016,13(6):747-751.
[116] ZHOU W X,NEWSAM S,LI C M,et al.PatternNet:a benchmark dataset for performance evaluation of remote sensing image retrieval[J].ISPRS Journal of Photogrammetry and Remote Sensing,2018,145:197-209.
[117] JIN P,XIA G S,HU F,et al.AID++:an updated version of AID on scene classification[C]//Proceedings of 2018 IEEE International Geoscience and Remote Sensing Symposium.Washington D.C.,USA:IEEE Press,2018:4721-4724.
[118] SUMBUL G,CHARFUELAN M,DEMIR B,et al.BigEarthNet:a large-scale benchmark archive for remote sensing image understanding[C]//Proceedings of 2019 IEEE International Geoscience and Remote Sensing Symposium.Washington D.C.,USA:IEEE Press,2019:5901-5904.
[119] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:2818-2826.
[120] KIRILLOV A,HE K M,GIRSHICK R,et al.Panoptic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:9404-9413.
[121] KIRILLOV A,GIRSHICK R,HE K M,et al.Panoptic feature pyramid networks[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:6399-6408.
[122] CHEN W L,WILSON J,TYREE S,et al.Compressing neural networks with the hashing trick[EB/OL].[2020-03-25].https://arxiv.org/abs/1504.04788.
[123] RASTEGARI M,ORDONEZ V,REDMON J,et al.XNOR-net:ImageNet classification using binary convolutional neural networks[M].Berlin,Germany:Springer,2016.

选择文件类型/文献管理软件名称

选择包含的内容