[1] EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: IEEE Press, 2014, 27: 2366-2374.
[2] GODARD C, MAC A O, FIRMAN M, et al. Digging into self-supervised monocular depth estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway, NJ: IEEE Press, 2019: 3828-3838.
[3] Shu C, Yu K, Duan Z X, et al. Feature-metric loss for self-supervised learning of depth and egomotion[C]//European Conference on Computer Vision. Berlin, Heidelberg: Springer, 2020: 572-588.
[4] EIGEN, D, FERGUS R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway, NJ: IEEE Press, 2015: 2650-2658.
[5] LAINA I, RUPPRECHT C, BELAGIANNIS V, et al. Deeper Depth Prediction with Fully Convolutional Residual Networks//[C]//2016 Fourth International Conference on 3D Vision (3DV). Stanford, CA: IEEE Press, 2016:239-248.
[6] ZHOU T H, BROWN M AND SNAVELY N, et al. Unsupervised Learning of Depth and Ego-Motion from Video[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE Press, 2017: 6612-6619.
[7] ZHOU H, GREENWOOD D, TAYLOR S. Self-Supervised Monocular Depth Estimation with Internal Feature Fusion[C]//British Machine Vision Conference. UK: British Machine Vision Association, 2021.
[8] WANG J D, SUN K, CHENG T H, et al. Deep High-Resolution Representation Learning for Visual Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349-3364.
[9] ZHAO C Q, ZHANG Y M, POGGI M, et al. MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer[C]//2022 International Conference on 3D Vision (3DV). Prague, Czechia: IEEE Press, 2022: 668-678.
[10] LEE Y W, KIM J H, WILLETTE J, et al. MPViT: Multi-Path Vision Transformer for Dense Prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2022: 7277-7286.
[11] SHIM D, KIM H J. SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network[C]//2023 IEEE International Conference on Robotics and Automation (ICRA). London, UK: IEEE Press, 2023: 4983-4990.
[12] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, NJ: IEEE Press, 2021: 9992-10002.
[13] MASOUMIAN A, RASHWAN H A, ABDULWAHAB S, et al. GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network[J]. Neurocomputing, 2023, 517: 81-92.
[14] 张玉亮, 赵智龙, 付炜平, 等. 融合边缘语义信息的单目深度估计[J]. 科学技术与工程, 2022, 22(7): 2761-2769.
ZHANG Y L, ZHAO Z L, FU W P, et al. Integrating spatial semantic information for monocular depth estimation[J]. Science Technology and Engineering, 2022, 22(7): 2761-2769.
[15] LI Q M, HAN Z C, WU X M. Deeper insights into graph convolutional networks for semi-supervised learning[C]//Proceedings of the AAAI conference on artificial intelligence. New Orleans, LA, USA: AAAI, 2018: 3538-3545.
[16] WEI J S, PAN S G, GAO W, et al. LAM-Depth: Laplace-Attention Module-Based Self-Supervised Monocular Depth Estimation[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(10): 13706-13716.
[17] 曹明伟, 邢景杰, 程宜风, 等. LpDepth: 基于拉普拉斯金字塔的自监督单目深度估计[J]. 计算机科学, 2025, 52(3): 33-40.
CAO M W, XING J J, CHENG Y F, et al. LpDepth: Self-supervised Monocular Depth Depth Estimation Based on Laplace Pyramid[J]. Computer Science, 2025, 52(3): 33-40.
[18] 曲熠,陈莹. 基于边缘强化的无监督单目深度估计[J].系统工程与电子技术, 2024, 46(1): 71-79.
QU Y, CHEN Y. Unsupervised monocular depth estimation based on edge enhancement[J]. Systems Engineering and Electronics, 2024, 46(1): 71-79.
[19] SONG C, CHEN Q J, LI F W B, et al. Multi-feature fusion enhanced monocular depth estimation with boundary awareness[J]. The Visual Computer, 2024, 40: 4955-4967.
[20] HAN K, WANG Y H, GUO J Y, et al. Vision GNN: An Image is Worth Graph of Nodes[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, LA, USA: MIT Press, 2022: 8291-8303.
[21] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
[22] GODARD C, MAC A O, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE Press, 2017: 6602-6611.
[23] ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. Unet++: A Nested U-Net Architecture for Medical Image Segmentation[C]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Granada, Spain: Springer Cham, 2018: 3-11.
[24] MUNIR M, AVERY W, RAHMAN M M, et al. GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE Press, 2024: 6118-6127.
[25] CHU X X, TIAN Z, ZHANG B, et al. Conditional Positional Encodings for Vision Transformers[C]//International Conference on Learning Representations. Ithaca, NY: OpenReview.net, 2021.
[26] LI G H, MULLER M, THABET A, et al. DeepGCNs: Can GCNs Go As Deep As CNNs?[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, NJ: IEEE Press, 2019: 9266-9275.
[27] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE Press, 2016: 770-778.
[28] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2012: 3354-3361
[29] SAXENA A, SUN M, NG A. Make3D: Learning 3D Scene Structure from a Single Still Image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(5): 824-840.
[30] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2009: 248-255.
[31] YAN J X, ZHAO H, BU P H, et al. Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation[C]//2021 International Conference on 3D Vision (3DV). London, UK: IEEE Press, 2021: 464-473.
[32] Karpov A, Makarov I. Exploring Efficiency of Vision Transformers for Self-supervised Monocular Depth Estimation[C]//2022 IEEE International Symposium on Mixed and Augmented Reality(ISMAR). Singapore: IEEE Press, 2022: 711-719.
[33] BAE J, MOON S, IM S. Deep Digging into the Generalization of Self-supervised Monocular Depth Estimation[C]//AAAI Conference on Artificial Intelligence. Washington D.C., USA: AAAI, 2023: 187-196.
[34] ZHANG N, NEX F, VOSSELMAN G, et al. Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE Press, 2023: 18537-18546.
[35] WU L H, WANG L H, WEI G H, et al. HPD-Depth: High performance decoding network for self-supervised monocular depth estimation[J]. Image and Vision Computing, 2025, 154: 105360.
[36] ZHOU M, YU H C, LI Z C, et al. Self-supervised Monocular Depth Estimation Based on Differential Attention[J]. Algorithms, 2025, 18(9): 590.
|