[1] 李兆亮,贾令尧,张冰冰,等.基于自监督学习和二阶表示
的小样本图像分类 [J/OL]. 计算机学
报 ,1-16[2025-01-10].http://kns.cnki.net/kcms/detail/11.18
26.tp.20241217.1617.008.html.
Li Zhaoliang, Jia Lingyao, Zhang Bingbing, et al. Few-shot
Image Classification Based on Self-Supervised Learning
and Second-Order Representations [J/OL]. Journal of
Computer Science, 1-16 [2025-01-10].
http://kns.cnki.net/kcms/detail/11.1826.tp.20241217.1617.0
08.html.
[2] Dosovitskiy A, Springenberg J T, Riedmiller M, et al.
Discriminative unsupervised feature learning with
convolutional neural networks[J]. Advances in neural
information processing systems, 2014, 27.
[3] Doersch C, Gupta A, Efros A A. Unsupervised visual
representation learning by context
prediction[C]//Proceedings of the IEEE international
conference on computer vision. 2015: 1422-1430.
[4] Chen T, Kornblith S, Norouzi M, et al. A simple
framework for contrastive learning of visual
representations[C]//International conference on machine
learning. PMLR, 2020: 1597-1607.
[5] Zhang R, Isola P, Efros A A. Colorful image
colorization[C]//Computer Vision–ECCV 2016: 14th
European Conference, Amsterdam, The Netherlands,
October 11-14, 2016, Proceedings, Part III 14. Springer
International Publishing, 2016: 649-666.
[6] Gidaris S, Singh P, Komodakis N. Unsupervised
representation learning by predicting image rotations[J].
arXiv preprint arXiv:1803.07728, 2018.
[7] Hotelling H. Relations between two sets of
variates[M]//Breakthroughs in statistics: methodology and
distribution. New York, NY: Springer New York, 1992:
162-190.
[8] Jia Y, Salzmann M, Darrell T. Factorized latent spaces with
structured sparsity[J]. Advances in neural information
processing systems, 2010, 23.
[9] Devlin J. Bert: Pre-training of deep bidirectional
transformers for language understanding[J]. arXiv preprint
arXiv:1810.04805, 2018.
[10] Pathak D, Krahenbuhl P, Donahue J, et al. Context
encoders: Feature learning by inpainting[C]//Proceedings
of the IEEE conference on computer vision and pattern
recognition. 2016: 2536-2544.
[11] Sermanet P, Lynch C, Chebotar Y, et al. Time-contrastive
networks: Self-supervised learning from video[C]//2018
IEEE international conference on robotics and automation
(ICRA). IEEE, 2018: 1134-1141.
[12] Misra I, Zitnick C L, Hebert M. Shuffle and learn:
unsupervised learning using temporal order
verification[C]//Computer Vision–ECCV 2016: 14th
European Conference, Amsterdam, The Netherlands,
October 11–14, 2016, Proceedings, Part I 14. Springer
International Publishing, 2016: 527-544.
[13] Godard C, Mac Aodha O, Firman M, et al. Digging into
self-supervised monocular depth
estimation[C]//Proceedings of the IEEE/CVF international
conference on computer vision. 2019: 3828-3838.
[14] Oord A, Li Y, Vinyals O. Representation learning with
contrastive predictive coding[J]. arXiv preprint
arXiv:1807.03748, 2018.
[15] Bachman P, Hjelm R D, Buchwalter W. Learning
representations by maximizing mutual information across
views[J]. Advances in neural information processing
systems, 2019, 32.
[16] He K, Fan H, Wu Y, et al. Momentum contrast for
unsupervised visual representation
learning[C]//Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition. 2020: 9729-9738.
[17] Chen X, He K. Exploring simple siamese representation
learning[C]//Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition. 2021:
15750-15758.
[18] 冯欣,胡成杭.一种自监督掩码图像建模的遮挡目标检测
方 法 [J]. 重 庆 理 工 大 学 学 报 ( 自然科
学),2024,38(06):186-193.
Feng Xin, Hu Chenghang. A Self-Supervised Masked Image
Modeling Method for Occluded Object Detection [J].
Journal of Chongqing University of Technology (Natural
Science), 2024, 38(06): 186-193.
[19] He K, Chen X, Xie S, et al. Masked autoencoders are
scalable vision learners[C]//Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition.
2022: 16000-16009.
[20] Baevski A, Hsu W N, Xu Q, et al. Data2vec: A general
framework for self-supervised learning in speech, vision
and language[C]//International Conference on Machine
Learning. PMLR, 2022: 1298-1312.
[21] Chen X, Ding M, Wang X, et al. Context autoencoder for
self-supervised representation learning[J]. International
Journal of Computer Vision, 2024, 132(1): 208-223.
[22] Wei C, Fan H, Xie S, et al. Masked feature prediction for
self-supervised visual pre-training[C]//Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern
Recognition. 2022: 14668-14678.
[23] Wang R, Chen D, Wu Z, et al. Bevt: Bert pretraining of
video transformers[C]//Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition.
2022: 14733-14743. [24] Wang L, Huang B, Zhao Z, et al. Videomae v2: Scaling
video masked autoencoders with dual
masking[C]//Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2023:
14549-14560.
[25] Li Y, Mao H, Girshick R, et al. Exploring plain vision
transformer backbones for object detection[C]//European
conference on computer vision. Cham: Springer Nature
Switzerland, 2022: 280-296.
[26] Fang Y, Yang S, Wang S, et al. Unleashing vanilla vision
transformer with masked image modeling for object
detection[C]//Proceedings of the IEEE/CVF International
Conference on Computer Vision. 2023: 6244-6253.
[27] Zhou L, Palangi H, Zhang L, et al. Unified
vision-language pre-training for image captioning and
vqa[C]//Proceedings of the AAAI conference on artificial
intelligence. 2020, 34(07): 13041-13049.
[28] Radford A, Kim J W, Hallacy C, et al. Learning
transferable visual models from natural language
supervision[C]//International conference on machine
learning. PMLR, 2021: 8748-8763.
[29] Ramesh A, Pavlov M, Goh G, et al. Zero-shot
text-to-image generation[C]//International conference on
machine learning. Pmlr, 2021: 8821-8831.
[30] Muslea I, Minton S, Knoblock C A. Active learning with
strong and weak views: A case study on wrapper
induction[C]//IJCAI. 2003, 3: 415-420.
[31] Yu S, Krishnapuram B, Steck H, et al. Bayesian
co-training[J]. Advances in neural information processing
systems, 2007, 20.
[32] Kumar A, Rai P, Daume H. Co-regularized multi-view
spectral clustering[J]. Advances in neural information
processing systems, 2011, 24.
[33] Lanckriet G R G, Cristianini N, Bartlett P, et al. Learning
the kernel matrix with semidefinite programming[J].
Journal of Machine learning research, 2004, 5(Jan): 27-72.
[34] Sonnenburg S, Rätsch G, Schäfer C, et al. Large scale
multiple kernel learning[J]. The Journal of Machine
Learning Research, 2006, 7: 1531-1565.
[35] Subrahmanya N, Shin Y C. Sparse multiple kernel learning
for signal processing applications[J]. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 2009, 32(5):
788-798.
[36] Salzmann M, Ek C H, Urtasun R, et al. Factorized
orthogonal latent spaces[C]//Proceedings of the thirteenth
international conference on artificial intelligence and
statistics. JMLR Workshop and Conference Proceedings,
2010: 701-708.
[37] Carroll J D. Generalization of canonical correlation
analysis to three of more sets of variables[C]//APA 76th
Annual Convention, San Francisco, CA, August
30-September 3, 1968. 1968.
[38] Benton A, Khayrallah H, Gujral B, et al. Deep generalized
canonical correlation analysis[J]. arXiv preprint
arXiv:1702.02519, 2017.
[39] Chen Z , He Z , Lu Z M .DEA-Net: Single Image
Dehazing Based on Detail-Enhanced Convolution and
Content-Guided Attention[J].IEEE Transactions on Image
Processing, 2024:33.
[40] Wu B , Xiao Q , Liu S ,et al.E2ENet: Dynamic Sparse
Feature Fusion for Accurate and Efficient 3D Medical
Image Segmentation[J]. 2023.
[41] Guermazi, B., & Khan, N. (2024). DynaSeg: A Deep
Dynamic Fusion Method for Unsupervised Image
Segmentation Incorporating Feature Similarity and Spatial
Continuity. Image Vis. Comput., 150, 105206.
[42] Redmon, J., & Farhadi, A. (2018). YOLOv3: An
Incremental Improvement. ArXiv, abs/1804.02767.
[43] 关日鹏, 况立群, 焦世超, 熊风光, 韩燮. 多模态特征
融合与词嵌入驱动的三维检索方法[J]. 计算机工程,
2023, 49(4): 101-107,113.
GUAN Ripeng, KUANG Liqun, JIAO Shichao, XIONG
Fengguang, HAN Xie. Retrieval Method of 3D Models
Driven by Multi-modal Feature Fusion and Word
Embedding[J]. Computer Engineering, 2023, 49(4):
101-107,113.
[44] Doersch C, Zisserman A. Multi-task self-supervised visual
learning[C]//Proceedings of the IEEE international
conference on computer vision. 2017: 2051-2060.
|