[1] Song, Zhiwei, et al. Deformable YOLOX: Detection and rust warning method of transmission line connection fittings based on image processing technology[J]. IEEE Transactions on Instrumentation and Measurement 72 (2023): 1-21.
[2] W. Guo et al., AI-oriented smart power system transient stability: The rationality, applications, challenges and future opportunities[J]. Sustain. Energy Technol. Assess., vol. 56, Art. no. 102990, 2023.
[3] X. Liu, J. Du, X. Gao, et al. Electrical equipment classification via improved Faster Region-based Convolutional Neural Network[C]//Proceedings of the 36th Chin. Control Decision. Conf. (CCDC), Xi'an, China, 2024, pp. 5956–5961.
[4] 白翔,李巨川,王慧民,等.基于改进Swin Transformer的电力图像检索方法[J/OL].计算机应用,1-13[2025-11-21]. Bai Xiang, Li Juchuan, Wang Huimin, et al. Power image retrieval method based on improved Swin Transformer [J/OL]. Journal of Computer Applications, 1–13 [2025-11-21].
[5] Liu, Yue, and Xinbo Huang. Efficient cross-modality insulator augmentation for multi-domain insulator defect detection in UAV images[J]. Sensors 24.2 (2024): 428.
[6] Aitelhaj, Rita, Badr-Eddine Benelmostafa, and Hicham Medromi. APF-YOLOV8: Enhancing Multiscale Detection and Intra-Class Variance Handling for UAV-Based Insulator Power Line Inspections[J]. F1000Research 14 (2025): 141.
[7] 刘传洋,吴一全,刘景景.无人机航拍图像中绝缘子缺陷检测的深度学习方法研究进展[J].电工技术学报, 2025, 40(9):2897-2916 Liu Chuanyang, Wu Yiquan, Liu Jingjing. Research progress on deep learning methods for insulator defect detection in UAV aerial images[J]. Transactions of China Electrotechnical Society, 2025, 40(9):2897-2916.
[8] Liu Z, Lin Y, Cao Y, et al. Swin transformer: Hierarchical vi-sion transformer using shifted windows[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 10012-10022.
[9] Hu E J, Shen Y, Wallis P, et al. Lora: Low-rank adaptation of large language models[J/OL]. arXiv preprint arXiv:2106.09685, 2021.
[10] Lowe, David G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision 60.2 (2004): 91-110.
[11] Bay, Herbert, et al. Speeded-up robust features (SURF)[J]. Computer vision and image understanding 110.3 (2008): 346-359.
[12] Zhang, Yingnan, Zhizhong Kang, and Zhen Cao. An Image Retrieval Method for Lunar Complex Craters Integrating Vis-ual and Depth Features[J]. Electronics 13.7 (2024): 1262.
[13] Babenko, Artem, et al. Neural codes for image retrieval[C]. European conference on computer vision. Cham: Springer International Publishing, 2014: 584-599.
[14] RADENOVIĆ F, TOLIAS G, CHUM O. Fine-tuning CNN image retrieval with no human annotation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018, 41(7): 1655-1668.
[15] Bhatnagar, Shubhang, and Narendra Ahuja. Potential Field Based Deep Metric Learning[C]//Proceedings of the Computer Vision and Pattern Recognition Conference. 2025: 25549-25559.
[16] Jiang, Xin, et al. Rethinking Vision Transformer for Large-Scale Fine-Grained Image Retrieval[J/OL].arXiv preprint, arXiv:2504.16691, 2025.
[17] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//International Conference on Learning Representations (ICLR). 2021.
[18] 杨军,张金影,康玥.基于自注意力机制的高分遥感影像语义分割[J].哈尔滨工程大学学报,2025,46(02):344-354. Yang Jun, Zhang Jinying, Kang Yue. Semantic segmentation of high-resolution remote sensing images based on self-attention mechanism [J]. Journal of Harbin Engineering University, 2025, 46(02): 344–354.
[19] Kumar A, Yadav S P, Kumar A. An improved feature extrac-tion algorithm for robust Swin Transformer model in high-dimensional medical image analysis[J]. Computers in bi-ology and medicine, 2025, 188: 109822.
[20] Duan, Yingtao, et al. STMSF: Swin transformer with multi-scale fusion for remote sensing scene classification[J]. Remote Sensing 17.4 (2025): 668.
[21] Yoo, Dayeon, Jeesu Kim, and Jinwoo Yoo. FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification[J]. IEEE Access 12 (2024): 72598-72606.
[22] Qin, Haolin, et al. Factorization vision transformer: Modeling long-range dependency with local window cost[J]. IEEE Transactions on Neural Networks and Learning Systems (2023).
[23] Hou, Qibin, Daquan Zhou, and Jiashi Feng. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13713-13722.
[24] Liu Z, Zhu J, Huang G. Collaborative Low-Rank Adaptation for Pre-Trained Vision Transformers [J/OL]. arXiv preprint, arXiv:2512.24603, 2025.
[25] Schroff F, Kalenichenko D, Philbin J. Facenet: A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE conference on computer vision and pattern recog-nition. 2015: 815-823.
[26] Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. International journal of com-puter vision, 2015, 115: 211-252.
[27] Oh Song H, Xiang Y, Jegelka S, et al. Deep metric learning via lifted structured feature embedding[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 4004-4012.
[28] He, Kaiming, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[29] Huang, Gao, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.
[30] Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International conference on machine learning. PMLR, 2019: 6105-6114.
[31] Li, Qiang, et al. Research on image classification of power inspection using less sample learning technique[J]. International Journal of Low-Carbon Technologies 19 (2024): 2119-2126.
[32] Li, Xun, et al. TLINet: A defects detection method for insulators of overhead transmission lines using partially transformer block[J]. PloS One 20.6 (2025): e0327139.
[33] E. Ramzi, et al., Hierarchical average precision training for pertinent image retrieval[C]//European Conference on Com-puter Vision, Cham: Springer Nature Switzerland, 2022: 250-266.
[34] H. Xuan, A. Stylianou, X. Liu, and R. Pless,"Hard negative examples are hard, but useful[C]//European Conference on Computer Vision, Cham: Springer International Publishing, Aug. 2020, pp. 126–142.
[35] B. Cai, P. Xiong, and S. Tian, Center contrastive loss for metric learning[J/OL] arXiv preprint arXiv:2308.00458, 2023.
[36] Furusawa T. Mean field theory in deep metric learning[C]//International Conference on Learning Representations (ICLR), 2024.
|