[1]罗旭东, 袁笛, 常晓军, 何震宇. 基于不确定性启发图像增强的水下目标跟踪[J]. 计算机工程, 2025, 51(1): 11-19.
LUO Xudong, YUAN Di, CHANG Xiaojun, HE Zhenyu. Underwater Target Tracking Based on Uncertainty-Inspired Image Enhancement[J]. Computer Engineering, 2025, 51(1): 11-19.
[2]ALABA S Y, NABI M M, SHAH C, et al. Class-Aware Fish Species Recognition Using Deep Learning for an Imbalanced Dataset[J]. Sensors, 2022, 22(21): 8268.
[3]刘子健, 王兴梅, 陈伟京, 等. 基于硬负样本对比学习的水下图像生成方法[J]. 模式识别与人工智能, 2024, 37(10):887-909.
LIU Zijian, WANG Xingmei, CHEN Weijing, ZHANG Wansong, ZHANG Tianzi. Underwater Image Generation Method Based on Contrastive Learning with Hard Negative Samples. PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024,37(10): 887-909.
[4]HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C] // Proc of 34th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: MIT Press, 2020: 6840–6851.
[5]张云帆, 易尧华, 汤梓伟, 王新宇. 基于通道注意力机制的文本生成图像方法[J]. 计算机工程, 2022, 48(4): 206-212,222.
ZHANG Yunfan, YI Yaohua, TANG Ziwei, WANG Xinyu. Text-to-Image Synthesis Method Based on Channel Attention Mechanism[J]. Computer Engineering, 2022, 48(4): 206-212,222.
[6]WANG N, ZHOU Y B,HAN F L, et al. UWGAN: Underwater GAN for Real-world Underwater Color Restoration and Dehazing [EB/OL]. [2021-03-26] https://arxiv.org/pdf/1912.10269 [7] LI J, SKINNER K A, EUSTICE R M, et al. WaterGAN: Unsupervised Generative Network to Enable Real-Time Color Correction of Monocular Underwater Images[J]. IEEE Robotics and Automation Letters, 2018, 3(1): 387-394.
[8] KWON G, YE J C. Diffusion-based Image Translation using disentangled style and content representation[C]//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda.
[9] ZHANG H Y, YAO F H, GONG Y F, et al. Marine Biology Image Generation Based on Diffusion-Stylegan2[J]. IEEE Access, 2024, 01: 1-1.
[10] ZHANG L M, RAO A Y, AGRAWALA M. Adding Conditional Control to Text-to-Image Diffusion Models[C] // Proc of the IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE, 2023: 3813-3824.
[11] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-Resolution Image Synthesis with Latent Diffusion Models[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, LA, USA: IEEE, 2022: 10674-10685.
[12] PENG B H, WANG J, ZHANG Y C, et al. ControlNeXt: Powerful and efficient control for image and video generation [EB/OL]. [2024-08-12]. https://arxiv.org/pdf/2408.06070.
[13] LI M, YANG T J N, KUANG H F, et al. ControlNet++: Improving conditional controls with efficient consistency feedback[C] // Proc of the 18th European Conference on Computer Vision. Milan, Italy: Springer, 2024: 129-147
[14] QIN C, ZHANG S, YU N, et al. UniControl: A unified diffusion model for controllable visual generation in the wild [EB/OL]. [2023-05-18] https://arxiv.org/pdf/2305.11147.
[15] ZHAO S H, CHEN D D, CHEN Y C. Uni-ControlNet: All-in-one control to text-to-image diffusion models[C] // Proc of 37th International Conference on Neural Information Processing Systems. New Orleans, LA, USA: MIT Press, 2023: 11124-11150.
[16] MOU C, WANG X T, XIE L B, et al. T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models [EB/OL]. [2023-02-16] https://arxiv.org/pdf/2302.08453.
[17] HU J, SHEN L, SUN G. Squeeze-and-Excitation Networks[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7132-7141.
[18] LI X, WANG W H, HU X L, et al. Selective Kernel Networks[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, US: IEEE, 2019: 510-519.
[19] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 11531-11539.
[20] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 4510-4520.
[21] OUYANG D L, HE S, ZHANG G Z, et al. Efficient Multi-Scale Attention Module with Cross-Spatial Learning[C] // IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island, Greece : IEEE, 2023: 1-5.
[22] LI Y X, LI X, YANG J. Spatial Group-Wise Enhance: Enhancing Semantic Feature Learning in CNN[C] // Asian Conference on Computer Vision. Macau SAR, China: Springer, 2022: 316-332.
[23] HU J, SHEN L, SUN G, et al. Gather-excite: exploiting feature context in convolutional neural networks[C] // International Conference on Neural Information Processing Systems. Montréal, Canada: MIT Press, 2018: 9423 – 9433.
[24] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C] // Proceedings of the European conference on computer vision. Munich, German:2018: 3-19.
[25] LIU H J, LIU F Q, FAN X Y, et al. Polarized self-attention: Towards high-quality pixel-wise mapping[J]. Neurocomputing, 2022, 506: 158-167.
[26] HONG L, WANG X, ZHANG G, et al. USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection[J]. IEEE Transactions on Image Processing, 2025, 34: 1602-1615. DOI: 10.1109/TIP.2023.3266163.
[27] LI J N, LI D X, XION C M, et al. BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation[C] // Proc of 39th International Conference on Machine Learning. Baltimore, Maryland, USA: PMLR, 2022: 12888–12900.
[28] RANFTL R, LASINGER K, HAFNER D, et al. Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44: 1623-1637. DOI: 10.1109/TPAMI.2020.3019967.
[29] YANG M, SOWMYA A. An Underwater Color Image Quality Evaluation Metric[J]. IEEE Transactions on Image Processing, 2015: 24(12): 6062-6071. DOI: 10.1109/TIP.2015.2491020.
[30] PANETTA K, GAO C, AGAIAN S. Human-Visual-System-Inspired Underwater Image Quality Measures[J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. DOI: 10.1109/JOE.2015.2469915.
[31] HESSEL J, HOLTZMAN A, FORBES M, et al. CLIPScore: A Reference-free Evaluation Metric for Image Captioning[C] // Proc of Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican Republic: ACL, 2021: 7514–7528.
[32] YANG L H, KANG B Y, HUANG Z L, et al. Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data[C] // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2024: 10371-10381. |