A Review of Generative Image Detection Based on Diffusion Models

doi:10.19678/j.issn.1000-3428.0253234

Abstract

Abstract: In recent years, generative image technology based on diffusion models has achieved breakthrough progress, with text-to-image models represented by Stable Diffusion, DALL-E, and Midjourney being widely applied in commercial and creative fields. However, highly realistic AI-generated images have also brought challenges to information authenticity, giving rise to social issues such as misinformation dissemination and copyright infringement. To effectively address these challenges, this paper systematically reviews the latest research progress in detection technologies for images generated by diffusion models. First, it outlines the development trajectory of diffusion models from principles and basic frameworks to large-scale applications. Second, it summarizes the evolution of dataset construction, pointing out that dataset development is progressing from using few generators and low resolutions toward multi-model integration and high-quality multi-level filtering. Third, it analyzes three mainstream approaches in detection technology: detection technologies based on implicit features, detection technologies based on explicit features, and detection technologies based on hybrid features. Finally, it analyzes the main challenges facing current detection technologies and provides an outlook on future research directions. This review offers researchers and practitioners a comprehensive technical landscape and reference for development trends.

摘要： 近年来，基于扩散模型的生成式图像技术取得了突破性进展，以Stable Diffusion、DALL-E和Midjourney为代表的文生图模型已经广泛应用于商业领域和日常生活。然而，高度逼真的AI生成图像也带来了图像真实性挑战，催生了虚假信息传播、版权侵犯等社会问题。为有效应对这些挑战，本文系统综述了基于扩散模型的生成图像检测技术的最新研究进展。首先，梳理了扩散模型从原理、基础框架到大规模应用的发展。其次，总结数据集发展，指出数据集建设正从少量生成器、低分辨率向多模型融合、高质量多级筛选方向发展。再次，分析了检测技术的三大主流方法：基于隐式特征的检测技术、基于显式特征的检测技术以及基于混合特征的检测技术。最后，分析了当前检测技术面临的主要挑战，并展望未来研究方向。本综述为研究人员和从业者提供了全面的技术图谱和发展趋势参考。

Luo Hao, Yiran Xin, Yunqi Tang. A Review of Generative Image Detection Based on Diffusion Models[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0253234.

罗昊, 辛一冉, 唐云祁. 基于扩散模型的生成式图像检测综述[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0253234.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0253234

References

[1] Xu Q, Chen D, Chen J, Lyu S, Wang C. Recent Advances on Generalizable Diffusion-generated Image Detection[A/OL]. arXiv, 2025[2025-08-16]. http://arxiv.org/abs/2502.19716. DOI:10.48550/arXiv.2502.19716.
[2] Mahara A, Rishe N. Methods and Trends in Detecting AI-Generated Images: A Comprehensive Review[A/OL]. arXiv, 2025[2025-12-07]. http://arxiv.org/abs/2502.15176. DOI:10.48550/arXiv.2502.15176.
[3] 刘晓龙, 刘欢, 赵耀, 倪蓉蓉, 李晓龙, 郭茂祖. AIGC伪造内容被动检测与主动防御技术综述[J/OL]. 中国科学:信息科学, 2025[2025-12-11]. LIU Xiaolong, LIU Huan, ZHAO Yao, et al. A review of passive detection and active defense technologies for AIGC-generated fake content [J]. Scientia Sinica (Informationis), 2025.
[4] 程泊宣, 李明轩, 张正宇. 扩散模型生成式图像检测技术研究综述[J]. 计算机工程与应用, 2025, 61(20): 1-18. CHENG Boxuan, LI Mingxuan, ZHANG Zhengyu. A review of research on generative image detection technologies based on diffusion models [J]. Computer Engineering and Applications, 2025, 61(20): 1-18.
[5] Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015, June). Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning (pp. 2256-2265). pmlr.
[6] Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
[7] Song Y, Sohl-Dickstein J, Kingma D P, Kumar A, Ermon S, Poole B. Score-Based Generative Modeling through Stochastic Differential Equations[A/OL]. arXiv, 2021[2025-10-20]. http://arxiv.org/abs/2011.13456. DOI:10.48550/arXiv.2011.13456.
[8] Song J, Meng C, Ermon S. Denoising Diffusion Implicit Models[A/OL]. arXiv, 2022[2025-04-17]. http://arxiv.org/abs/2010.02502. DOI:10.48550/arXiv.2010.02502.
[9] Nichol A, Dhariwal P. Improved Denoising Diffusion Probabilistic Models[C]//Meila M, Zhang T. International Conference on Machine Learning (ICML): 139. 2021.
[10] Dhariwal P, Nichol A. Diffusion Models Beat GANs on Image Synthesis[C]//Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan J. 35th Annual Conference on Neural Information Processing Systems (NeurIPS): 34. 2021.
[11] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation[A/OL]. arXiv, 2015[2025-10-21]. DOI:10.48550/arXiv.1505.04597.
[12] Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-Resolution Image Synthesis with Latent Diffusion Models[C/OL]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, 2022: 10674-10685[2023-11-09]. DOI:10.1109/CVPR52688.2022.01042.
[13] Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models[C]//Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S. 38th International Conference on Machine Learning (ICML). 2022.
[14] Saharia, Chitwan, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L. Denton, Kamyar Ghasemipour et al. "Photorealistic text-to-image diffusion models with deep language understanding." Advances in neural information processing systems 35 (2022): 36479-36494.
[15] Ho J, Salimans T. Classifier-Free Diffusion Guidance[A/OL]. arXiv, 2022[2025-04-17]. http://arxiv.org/abs/2207.12598. DOI:10.48550/arXiv.2207.12598.
[16] Meng C, He Y, Song Y, Song J, Wu J, Zhu J Y, Ermon S. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations[A/OL]. arXiv, 2022[2025-10-20]. DOI:10.48550/arXiv.2108.01073.
[17] Kawar, B., Elad, M., Ermon, S., & Song, J. (2022). Denoising diffusion restoration models. Advances in neural information processing systems, 35, 23593-23606.
[18] Saharia C, Ho J, Chan W, Salimans T, Fleet D J, Norouzi M. Image Super-Resolution Via Iterative Refinement[J/OL]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022: 1-14. DOI:10.1109/TPAMI.2022.3204461.
[19] Lugmayr A, Danelljan M, Romero A, Yu F, Timofte R, Van Gool L. RePaint: Inpainting using Denoising Diffusion Probabilistic Models[C/OL]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, 2022: 11451-11461[2025-10-20]. DOI:10.1109/CVPR52688.2022.01117.
[20] Choi J, Kim S, Jeong Y, Gwon Y, Yoon S. ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models[C/OL]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, 2021: 14347-14356[2025-10-20]. DOI:10.1109/ICCV48922.2021.01410.
[21] Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M. Hierarchical Text-Conditional Image Generation with CLIP Latents[A/OL]. arXiv, 2022[2025-03-18]. DOI:10.48550/arXiv.2204.06125.
[22] Grommelt P, Weiss L, Pfreundt F J, Keuper J. Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets[A/OL]. arXiv, 2024[2025-12-05]. DOI:10.48550/arXiv.2403.17608.
[23] Zhu M, Chen H, Yan Q, Huang X, Lin G, Li W, Tu Z, Hu H, Hu J, Wang Y. GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image[C]//Oh A, Neumann T, Globerson A, Saenko K, Hardt M, Levine S. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023). 2023.
[24] Rajan A S, Ojha U, Schloesser J, Lee Y J. Aligned Datasets Improve Detection of Latent Diffusion-Generated Images[A/OL]. arXiv, 2025[2025-04-05]. DOI:10.48550/arXiv.2410.11835.
[25] Guillaro, F., Zingarini, G., Usman, B., Sud, A., Cozzolino, D., & Verdoliva, L. (2025). A bias-free training paradigm for more general ai-generated image detection. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 18685-18694).
[26] Chen R, Xi J, Yan Z, Zhang K Y, Wu S, Xie J, Chen X, Xu L, Guan I, Yao T, Ding S. Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable[A/OL]. arXiv, 2025[2025-12-05]. DOI:10.48550/arXiv.2505.14359.
[27] Wang L, Chen W, Li Z, Guo S. PDA: Generalizable Detection of AI-Generated Images via Post-hoc Distribution Alignment[A/OL]. arXiv, 2025[2025-12-05]. DOI:10.48550/arXiv.2502.10803.
[28] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.
[29] Brock A, Donahue J, Simonyan K. Large Scale GAN Training for High Fidelity Natural Image Synthesis[A/OL]. arXiv, 2019[2025-06-05]. DOI:10.48550/arXiv.1809.11096.
[30] Gu S, Chen D, Bao J, Wen F, Zhang B, Chen D, Yuan L, Guo B. Vector Quantized Diffusion Model for Text-to-Image Synthesis[C/OL]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, 2022: 10686-10696[2025-03-18]. DOI:10.1109/CVPR52688.2022.01043.
[31] Wang Z, Bao J, Zhou W, Wang W, Hu H, Chen H, Li H. DIRE for Diffusion-Generated Image Detection[C/OL]//2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023: 22388-22398[2025-03-27]. DOI:10.1109/ICCV51070.2023.02051.
[32] Liu L, Ren Y, Lin Z, Zhao Z. Pseudo Numerical Methods for Diffusion Models on Manifolds[A/OL]. arXiv, 2022[2025-10-21]. DOI:10.48550/arXiv.2202.09778.
[33] Ojha U, Li Y, Lee Y J. Towards Universal Fake Image Detectors that Generalize Across Generative Models[C/OL]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada: IEEE, 2023: 24480-24489[2025-03-27]. DOI:10.1109/CVPR52729.2023.02345.
[34] Karras T, Aila T, Laine S, Lehtinen J. Progressive Growing of GANs for Improved Quality, Stability, and Variation[A/OL]. arXiv, 2018[2025-06-05]. DOI:10.48550/arXiv.1710.10196.
[35] Karras T, Laine S, Aila T, IEEE Comp Soc. A Style-Based Generator Architecture for Generative Adversarial Networks[C/OL]//2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019). 2019: 4396-4405. DOI:10.1109/CVPR.2019.00453.
[36] Zhu J, Park T, Isola P, Efros A, IEEE. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks[C/OL]//2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV). 2017: 2242-2251. DOI:10.1109/ICCV.2017.244.
[37] Choi Y, Choi M, Kim M, Ha J W, Kim S, Choo J. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation[C/OL]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT: IEEE, 2018: 8789-8797[2025-10-21]. DOI:10.1109/CVPR.2018.00916.
[38] Park T, Liu M, Wang T, Zhu J, IEEE Comp Soc. Semantic Image Synthesis with Spatially-Adaptive Normalization[C/OL]//2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019). 2019: 2332-2341. DOI:10.1109/CVPR.2019.00244.
[39] Chen Q, Koltun V. Photographic Image Synthesis with Cascaded Refinement Networks[C/OL]//2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 1520-1529[2025-10-21]. DOI:10.1109/ICCV.2017.168.
[40] Li K, Zhang T, Malik J. Diverse Image Synthesis From Semantic Layouts via Conditional IMLE[C/OL]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE, 2019: 4219-4228[2025-10-21]. DOI:10.1109/ICCV.2019.00432.
[41] Dai T, Cai J, Zhang Y, Xia S T, Zhang L. Second-Order Attention Network for Single Image Super-Resolution[C/OL]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE, 2019: 11057-11066[2025-10-21]. DOI:10.1109/CVPR.2019.01132.
[42] Chen C, Chen Q, Xu J, Koltun V. Learning to See in the Dark[C/OL]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT: IEEE, 2018: 3291-3300[2025-10-21]. DOI:10.1109/CVPR.2018.00347.
[43] Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Niessner M. FaceForensics++: Learning to Detect Manipulated Facial Images[C/OL]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE, 2019: 1-11[2024-01-23]. DOI:10.1109/ICCV.2019.00009.
[44] Chen Z, Sun K, Zhou Z, Lin X, Sun X, Cao L, Ji R. DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2403.18471.
[45] Lu Z, Huang D, Bai L, Qu J, Wu C, Liu X, Ouyang W. Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images[C]//Oh A, Neumann T, Globerson A, Saenko K, Hardt M, Levine S. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023). 2023.
[46] Kang H, Wen S, Wen Z, Ye J, Li W, Feng P, Zhou B, Wang B, Lin D, Zhang L, He C. LEGION: Learning to Ground and Explain for Synthetic Image Detection[A/OL]. arXiv, 2025[2025-10-18]. DOI:10.48550/arXiv.2503.15264.
[47] Liang Y, He J, Li G, Li P, Klimovskiy A, Carolan N, Sun J, Pont-Tuset J, Young S, Yang F, Ke J, Dvijotham K D, Collins K M, Luo Y, Li Y, Kohlhoff K J, Ramachandran D, Navalpakkam V. Rich Human Feedback for Text-to-Image Generation[C/OL]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2024: 19401-19411[2025-10-21]. DOI:10.1109/CVPR52733.2024.01835.
[48] Yan S, Li O, Cai J, Hao Y, Jiang X, Hu Y, Xie W. A Sanity Check for AI-generated Image Detection[A/OL]. arXiv, 2025[2025-10-19]. DOI:10.48550/arXiv.2406.19435.
[49] Huang Z, Xia B, Lin Z, Mou Z, Yang W, Jia J. FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant[A/OL]. arXiv, 2024[2025-10-21]. DOI:10.48550/arXiv.2408.10072.
[50] He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition[A/OL]. arXiv, 2015[2025-10-21]. DOI:10.48550/arXiv.1512.03385.
[51] Wang P, Bai S, Tan S, Wang S, Fan Z, Bai J, Chen K, Liu X, Wang J, Ge W, Fan Y, Dang K, Du M, Ren X, Men R, Liu D, Zhou C, Zhou J, Lin J. Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution[A/OL]. arXiv, 2024[2025-10-21]. DOI:10.48550/arXiv.2409.12191.
[52] Huang, Z., Hu, J., Li, X., He, Y., Zhao, X., Peng, B., ... & Cheng, G. (2025). Sida: Social media image deepfake detection, localization and explanation with large multimodal model. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 28831-28841).
[53] Sarkar A, Mai H, Mahapatra A, Lazebnik S, Forsyth D A, Bhattad A. Shadows Don’t Lie and Lines Can’t Bend! Generative Models Don’t know Projective Geometry…for Now[C/OL]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2024: 28140-28149[2025-10-21]. DOI:10.1109/CVPR52733.2024.02658.
[54] Li J, Li D, Xiong C, Hoi S. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation[A/OL]. arXiv, 2022[2025-10-23]. DOI:10.48550/arXiv.2201.12086. [55] Podell D, English Z, Lacey K, Blattmann A, Dockhorn T, Müller J, Penna J, Rombach R. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis[A/OL]. arXiv, 2023[2025-04-17]. DOI:10.48550/arXiv.2307.01952.
[56] Arkhipkin V, Filatov A, Vasilev V, Maltseva A, Azizov S, Pavlov I, Agafonova J, Kuznetsov A, Dimitrov D. Kandinsky 3.0 Technical Report[A/OL]. arXiv, 2024[2025-10-21]. DOI:10.48550/arXiv.2312.03511.
[57] Chen J, Yu J, Ge C, Yao L, Xie E, Wu Y, Wang Z, Kwok J, Luo P, Lu H, Li Z. PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis[A/OL]. arXiv, 2023[2025-10-21]. DOI:10.48550/arXiv.2310.00426.
[58] Hong Y, Zhang J. WildFake: A Large-scale Challenging Dataset for AI-Generated Images Detection[A/OL]. arXiv, 2024[2024-03-25].
[59] Lin T Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick C L, Dollár P. Microsoft COCO: Common Objects in Context[A/OL]. arXiv, 2015[2025-08-17]. DOI:10.48550/arXiv.1405.0312.
[60] Yu F, Seff A, Zhang Y, Song S, Funkhouser T, Xiao J. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop[A/OL]. arXiv, 2016[2025-10-22]. DOI:10.48550/arXiv.1506.03365.
[61] Schuhmann C, Beaumont R, Vencu R, Gordon C, Wightman R, Cherti M, Coombes T, Katta A, Mullis C, Wortsman M, Schramowski P, Kundurthy S, Crowson K, Schmidt L, Kaczmarczyk R, Jitsev J. LAION-5B: An open large-scale dataset for training next generation image-text models[J].
[62] Zhou Z, Luo Y, Wu Y, Sun K, Ji J, Yan K, Ding S, Sun X, Wu Y, Ji R. AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models[A/OL]. arXiv, 2025[2025-10-19]. DOI:10.48550/arXiv.2507.02664.
[63] Park, J., & Owens, A. (2025). Community forensics: Using thousands of generators to train fake image detectors. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 8245-8257).
[64] Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730-3738).
[65] Wang Z, Bao J, Zhou W, Wang W, Hu H, Chen H, Li H. DIRE for Diffusion-Gener ated Image Detection[A/OL]. arXiv, 2023[2024-01-08].
[66] 杨皓然, 马瑞强, 王钢, 崔旭, 郭亚楠. 人工智能生成图像检测综述[J/OL]. 计算机技术与发展, 2025, 35(11): 1-11. DOI:10.20165/j.cnki.ISSN1673-629X.2025.0152. YANG Haoran, MA Ruiqiang, WANG Gang, et al. A review of AI-generated image detection [J]. Computer Technology and Development, 2025, 35(11): 1-11.
[67] 张汝波, 蔺庆龙, 张天一. 基于深度学习的图像篡改检测方法综述[J]. 智能系统学报, 2025, 20(2): 283-304. ZHANG Rubo, LIN Qinglong, ZHANG Tianyi. A review of image forgery detection methods based on deep learning [J]. CAAI Transactions on Intelligent Systems, 2025, 20(2): 283-304.
[68] Epstein D C, Jain I, Wang O, Zhang R. Online Detection of AI-Generated Images[C/OL]//2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Paris, France: IEEE, 2023: 382-392[2025-03-27]. DOI:10.1109/ICCVW60793.2023.00045.
[69] Lu, Z., Huang, D., Bai, L., Qu, J., Wu, C., Liu, X., & Ouyang, W. (2023). Seeing is not always believing: Benchmarking human and model perception of ai-generated images. Advances in neural information processing systems, 36, 25435-25447.
[70] Zhu M, Chen H, Yan Q, Huang X, Lin G, Li W, Tu Z, Hu H, Hu J, Wang Y. GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image[A/OL]. arXiv, 2023[2024-03-12].
[71] Ricker J, Damm S, Holz T, Fischer A. Towards the Detection of Diffusion Model Deepfakes[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2210.14571.
[72] Bird J J, Lotfi A. CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images[J/OL]. IEEE Access, 2024, 12: 15642-15650. DOI:10.1109/ACCESS.2024.3356122.
[73]Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[A/OL]. arXiv, 2021[2023-11-08]. http://arxiv.org/abs/2010.11929.
[74] Coccomini D A, Esuli A, Falchi F, Gennaro C, Amato G. Detecting images generated by diffusers[J/OL]. PeerJ Computer Science, 2024, 10: e2127. DOI:10.7717/peerj-cs.2127.
[75] Das S, Dutta D, Ghosh T, Naskar R. Universal Detection and Source Attribution of Diffusion Model Generated Images with High Generalization and Robustness[M/OL]//Maji P, Huang T, Pal N R, Chaudhury S, De R K. Pattern Recognition and Machine Intelligence: 14301. Cham: Springer Nature Switzerland, 2023: 441-448[2025-03-27]. DOI:10.1007/978-3-031-45170-6_45.
[76] Zhu M, Chen H, Huang M, Li W, Hu H, Hu J, Wang Y. GenDet: Towards Good Generalizations for AI-Generated Image Detection[A/OL]. arXiv, 2023[2024-03-12].
[77] He Z, Chen P Y, Ho T Y. RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection[A/OL]. arXiv, 2024[2025-10-24]. DOI:10.48550/arXiv.2405.20112.
[78] Oquab M, Darcet T, Moutakanni T, Vo H, Szafraniec M, Khalidov V, Fernandez P, Haziza D, Massa F, El-Nouby A, Assran M, Ballas N, Galuba W, Howes R, Huang P Y, Li S W, Misra I, Rabbat M, Sharma V, Synnaeve G, Xu H, Jegou H, Mairal J, Labatut P, Joulin A, Bojanowski P. DINOv2: Learning Robust Visual Features without Supervision[A/OL]. arXiv, 2024[2025-12-07]. DOI:10.48550/arXiv.2304.07193.
[79] Xu K, Zhang L, Shi J. Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2403.19653.
[80] Yan S, Li O, Cai J, Hao Y, Jiang X, Hu Y, Xie W. A Sanity Check for AI-generated Image Detection[A/OL]. arXiv, 2025[2025-03-27]. DOI:10.48550/arXiv.2406.19435.
[81] Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PmLR.
[82] Sha Z, Li Z, Yu N, Zhang Y. DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models[A/OL]. arXiv, 2023[2025-03-27]. DOI:10.48550/arXiv.2210.06998.
[83] Liu Z, Wang H, Kang Y, Wang S. Mixture of Low-rank Experts for Transferable AI-Generated Image Detection[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2404.04883.
[84] Hu E J, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W. LoRA: Low-Rank Adaptation of Large Language Models[A/OL]. arXiv, 2021[2025-10-23]. DOI:10.48550/arXiv.2106.09685.
[85] Khan S A, Dang-Nguyen D T. CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection[A/OL]. arXiv, 2024[2025-04-03]. DOI:10.48550/arXiv.2402.12927.
[86] Cozzolino D, Poggi G, Corvi R, Nießner M, Verdoliva L. Raising the Bar of AI-generated Image Detection with CLIP[A/OL]. arXiv, 2023[2024-03-25].
[87] Gaintseva T, Kushnareva L, Magai G, Piontkovskaya I, Nikolenko S, Benning M, Barannikov S, Slabaugh G. Improving Interpretability and Robustness for the Detection of AI-Generated Images[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2406.15035.
[88] Shen Z, Zhang K, Jia B, Fang Y, Yu Z, Lin S. DF-LLaVA: Unlocking MLLM’s potential for Synthetic Image Detection via Prompt-Guided Knowledge Injection[A/OL]. arXiv, 2025[2025-12-05]. DOI:10.48550/arXiv.2509.14957.
[89] Chang Y M, Yeh C, Chiu W C, Yu N. AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors[A/OL]. arXiv, 2024[2025-12-05]. DOI:10.48550/arXiv.2310.17419.
[90] Eutamene H B, Hamidouche W, Keita M, Taleb-Ahmed A, Martin A, Camacho D, Hadid A. SIREN: Self-Improving Retrieval-Augmented Deepfake Detection with Explainable LLM Reasoning[J].
[91] Chen, J., Yao, J., & Niu, L. (2024). A single simple patch is all you need for ai-generated image detection. arXiv preprint arXiv:2402.01123.
[92] Yuan, L., Li, X., Zhang, Y., Zhang, J., Li, H., & Gao, X. MLEP: Multi-granularity Local Entropy Patterns for Generalized AI-generated Image Detection. In The Thirty-ninth Annual Conference on Neural Information Processing Systems.
[93] Wong Y J, Ng T K. Local Statistics for Generative Image Detection[A/OL]. arXiv, 2023[2024-03-25].
[94] Zhang, J., Li, L., Yan, C., Ke, W., & Gong, Y. (2025, October). Frequency-aware Correlation Discovering and Spatial Forgery Clue Distilling for Synthetic Image Detection. In Proceedings of the 33rd ACM International Conference on Multimedia (pp. 11726-11735).
[95] Das D, Yahan M, Zaman M T, Bayesh M R. Edge-Enhanced Vision Transformer Framework for Accurate AI-Generated Image Detection[A/OL]. arXiv, 2025[2025-12-05]. DOI:10.48550/arXiv.2508.17877.
[96] Ju Y, Jia S, Cai J, Guan H, Lyu S. GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection[A/OL]. arXiv, 2023[2025-12-04]. DOI:10.48550/arXiv.2211.08615.
[97] Uhlenbrock L, Cozzolino D, Moussa D, Verdoliva L, Riess C. Did You Note My Palette? Unveiling Synthetic Images Through Color Statistics[C/OL]//Proceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security. Baiona Spain: ACM, 2024: 47-52[2025-10-22]. DOI:10.1145/3658664.3659652.
[98] Dogoulis P, Kordopatis-Zilos G, Kompatsiaris I, Papadopoulos S. Improving Synthetically Generated Image Detection in Cross-Concept Settings[C/OL]//Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation. 2023: 28-35[2025-03-27]. DOI:10.1145/3592572.3592846.
[99] Cheng H, Guo Y, Wang T, Nie L, Kankanhalli M. Diffusion Facial Forgery Detection[A/OL]. arXiv, 2024[2024-03-25].
[100] Zhong N, Xu Y, Li S, Qian Z, Zhang X. PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection[A/OL]. arXiv, 2024[2025-03-27]. DOI:10.48550/arXiv.2311.12397.
[101] Lu Y, Ebrahimi T. Towards the Detection of AI-Synthesized Human Face Images[A/OL]. arXiv, 2024[2024-03-25].
[102]Doloriel C T, Cheung N M. Frequency Masking for Universal Deepfake Detection[A/OL]. arXiv, 2024[2025-04-03]. DOI:10.48550/arXiv.2401.06506.
[103] Koutlis C, Papadopoulos S. Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection[A/OL]. arXiv, 2024[2025-04-03]. DOI:10.48550/arXiv.2402.19091.
[104] Baru, L. B., Boddeda, R., Patel, S. A., & Gajapaka, S. M. (2025). Wavelet-driven generalizable framework for deepfake face forgery detection. In Proceedings of the winter conference on applications of computer vision (pp. 1661-1669).
[105] Lanzino R, Fontana F, Diko A, Marini M R, Cinque L. Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks[C/OL]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, WA, USA: IEEE, 2024: 3771-3780[2025-10-24]. DOI:10.1109/CVPRW63382.2024.00381.
[106] Karageorgiou, D., Papadopoulos, S., Kompatsiaris, I., & Gavves, E. (2025). Any-resolution ai-generated image detection by spectral learning. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 18706-18717).
[107] Zhang D, Zhang T, Ge S, Süsstrunk S. Enhancing Frequency Forgery Clues for Diffusion-Generated Image Detection[A/OL]. arXiv, 2025[2025-12-05]. DOI:10.48550/arXiv.2511.00429.
[108] Zhang, D., Zhang, T., Ge, S., & Susstrunk, S. Leveraging Natural Frequency Deviation for Diffusion-Generated Image Detection.
[109] Zhou, C., Wang, J., Li, Y., Li, L., Cao, J., & Tang, S. (2025). Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection. arXiv preprint arXiv:2512.17350.
[110] Wen Y, Kirchenbauer J, Geiping J, Goldstein T. Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust[A/OL]. arXiv, 2023[2025-03-27]. DOI:10.48550/arXiv.2305.20030.
[111] Durall, R., Keuper, M., & Keuper, J. (2020). Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7890-7899).
[112] Corvi R, Cozzolino D, Poggi G, Nagano K, Verdoliva L. Intriguing properties of synthetic images: from generative adversarial networks to diffusion models[C/OL]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver, BC, Canada: IEEE, 2023: 973-982[2025-04-15]. DOI:10.1109/CVPRW59228.2023.00104.
[113]Liu H, Tan Z, Tan C, Wei Y, Zhao Y, Wang J. Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection[A/OL]. arXiv, 2023[2025-04-03]. DOI:10.48550/arXiv.2312.16649.
[114]Wißmann A, Zeiler S, Nickel R M, Kolossa D. Whodunit: Detection and Attribution of Synthetic Images by Leveraging Model-specific Fingerprints[C/OL]//3rd ACM International Workshop on Multimedia AI against Disinformation. Phuket Thailand: ACM, 2024: 65-72[2025-10-24]. DOI:10.1145/3643491.3660280.
[115]Meng Z, Peng B, Dong J, Tan T. Artifact Feature Purification for Cross-domain Detection of AI-generated Images[A/OL]. arXiv, 2024[2024-03-25].
[116]Ricker J, Lukovnikov D, Fischer A. AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error[A/OL]. arXiv, 2024[2024-03-25].
[117]Ma R, Duan J, Kong F, Shi X, Xu K. Exposing the Fake: Effective Diffusion-Generated Images Detection[A/OL]. arXiv, 2023[2024-03-25].
[118]Luo Y, Du J, Yan K, Ding S. LaRE$^2$: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection[A/OL]. arXiv, 2025[2025-03-27]. DOI:10.48550/arXiv.2403.17465.
[119] Chu, B., Xu, X., Wang, X., Zhang, Y., You, W., & Zhou, L. (2025). Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 12830-12839).
[120] Liang, Y., Yu, M., Li, G., Jiang, J., Du, F., Jingyuan, L., ... & Huang, W. Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection. In The Thirty-ninth Annual Conference on Neural Information Processing Systems.
[121] Sinitsa S, Fried O. Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and Model Lineage Analysis[C/OL]//2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI, USA: IEEE, 2024: 4055-4064[2025-03-27]. DOI:10.1109/WACV57701.2024.00402.
[122] Tan, C., Zhao, Y., Wei, S., Gu, G., Liu, P., & Wei, Y. (2024). Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 28130-28139). arXiv, 2023[2025-04-05]. DOI:10.48550/arXiv.2312.10461.
[123] Cazenavette G, Sud A, Leung T, Usman B. FakeInversion: Learning to Detect Image[1] Xu Q, Chen D, Chen J, Lyu S, Wang C. Recent Advances on Generalizable Diffusion-generated Image Detection[A/OL]. arXiv, 2025[2025-08-16]. http://arxiv.org/abs/2502.19716. DOI:10.48550/arXiv.2502.19716. s from Unseen Text-to-Image Models by Inverting Stable Diffusion[A/OL]. arXiv, 2024[2025-04-05]. DOI:10.48550/arXiv.2406.08603.
[124] Zhang Y, Xu X. Diffusion Noise Feature: Accurate and Fast Generated Image Detection[A/OL]. arXiv, 2025[2025-10-24]. DOI:10.48550/arXiv.2312.02625.
[125] Chen, B., Zeng, J., Yang, J., & Yang, R. (2024, July). Drct: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. In Forty-first International Conference on Machine Learning.
[126] Corvi R, Cozzolino D, Zingarini G, Poggi G, Nagano K, Verdoliva L. On the detection of synthetic images generated by diffusion models[A/OL]. arXiv, 2022[2025-03-27]. DOI:10.48550/arXiv.2211.00680.
[127] Zhang, H., He, Q., Bi, X., Li, W., Liu, B., & Xiao, B. (2025). Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network. InProceedings of the Computer Vision and Pattern Recognition Conference(pp. 23828-23837).
[128] Jia, Z., Huang, C., Zhu, Y., Fei, H., Duan, X., Yuan, Z., ... & Zhou, J. (2025). Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis. InProceedings of the Computer Vision and Pattern Recognition Conference(pp. 13445-13454).

Please choose a citation manager

Content to export