作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (12): 277-284. doi: 10.19678/j.issn.1000-3428.0252288

• 移动互联与通信技术 • 上一篇    下一篇

任意尺度超分辨率图像篡改掩膜生成方法

徐雄1, 杨欣宇2, 朱学康2, 杜博2, 粟磊2, 童炳魁2, 雷泽宇2,3, 周吉喆2,4,*()   

  1. 1. 中国电子科技集团公司第十研究所, 四川 成都 610036
    2. 四川大学计算机学院, 四川 成都 610207
    3. 澳门大学计算机与信息科学系, 澳门 999078
    4. 教育部机器学习与产业智能工程研究中心, 四川 成都 610065
  • 收稿日期:2025-04-03 修回日期:2025-07-25 出版日期:2025-12-15 发布日期:2025-09-18
  • 通讯作者: 周吉喆
  • 基金资助:
    四川省港澳台科技创新合作项目(2024YFHZ0355)

Manipulation Mask Manufacturer for Arbitrary-Scale Super-Resolution Images

XU Xiong1, YANG Xinyu2, ZHU Xuekang2, DU Bo2, SU Lei2, TONG Bingkui2, LEI Zeyu2,3, ZHOU Jizhe2,4,*()   

  1. 1. The 10th Research Institute of China Electronics Technology Group Corporation, Chengdu 610036, Sichuan, China
    2. College of Computer Science, Sichuan University, Chengdu 610207, Sichuan, China
    3. Department of Computer and Information Science, University of Macau, Macao 999078, China
    4. Engineering Research Center of Machine Learning and Industry Intelligence, Ministry of Education, Chengdu 610065, Sichuan, China
  • Received:2025-04-03 Revised:2025-07-25 Online:2025-12-15 Published:2025-09-18
  • Contact: ZHOU Jizhe

摘要:

在图像篡改定位(IML)领域, 现有数据集数量少、质量差, 难以支撑模型的泛化与鲁棒性。为此, 提出篡改掩膜生成(MMM)框架, 引入超分辨率模块以缓解原始图像与篡改图像清晰度差异带来的噪声问题, 并通过特征嵌入拼接与上下文建模生成高质量掩膜。基于MMM框架, 构建包含11 069对原始图像、篡改图像及掩膜的篡改掩膜生成数据集(MMMD), 其中涵盖复制移动、拼接、深度伪造(Deepfake)、图像修复和风格迁移等多种篡改方式。在CASIAv2、NIST16和IMD2020数据集上的实验结果表明, MMM框架取得了较好性能, 并在多种模型中展现出优良的泛化能力。进一步地, 使用MMMD预训练的MVSS-Net和IML-ViT在多个数据集上的F1值显著高于在传统数据集上预训练的模型, 凸显了MMMD在推动图像取证与篡改检测研究中的价值。

关键词: 图像篡改定位, 视觉媒体数据集, 数据集生成, 任意尺度, 超分辨率

Abstract:

In the field of Image Manipulation Localization (IML), the limited quantity and poor quality of existing datasets hinder model generalization and robustness. To address this, a Manipulation Mask Manufacturer (MMM) framework that integrates a super-resolution module to mitigate clarity discrepancies between the original and tampered images is proposed. The framework generates high-quality masks by embedding and concatenating features for effective context modeling. Based on the MMM framework, a Manipulation Mask Manufacturer Dataset (MMMD) is constructed, which contains 11 069 triplets of original images, manipulated images, and corresponding masks. MMMD encompasses diverse manipulation types, including copy-move, splicing, Deepfake, image inpainting, and style transfer. Experimental results demonstrate that MMM achieves strong performance on the CASIAv2, NIST16, and IMD2020 datasets, with F1 values of up to 0.96 and an Intersection over Union (IoU) of 0.90 on CASIAv2. Furthermore, models pretrained on MMMD, such as MVSS-Net and IML-ViT, consistently outperform those pretrained on conventional datasets across multiple benchmarks, highlighting the potential of the dataset to advance research in image forensics and manipulation detection.

Key words: Image Manipulation Localization (IML), dataset of visual media, dataset generation, arbitrary-scale, super-resolution