Abstract:
Aiming at the problems of difficult identification of microscale tampering and insufficient fusion of multi-domain features, this paper proposes an image tampering detection framework that integrates Frequency Domain Enhancement (FDE) and bottleneck Aggregated Attention (CFAM). The method adopts RGB+DCT dual-stream: FDE divides the features into low/mid/high frequency sub-bands in the frequency domain, uses frequency band attention to suppress redundant low frequencies, enhances tampered sensitive frequency bands, and uses multi-scale convolution to capture boundaries and mid/high frequency disturbances. After enhancement, it is refluxed through IDCT to achieve air-frequency complementarity. During the fusion stage, CFAM models channel importance and spatial significance in parallel within a 1×1 bottleneck and aligns two types of attention through linear aggregation in the same domain. This is different from the serial or single-dimensional modeling of existing attention mechanisms (such as SE and CBAM), which not only reduces computational overhead but also decreases information transmission loss, significantly improving the response to small targets and weak boundaries. Weighted loss and perturbation enhancement are introduced in training to alleviate class imbalance and strengthen robustness. Unified caliber evaluation and ablation experiments on multiple sets of public datasets show that this method outperforms recent comparable methods in terms of accuracy, robustness and cross-domain generalization, and FDE and CFAM have synergistic gains. It can still generate high-precision tampered masks in strongly disturbed scenarios such as recompression, blurring and scaling, and has good efficiency and deployability.
摘要: 针对微尺度篡改难识别与多域特征融合不足的问题,本文提出融合频域增强(FDE)与瓶颈聚合注意力(CFAM)的图像篡改检测框架。方法采用RGB+DCT双流:FDE在频域将特征划分为低/中/高频子带,利用频段注意力抑制冗余低频、强化篡改敏感频段,并以多尺度卷积捕捉边界与中高频扰动;增强后经IDCT回流,实现空间和频域信息互补。融合阶段,CFAM在 1×1 瓶颈内并行建模通道重要性与空间显著性,并通过同域线性聚合对齐两类注意力,区别于现有注意力机制(如SE、CBAM)的串行或单一维度建模,既降低计算开销,又减少信息传递损耗,显著提升对小目标与弱边界的响应。训练中引入加权损失与扰动增强缓解类不平衡并强化鲁棒性。多组公开数据集的统一口径评测与消融实验表明,本方法在精度、鲁棒性与跨域泛化方面优于近期可比方法,且FDE与CFAM具有协同增益;在重压缩、模糊与缩放等强扰动场景中仍能生成高精度篡改掩模,具备良好的效率与可部署性。