Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2023, Vol. 49 ›› Issue (4): 217-225. doi: 10.19678/j.issn.1000-3428.0066077

• Graphics and Image Processing • Previous Articles     Next Articles

Image-Scene Transformation Based on Generative Adversarial Networks

LUO Siqing, CHEN Hui   

  1. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Received:2022-10-24 Revised:2022-12-19 Published:2023-02-08

基于生成对抗网络的图像场景转换

罗嗣卿, 陈慧   

  1. 东北林业大学 信息与计算机工程学院, 哈尔滨 150040
  • 作者简介:罗嗣卿(1964-),男,副教授、博士,主研方向为图像处理、机器学习;陈慧,硕士研究生。
  • 基金资助:
    国家自然科学基金(62202092)。

Abstract: Due to the limitations of time, place, photographic equipment, and other factors, it is difficult to obtain images with the same content but different scenes in the real world.One feasible way is to use Generative Adversarial Networks(GAN) to convert the scenes in the images without a pair of data sets.However, the existing GAN-based image-scene transformation approaches mainly focus on single-category, one-way, and simple-structure scene transformation.To achieve effective scene transformation with rich categories and highly complex semantic structure, a GAN-based image-scene transformation model is proposed in this study to realize the transformation between different scenes such as sunny, rainy, and foggy days.The combination of GAN, attention module, and scene-segmentation module enables the proposed model to accurately recognize and transform Regions of Interest(ROI) while keeping other regions unchanged.To further improve the diversity of output, this paper proposes a new regularization loss that helps in suppressing potential noise.In addition, a noise-separation module is embedded in the discriminator to avoid modal collapse due to lack of noise constraints.The experimental results show that the proposed model achieves 7.25% and 19% higher Fréchet Inception Distance(FID) score and Kernel Inception Distance(KID) score, respectively, compared with the six contrast models(for example, CycleGAN, UNIT, MUNIT, and NICE-GAN).Furthermore, the proposed model can generate images with improved visual effects in different scenes.

Key words: image processing, image transformation, Generative Adversarial Networks(GAN), scene transformation, attention mechanism

摘要: 由于时间、地点、摄影设备等因素的限制,导致在真实世界中很难获得内容相同而场景不同的图像,一种可行方式是利用生成对抗网络(GAN)在没有成对数据集的情况下对图片中的场景进行转换,但是已有基于GAN的图像场景转换方法主要关注单个类别、单向、结构简单的场景。为了解决具有丰富类别和高度复杂语义结构的图像场景转换问题,提出一种基于GAN的图像场景转换模型,以实现晴天、雨天、雾天等不同场景之间的转换。将GAN、注意力模块和场景分割模块相结合,使模型正确识别并转换感兴趣区域同时保持其他区域不变。为了进一步提高输出的多样性,提出一种新型的正则化损失来抑制潜在噪声。此外,为了避免因缺乏噪声约束而出现的模态崩溃问题,在鉴别器中嵌入噪声分离模块。实验结果表明,相较CycleGAN、UNIT、MUNIT、NICE-GAN等6种对比模型,该模型所生成图像的FID得分和KID得分平均分别提高约7.25%和19%,其能够在不同场景下生成视觉效果更佳的图像。

关键词: 图像处理, 图像转换, 生成对抗网络, 场景转换, 注意力机制

CLC Number: