作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (12): 224-232. doi: 10.19678/j.issn.1000-3428.0068490

• 图形图像处理 • 上一篇    下一篇

基于逐像素强化学习的边缘保持图像复原

江敏1, 陈飞1,*(), 程航2, 王美清2   

  1. 1. 福州大学计算机与大数据学院, 福建 福州 350108
    2. 福州大学数学与统计学院, 福建 福州 350108
  • 收稿日期:2023-10-07 出版日期:2024-12-15 发布日期:2024-04-25
  • 通讯作者: 陈飞
  • 基金资助:
    国家自然科学基金(61771141); 福建省自然科学基金(2021J01620)

Edge-Preserving Image Restoration Based on Pixel-by-Pixel Reinforcement Learning

JIANG Min1, CHEN Fei1,*(), CHENG Hang2, WANG Meiqing2   

  1. 1. School of Computer Science and Big Data, Fuzhou University, Fuzhou 350108, Fujian, China
    2. School of Mathematics and Statistics, Fuzhou University, Fuzhou 350108, Fujian, China
  • Received:2023-10-07 Online:2024-12-15 Published:2024-04-25
  • Contact: CHEN Fei

摘要:

高强度的高斯噪声往往会模糊或破坏图像的细节和结构, 导致边缘信息的丢失。为此, 提出基于逐像素强化学习的边缘保持图像复原算法。首先, 为每个像素构建一个像素层智能体并设计针对边缘处的侧窗均值滤波器到动作空间中, 所有的像素层智能体共享优势行动者-评论家算法的参数, 因此模型可以同时输出所有位置的状态转移概率并选择合适的策略进行状态转移, 从而复原图像; 其次, 在特征提取共享网络中结合协调注意力, 聚焦所有像素位置在特征通道间的全局信息, 并保留位置嵌入信息; 然后, 为了缓解稀疏奖励问题, 设计一个基于图拉普拉斯正则的辅助损失, 关注图像的局部平滑信息, 对局部不平滑区域加以惩罚, 从而促进像素层智能体更加有效地学习到正确的策略以实现边缘保持。实验结果表明, 所提的算法在Middlebury2005数据集和MNIST数据集上的峰值信噪比(PSNR)分别达到32.97 dB和28.26 dB, 相比于Pixel-RL算法分别提升了0.23 dB和0.75 dB, 参数量和训练总时间分别减少了44.9%和18.2%, 在实现边缘保持的同时有效降低了模型的复杂度。

关键词: 图像复原, 深度强化学习, 逐像素强化学习, 协调注意力, 图拉普拉斯, 边缘保持

Abstract:

High-intensity Gaussian noise tends to blur or destroy the details and structure of an image, resulting in the loss of edge information. Therefore, an edge-preserving image restoration algorithm based on pixel-by-pixel reinforcement learning is proposed. First, a pixel-wise agent is constructed for each pixel. The algorithm uses a side window averaging filter at the edge of the action space. All pixel layer agents share the parameters of the advantageous actor-critic algorithm; therefore, the model can output the state transition probability of all positions simultaneously and select the appropriate strategy for the state transition to restore the image. Second, coordinated attention is combined in the feature extraction sharing network to focus on the global information of all pixel positions between the feature channels, to retain the position embedding information. Subsequently, to alleviate the problem of sparse rewards, an auxiliary loss, designed based on graph Laplacian regularity, focuses on the local smoothing information of the image, punishing the local unsmooth area to encourage the pixel-layer agent to learn the correct strategy, so as to more effectively maintain the edge. The experimental results show that the Peak Signal-to-Noise Ratio (PSNR) of the proposed algorithm on the Middlebury2005 and MNIST datasets is 32.97 dB and 28.26 dB, respectively, which is 0.23 dB and 0.75 dB higher than those obtained by the Pixel-RL algorithm, respectively. The total number of parameters and training time decrease by 44.9% and 18.2%, respectively, effectively reducing the complexity of the model while maintaining the edges.

Key words: image restoration, deep reinforcement learning, pixel-by-pixel reinforcement learning, coordinated attention, graph Laplacian, edge-preserving