Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

A Dual-Stage HVI Color Space Transformer Network for Low-Light Image Enhancement

  

  • Published:2026-05-12

基于HVI颜色空间变换的双阶段低光图像增强网络

Abstract: Low-light image enhancement (LLIE) is crucial in computer vision for restoring rich visual information from corrupted low-light images. However, existing LLIE methods often suffer from color bias due to color space sensitivity, and they typically fail to balance denoising and color fidelity within a single-stage framework. To address these challenges, this research introduces a novel Dual-Stage HVI-based Transformer Network (DHTNet) for LLIE. DHTNet significantly improves the quality of low-light images by decoupling intensity (I) and color (HV) maps, enabling their independent yet synergistic optimization. In the first stage, a hierarchical Transformer network equipped with an Adaptive Guidance Interaction Module (AGIM) models long-range dependencies between I and HV features. This stage achieves global noise suppression and accurate color calibration. In the second stage, the Multi-Scale Enhanced Synergistic Attention (MESA) module enhances localized color and feature representation through synergetic optimization across I and HV branches. This dual-stage framework addresses the limitations of existing LLIE approaches by retaining complex image details while enhancing visual realism. Experimental results show that DHTNet achieves the highest PSNR on both the SICE and SID datasets, surpassing the second-best method by 0.717 dB and 1.897 dB, respectively. In addition, DHTNet attains PSNR values of 28.756 dB, 24.683 dB, and 25.950 dB on the LOLv1, LOLv2-Real, and LOLv2-Synthetic datasets, respectively, outperforming existing methods such as Retinexformer, CIDNet and other models.

摘要: 低光图像增强(LLIE)是计算机视觉领域的一项关键技术,旨在从质量退化的低光图像中恢复丰富的视觉信息。然而,现有方法因对颜色空间敏感,常出现色彩偏差问题,且在单阶段框架下难以兼顾噪声抑制与色彩保真度。针对上述问题,提出一种基于HVI颜色空间变换的双阶段网络DHTNet。该网络通过解耦强度(I)和颜色(HV)图,实现对两者的独立优化与协同增强,从而显著提升低光图像的视觉质量。第一阶段采用集成了自适应引导交互模块(AGIM)的分层Transformer网络建模了I和HV特征之间的长距离依赖关系,以实现全局抑制噪声和精准的色彩校正。第二阶段引入多尺度增强协同注意力(MESA)模块,通过跨分支的协同优化机制增强局部色彩与细节特征。该双阶段框架在保留复杂图像结构的同时,有效提升了视觉真实性,解决了现有LLIE方法的局限性。在SICE和SID数据集上的实验表明,DHTNet的峰值信噪比(PSNR)均达到最高水平,较次优方法提升0.717 dB和1.897 dB;在LOLv1、LOLv2-Real和LOLv2-Synthetic数据集上,PSNR分别达到28.756 dB、24.683 dB和25.950 dB,性能优于Retinexformer、CIDNet等对比模型。