结合注意力和低光增强的夜间语义分割

doi:10.19678/j.issn.1000-3428.0068104

摘要/Abstract

摘要：

随着深度学习技术的发展和计算能力的提升, 对白天拍摄的自然场景图像进行语义分割能够取得良好的效果。然而, 在夜间图像语义分割任务中, 由于存在曝光不平衡、缺乏标记数据等问题, 由白天数据训练的模型往往无法取得良好的表现。为此, 提出一种新的无监督夜间图像语义分割网络(AI-USeg)。首先, 使用一个轻量级的自校准照明网络(SCI)对夜间图像进行增强, 以减少光照变化对后续语义分割网络的影响; 其次, 引入领域自适应(DA)方法, 将模型从包含大量有标签数据的Cityscapes自适应到Dark Zurich-D, 解决缺乏标记数据的问题; 随后, AI-USeg在基于全卷积网络(FCN)实现的判别器中引入SENet, 通过在输出空间进行对抗学习来适应夜间低光照环境下的图像特征, 以提升夜间图像语义分割任务的效果。实验使用Cityscapes和Dark Zurich-train中的2 416个昼夜图像对进行无监督训练, 结果表明, AI-USeg在Dark Zurich-test和Nighttime Driving-test上的平均交并比(mIoU)分别达到了47.9%和51.5%, 相较于MGCDA方法分别提高了5.4和2.1个百分点。AI-USeg对夜间图像的特征适应性更强, 具有更高的鲁棒性, 为夜间场景下的图像分割任务提供了一种有效的解决方案。

关键词: 深度学习, 语义分割, 自动驾驶, 低光图像增强, 注意力机制

Abstract:

With the development of deep learning technology and the improvements in computing power, semantic segmentation of natural scene images captured during the day shows promising results. However, in nighttime image semantic segmentation tasks, models trained on daytime data often fail to deliver satisfactory performance due to challenges such as imbalanced exposure and a lack of labeled data. To address these challenges, a new unsupervised nighttime image semantic segmentation network called AI-USeg is proposed. First, a lightweight Self-Calibrating Illumination (SCI) network is used to enhance nighttime images, thereby mitigating the impact of lighting variations on subsequent semantic segmentation networks. Next, a Domain Adaptation (DA) method is introduced to transition the model from Cityscapes containing a large amount of labeled data to Dark Zurich-D, addressing the lack of labeled data. Subsequently, AI-USeg introduces a Squeeze-and-Excitation Network (SENet) into the discriminator, built upon a Fully Convolutional Network (FCN). This adaptation facilitates the adjustment of image features in low-light nighttime settings through adversarial learning in the output space, ultimately improving the performance of semantic segmentation tasks for nighttime images. The experiment used two sets of 2 416 day and night image pairs sourced from Cityscapes and Dark Zurich-train for unsupervised training. The results show that AI-USeg achieved Mean Intersection over Union (mIoU) values of 47.9% and 51.5% on the Dark Zurich-test and Nighttime Driving-test datasets, respectively. These values were 5.4 and 2.1 percentage points higher than those obtained using the MGCDA method. In conclusion, AI-USeg displayed stronger adaptability to nighttime image features and higher robustness than previous segmentation models, providing an effective solution for image segmentation tasks in nighttime scenes.

Key words: deep learning, semantic segmentation, autonomous driving, low-light image enhancement, attention mechanism

肖慈, 徐杨, 张永丹, 冯明文, 黄易仟. 结合注意力和低光增强的夜间语义分割[J]. 计算机工程, 2024, 50(7): 271-281.

Ci XIAO, Yang XU, Yongdan ZHANG, Mingwen FENG, Yiqian HUANG. Nighttime Semantic Segmentation with Attention and Low-Light Enhancement[J]. Computer Engineering, 2024, 50(7): 271-281.

https://www.ecice06.com/CN/Y2024/V50/I7/271

图/表 12

图1 AI-USeg网络架构

Fig.1 Network architecture of AI-USeg

图2 SCI结构

Fig.2 The structure of SCI

图3 语义分割网络结构

Fig.3 The structure of semantic segmentation network

图4 SENet示意图

Fig.4 Schematic diagram of SENet

图5 在Dark Zurich-val上的可视化结果对比

Fig.5 Comparison of visualization results on Dark Zurich-val

图6 在Nighttime Driving-test上的可视化结果对比

Fig.6 Comparison of visualization results on Nighttime Driving-test

图7 Relight和Enhancement图像的可视化结果

Fig.7 Visualization results of Relight and Enhancement images

参考文献 25

1	周东明, 张灿龙, 唐艳平, 等. 联合语义分割与注意力机制的行人再识别模型. 计算机工程, 2022, 48 (2): 201- 206. URL
	ZHOU D M , ZHANG C L , TANG Y P , et al. Pedestrian re-identification model combining semantic segmentation and attention mechanism. Computer Engineering, 2022, 48 (2): 201- 206. URL
2	WU X Y, WU Z Y, GUO H, et al. DANNet: a one-stage domain adaptation network for unsupervised nighttime semantic segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2021: 15764-15773.
3	赫晓慧, 宋定君, 李盼乐, 等. 融合多尺度特征的遥感影像道路提取方法. 计算机工程, 2022, 48 (8): 196- 205. URL
	HE X H , SONG D J , LI P L , et al. Remote sensing image road extraction method combined with multi-scale features. Computer Engineering, 2022, 48 (8): 196- 205. URL
4	范润泽, 刘宇红, 张荣芬, 等. 基于多尺度注意力机制的道路场景语义分割模型. 计算机工程, 2023, 49 (2): 288- 295. URL
	FAN R Z , LIU Y H , ZHANG R F , et al. Road scene semantic segmentation model based on multi-scale attention mechanism. Computer Engineering, 2023, 49 (2): 288- 295. URL
5	DAI D X, VAN GOOL L. Dark model adaptation: semantic image segmentation from daytime to nighttime[C]//Proceedings of the 21st International Conference on Intelligent Transportation Systems. Washington D.C.,USA:IEEE Press,2018: 3819-3824.
6	SAKARIDIS C, DAI D X, VAN GOOL L. Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D.C.,USA:IEEE Press,2019: 7373-7382.
7	SAKARIDIS C , DAI D X , VAN GOOL L . Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (6): 3139- 3153. doi: 10.1109/TPAMI.2020.3045882
8	SUN L, WANG K W, YANG K L, et al. See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion[EB/OL].[2023-06-05]. https://arxiv.org/abs/1908.05868.
9	ROMERA E, BERGASA L M, YANG K L, et al. Bridging the day and night domain gap for semantic segmentation[C]//Proceedings of IEEE Intelligent Vehicles Symposium. Washington D.C.,USA:IEEE Press,2019:1312-1318.
10	CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2016: 3213-3223.
11	MA H Y , LIN X R , YU Y Z . I2F: a unified image-to-feature approach for domain adaptive semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (3): 1695- 1710. doi: 10.1109/TPAMI.2022.3229207
12	YANG L H, ZHUO W, QI L, et al. ST++: make self-training work better for semi-supervised semantic segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2022: 4258-4267.
13	TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to adapt structured output space for semantic segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press, 2018: 7472-7481.
14	ZHANG P, ZHANG B, ZHANG T, et al. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2021: 12409-12419.
15	WEI C, WANG W J, YANG W H, et al. Deep Retinex decomposition for low-light enhancement[EB/OL].[2023-06-05]. https://arxiv.org/abs/1808.04560.
16	DENG X Q, WANG P, LIAN X C, et al. NightLab: a dual-level architecture with hardness detection for segmentation at night[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2022: 16917-16927.
17	ZHU X Z, CHENG D Z, ZHANG Z, et al. An empirical study of spatial attention mechanisms in deep networks[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D.C.,USA:IEEE Press,2019: 6687-6696.
18	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2018: 7132-7141.
19	MA L, MA T Y, LIU R S, et al. Toward fast, flexible, and robust low-light image enhancement[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2022: 5627-5636.
20	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2016: 770-778.
21	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2017: 6230-6239.
22	VU T H, JAIN H, BUCHER M, et al. ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2019: 2512-2521.
23	ZHANG Y H , GUO X J , MA J Y , et al. Beyond brightening low-light images. International Journal of Computer Vision, 2021, 129 (4): 1013- 1037. doi: 10.1007/s11263-020-01407-x
24	CHEN L C , PAPANDREOU G , KOKKINOS I , et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 834- 848. doi: 10.1109/TPAMI.2017.2699184
25	LIN G S, MILAN A, SHEN C H, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2017: 5168-5177.

[1]	王炼红, 林飞鹏, 李潇瑶, 谌桂枝, 周莉. 融入课程知识图谱的KMAKT预测[J]. 计算机工程, 2024, 50(7): 23-31.
[2]	钱来, 赵卫伟. 基于对比学习和注意力机制的文本分类方法[J]. 计算机工程, 2024, 50(7): 104-111.
[3]	刘建敏, 林晖, 汪晓丁. 基于图注意力机制的无地图场景轨迹预测方法[J]. 计算机工程, 2024, 50(7): 144-153.
[4]	牛瑞婷, 严天峰, 高锐, 王映植. 低信噪比下基于深度学习TCNN-MobileNet的调制识别[J]. 计算机工程, 2024, 50(7): 204-215.
[5]	屠乃威, 焦猛, 阎馨. 复杂环境下输电线路鸟巢目标图像检测模型[J]. 计算机工程, 2024, 50(7): 216-226.
[6]	逯焕宇, 张永宏, 马光义, 谢东林, 田伟. 基于半监督对抗学习的遥感图像水体提取[J]. 计算机工程, 2024, 50(7): 251-263.
[7]	张诗婧, 莫绪涛, 赵行, 董杨林. 基于球面折反射成像和YOLOv7的内螺纹缺陷检测[J]. 计算机工程, 2024, 50(7): 282-292.
[8]	张锡英, 孙守东, 于海浩, 边继龙. 基于空间传播的多视图三维重建[J]. 计算机工程, 2024, 50(7): 293-302.
[9]	贵向泉, 刘世清, 李立, 秦庆松, 李唐艳. 基于改进YOLOv8的景区行人检测算法[J]. 计算机工程, 2024, 50(7): 342-351.
[10]	徐明亮, 李芳媛, 马浩然, 何飞. 大规模神经记录的峰电位聚类算法(特邀)[J]. 计算机工程, 2024, 50(6): 1-34.
[11]	魏琢艺, 罗迈, 李文兵, 曾远松, 余伟江, 杨跃东. 基于多源域适应的单细胞智能分类[J]. 计算机工程, 2024, 50(6): 48-55.
[12]	李子杰, 周菊香, 韩晓瑜, 甘健侯, 鹿泽光, 王俊. 序列特征与学习过程融合的知识追踪模型[J]. 计算机工程, 2024, 50(6): 77-85.
[13]	程腾腾, 姚春龙, 于晓强, 李旭, 王庆丰. 基于多头注意力机制融合常识知识的共情对话生成[J]. 计算机工程, 2024, 50(6): 94-101.
[14]	更藏措毛, 黄鹤鸣, 杨毅杰. 融合多尺度特征与上下文信息的语音增强方法[J]. 计算机工程, 2024, 50(6): 138-147.
[15]	李永飞, 李铭洋, 常鑫, 曹可欣. 基于可解释性深度学习的物联网水质监测数据异常检测[J]. 计算机工程, 2024, 50(6): 179-187.

选择文件类型/文献管理软件名称

选择包含的内容