基于渐进式训练的多判别器域适应目标检测

doi:10.19678/j.issn.1000-3428.0065821

摘要/Abstract

摘要：

基于对抗训练的域适应目标检测的研究旨在不对新数据集进行额外标注的情况下, 将检测模型应用于不同的数据集。但现有算法存在目标检测和域对齐任务难以平衡的问题, 且一般的单判别器结构容易局限于数据的单个模式, 导致域对齐的质量下降。提出一种基于渐进式训练的多判别器域适应目标检测算法, 针对传统的单判别器结构对复杂结构数据进行域对齐时的局限性, 在实例级的域适应头中引入多判别器结构, 使其在学习域不变信息时考虑数据的多模结构, 实现质量更高、更全面的域对齐。同时, 为降低引入多判别器结构而增加的模型复杂度, 设计基于Dropout技术的多判别器结构, 对单个判别器参数进行重复利用, 并创新性地引入渐进式训练策略, 即随着训练的推进逐步增大域对齐任务的比重和难度, 动态平衡目标检测和域对齐任务的权重。实验结果表明, 所提算法在Cityscapes到Foggy Cityscapes的域适应场景下的平均检测精度为42.9%, 相比近几年该领域的新算法提高了至少0.5个百分点。

关键词: 目标检测, 域适应, 对抗训练, 多判别器, 渐进式训练策略

Abstract:

The research on domain adaptive object detection based on adversarial training aims the deployment of detection models for use with different data sets without labeling new data sets. However, existing algorithms have difficulty in balancing the tasks of object detection and domain alignment. The general single discriminator structure is limited to single mode of data, resulting in degradation of domain alignment quality. This paper proposes a multi-discriminator domain adaptive object detection algorithm based on progressive training. Considering the limitations of the traditional single-discriminator structure in the domain alignment of complex structural data, a multi-discriminator structure is introduced into the instance-level domain-adaptive head to force it to consider multiple modes of data while learning the domain invariant information, which contributes to achieving higher quality and more comprehensive domain alignment. Meanwhile, to reduce the increased model complexity, a multi-discriminator structure designed to reuse the single discriminator parameters is introduced based on dropout technology. In this paper, an innovative progressive training strategy is introduced, whereby the proportion and difficulty of domain alignment are gradually increased with the progress in training, to dynamically balance the weight of object detection and domain alignment tasks. The experimental results indicate that the mean average precision of the algorithm in domain adaptation from Cityscapes to Foggy Cityscapes was 42.9%, which is an improvement of at least 0.5 percentage points compared to algorithms of recent years.

Key words: object detection, domain adaptive, adversarial training, multi-discriminator, progressive training strategy

李惠森, 侯进, 党辉, 周宇航. 基于渐进式训练的多判别器域适应目标检测[J]. 计算机工程, 2023, 49(10): 202-211, 221.

Huisen LI, Jin HOU, Hui DANG, Yuhang ZHOU. Domain Adaptive Multi-Discriminator Object Detection Based on Progressive Training[J]. Computer Engineering, 2023, 49(10): 202-211, 221.

http://www.ecice06.com/CN/Y2023/V49/I10/202

图/表 14

图1 本文算法的整体框架

Fig.1 Overall framework of the algorithm in this paper

图2 图像级域适应头

Fig.2 Image level domain adaptive head

图3 实例级域分类器

Fig.3 Instance level domain classifier

图4 基于Dropout的实例级域分类器

Fig.4 Instance level domain classifier based on Dropout

图5 基于Dropout的多判别器域适应结构

Fig.5 Multi-discriminator domain adaptation structure based on Dropout

图6

$ {\lambda } $

参数随迭代次数的变化曲线

Fig.6 Variation curve of

$ {\lambda } $

parameter with the iterations

图7 不同算法的P-R曲线

Fig.7 P-R curves of different algorithms

图8 不同算法的检测结果对比

Fig.8 Detection results comparison of differernt algorithms

图9 判别器数量对模型性能的影响

Fig.9 Influence of the number of discriminators on model performance

图10

$ {\lambda } $

参数对模型性能的影响

Fig.10 Influence of

$ {\lambda } $

parameter on model performance

参考文献 33

1	LI P L, CHEN X Z, SHEN S J. Stereo R-CNN based 3D object detection for autonomous driving[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 7636-7644.
2	HATTORI H, LEE N, BODDETI V N, et al. Synthesizing a scene-specific pedestrian detector and pose estimator for static video surveillance. International Journal of Computer Vision, 2018, 126 (9): 1027- 1044. doi: 10.1007/s11263-018-1077-3
3	SCALISE R, THOMASON J, BISK Y, et al. Improving robot success detection using static object data[C]//Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington D. C., USA: IEEE Press, 2020: 4229-4235.
4	邹慧海, 侯进. 改进SSD算法的道路小目标检测研究. 计算机工程, 2022, 48 (5): 281- 288. URL
	ZOU H H, HOU J. Research on road small target detection with improved SSD algorithm. Computer Engineering, 2022, 48 (5): 281- 288. URL
5	GOPALAN R, LI R N, CHELLAPPA R. Domain adaptation for object recognition: an unsupervised approach[C]//Proceedings of International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2012: 999-1006.
6	李莉, 王新强, 银珊. 基于衰减补偿与直方图拉伸的水下图像增强算法. 计算机工程, 2022, 48 (6): 222- 227. URL
	LI L, WANG X Q, YIN S. Underwater image enhancement algorithm based on attenuation compensation and histogram stretching. Computer Engineering, 2022, 48 (6): 222- 227. URL
7	GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2014: 2672-2680.
8	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
9	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 6154-6162.
10	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2980-2988.
11	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
12	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 6517-6525.
13	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-08-08]. https://arxiv.org/abs/1804.02767.
14	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-08-08]. https://arxiv.org/abs/2004.10934.
15	LONG M S, CAO Y, WANG J M, et al. Learning transferable features with deep adaptation networks[C]//Proceedings of the 32nd International Conference on Machine Learning. New York, USA: ACM Press, 2015: 97-105.
16	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 2017, 60 (6): 84- 90. doi: 10.1145/3065386
17	LI J J, CHEN E P, DING Z M, et al. Maximum density divergence for domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (11): 3918- 3930. doi: 10.1109/TPAMI.2020.2991050
18	CHEN Y H, LI W, SAKARIDIS C, et al. Domain adaptive faster R-CNN for object detection in the wild[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 3339-3348.
19	SAITO K, USHIKU Y, HARADA T, et al. Strong-weak distribution alignment for adaptive object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 6949-6958.
20	XU M H, WANG H, NI B B, et al. Cross-domain detection via graph-induced prototype alignment[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 12352-12361.
21	GUAN D Y, HUANG J X, XIAO A R, et al. Uncertainty-aware unsupervised domain adaptation in object detection. IEEE Transactions on Multimedia, 2022, 24, 2502- 2514. doi: 10.1109/TMM.2021.3082687
22	HE K M, ZHANG X Y, REN S Q, et al. Identity mappings in deep residual networks[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 630-645.
23	GANIN Y, LEMPITSKY V. Unsupervised domain adaptation by backpropagation[EB/OL]. [2022-08-08]. https://arxiv.org/abs/1409.7495.
24	PEI Z Y, CAO Z J, LONG M S, et al. Multi-adversarial domain adaptation[C]//Proceedings of AAAI Conference on Artificial Intelligence. Menlo Park, USA: AAAI Press, 2018: 3934-3941.
25	KURMI V K, BAJAJ V, SUBRAMANIAN V K, et al. Curriculum based dropout discriminator for domain adaptation[EB/OL]. [2022-08-08]. https://arxiv.org/abs/1907.10628.
26	CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA. IEEE Press, 2016: 3213-3223.
27	SAKARIDIS C, DAI D X, VAN GOOL L. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 2018, 126 (9): 973- 992. doi: 10.1007/s11263-018-1072-8
28	GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving?The KITTI vision benchmark suite[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2012: 3354-3361.
29	PADILLA R, NETTO S L, DA SILVA E A B. A survey on performance metrics for object-detection algorithms[C]//Proceedings of International Conference on Systems, Signals and Image Processing. Washington D. C., USA: IEEE Press, 2020: 237-242.
30	HSU C C, TSAI Y H, LIN Y Y, et al. Every pixel matters: center-aware feature alignment for domain adaptive object detector[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 733-748.
31	WANG W, CAO Y, ZHANG J, et al. Exploring sequence feature alignment for domain adaptive detection transformers[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York, USA: ACM Press, 2021: 1730-1738.
32	LIU D N, ZHANG C Y, SONG Y, et al. Decompose to adapt: cross-domain object detection via feature disentanglement. IEEE Transactions on Multimedia, 2023, 25, 1333- 1344. doi: 10.1109/TMM.2022.3141614
33	ALQASIR H, MUSELET D, DUCOTTET C. Region proposal oriented approach for domain adaptive object detection[C]//Proceedings of International Conference on Advanced Concepts for Intelligent Vision Systems. Berlin, Germany: Springer, 2020: 38-50.

[1]	李嘉新, 侯进, 盛博莹, 周宇航. 基于改进YOLOv5的遥感小目标检测网络[J]. 计算机工程, 2023, 49(9): 256-264.
[2]	沈志东, 岳恒宪. 基于分布式扰动的文本对抗训练方法[J]. 计算机工程, 2023, 49(9): 16-22.
[3]	龙玉江, 卫薇, 舒彧, 张正刚, 王道累, 李峰. 基于自适应关键点的破损旋转绝缘子检测方法[J]. 计算机工程, 2023, 49(9): 272-278.
[4]	徐春波, 闫娟, 杨慧斌, 王博, 吴晗. 基于目标检测和语义分割的视觉SLAM算法[J]. 计算机工程, 2023, 49(8): 199-206, 214.
[5]	宋志娜, 李莎, 杨建明, 徐川. 基于特征与区域定位增强的遥感舰船目标检测[J]. 计算机工程, 2023, 49(8): 257-264.
[6]	刘俊豪, 王美林, 谢兴, 宋烨兴, 许莉花. 基于改进YOLOv5的皮革瑕疵检测算法[J]. 计算机工程, 2023, 49(8): 240-249.
[7]	李强龙, 周新文, 位梦恩, 甘阳洲. 基于条形池化和注意力机制的街道场景红外目标检测算法[J]. 计算机工程, 2023, 49(8): 310-320.
[8]	闫兴亚, 匡娅茜, 白光睿, 李月. 基于深度学习的学生课堂行为识别方法[J]. 计算机工程, 2023, 49(7): 251-258.
[9]	聂志勇, 阴宇薇, 汤佳欣, 涂志刚. 一种基于边界框关键点距离的框回归算法[J]. 计算机工程, 2023, 49(7): 65-75.
[10]	李军侠, 王星驰, 殷梓, 石德硕. 边缘深度挖掘的弱监督显著性目标检测[J]. 计算机工程, 2023, 49(7): 169-178.
[11]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.
[12]	齐咏生, 杜晓旭, 朱俊峰, 高胜利, 刘利强. 基于增强型轻量深度网络的牧区牲畜高效检测[J]. 计算机工程, 2023, 49(7): 278-287.
[13]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
[14]	朱红, 牛浩然, 朱彤. 基于字词融合与对抗训练的行业人物实体识别[J]. 计算机工程, 2023, 49(5): 56-62.
[15]	罗华峰, 沈奕菲, 阮黎翔, 杜奇伟, 郑翔, 陈智麒, 张胜. 边缘环境下面向实时目标检测的帧卸载调度算法[J]. 计算机工程, 2023, 49(5): 295-301,309.

选择文件类型/文献管理软件名称

选择包含的内容