基于隐状态预测的失真交通信号灯路口控制策略

doi:10.19678/j.issn.1000-3428.0069416

摘要/Abstract

摘要：

交通信号灯控制对缓解交通拥堵、提升城市通勤效率有着重要作用。近年来，以实时交通数据为输入的基于深度强化学习的信号灯控制算法已取得突破性进展。然而，现实场景中的交通数据通常伴随着数据失真。传统方法在修复失真数据后使用强化学习算法控制信号灯，但一方面信号灯相位的动态性给失真修复引入了额外不确定性，另一方面失真修复难以与深度强化学习框架相结合来提升性能。为此，提出基于隐状态预测的失真交通信号灯路口控制模型HCRL。HCRL模型由编码子模型、控制子模型和编码预测子模型组成，通过引入信号灯路口的隐状态表示机制，更好地适应深度强化学习框架，有效表达信号灯路口的控制状态，并使用特殊的迁移训练方法避免数据失真对控制子模型的干扰。使用两个真实数据集验证了数据失真对智能信号灯控制算法的影响。实验结果表明，HCRL模型在所有失真场景和失真率下均优于基于失真修复的信号灯控制模型，并在与其他基线模型的对比中表现出了对数据失真更强的鲁棒性。

关键词: 交通信号灯控制, 智能交通, 深度强化学习, 隐状态, 数据失真

Abstract:

Traffic signal control plays an important role in alleviating traffic congestion and improving urban commuting efficiency. In recent years, breakthroughs have been made in traffic signal control algorithms based on deep reinforcement learning using real-time traffic data as input. However, traffic data in real-world scenarios often involve data distortion. Traditional solutions use reinforcement learning algorithms to control signal lights after repairing distorted data. However, on the one hand, the dynamic phases of traffic signal introduces additional uncertainty to distortion repair, and on the other hand, distortion repair is difficult to combine with deep reinforcement learning frameworks to improve performance. To address these issues, a distorted traffic signal control model based on hidden state prediction, HCRL, is proposed. The HCRL model comprises encoding, control, and encoding prediction sub-models. By introducing a hidden state representation mechanism for signalized intersections, the HCRL model can adapt better to deep reinforcement learning frameworks and effectively express the control state of signalized intersections. In addition, the HCRL model uses a special transfer training method to avoid data distortion interference in the control sub-model. Two real datasets are used to verify the impact of data distortion on the intelligent signal light control algorithms. The experimental results show that the HCRL model outperforms the distortion-completion-based traffic signal control models in all distortion scenarios and distortion rates; further, it demonstrates strong robustness against data distortion when compared with other baseline models.

Key words: traffic signal control, intelligent transportation, deep reinforcement learning, hidden state, data distortion

秦敏浩, 孙未未. 基于隐状态预测的失真交通信号灯路口控制策略[J]. 计算机工程, 2025, 51(9): 1-13.

QIN Minhao, SUN Weiwei. Control Strategy for Intersections with Distorted Traffic Signals Based on Hidden State Prediction[J]. Computer Engineering, 2025, 51(9): 1-13.

https://www.ecice06.com/CN/Y2025/V51/I9/1

图/表 13

图1 车流数据修复与信号灯控制的耦合性

Fig.1 Coupling of traffic data repair and signal control

图2 信号灯路口相关概念

Fig.2 Relevant concepts of signal intersections

图3 DTSE数据与VE数据

Fig.3 DTSE data and VE data

图4 HCRL模型结构细节

Fig.4 Details of the HCRL model structure

图5 HCRL模型结构概览

Fig.5 Overview of the HCRL model structure

图6 信号灯相位变化对VE数据修复的影响

Fig.6 Impact of signal phase change on VE data repair

图7 HCRL模型训练与部署时的数据流向

Fig.7 Data flow direction of the HCRL model during training and deployment

图8 不同失真场景下模型效果随失真率的变化

Fig.8 Variation of model effect with distortion rates in different distortion scenarios

图9 SCRL与HCRL模型训练指标随训练过程的变化关系

Fig.9 Variation of training indicators for SCRL and HCRL models during the training process

参考文献 37

1	ZHAO D B , DAI Y J , ZHANG Z . Computational intelligence in urban traffic signal control: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42 (4): 485- 494.
2	ZHANG L D , ZHU W X . Delay-feedback control strategy for reducing CO₂ emission of traffic flow system. Physica A: Statistical Mechanics and Its Applications, 2015, 428, 481- 492. doi: 10.1016/j.physa.2015.01.077
3	ZHANG L, WU Q, SHEN J, et al. Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control[C]//Proceedings of International Conference on Machine Learning. [S. l. ]: PMLR, 2022: 26645-26654.
4	ZENG J H, HU J M, ZHANG Y. Adaptive traffic signal control with deep recurrent Q-learning[C]//Proceedings of the IEEE Intelligent Vehicles Symposium (IV). Washington D.C., USA: IEEE Press, 2018: 1215-1220.
5	NAWAR M, FARES A, AL-SAMMAK A. Rainbow deep reinforcement learning agent for improved solution of the traffic congestion[C]//Proceedings of the 7th International Japan-Africa Conference on Electronics, Communications, and Computations. Washington D.C., USA: IEEE Press, 2019: 80-83.
6	WEI H, CHEN C C, ZHENG G J, et al. PressLight: learning max pressure control to coordinate traffic signals in arterial network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2019: 1290-1298.
7	CHEN C C, WEI H, XU N, et al. Toward A thousand lights: decentralized deep reinforcement learning for large-scale traffic signal control[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2020: 3414-3421.
8	OROOJLOOY A, NAZARI M, HAJINEZHAD D, et al. AttendLight: universal attention-based reinforcement learning model for traffic signal control[EB/OL]. [2024-01-19]. https://arxiv.org/abs/2010.05772.
9	DU W L, YE J Y, GU J Y, et al. SafeLight: a reinforcement learning method toward collision-free traffic signal control[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2023: 14801-14810.
10	HAN X, ZHAO X Y, ZHANG L, et al. Mitigating action hysteresis in traffic signal control with traffic predictive reinforcement learning[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2023: 673-684.
11	BUSCH J V S, VOELCKNER R, SOSSALLA P, et al. Deep reinforcement learning for the joint control of traffic light signaling and vehicle speed advice[C]//Proceedings of the International Conference on Machine Learning and Applications. Washington D.C., USA: IEEE Press, 2023: 182-187.
12	LIU H X , WU X K , MA W T , et al. Real-time queue length estimation for congested signalized intersections. Transportation Research Part C: Emerging Technologies, 2009, 17 (4): 412- 427. doi: 10.1016/j.trc.2009.02.003
13	COMERT G , CETIN M . Analytical evaluation of the error in queue length estimation at traffic signals from probe vehicle data. IEEE Transactions on Intelligent Transportation Systems, 2011, 12 (2): 563- 573. doi: 10.1109/TITS.2011.2113375
14	SHENG Z H, XUE S B, XU Y W, et al. Real-time queue length estimation with trajectory reconstruction using surveillance data[C]//Proceedings of the 16th International Conference on Control, Automation, Robotics and Vision. Washington D.C., USA: IEEE Press, 2020: 124-129.
15	BAE B , KIM H , LIM H , et al. Missing data imputation for traffic flow speed using spatio-temporal cokriging. Transportation Research Part C: Emerging Technologies, 2018, 88, 124- 139. doi: 10.1016/j.trc.2018.01.015
16	ZHANG W B , ZHANG P L , YU Y H , et al. Missing data repairs for traffic flow with self-attention generative adversarial imputation net. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (7): 7919- 7930. doi: 10.1109/TITS.2021.3074564
17	ZHANG K P , ZHOU F , WU L , et al. Semantic understanding and prompt engineering for large-scale traffic data imputation. Information Fusion, 2024, 102, 102038. doi: 10.1016/j.inffus.2023.102038
18	COOLS S B , GERSHENSON C , D'HOOGHE B . Self-organizing traffic lights: a realistic simulation. Berlin, Germany: Springer, 2013: 45- 55.
19	傅明建, 郭福强. 基于深度强化学习的无信号灯路口决策研究. 计算机工程, 2024, 50 (5): 91- 99. doi: 10.19678/j.issn.1000-3428.0068112
	FU M J , GUO F Q . Research on decision-making at intersection without traffic lights based on deep reinforcement learning. Computer Engineering, 2024, 50 (5): 91- 99. doi: 10.19678/j.issn.1000-3428.0068112
20	唐慕尧, 周大可, 李涛. 结合状态预测的深度强化学习交通信号控制. 计算机应用研究, 2022, 39 (8): 2311- 2315.
	TANG M Y , ZHOU D K , LI T . State prediction based deep reinforcement learning for traffic signal control. Application Research of Computers, 2022, 39 (8): 2311- 2315.
21	LIN J F, ZHU Y Y, LIU L B, et al. DenseLight: efficient control for large-scale traffic signals with dense feedback[C]//Proceedings of the 32nd International Joint Conference on Artificial Intelligence. [S. l. ]: IJCAI, 2023: 6058-6066.
22	TAN H C , FENG G D , FENG J S , et al. A tensor-based method for missing traffic data completion. Transportation Research, Part C: Emerging Technologies, 2013, 28, 15- 27. doi: 10.1016/j.trc.2012.12.007
23	WANG S Q , GAO M , WANG Z W , et al. Fine-grained spatial-temporal representation learning with missing data completion for traffic flow prediction. Berlin, Germany: Springer, 2021.
24	SASSELLA A, ABBRACCIAVENTO F, FORMENTIN S, et al. On queue length estimation in urban traffic intersections via inductive loops[C]//Proceedings of the American Control Conference. Washington D.C., USA: IEEE Press, 2023: 1135-1140.
25	WANG Z P, ZHUANG D Y, LI Y K, et al. ST-GIN: an uncertainty quantification approach in traffic data imputation with spatio-temporal graph attention and bidirectional recurrent united neural networks[C]//Proceedings of the IEEE 26th International Conference on Intelligent Transportation Systems. Washington D.C., USA: IEEE Press, 2023: 1454-1459.
26	GENDERS W , RAZAVI S . Evaluating reinforcement learning state representations for adaptive traffic signal control. Procedia Computer Science, 2018, 130, 26- 33. doi: 10.1016/j.procs.2018.04.008
27	WU P , XU L H , HUANG Z L . Imputation methods used in missing traffic data: a literature review. Berlin, Germany: Springer, 2020.
28	YU W H, TAN J, KAREN LIU C, et al. Preparing for the unknown: learning a universal policy with online system identification[EB/OL]. [2024-01-19]. https://arxiv.org/abs/1702.02453.
29	YU L A , LI M X , LIU X J . A two-stage case-based reasoning driven classification paradigm for financial distress prediction with missing and imbalanced data. Expert Systems with Applications, 2024, 249, 123745. doi: 10.1016/j.eswa.2024.123745
30	KALAPOS A, GOR C, MONI R, et al. Sim-to-real reinforcement learning applied to end-to-end vehicle control[C]//Proceedings of the 23rd International Symposium on Measurement and Control in Robotics. Washington D.C., USA: IEEE Press, 2020: 1-6.
31	VASWANI, ASHISH, NOAM S, et al. Attention is all you need[EB/OL]. [2024-01-19]. https://arxiv.org/abs/1706.03762.
32	WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on Machine Learning. New York, USA: ACM Press, 2016: 1995-2003.
33	BENGIO Y , COURVILLE A , VINCENT P . Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35 (8): 1798- 1828. doi: 10.1109/TPAMI.2013.50
34	YU Y, BUCHANAN S, PAI D, et al. White-box transformers via sparse rate reduction[C]//Proceedings of Advances in Neural Information Processing Systems. New York, USA: Curran Associates Inc., 2023: 9422-9457.
35	PAN S J , YANG Q . A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359. doi: 10.1109/TKDE.2009.191
36	ZHANG H C, FENG S Y, LIU C, et al. CityFlow: a multi-agent reinforcement learning environment for large scale city traffic scenario[C]//Proceedings of the World Wide Web Conference. New York, USA: ACM Press, 2019: 3620-3624.
37	BAN X G , HAO P , SUN Z B . Real time queue length estimation for signalized intersections using travel times from mobile sensors. Transportation Research, Part C: Emerging Technologies, 2011, 19 (6): 1133- 1156. doi: 10.1016/j.trc.2011.01.002

[1]	陈彦如, 刘珂良, 冉茂亮. 基于深度强化学习的外卖即时配送实时优化[J]. 计算机工程, 2025, 51(9): 328-339.
[2]	何兆成, 刘钦, 朱依婷. 面向实时信控策略评价的交通-碳排耦合微观仿真模型[J]. 计算机工程, 2025, 51(9): 306-316.
[3]	崔萌萌, 施静燕, 项昊龙. 基于空地协同的动态车载边缘任务卸载方法[J]. 计算机工程, 2025, 51(9): 25-37.
[4]	翟志鹏, 曹阳, 沈琴琴, 施佺. 基于多时空图融合与动态注意力的交通流预测[J]. 计算机工程, 2025, 51(9): 139-148.
[5]	张潇, 李德识. 非平稳时间序列多维隐状态的预测机制[J]. 计算机工程, 2025, 51(7): 68-77.
[6]	亓明凯, 王迪, 张立晔. 基于分层强化学习的在线三维装箱模型[J]. 计算机工程, 2025, 51(6): 136-145.
[7]	吴凯峰, 刘磊, 刘晨, 梁成庆. 基于融合课程思想MADDPG的无人机编队控制[J]. 计算机工程, 2025, 51(5): 73-82.
[8]	吕超峰, 徐鹏飞, 罗迪, 刘金平. 基于多智能体深度强化学习的SD-IoT控制器部署[J]. 计算机工程, 2025, 51(5): 83-92.
[9]	刘云翔, 梁智超. 一种高效的连续时序图注意力网络的交通预测模型[J]. 计算机工程, 2025, 51(4): 350-359.
[10]	林绍福, 陈盈盈, 李硕朋. 基于深度强化学习的多无人机能量传输与边缘计算联合优化方法[J]. 计算机工程, 2025, 51(3): 144-154.
[11]	李思源, 钟兴宇, 李凯茵, 徐清振. 基于多层图关系和强化学习的策略教学研究[J]. 计算机工程, 2025, 51(3): 122-130.
[12]	曾建州, 李泽平, 张素勤. 基于TD3算法的多智能体协作缓存策略[J]. 计算机工程, 2025, 51(2): 365-374.
[13]	石琼, 段辉, 师智斌. 基于深度强化学习的可信任务卸载方案[J]. 计算机工程, 2024, 50(8): 142-152.
[14]	刘树林, 李红军, 甘雨金, 罗茜雅. 基于线性低秩卷积与道路网络的城市流量推断[J]. 计算机工程, 2024, 50(7): 333-341.
[15]	傅明建, 郭福强. 基于深度强化学习的无信号灯路口决策研究[J]. 计算机工程, 2024, 50(5): 91-99.

选择文件类型/文献管理软件名称

选择包含的内容