融合时空注意力机制的多尺度卷积车辆轨迹预测

doi:10.19678/j.issn.1000-3428.0068767

摘要/Abstract

摘要：

车辆轨迹预测是自动驾驶的重要环节，提升车辆轨迹预测的可靠性和准确性对自动驾驶安全性有很大帮助。道路上车辆行驶受交通环境影响，考虑相邻车辆运动和相对空间位置等交通环境因素，在长短期记忆(LSTM)神经网络编码器-解码器模型基础上引入时空注意力机制，通过时间注意力层关注目标车辆和相邻车辆的历史轨迹，空间注意力层关注车辆的相对空间位置。为了增强特征提取程度和实现更全面的特征融合，使用多尺度卷积社交池增大感受野，融合多尺度特征，并提出基于LSTM编码器-解码器架构融合多尺度卷积社交池和时空注意力机制的车辆轨迹预测模型MCS-STA-LSTM。通过学习车辆运动相互依赖关系，以达到获得目标车辆未来轨迹基于机动类别的多模态预测分布的目的。在公开数据集NGSIM上进行训练、验证和测试，实验结果表明，相较于其他轨迹预测模型，该方法在3 s内的均方根误差平均降低了9.35%，5 s内均方根误差平均降低了5.53%，提高了轨迹预测准确性，在中短期预测上更具有优势。

关键词: 多尺度卷积社交池化, 轨迹预测, 长短期记忆神经网络, 时空注意力机制, 多尺度特征融合

Abstract:

Vehicle trajectory prediction is a crucial component of autonomous driving systems, and improving its reliability and accuracy greatly enhances the safety of autonomous driving. Considering the influence of traffic conditions on vehicle movement, this study focuses on traffic environmental factors such as neighboring vehicle motion and relative spatial positions. Building on the Long Short-Term Memory (LSTM) network encoder-decoder model, a spatiotemporal attention mechanism is introduced. Temporal-level attention focuses on the historical trajectories of the target and neighboring vehicles, whereas spatial attention focuses on the relative spatial positions of the vehicles. Additionally, to enhance feature extraction and achieve a more comprehensive feature fusion, multi-scale convolutional social pooling is utilized to increase the receptive field and integrate multi-scale features. By combining these two aspects, this study proposes a vehicle trajectory prediction model called MCS-STA-LSTM, which incorporates the LSTM encoder-decoder architecture, multi-scale convolutional social pooling, and a spatiotemporal attention mechanism. This model learns the interdependencies of vehicle movements to obtain multi-modal prediction distributions of future trajectories for a target vehicle based on maneuver categories. The model is trained, validated, and tested on the publicly available NGSIM dataset. Several comparative experiments demonstrate that the MCS-STA-LSTM model achieves an average Root Mean Square Error (RMSE) reduction of 9.35% within 3 s and 5.53% within 5 s when compared to other trajectory prediction models. These results indicate an improved trajectory prediction accuracy, highlighting the model's advantage in medium- and short-term predictions.

Key words: multi-scale convolutional social pooling, trajectory prediction, Long Short-Term Memory (LSTM) neural network, spatial-temporal attention mechanism, multi-scale feature fusion

闫建红, 刘芝妍, 王震. 融合时空注意力机制的多尺度卷积车辆轨迹预测[J]. 计算机工程, 2025, 51(8): 406-414.

YAN Jianhong, LIU Zhiyan, WANG Zhen. Multi-Scale Convolutional Vehicle Trajectory Prediction Integrating Spatiotemporal Attention Mechanism[J]. Computer Engineering, 2025, 51(8): 406-414.

https://www.ecice06.com/CN/Y2025/V51/I8/406

图/表 13

图1 Inception结构

Fig.1 Inception structure

图2 cs-LSTM模型结构

Fig.2 cs-LSTM model structure

图3 MCS-STA-LSTM模型的整体框架

Fig.3 The overall framework of the MCS-STA-LSTM model

图4 横向和纵向的6种机动类别

Fig.4 Six types of lateral and longitudinal maneuver categories

图5 训练损失

Fig.5 Training loss

图6 经度坐标预测准确率

Fig.6 Accuracy of longitude coordinates prediction

图7 纬度坐标预测准确率

Fig.7 Accuracy of latitude coordinates prediction

图8 目标车辆预测轨迹分布可视化1

Fig.8 Visualization of target vehicle predicted trajectory distribution 1

图9 目标车辆预测轨迹分布可视化2

Fig.9 Visualization of target vehicle predicted trajectory distribution 2

参考文献 26

1	SEETHARAMAN G, LAKHOTIA A, BLASCH E P. Unmanned vehicles come of age: the DARPA grand challenge. Computer, 2006, 39(12): 26- 29.
2	杨超. 自动驾驶汽车行为预测综述. 汽车文摘, 2022(10): 11- 18.
	YANG C. Overview on behavior prediction for autonomous vehicles. Automotive Digest, 2022(10): 11- 18.
3	LI S S, LI N, GIRARD A, et al. Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control[C]//Proceedings of the 58th IEEE Conference on Decision and Control. Washington D. C., USA: IEEE Press, 2019: 2181-2187.
4	CHANDRA R, MANOCHA D. GamePlan: game-theoretic multi-agent planning with human drivers at intersections, round abouts, and merging. IEEE Robotics and Automation Letters, 2022, 7(2): 2676- 2683. doi: 10.1109/LRA.2022.3144516
5	SHU K Q, YU H L, CHEN X X, et al. Autonomous driving at intersections: a behavior-oriented critical-turning-point approach for decision making. ASME Transactions on Mechatronics, 2021, 27(1): 234- 244.
6	曹栋发, 李勇, 胡创业, 等. 基于用户画像与Stackelberg博弈的交通环岛通行策略. 计算机工程, 2023, 49(9): 208- 216. URL
	CAO D F, LI Y, HU C Y, et al. Traffic strategy of roundabout based on user portrait and Stackelberg game. Computer Engineering, 2023, 49(9): 208- 216. URL
7	WIEST J, HOFFKEN M, KRESEL U, et al. Probabilistic trajectory prediction with Gaussian mixture models[C]//Proceedings of IEEE Intelligent Vehicles Symposium. Washington D. C., USA: IEEE Press, 2012: 141-146.
8	GINDELE T, BRECHTEL S, DILLMANN R. Learning driver behavior models from traffic observations for decision making and planning. IEEE Intelligent Transportation Systems Magazine, 2015, 7(1): 69- 79. doi: 10.1109/MITS.2014.2357038
9	KIM B, KANG C M, KIM J, et al. Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network[C]//Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems. Washington D. C., USA: IEEE Press, 2017: 399-404.
10	LEE N, CHOI W, VERNAZA P, et al. DESIRE: distant future prediction in dynamic scenes with interacting agents[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 336-345
11	朱自升. 基于神经网络的车辆轨迹预测算法的研究与实现[D]. 西安: 西安电子科技大学, 2018.
	ZHU Z S. Research and implementation of vehicle trajectory prediction algorithm based on neural network[D]. Xi'an: Xidian University, 2018. (in Chinese)
12	PAN J C, SUN H Y, XU K C, et al. Lane-attention: predicting vehicles' moving trajectories by learning their attention over lanes[C]//Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington D. C., USA: IEEE Press, 2020: 7949-7956.
13	TANG C, SALAKHUTDINOV R R. Multiple futures prediction[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 32-41.
14	吴晓建, 危一华, 王爱春, 等. 基于融合Dropout与注意力机制的LSTM-GRU车辆轨迹预测. 湖南大学学报(自然科学版), 2023, 50(4): 65- 75.
	WU X J, WEI Y H, WANG A C, et al. Vehicle trajectory prediction based on LSTM-GRU integrating dropout and attention mechanism. Journal of Hunan University (Natural Sciences), 2023, 50(4): 65- 75.
15	HOCHREITER S, SCHMIDHUBER J. Long short-term memory. Neural Computation, 1997, 9(8): 1735- 1780.
16	PARK S H, KIM B, KANG C M, et al. Sequence-to-sequence prediction of vehicle trajectory via LSTM encoder-decoder architecture[C]//Proceedings of IEEE Intelligent Vehicles Symposium. Washington D. C., USA: IEEE Press, 2018: 1672-1678.
17	FERNANDO T, DENMAN S, SRIDHARAN S, et al. Soft+Hardwired attention: an LSTM framework for human trajectory prediction and abnormal event detection. Neural Networks, 2018, 108, 466- 478.
18	ZHANG P, OUYANG W L, ZHANG P F, et al. SR-LSTM: state refinement for LSTM towards pedestrian trajectory prediction[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 12085-12094.
19	LIN L, LI W Z, BI H K, et al. Vehicle trajectory prediction using LSTMs with spatial-temporal attention mechanisms. IEEE Intelligent Transportation Systems Magazine, 2021, 14(2): 197- 208.
20	LECUN Y, BENGIO Y. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 1995, 3361(10): 1995.
21	ZHANG K P, ZHANG Z P, LI Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23(10): 1499- 1503.
22	SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2015: 1-9.
23	ALAHI A, GOEL K, RAMANATHAN V, et al. Social LSTM: human trajectory prediction in crowded spaces[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 961-971.
24	DEO N, TRIVEDI M M. Convolutional social pooling for vehicle trajectory prediction[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 1468-1476.
25	DAI S Z, LI L, LI Z H. Modeling vehicle interactions via modified LSTM models for trajectory prediction. IEEE Access, 2019, 7, 38287- 38296.
26	ZHAO T Y, XU Y F, MONFORT M, et al. Multi-agent tensor fusion for contextual trajectory prediction[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 12126-12134.

[1]	栾孟娜, 郑秋梅, 王风华. 基于DMC-YOLO的交通标志实时检测算法[J]. 计算机工程, 2025, 51(7): 90-99.
[2]	李白芽. 基于CNN-Transformer的电子喉镜病灶及器官分割网络[J]. 计算机工程, 2025, 51(6): 327-337.
[3]	栾方军, 龚琪, 袁帅. 基于注意力机制和多尺度融合的人群计数网络[J]. 计算机工程, 2025, 51(3): 352-361.
[4]	许明, 屈泰澎, 姜彦吉. 改进YOLOv7在复杂场景下的交通标志检测算法[J]. 计算机工程, 2025, 51(2): 335-343.
[5]	陈浩, 陈珺, 刘飞. 基于自主探索的移动机器人路径规划研究[J]. 计算机工程, 2025, 51(1): 60-70.
[6]	刘建敏, 林晖, 汪晓丁. 基于图注意力机制的无地图场景轨迹预测方法[J]. 计算机工程, 2024, 50(7): 144-153.
[7]	杨硕, 王一丁. 基于改进薄板样条运动模型的人脸动画算法[J]. 计算机工程, 2024, 50(6): 255-265.
[8]	郑晨俊, 曾艳, 袁俊峰, 张纪林, 王鑫, 韩猛. 基于联邦学习的船舶AIS轨迹预测算法[J]. 计算机工程, 2024, 50(2): 298-307.
[9]	宋志娜, 李莎, 杨建明, 徐川. 基于特征与区域定位增强的遥感舰船目标检测[J]. 计算机工程, 2023, 49(8): 257-264.
[10]	费蓉, 马梦阳, 张晓, 黑新宏, 徐庆征, 邱原. 基于轨迹预测与冲突检测的自动驾驶碰撞检测模型[J]. 计算机工程, 2023, 49(7): 10-20.
[11]	李雪松, 张锲石, 宋呈群, 康宇航, 程俊. 自动驾驶场景下的轨迹预测技术综述[J]. 计算机工程, 2023, 49(5): 1-11.
[12]	任家豪, 张光华, 乔钢柱, 武秀萍. 多尺度特征融合的头影标志点检测[J]. 计算机工程, 2023, 49(3): 271-279.
[13]	刘国名, 李彩虹, 李永迪, 张国胜, 张耀玉, 高腾腾. 基于改进PPO算法的机器人局部路径规划[J]. 计算机工程, 2023, 49(2): 119-126,135.
[14]	刘仲任, 彭力. 多尺度视觉感知融合的显著性目标检测[J]. 计算机工程, 2023, 49(12): 186-193.
[15]	贵向泉, 张馨月, 李立. 高分辨率皮肤黑色素瘤图像的两阶段式分割算法[J]. 计算机工程, 2023, 49(11): 267-274.

选择文件类型/文献管理软件名称

选择包含的内容