作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (8): 406-414. doi: 10.19678/j.issn.1000-3428.0068767

• 开发研究与工程应用 • 上一篇    

融合时空注意力机制的多尺度卷积车辆轨迹预测

闫建红*(), 刘芝妍, 王震   

  1. 太原师范学院计算机科学与技术学院,山西 晋中 030619
  • 收稿日期:2023-11-03 修回日期:2024-02-19 出版日期:2025-08-15 发布日期:2025-08-15
  • 通讯作者: 闫建红
  • 基金资助:
    山西省重点研发计划(202102010101008); 山西省重点实验室智能优化计算与区块链技术项目

Multi-Scale Convolutional Vehicle Trajectory Prediction Integrating Spatiotemporal Attention Mechanism

YAN Jianhong*(), LIU Zhiyan, WANG Zhen   

  1. School of Computer Science and Technology, Taiyuan Normal University, Jinzhong 030619, Shanxi, China
  • Received:2023-11-03 Revised:2024-02-19 Online:2025-08-15 Published:2025-08-15
  • Contact: YAN Jianhong

摘要:

车辆轨迹预测是自动驾驶的重要环节,提升车辆轨迹预测的可靠性和准确性对自动驾驶安全性有很大帮助。道路上车辆行驶受交通环境影响,考虑相邻车辆运动和相对空间位置等交通环境因素,在长短期记忆(LSTM)神经网络编码器-解码器模型基础上引入时空注意力机制,通过时间注意力层关注目标车辆和相邻车辆的历史轨迹,空间注意力层关注车辆的相对空间位置。为了增强特征提取程度和实现更全面的特征融合,使用多尺度卷积社交池增大感受野,融合多尺度特征,并提出基于LSTM编码器-解码器架构融合多尺度卷积社交池和时空注意力机制的车辆轨迹预测模型MCS-STA-LSTM。通过学习车辆运动相互依赖关系,以达到获得目标车辆未来轨迹基于机动类别的多模态预测分布的目的。在公开数据集NGSIM上进行训练、验证和测试,实验结果表明,相较于其他轨迹预测模型,该方法在3 s内的均方根误差平均降低了9.35%,5 s内均方根误差平均降低了5.53%,提高了轨迹预测准确性,在中短期预测上更具有优势。

关键词: 多尺度卷积社交池化, 轨迹预测, 长短期记忆神经网络, 时空注意力机制, 多尺度特征融合

Abstract:

Vehicle trajectory prediction is a crucial component of autonomous driving systems, and improving its reliability and accuracy greatly enhances the safety of autonomous driving. Considering the influence of traffic conditions on vehicle movement, this study focuses on traffic environmental factors such as neighboring vehicle motion and relative spatial positions. Building on the Long Short-Term Memory (LSTM) network encoder-decoder model, a spatiotemporal attention mechanism is introduced. Temporal-level attention focuses on the historical trajectories of the target and neighboring vehicles, whereas spatial attention focuses on the relative spatial positions of the vehicles. Additionally, to enhance feature extraction and achieve a more comprehensive feature fusion, multi-scale convolutional social pooling is utilized to increase the receptive field and integrate multi-scale features. By combining these two aspects, this study proposes a vehicle trajectory prediction model called MCS-STA-LSTM, which incorporates the LSTM encoder-decoder architecture, multi-scale convolutional social pooling, and a spatiotemporal attention mechanism. This model learns the interdependencies of vehicle movements to obtain multi-modal prediction distributions of future trajectories for a target vehicle based on maneuver categories. The model is trained, validated, and tested on the publicly available NGSIM dataset. Several comparative experiments demonstrate that the MCS-STA-LSTM model achieves an average Root Mean Square Error (RMSE) reduction of 9.35% within 3 s and 5.53% within 5 s when compared to other trajectory prediction models. These results indicate an improved trajectory prediction accuracy, highlighting the model's advantage in medium- and short-term predictions.

Key words: multi-scale convolutional social pooling, trajectory prediction, Long Short-Term Memory (LSTM) neural network, spatial-temporal attention mechanism, multi-scale feature fusion