作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于4D毫米波雷达与视觉融合的三维目标检测算法

  • 发布日期:2024-11-05

3D Object Detection Based on 4D Millimeter-Wave Radar and Vision Fusion

  • Published:2024-11-05

摘要: 针对自动驾驶场景中的行人和车辆目标识别与定位问题,提出了一种4D毫米波雷达与视觉融合的算法——CDCAM-BEV,以提高目标检测精度。其基本思想是:首先,设计雷达柱体网络,将4D雷达点云编码为伪图像,并通过正交特征变换将单目图像转换为鸟瞰图(BEV)特征;其次,基于交叉注意力机制,设计共同信息提取模块(CICAM)和差异信息提取模块(DICAM),充分挖掘雷达和图像的公共信息和差异信息;最后,基于CICAM和DICAM模块设计鸟瞰图特征融合模块,实现图像信息和雷达信息在BEV空间的特征级融合。在具有挑战性的VOD数据集上验证了所提算法,并与其他五种三维目标检测算法进行对比。实验结果显示,CDCAM-BEV在多个模式下的检测性能均优于其他算法。在三维模式下,CDCAM-BEV的平均检测精度比排名第二的Part-A²高出3.65%;在BEV模式下,比排名第二的Pointpillars高出5.04%;在AOS模式下,比排名第二的Part-A²高出2.62%。这些结果表明,CDCAM-BEV在各模式下均表现出卓越性能,能够有效融合图像和4D雷达点云特征,显著提高目标检测精度和可靠性。

Abstract: This paper proposes a 4D millimeter-wave radar and vision fusion algorithm—CDCAM-BEV—for pedestrian and vehicle recognition and localization in autonomous driving scenarios, aiming to improve detection accuracy. The basic idea is as follows: First, a radar cylindrical network is designed to encode 4D radar point clouds into pseudo-images, and the monocular image is transformed into bird's-eye view (BEV) features through an orthogonal feature transformation. Then, a cross-attention mechanism-based common information extraction module (CICAM) and differential information extraction module (DICAM) are designed to fully exploit the common and differential information between radar and image. Finally, a BEV feature fusion module is designed based on the CICAM and DICAM modules to achieve feature-level fusion of image and radar information in the BEV space. The proposed algorithm is validated on the challenging VOD dataset and compared with five other 3D object detection algorithms. Experimental results show that CDCAM-BEV outperforms other algorithms in multiple modes. In the 3D mode, the average detection accuracy of CDCAM-BEV is 3.65% higher than the second-ranked Part-A²; in the BEV mode, it is 5.04% higher than the second-ranked Pointpillars; in the AOS mode, it is 2.62% higher than the second-ranked Part-A². These results indicate that CDCAM-BEV demonstrates superior performance across all modes, effectively fusing image and 4D radar point cloud features, significantly improving detection accuracy and reliability.