3D Small Object Detection Algorithm Based on Dynamic Feature Enhancement

doi:10.19678/j.issn.1000-3428.0252879

Abstract

Abstract: In 3D object detection from point clouds, the inherent sparsity of LiDAR data poses pronounced challenges for small objects. Few effective points lead to weak structural cues and blurry boundaries; limited contextual awareness hinders spatial reasoning and semantic completion, causing localization bias; and the difficulty of precise spatial localization, weak channel expressiveness, and background dominance jointly constrain accuracy. To mitigate the impact of the above issues on detection accuracy, we propose a dynamic-aware 3D detector that integrates dynamic feature extraction with feature-enhancement mapping, targeting the two critical stages of small-object detection—feature extraction and candidate generation. Specifically, we introduce a dynamic point-feature prediction network that adaptively predicts and supplements sampling points to strengthen structural perception of small objects; we then build a feature-enhancement mapping network that deeply fuses the original features with those produced by the dynamic module to yield context-rich 2D feature maps, thereby compensating for contextual deficiency and improving localization; finally, we design a point-cloud feature-enhancement module to sharpen focus on key small-object regions along both channel and spatial dimensions. Experiments on the nuScenes dataset demonstrate that our approach surpasses mainstream detectors: relative to the CenterPoint baseline, mean Average Precision (mAP) increases from 56.1% to 59.4%, and the nuScenes Detection Score (NDS) rises from 64.4% to 67.4%.

摘要： 在点云三维目标检测任务中，点云数据的稀疏性客观上对小目标检测构成显著挑战。具体表现为：小目标自身有效点数稀少导致结构信息缺失与边界模糊；上下文感知能力不足阻碍模型有效利用周围环境信息进行空间推理与语义补全，进而引发定位偏差；以及其固有的空间定位困难、通道表达弱和特征易被背景淹没等问题，共同制约了检测性能的提升。为缓解上述问题对检测精度造成的影响，本文提出一种融合动态特征提取与特征增强映射的动态感知三维检测算法。该模型聚焦特征提取与候选框生成两大关键阶段对小目标检测进行优化。具体而言：首先，引入动态点特征预测网络，通过自适应预测补充采样点以强化对小目标的结构感知能力；其次，构建特征增强映射网络，对原始特征及动态预测网络生成的特征进行深度融合，输出富含上下文信息的二维特征图，有效弥补上下文缺失并提升小目标的定位精度；最后，设计点云特征增强网络，在通道与空间双维度提升网络对小目标关键区域的聚焦能力。基于nuScenes数据集的实验结果表明，所提算法性能优于当前主流目标检测算法。与基准模型CenterPoint相比，平均精度（mAP）由56.1%提升至59.4%；标准化检测分数（NDS）由64.4%提升至67.4%。

Li Luyang, Yan Jinlong, Fang Zeru, Jin Qiqi, Xue Hongxin. 3D Small Object Detection Algorithm Based on Dynamic Feature Enhancement[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252879.

李潞洋, 闫锦龙, 方泽儒, 金旗旗, 薛红新. 基于动态特征增强的三维小目标检测算法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252879.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0252879

References

[1]李昌财,陈刚,侯作勋,等.自动驾驶中的三维目标检测算法研究综述[J].中国图象图形学报,2024,29(11):3238-3264. LI Changcai, CHEN Gang, HOU Zuoxun, et al. A review of research on three-dimensional target detection algorithms in autonomous driving[J]. Chinese Journal of Image Graphics,2024,29(11):3238-3264.
[2]Zhang B, Wang H, You S, et al. A small-size 3D object detection network for analyzing the sparsity of raw LiDAR point cloud[J]. Journal of Russian Laser Research, 2023, 44(6): 646-655.
[3]Wang J, Liu Y, Zhu Y, et al. 3d point cloud object detection method based on multi-scale dynamic sparse voxelization[J]. Sensors, 2024, 24(6): 1804.
[4]Yang H, Wang W, Chen M, et al. Pvt-ssd: Single-stage 3d object detector with point-voxel transformer[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 13476-13487.
[5]龙丽叶,焦世超,郭磊,等.基于紧凑中心的多模态三维模型检索研究[J].计算机工程,2025,51(02):322-334. Li-Yeh Long, Shi-Chao Jiao, Lei Guo, et al. Research on multimodal 3D model retrieval based on compact center[J]. Computer Engineering,2025,51(02):322-334.
[6]Qi C R, Su H, Mo K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 652-660.
[7]Qi C R, Yi L, Su H, et al. Pointnet++: Deep hierarchical feature learning on point sets in a metric space[J]. Advances in neural information processing systems, 2017, 30.
[8]Shi S, Wang X, Li H. Pointrcnn: 3d object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 770-779.
[9]Zhou Y, Tuzel O. Voxelnet: End-to-end learning for point cloud based 3d object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4490-4499.
[10]Lang A H, Vora S, Caesar H, et al. Pointpillars: Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 12697-12705.
[11]Yan Y, Mao Y, Li B. Second: Sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): 3337.
[12]Deng J, Shi S, Li P, et al. Voxel r-cnn: Towards high performance voxel-based 3d object detection[C]//Proceedings of the AAAI conference on artificial intelligence. 2021, 35(2): 1201-1209.
[13]Yin T, Zhou X, Krahenbuhl P. Center-based 3d object detection and tracking[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 11784-11793.
[14]孙一杰,李晓明.结合动态循环金字塔与任务解耦的无锚框检测[J].计算机工程与设计,2025,46(04):1157-1166. Sun Yijie,Li Xiaoming. Combining dynamic cyclic pyramid with task decoupling for anchor-free frame detection[J]. Computer Engineering and Design,2025,46(04):1157-1166.
[15]Law H, Deng J. Cornernet: Detecting objects as paired keypoints[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 734-750.
[16]Zhou X, Zhuo J, Krahenbuhl P. Bottom-up object detection by grouping extreme and center points[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 850-859.
[17]Zhou X, Wang D, Krähenbühl P. Objects as points[J]. arXiv preprint arXiv:1904.07850, 2019.
[18]Tian Z, Shen C, Chen H, et al. Fcos: Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 9627-9636.
[19]Caesar H, Bankiti V, Lang A H, et al. nuscenes: A multimodal dataset for autonomous driving[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11621-11631.
[20]Loshchilov I, Hutter F. Decoupled weight decay regularization[J]. arXiv preprint arXiv:1711.05101, 2017.
[21]Hu P, Ziglar J, Held D, et al. What you see is what you get: Exploiting visibility for 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11001-11009.
[22]Chen Q, Sun L, Cheung E, et al. Every view counts: Cross-view consistency in 3d object detection with hybrid-cylindrical-spherical voxelization[J]. Advances in Neural Information Processing Systems, 2020, 33: 21224-21235.
[23]Vora S, Lang A H, Helou B, et al. Pointpainting: Sequential fusion for 3d object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4604-4612.
[24]Yin J, Shen J, Guan C, et al. Lidar-based online 3d video object detection with graph-based message passing and spatiotemporal transformer attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11495-11504.
[25]Zhu X, Ma Y, Wang T, et al. Ssn: Shape signature networks for multi-class object detection from point clouds[C]//European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 581-597.
[26]Zhu B, Jiang Z, Zhou X, et al. Class-balanced grouping and sampling for point cloud 3d object detection[J]. arXiv preprint arXiv:1908.09492, 2019.
[27]Li J, Luo C, Yang X. PillarNeXt: Rethinking network designs for 3D object detection in LiDAR point clouds[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 17567-17576.
[28]Chen Y, Liu J, Zhang X, et al. Voxelnext: Fully sparse voxelnet for 3d object detection and tracking[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 21674-21683.
[29]Mei C, He H, Liu Y, et al. SEGT: A General Spatial Expansion Group Transformer for nuScenes Lidar-based Object Detection Task[J]. arXiv preprint arXiv:2412.09658, 2024.
[30]Fan L, Wang F, Wang N, et al. Fsd v2: Improving fully sparse 3d object detection with virtual voxels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
[31]Chen Y, Li Y, Zhang X, et al. Focal sparse convolutional networks for 3d object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 5428-5437.
[32]Chen Y, Liu J, Zhang X, et al. Largekernel3d: Scaling up kernels in 3d sparse cnns[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 13488-13498.

Please choose a citation manager

Content to export