Rethinking Voxelization and Classification for 3D Object Detection

Murhij, Youshaa; Golodkov, Alexander; Yudin, Dmitry

doi:10.1007/978-981-99-1645-0_39

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1793))

Included in the following conference series:

International Conference on Neural Information Processing

838 Accesses

Abstract

The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network. In other words, the detecting network must be confident enough about its predictions. In this paper, we present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer that works on fast pillar-based models in the same way a voxelizer works on slow voxel-based models. In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects that significantly improves model precision in a negligible time and computing cost. The developed code is publicly available at: https://github.com/YoushaaMurhij/RVCDet.

This work was supported by the Russian Science Foundation (Project No. 21-71-00131).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
Google Scholar
Chen, Y., Li, Y., Zhang, X., Sun, J., Jia, J.: Focal sparse convolutional networks for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5428–5437 (2022)
Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection (2019)
Google Scholar
Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., Posner, I.: Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1355–1361. IEEE (2017)
Google Scholar
Ge, W., Yang, S., Yu, Y.: Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)
Google Scholar
Li, B.: 3D fully convolutional network for vehicle detection in point cloud. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1513–1518. IEEE (2017)
Google Scholar
Li, B., Zhang, T., Xia, T.: Vehicle detection from 3D lidar using fully convolutional network. arXiv preprint arXiv:1608.07916 (2016)
Murhij, Y., Yudin, D.: FMFNet: improve the 3D object detection and tracking via feature map flow. In: Proceedings of the IEEE International Joint Conference on Neural Network (IJCNN) (2022)
Google Scholar
Premebida, C., Carreira, J., Batista, J., Nunes, U.: Pedestrian detection combining rgb and dense lidar data. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4112–4117. IEEE (2014)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: IEEE (2017)
Google Scholar
Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: IEEE (2017)
Google Scholar
Schwarz, K., Sauer, A., Niemeyer, M., Liao, Y., Geiger, A.: VoxGRAF: fast 3D-aware image synthesis with sparse voxel grids. arXiv preprint arXiv:2206.07695 (2022)
Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., Li, H.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: CVPR (2020)
Google Scholar
Song, S., Xiao, J.: Deep sliding shapes for amodal 3D object detection in RGB-D images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 808–816 (2016)
Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE (2015)
Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2446–2454 (2020)
Google Scholar
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Xiao, X.T.J.: 3D shapeNets: a deep representation for volumetric shapes. In: IEEE (2015)
Google Scholar
Xu, M., Ding, R., Zhao, H., Qi, X.: PAConv: position adaptive convolution with dynamic kernel assembling on point clouds. In: IEEE/CVF (2021)
Google Scholar
Yin, T., Zhou, X., Krahenbuhl, P.: Center-based 3d object detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11784–11793 (2021)
Google Scholar
Zheng, W., Tang, W., Chen, S., Jiang, L., Fu, C.W.: CIA-SSD: Confident IoU-aware single-stage object detector from point cloud. In: AAAI (2021)
Google Scholar
Zhou, Y., et al.: End-to-end multi-view fusion for 3D object detection in lidar point clouds (2019)
Google Scholar
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Google Scholar
Zhu, B., Jiang, Z., Zhou, X., Li, Z., Yu, G.: Class-balanced grouping and sampling for point cloud 3D object detection (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
Youshaa Murhij, Alexander Golodkov & Dmitry Yudin
Artificial Intelligence Research Institute, Moscow, Russia
Dmitry Yudin

Authors

Youshaa Murhij
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Golodkov
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Yudin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youshaa Murhij .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Murhij, Y., Golodkov, A., Yudin, D. (2023). Rethinking Voxelization and Classification for 3D Object Detection. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1793. Springer, Singapore. https://doi.org/10.1007/978-981-99-1645-0_39

Download citation

DOI: https://doi.org/10.1007/978-981-99-1645-0_39
Published: 14 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1644-3
Online ISBN: 978-981-99-1645-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Rethinking Voxelization and Classification for 3D Object Detection