Abstract
Automatic detection of objects play an important and key role in developing real time applications related to robotics and autonomous driving vehicles. The latest research trend in computer vision is to detect objects in the 3D point cloud data produced by LiDAR (Light Detection and Ranging) sensors mounted on the self driving cars. This research paper aims at proposing modifications to the existing VoxelNet architecture for object detection. The proposed models perform direct 3D convolution on the point cloud data. Firstly, the point cloud is encoded into a suitable format in the detection pipeline, and next, the feature maps are extracted from the encoded output of the encoder, and lastly, object detection is done using this learnt feature maps in the final stage. Experimental results on the benchmark KITTI dataset show that the proposed modifications outperform the existing VoxelNet based models and other fusion based methods in terms of accuracy as well as time.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Liu, L., Ouyang, W., Wang, X.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128, 261–318 (2020)
Ahmed, E., Das, R.: A survey on deep learning advances on different 3D data representations (2019)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)
Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L.: Deep learning for 3D point clouds: a survey (2019)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: CVPR (2018)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving (MV3D). In: CVPR (2017)
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R:. Monocular 3D object detection for autonomous driving. In: CVPR (2016)
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum PointNets for 3D object detection from RGB-D data. In: CVPR (2018)
Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 663–678. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_39
Yan, Y., Mao, Y., Li, B.: Sparsely embedded convolutional detection. MDPI Sens. 18(10), 3337 (2018)
Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: CVPR (2019)
Chen, Y., Liu, S., Shen, X., Jia, J.: Fast point R-CNN (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: CVPR (2015)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: ICCV (2015)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
KITTI 3D detection benchmark dataset. http://www.cvlibs.net/datasets/kitti/eval_object.php
WAYMO Open Dataset. https://waymo.com/open/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nikhil, G.N., Meraz, M., Javed, M. (2021). Automatic On-Road Object Detection in LiDAR-Point Cloud Data Using Modified VoxelNet Architecture. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1378. Springer, Singapore. https://doi.org/10.1007/978-981-16-1103-2_18
Download citation
DOI: https://doi.org/10.1007/978-981-16-1103-2_18
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1102-5
Online ISBN: 978-981-16-1103-2
eBook Packages: Computer ScienceComputer Science (R0)