Evaluation of Detection and Segmentation Tasks on Driving Datasets

Singh, Deepak; Rahane, Ameet; Mondal, Ajoy; Subramanian, Anbumani; Jawahar, C. V.

doi:10.1007/978-3-031-11346-8_44

Deepak Singh¹⁰,
Ameet Rahane¹¹,
Ajoy Mondal¹⁰,
Anbumani Subramanian¹² &
…
C. V. Jawahar¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1567))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

968 Accesses
1 Citations

Abstract

Object detection, semantic segmentation, and instance segmentation form the bases for many computer vision tasks in autonomous driving. The complexity of these tasks increases as we shift from object detection to instance segmentation. The state-of-the-art models are evaluated on standard datasets such as pascal-voc and ms-cococ, which do not consider the dynamics of road scenes. Driving datasets such as Cityscapes and Berkeley Deep Drive (bdd) are captured in a structured environment with better road markings and fewer variations in the appearance of objects and background. However, the same does not hold for Indian roads. The Indian Driving Dataset (idd) is captured in unstructured driving scenarios and is highly challenging for a model due to its diversity. This work presents a comprehensive evaluation of state-of-the-art models on object detection, semantic segmentation, and instance segmentation on-road scene datasets. We present our analyses and compare their quantitative and qualitative performance on structured driving datasets (Cityscapes and bdd) and the unstructured driving dataset (idd); understanding the behavior on these datasets helps in addressing various practical issues and helps in creating real-life applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Araki, R., Onishi, T., Hirakawa, T., Yamashita, T., Fujiyoshi, H.: MT-DSSD: deconvolutional single shot detector using multi task learning for object detection, segmentation, and grasping detection. In: ICRA (2020)
Google Scholar
Arnab, A., Jayasumana, S., Zheng, S., Torr, P.H.: Higher order Conditional Random Fields in deep neural networks. In: ECCV (2016)
Google Scholar
Bolya, D., Zhou, C., Xiao, F., Lee, Y.: YOLACT: real-time instance segmentation. In: ICCV (2019)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: High quality object detection and instance segmentation. IEEE Trans. PAMI (2019)
Google Scholar
Chen, K., et al.: MMDetection: Open MMLab detection toolbox and benchmark. arXiv (2019)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. PAMI (2018)
Google Scholar
Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR (2016)
Google Scholar
Cordts, M., et al.: The Cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV (2010)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: CVPR (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask Scoring R-CNN. In: CVPR (2019)
Google Scholar
Ke, L., Tai, Y.W., Tang, C.K.: Deep occlusion-aware instance segmentation with overlapping bilayers. In: CVPR (2021)
Google Scholar
Lee, Y., Park, J.: Centermask: Real-time anchor-free instance segmentation. In: CVPR (2020)
Google Scholar
Liang, X., Lin, L., Wei, Y., Shen, X., Yang, J., Yan, S.: Proposal-free network for instance-level object segmentation. IEEE Trans. PAMI (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: ECCV (2014)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: Single shot multibox detector. In: ECCV (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. In: CVPR (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. arXiv (2018)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: NeurIPS (2015)
Google Scholar
Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. (2018)
Google Scholar
Varma, G., Subramanian, A., Namboodiri, A., Chandraker, M., Jawahar, C.: IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments. In: WACV (2019)
Google Scholar
Wang, X., Kong, T., Shen, C., Jiang, Y., Li, L.: SOLO: segmenting objects by locations. In: ECCV (2020)
Google Scholar
Wang, Y., Zhou, Q., Xiong, J., Wu, X., Jin, X.: ESNet: an efficient symmetric network for real-time semantic segmentation. In: PRCV (2019)
Google Scholar
Wu, T., Tang, S., Zhang, R., Cao, J., Zhang, Y.: CGNet: a light-weight context guided network for semantic segmentation. IEEE Trans. Image Process. (2020)
Google Scholar
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2. https://github.com/facebookresearch/detectron2 (2019)
Xie, E., et al.: Polarmask: Single shot instance segmentation with polar representation. In: CVPR (2020)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)
Google Scholar
Yu, F., et al.: BDD100k: a diverse driving dataset for heterogeneous multitask learning. In: CVPR (2020)
Google Scholar
Yu, F., Koltun, V., Funkhouser, T.A.: Dilated residual networks. arXiv (2017)
Google Scholar
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Google Scholar
Zhao, Q., et al.: M2Det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI (2019)
Google Scholar

Download references

Acknowledgements

This work was partly funded by IHub-Data at IIIT-Hyderabad.

Author information

Authors and Affiliations

International Institute of Information Technology, Hyderabad, India
Deepak Singh, Ajoy Mondal & C. V. Jawahar
University of California, Berkeley, Berkeley, USA
Ameet Rahane
Intel, Bangalore, India
Anbumani Subramanian

Authors

Deepak Singh
View author publications
You can also search for this author in PubMed Google Scholar
Ameet Rahane
View author publications
You can also search for this author in PubMed Google Scholar
Ajoy Mondal
View author publications
You can also search for this author in PubMed Google Scholar
Anbumani Subramanian
View author publications
You can also search for this author in PubMed Google Scholar
C. V. Jawahar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepak Singh .

Editor information

Editors and Affiliations

Indian Institute of Technology Roorkee, Roorkee, India
Balasubramanian Raman
Indian Institute of Technology Ropar, Ropar, India
Subrahmanyam Murala
Jadavpur University, Kolkata, India
Ananda Chowdhury
Indian Institute of Technology Ropar, Ropar, India
Abhinav Dhall
Indian Institute of Technology Ropar, Ropar, India
Puneet Goyal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, D., Rahane, A., Mondal, A., Subramanian, A., Jawahar, C.V. (2022). Evaluation of Detection and Segmentation Tasks on Driving Datasets. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1567. Springer, Cham. https://doi.org/10.1007/978-3-031-11346-8_44

Download citation

DOI: https://doi.org/10.1007/978-3-031-11346-8_44
Published: 24 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11345-1
Online ISBN: 978-3-031-11346-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluation of Detection and Segmentation Tasks on Driving Datasets