The Role of Ground Truth Annotation in Semantic Image Segmentation Performance for Autonomous Driving

Bouasria, Ihssane; Jebrane, Walid; Akchioui, Nabil El

doi:10.1007/978-3-031-28387-1_23

Ihssane Bouasria¹⁴,
Walid Jebrane¹⁴ &
Nabil El Akchioui¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 625))

Included in the following conference series:

International Conference On Big Data and Internet of Things

274 Accesses
1 Citations

Abstract

The key to reaching full autonomous vehicles lies in the degree of robustness and generalization of the visual perception system for any prompt scenario during driving or under any weather condition. Regardless of all the advent in semantic segmentation and the numerous datasets existing, they still fail to perform well when the input images are unclear (fog, rain, snow, or nighttime), or small-scale. In this study, we will highlight the impact of choosing the adequate coarse ground truth annotation on achieving the best outcome possible out of the preparation and training of the dataset to increase the accuracy of the perception system in driverless cars. Our experiment has shown that indeed the accuracy values of semantic image segmentation using coarse ground truth annotation outperform the fine ground truth for the principal classes (car, road) which implies faster datasets preparation and tuning before pragmatic applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.A.: Review on deep learning techniques applied to semantic segmentation, pp. 1–23 (2017)
Google Scholar
Garcia-Garcia, A., et al.: A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. J. 70, 41–65 (2018)
Article Google Scholar
Janai, J., Güney, F., Behl, A., Geiger, A.: computer vision for autonomous vehicles: problems, datasets and state of the art. Found. Trends® Comput. Graph. Vis. 12, 1–308 (2020)
Google Scholar
Kuo, W., Angelova, A., Malik, J. Lin, T.Y.: ShapeMask: Learning to segment novel objects by refining shape priors. In: Proceedings IEEE International Conference Computer Vision 2019, pp. 9206–9215, October 2019
Google Scholar
Papadeas, I., Tsochatzidis, L., Amanatiadis, A., Pratikakis, I.: Real-time semantic image segmentation with deep learning for autonomous driving: a survey. Appl. Sci. 11 (2021)
Google Scholar
Hua, B.S., et al.: SceneNN: a scene meshes dataset with annotations. In: Proceedings - 2016 4th International Conference 3D Vision, 3DV 2016, pp. 92–101 (2016)
Google Scholar
Chen, P., et al.: Object localization under single coarse point supervision, vol. 2, pp. 4868–4877
Google Scholar
Jing, L., Chen, Y., Tian, Y.: Coarse-to-fine semantic segmentation from image-level labels. IEEE Trans. Image Process. 29, 225–236 (2020)
Article MathSciNet MATH Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings IEEE Computer Society Conference Computer Vision Pattern Recognit, pp. 3354–3361 (2012)
Google Scholar
Huang, X., et al.: The ApolloScape open dataset for autonomous driving and its application. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2702–2719 (2020)
Article Google Scholar
Yu, F. et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. Proc. In: IEEE Computer Society Conference Computer Vision Pattern Recognit, pp. 2633–2642 (2020)
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017)
Article Google Scholar
Weng, W., Zhu, X.: UNet: convolutional networks for biomedical image segmentation. IEEE Access 9, 16591–16603 (2021)
Article Google Scholar
Neuhold, G., Ollmann, T., Bulo, S.R., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: Proceedings IEEE International Conference Computer Vision 2017, pp. 5000–5009, -October 2017
Google Scholar
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7
Chapter Google Scholar
Tesla. Autopilot: Full Self-Driving Hardware on All Cars. Tesla Motors 1 (2017). https://www.tesla.com/autopilot
Ingle, S., Phute, M.: Tesla autopilot : semi autonomous driving, an uptick for future autonomy. Int. Res. J. Eng. Technol. 3, 369–372 (2016)
Google Scholar
Brostow, G.J., Fauqueur, J., Cipolla, R.: Semantic object classes in video: a high-definition ground truth database. Pattern Recognit. Lett. 30, 88–97 (2009)
Article Google Scholar
Brostow, G., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using SfM Point Clouds. Eccv, pp. 1–15 (2008)
Google Scholar
Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Roth, S.: The cityscapes dataset for semantic urban scene understanding (2016)
Google Scholar
Liu, X., et al.: Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training (2017)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings 30th IEEE Conference Computer Vision Pattern Recognition, CVPR 2017, pp. 6230–6239, January 2017
Google Scholar
Abdar, M., et al.: A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fus. 76, 243–297 (2021)
Article Google Scholar
Kingma, D.P., et al.: Semi-supervised learning with deep generative models. In: Advances Neural Information Processing Systems, vol. 27 (2014)
Google Scholar
Hung, W.-C., et al.: Adversarial learning for semi-supervised semantic segmentation. arXiv preprint: https://arxiv.org/abs/1802.07934 (2018)
Mittal, S., Tatarchenko, M., Brox, T.: Semi-supervised semantic segmentation with high-and low-level consistency. IEEE Trans. Pat. Anal. Mach. Intell. (2019)
Google Scholar
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semisupervised learning. In: Advances Neural Information Systems (2016)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results (2017)
Google Scholar
Li, X., et al.: Semi-supervised skin lesion segmentation via transformation consistent self-ensembling model. arXiv preprint: https://arxiv.org/abs/1808.03887 (2018)
Perone, C.S., Cohen-Adad, J.: Deep semi-supervised segmentation with weight-averaged consistency targets. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (2018)
Google Scholar
French, G., et al.: Semi-supervised semantic segmentation needs strong, varied perturbations. arXiv preprint: https://arxiv.org/abs/1906.01916 (2019)
Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Selvaraju, R.R., et al.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Google Scholar
Zhou, B., et al.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Singh, K.K., Lee, Y.J.: Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE (2017)
Google Scholar
Wei, Y., et al.: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Li, K., et al.: Tell me where to look: guided attention inference network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Zhang, X., et al.: Adversarial complementary learning for weakly supervised object localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Huang, Z., et al.: Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Adams, R., Bischof, L.: Seeded region growing. IEEE Trans. Patt. Anal. Mach. 16, 641–647 (1994)
Article Google Scholar
Dai, J., He, K., Sun, J.: Boxsup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Google Scholar
Khoreva, A., et al.: Simple does it: weakly supervised instance and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)
Article Google Scholar
Mottaghi, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar
Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.: Detect what you can: Detecting and representing objects using holistic models and body parts. In: IEEE Conference (2017)
Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The Synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Cordts, M., et al.: The cityscapes dataset. In: CVPR Workshop on the Future of Datasets in Vision (2015)
Google Scholar
Ros, G., Alvarez, J.M.: Unsupervised image transformation for outdoor semantic labelling. In: Intelligent Vehicles Symposium (IV). IEEE (2015)
Google Scholar
Shen, X., et al.: Automatic portrait segmentation for image stylization. In: Computer Graphics Forum, vol. 35, no. 2. Wiley Online Library (2016)
Google Scholar
Bell, S., Upchurch, P., Snavely, N., Bala, K., Material recognition in the wild with the materials in context database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: Robotics and Automation (ICRA), 2011 IEEE International Conference on, pp. 1817–1824. IEEE (2011)
Google Scholar
Yi, L., et al.: A scalable active framework for region annotation in 3D shape collections. SIGGRAPH Asia (2016)
Google Scholar
Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2D-3D semantic data for indoor scene understanding. ArXiv e-prints, Febraury 2017
Google Scholar
Hackel, T., Wegner, J.D., Schindler, K.: Contour detection in unstructured 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1610–1618 (2016)
Google Scholar

Download references

Acknowledgment

W. Jebrane acknowledges support from the “Center National for Scientific and Technical Research” CNRST, Morocco.

Author information

Authors and Affiliations

Research and Development Laboratory in Engineering Sciences (LRDSI), Abdelmalek Essaâdi University, Al Hoceima, Morocco
Ihssane Bouasria, Walid Jebrane & Nabil El Akchioui

Authors

Ihssane Bouasria
View author publications
You can also search for this author in PubMed Google Scholar
Walid Jebrane
View author publications
You can also search for this author in PubMed Google Scholar
Nabil El Akchioui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ihssane Bouasria .

Editor information

Editors and Affiliations

ENSIAS, Mohammed V University, Rabat, Morocco
Mohamed Lazaar
FST, Abdelmalek Essaâdi University, Tangier, Morocco
El Mokhtar En-Naimi
FST, Abdelmalek Essaâdi University, Tangier, Morocco
Abdelhamid Zouhair
ENSA, Abdelmalek Essaâdi University, Tetuan, Morocco
Mohammed Al Achhab
ENSA, Abdelmalek Essaadi University, Tetouan, Morocco
Oussama Mahboub

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bouasria, I., Jebrane, W., Akchioui, N.E. (2023). The Role of Ground Truth Annotation in Semantic Image Segmentation Performance for Autonomous Driving. In: Lazaar, M., En-Naimi, E.M., Zouhair, A., Al Achhab, M., Mahboub, O. (eds) Proceedings of the 6th International Conference on Big Data and Internet of Things. BDIoT 2022. Lecture Notes in Networks and Systems, vol 625. Springer, Cham. https://doi.org/10.1007/978-3-031-28387-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-28387-1_23
Published: 29 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28386-4
Online ISBN: 978-3-031-28387-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics