Abstract
Amodal perception is the ability to hallucinate full shapes of (partially) occluded objects. While natural to humans, learning-based perception methods often only focus on the visible parts of scenes. This constraint is critical for safe automated driving since detection capabilities of perception methods are limited when faced with (partial) occlusions. Moreover, corner cases can emerge from occlusions while the perception method is oblivious. In this work, we investigate the possibilities of joint prediction of amodal and visible semantic segmentation masks. More precisely, we investigate whether both perception tasks benefit from a joint training approach. We report our findings on both the Cityscapes and the Amodal Cityscapes dataset. The proposed joint training outperforms the separately trained networks in terms of mean intersection over union in amodal areas of the masks by \(6.84\%\) absolute, while even slightly improving the visible segmentation performance.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bär, A., Klingner, M., Varghese, S., Hüger, F., Schlicht, P., Fingscheidt, T.: Robust semantic segmentation by redundant networks with a layer-specific loss contribution and majority vote. In: Proceedings of CVPR - Workshops, Seattle, WA, USA, pp. 1348–1358 (2020)
Bogdoll, D., Nitsche, M., Zöllner, J.M.: Anomaly detection in autonomous driving: a survey. In: Proceedings of CVPR - Workshops, New Orleans, LA, USA, pp. 4488–4499 (2022)
Bolte, J.A., et al.: Unsupervised domain adaptation to improve image segmentation quality both in the source and target domain. In: Proceedings of CVPR - Workshops, Long Beach, CA, USA, pp. 1404–1413 (2019)
Breitenstein, J., Fingscheidt, T.: Amodal cityscapes: a new dataset, its generation, and an amodal semantic segmentation challenge baseline. In: Proceedings of IV, Aachen, Germany, pp. 1–8 (2022)
Breitenstein, J., Termöhlen, J.A., Lipinski, D., Fingscheidt, T.: Systematization of Corner Cases for Visual Perception in Automated Driving. In: Proc. of IV. pp. 986–993. Las Vegas, NV, USA (Oct 2020)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of CVPR, Las Vegas, NV, USA, pp. 3213–3223 (2016)
Follmann, P., König, R., Härtinger, P., Klostermann, M.: Learning to see the invisible: end-to-end trainable amodal instance segmentation. In: Proceedings of WACV, Waikoloa Village, HI, USA, pp. 1328–1336 (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of ICCV, Venice, Italy, pp. 2980–2988 (2017)
Heidecker, F., et al.: An application-driven conceptualization of corner cases for perception in highly automated driving. In: Proceedings of IV, Nagoya, Japan, pp. 644–651 (2021)
Hu, Y.T., Chen, H.S., Hui, K., Huang, J.B., Schwing, A.G.: SAIL-VOS: semantic amodal instance level video object segmentation - a synthetic dataset and baselines. In: Proceedings of CVPR, Long Beach, CA, USA, pp. 3105–3115 (2019)
Ke, L., Tai, Y.W., Tang, C.K.: Deep occlusion-aware instance segmentation with overlapping BiLayers. In: Proceedings of CVPR, Nashville, TN, USA, pp. 4019–4028 (2021)
Klingner, M., Bär, A., Fingscheidt, T.: Improved noise and attack robustness for semantic segmentation by using multi-task training with self-supervised depth estimation. In: Proceedings of CVPR - Workshops, Seattle, WA, USA, pp. 1299–1309 (2020)
Mohan, R., Valada, A.: Amodal panoptic segmentation. In: Proceedings of CVPR, New Orleans, LA, USA, pp. 21023–21032 (2022)
Poudel, R.P., Liwicki, S., Cipolla, R.: Fast-SCNN: Fast Semantic Segmentation Network. arXiv preprint arXiv:1902.04502, pp. 1–9 (2019)
Purkait, P., Zach, C., Reid, I.D.: Seeing behind things: extending semantic segmentation to occluded regions. In: Proceedings of IROS, Macau, SAR, China, pp. 1998–2005 (2019)
Qi, L., Jiang, L., Liu, S., Shen, X., Jia, J.: Amodal instance segmentation with KINS dataset. In: Proceedings of CVPR, Long Beach, CA, USA, pp. 3014–3023 (2019)
Reddy, N.D., Tamburo, R., Narasimhan, S.: WALT: Watch and learn 2D amodal representation using time-lapse imagery. In: Proceedings of CVPR, New Orleans, LA, USA, pp. 9356–9366 (2022)
Rensink, R.A., Enns, J.T.: Early completion of occluded objects. Vision Res. 38(15), 2489–2505 (1998)
Romera, E., Álvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized ConvNet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. (T-ITS) 19(1), 263–272 (2018)
Sun, Y., Kortylewski, A., Yuille, A.: Amodal segmentation through out-of-task and out-of-distribution generalization with a Bayesian model. In: Proceedings of CVPR, New Orleans, LA, USA, pp. 1215–1224 (2022)
Wang, A., Sun, Y., Kortylewski, A., Yuille, A.: Robust object detection under occlusion with context-aware CompositionalNets. In: Proceedings of CVPR, Seattle, WA, USA, pp. 12645–12654 (2020)
Weigelt, S., Singer, W., Muckli, L.: Separate cortical stages in amodal completion revealed by functional magnetic resonance adaptation. BMC Neurosci. 8(1), 1–11 (2007)
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 173–190. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_11
Zhu, Y., Tian, Y., Metaxas, D., Dollár, P.: Semantic amodal segmentation. In: Proceedings of CVPR, Honolulu, HI, USA, pp. 1464–1472 (2017)
Acknowledgements
We mourn the loss of our co-author, colleague and friend Jonas Löhdefink. Without his valuable input this work would not have been possible. The research leading to these results is funded by the German Federal Ministry for Economic Affairs and Energy within the project “KI Data Tooling - Methods and tools for the generation and refinement of training, validation and safeguarding data for AI functions in autonomous vehicles.” The authors would like to thank the consortium for the successful cooperation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Breitenstein, J., Löhdefink, J., Fingscheidt, T. (2023). Joint Prediction of Amodal and Visible Semantic Segmentation for Automated Driving. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13801. Springer, Cham. https://doi.org/10.1007/978-3-031-25056-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-25056-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25055-2
Online ISBN: 978-3-031-25056-9
eBook Packages: Computer ScienceComputer Science (R0)