Abstract
This article concerns the issue of image semantic segmentation for the machine vision system of an autonomous Unmanned Ground Vehicle (UGV) moving in an off-road environment. Determining the meaning (semantics) of the areas visible in the recorded image provides a complete understanding of the scene surrounding the autonomous vehicle. It is crucial for the correct determination of a passable route. Nowadays, semantic segmentation is generally solved using convolutional neural networks (CNN), which can take an image as input and output the segmented image. However, proper training of the neural network requires the use of large amounts of data, which becomes problematic in the situation of low availability of large, dedicated image data sets that consider various off-road situations - driving on various types of roads, surrounded by diverse vegetation and in various weather and light conditions. This study introduces a synthetic image dataset called “OffRoadSynth” to address the training data scarcity for off-road scenarios. It has been shown that pre-training the neural network on this synthetic dataset improves image segmentation accuracy compared to other methods, such as random network weight initialization or using larger, generic datasets. Results suggest that using a smaller but domain-dedicated set of synthetic images to initialize network weights for training on the target real-world dataset may be an effective approach to improving semantic segmentation results of images, including those from off-road environments.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Data Availability
The OffRoadSynth (Open Synthetic Off-Road Image Dataset) for semantic segmentation, presented in this work, is publicly available at the following link: https://www.kaggle.com/datasets/konrmal94/synthetic-offroad-image-dataset/settings
Code Availability
The authors will share the code used in this paper upon reasonable request.
References
Barnes, D., Maddern, W., Posner, I.: Find your own way: Weakly-supervised segmentation of path proposals for urban autonomy. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 203–210. IEEE, Singapore (2017). https://doi.org/10.1109/ICRA.2017.7989025
Zhang, Z., Qin, J., Wang, S., Kang, Y., Liu, Q.: Ulodnet: A unified lane and obstacle detection network towards drivable area understanding in autonomous navigation. Journal of Intelligent & Robotic Systems 105(4) (2022). https://doi.org/10.1007/s10846-022-01606-3
Janai, J., Güney, F., Behl, A., Geiger, A.: Computer vision for autonomous vehicles: Problems, datasets and state of the art. Foundations and Trends in Computer Graphics and Vision 12(1–3), 1–308 (2020). https://doi.org/10.1561/0600000079
Liu, J., Li, H., Luo, J., Xie, S., Sun, Y.: Estimating obstacle maps for usvs based on a multistage feature aggregation and semantic feature separation network. Journal of Intelligent & Robotic Systems 102(21) (2021). https://doi.org/10.1007/s10846-021-01395-1
Wan Aasim, W.F.A., Okasha, M., Faris, W.F.: Real-time artificial intelligence based visual simultaneous localization and mapping in dynamic environments-a review. Journal of Intelligent & Robotic Systems 105(15) (2022). https://doi.org/10.1007/s10846-022-01643-y
Thrun, S., Montemerlo, M.: The graph slam algorithm with applications to large-scale mapping of urban structures. The International Journal of Robotics Research 25(5–6), 403–429 (2006). https://doi.org/10.1177/02783649060653
Feng, D., Haase-Schütz, C., Rosenbaum, L., Hertlein, H., Gläser, C., Timm, F., Wiesbeck, W., Dietmayer, K.: Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Trans. Intell. Transp. Syst. 22(3), 1341–1360 (2021). https://doi.org/10.1109/TITS.2020.2972974
LeCun, Y., Bengio, Y.: Convolutional neural networks for images,speech, and time-series. In: Arbib, M.A. (ed.) The handbook of brain theory and neural networks. MIT Press, Cambridge, MA (1995). https://citeseerx.ist.psu.edu/document?repid=rep1 &type=pdf &doi=e26cc4a1c717653f323715d751c8dea7461aa105
Kim, D., Tsai, Y.-H., Suh, Y., Faraki, M., Garg, S., Chandraker, M., Han, B.: Learning semantic segmentation from multiple datasets with label shifts. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision-ECCV 2022, pp. 20–36. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_2
Petrović, V.M., Kovačević, B.D.: AViLab–gamified virtual educational tool for introduction to agent theory fundamentals. Electronics 11(3) (2022). https://doi.org/10.3390/electronics11030344
Nikolenko, S.I.: Synthetic Data for Deep Learning, 1st edn. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75178-4
Valada, A., Oliveira, G.L., Brox, T., Burgard, W.: Deep multispectral semantic scene understanding of forested environments using multimodal fusion. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds.) 2016 International Symposium on Experimental Robotics, pp. 465–477. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-50115-4_41
Maturana, D., Chou, P.-W., Uenoyama, M., Scherer, S.: Real-time semantic mapping for autonomous off-road navigation. In: Hutter, M., Siegwart, R. (eds.) Field and Service Robotics, pp. 335–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67361-5_22
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223. IEEE, Las Vegas, NV, USA (2016). https://doi.org/10.1109/CVPR.2016.350
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, Miami, FL, USA (2009). https://doi.org/10.1109/CVPR.2009.5206848
Mottaghi, R., Chen, X., Liu, X., Cho, N.-G., Lee, S.-W., Fidler, S., Urtasun, R.,Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition,pp. 891–898. IEEE, Columbus, OH, USA (2014).https://doi.org/10.1109/CVPR.2014.119
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P.,Zitnick, C.L.: Microsoft coco: Common objects in context. In: Fleet, D., Pajdla,T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision-ECCV 2014, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: The kitti dataset. The International Journal of Robotics Research 32(11), 1231–1237 (2013). https://doi.org/10.1177/0278364913491297
Huang, X., Wang, P., Cheng, X., Zhou, D., Geng, Q., Yang, R.: The apolloscape open dataset for autonomous driving and its application. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2702–2719 (2020). https://doi.org/10.1109/TPAMI.2019.2926463
Wigness, M., Eum, S., Rogers, J.G., Han, D., Kwon, H.: A rugd dataset for autonomous navigation and visual perception in unstructured outdoor environments. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5000–5007. IEEE, Macau, China (2019). https://doi.org/10.1109/IROS40897.2019.8968283
Min, C., Jiang, W., Zhao, D., Xu, J., Xiao, L., Nie, Y., Dai, B.: Orfd: A dataset and benchmark for off-road freespace detection. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 2532–2538. IEEE, Philadelphia, PA, USA (2022). https://doi.org/10.1109/ICRA46639.2022.9812139
Jiang, P., Osteen, P., Wigness, M., Saripalli, S.: Rellis-3d dataset: Data, benchmarks and analysis. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 1110–1116. IEEE, Xi’an, China (2021). https://doi.org/10.1109/ICRA48506.2021.9561251
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243. IEEE, Las Vegas, NV, USA (2016). https://doi.org/10.1109/CVPR.2016.352
Chen, Y., Li, W., Chen, X., Van Gool, L.: Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA (2019). https://doi.org/10.1109/CVPR.2019.00194
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Sharma, S., Ball, J.E., Tang, B., Carruth, D.W., Doude, M., Islam, M.A.: Semantic segmentation with transfer learning for off-road autonomous driving. Sensors 19(11), 2577 (2019). https://doi.org/10.3390/s19112577
Holder, C.J., Breckon, T.P., Wei, X.: From on-road to off: Transfer learning within a deep convolutional neural network for segmentation and classification of off-road scenes. In: Hua, G., Jégou, H. (eds.) Computer Vision-ECCV 2016 Workshops, vol. 9913, pp. 149–162. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_11
Lambert, J., Liu, Z., Sener, O., Hays, J., Koltun, V.: MSeg: A composite dataset for multi-domain semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 796–810 (2022). https://doi.org/10.1109/TPAMI.2022.3151200
Wang, L., Li, D., Liu, H., Peng, J., Tian, L., Shan, Y.: Cross-dataset collaborative learning for semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2487–2494 (2022). https://doi.org/10.1609/aaai.v36i3.20149
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp.1520–1528. IEEE, Santiago, Chile (2015). https://doi.org/10.1109/ICCV.2015.178
Jin, Y., Han, D., Ko, H.: Memory-based semantic segmentation for off-road unstructured natural environments. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, Prague, Czech Republic (2021). https://doi.org/10.1109/IROS51168.2021.9636620
Viswanath, K., Singh, K., Jiang, P., Sujit, P.B., Saripalli, S.: Offseg: A semantic segmentation framework for off-road driving. In: 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pp. 354–359. IEEE, Lyon, France (2021). https://doi.org/10.1109/CASE49439.2021.9551643
Sgibnev, I., Sorokin, A., Vishnyakov, B., Vizilter, Y.: Deep semantic segmentation for the off-road autonomous driving. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2020, 617–622 (2020). https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation (2017). https://doi.org/10.48550/arXiv.1706.05587
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE, Las Vegas, NV, USA (2016). https://doi.org/10.1109/CVPR.2016.90
Zhang, X., Chen, Z., Wu, Q.M.J., Cai, L., Lu, D., Li, X.: Fast semantic segmentation for scene perception. IEEE Trans. Industr. Inf. 15(2), 1183–1192 (2019). https://doi.org/10.1109/TII.2018.2849348
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Unity: The Perception Camera component (2022). https://docs.unity3d.com/Packages/com.unity.perception@0.6/manual/PerceptionCamera.html [access on December 29, 2022]
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs (2014). https://doi.org/10.48550/ARXIV.1412.7062
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs (2016). https://doi.org/10.48550/ARXIV.1606.00915
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation (2018). https://doi.org/10.48550/ARXIV.1802.02611
Acknowledgements
Not applicable
Funding
The APC was funded by the Warsaw University of Technology. No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
K.M. methodology, software, formal analysis, investigation, resources, data curation, writing - original draft preparation.J.D. conceptualization, supervision, project administration, funding acquisition, writing - review and editing.A.K. conceptualization, methodology, validation, resources, writing - original draft preparation, writing - review and editing.P.H. conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing - review and editing.K.K. software, formal analysis, investigation, resources, writing - original draft preparation, writing - review and editing, visualization.
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Ethics approval
Not applicable
Consent to participate
Not applicable
Consent for publication
Not applicable
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Małek, K., Dybała, J., Kordecki, A. et al. OffRoadSynth Open Dataset for Semantic Segmentation using Synthetic-Data-Based Weight Initialization for Autonomous UGV in Off-Road Environments. J Intell Robot Syst 110, 76 (2024). https://doi.org/10.1007/s10846-024-02114-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-024-02114-2