Deep representation learning for road detection using Siamese network

  • Huafeng Liu
  • Xiaofeng Han
  • Xiangrui Li
  • Yazhou Yao
  • Pu Huang
  • Zhenmin Tang


Robust road detection is a key challenge in safe autonomous driving. Recently, with the rapid development of 3D sensors, more and more researchers are trying to fuse information across different sensors to improve the performance of road detection. Although many successful works have been achieved in this field, methods for data fusion under deep learning framework is still an open problem. In this paper, we propose a Siamese deep neural network based on FCN-8s to detect road region. Our method uses data collected from a monocular color camera and a Velodyne-64 LiDAR sensor. We project the LiDAR point clouds onto the image plane to generate LiDAR images and feed them into one of the branches of the network. The RGB images are fed into another branch of our proposed network. The feature maps that these two branches extract in multiple scales are fused before each pooling layer, via padding additional fusion layers. Extensive experimental results on public dataset KITTI ROAD demonstrate the effectiveness of our proposed approach.


Road detection Siamese network Data fusion Deep learning 



This research was supported by the Major Special Project of Core Electronic Devices, High-end Generic Chips and Basic Software(Grant No. 2015ZX01041101), National Defense Pre-research Foundation(Grant No.41412010101) and the China Postdoctoral Science Foundation (Grant No. 2016M600433).


  1. 1.
    Almazan EJ, Qian Y, Elder JH (2016) Road segmentation for classification of road weather conditions. European Conference on Computer Vision 9913:96–108Google Scholar
  2. 2.
    Asvadi A, Garrote L, Premebida C et al (2017) Multi-modal vehicle detection: fusing 3d-LIDAR and color camera data. Pattern Recogn Lett 09:1–10Google Scholar
  3. 3.
    Asvadi A, Premebida C, Peixoto P et al (2016) 3D Lidar-based static and moving obstacle detection in driving environments. Robot Auton Syst 83:299–311CrossRefGoogle Scholar
  4. 4.
    Bromley J, Guyon I, Lecun Y et al (1993) Signature verification using a ”Siamese” time delay neural network. International Conference on Neural Information Processing Systems 1992:737–744Google Scholar
  5. 5.
    Caltagirone L, Scheidegger S, Svensson L et al (2017) Fast LIDAR-based road detection using fully convolutional neural networks. IEEE Intelligent Vehicles Symposium 2017:1019–1024Google Scholar
  6. 6.
    Charles RQ, Su H, Kaichun M et al (2016) Pointnet: Deep Learning on Point Sets for 3D Classification and Segmentation. IEEE Conference on Computer Vision and Pattern Recognition 2016:77–85Google Scholar
  7. 7.
    Chen T, Dai B, Wang R et al (2014) Gaussian process based Real-Time ground segmentation for autonomous land vehicles. J Intell Robot Syst 76(3-4):563–582CrossRefGoogle Scholar
  8. 8.
    Chen L, Yang J, Kong H (2017) Lidar-histogram for fast road and obstacle detection. IEEE International Conference on Robotics and Automation 2017:1343–1348Google Scholar
  9. 9.
    Cheng Z, Ding Y, He X et al (2018) Aˆ3NCF: an adaptive aspect attention model for rating prediction. International Joint Conference on Artificial Intelligence 2018:3748–3754Google Scholar
  10. 10.
    Cheng Z, Ding Y, Zhu L et al (2018) Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews. arXiv:1802.07938
  11. 11.
    Chenyi C, Ari S, Alain K, Jianxiong X (2015) DeepDriving: Learning affordance for direct perception in autonomous driving. IEEE International Conference on Computer Vision 2015:2722–2730Google Scholar
  12. 12.
    Fritsch J, Kuhnl T, Geiger A (2014) A new performance measure and evaluation benchmark for road detection algorithms. IEEE Conference on Intelligent Transportation Systems 2014:1693–1700Google Scholar
  13. 13.
    Han X, Wang H, Lu Jf, Zhao CX (2017) Road detection based on the fusion of Lidar and image data. Int J Adv Robot Syst 14:1–10CrossRefGoogle Scholar
  14. 14.
    Han X, Lu J, Zhao C, You S, Li H (2018) Semi-supervised and Weakly Supervised Road Detection Based on Generative Adversarial Networks. IEEE Signal Process Lett 25(4):551–555CrossRefGoogle Scholar
  15. 15.
    Hata AY, Osorio FS, Wolf D (2014) Robust curb detection and vehicle localization in urban environments. IEEE Intelligent Vehicles Symposium Proceedings 2014:1257–1262Google Scholar
  16. 16.
    He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. IEEE Computer Vision and Pattern Recognition 2016:770–778Google Scholar
  17. 17.
    He K, Gkioxari G, Dollar P, Girshick R (2018) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 99:1–1Google Scholar
  18. 18.
    Hu X, Rodriguez FSA, Gepperth A (2014) A multi-modal system for road detection and segmentation. IEEE Intelligent Vehicles Symposium Proceedings 2014:1365–1370Google Scholar
  19. 19.
    Laddha A, Kocamaz MK, Navarroserment LE et al (2016) Map-supervised road detection. IEEE Intelligent Vehicles Symposium 2016:118–123Google Scholar
  20. 20.
    Lei Z et al (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486CrossRefGoogle Scholar
  21. 21.
    Li J, Lu K, Huang Z et al (2018) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 99:1–12Google Scholar
  22. 22.
    Liang X, Wang R, Dai B et al (2018) Hybrid conditional random field based camera-LIDAR fusion for road detection. Inf Sci 432:543–558MathSciNetCrossRefGoogle Scholar
  23. 23.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition 2015:3431–3440Google Scholar
  24. 24.
    Mendes C, Frémont V, Wolf D (2016) Exploiting fully convolutional neural networks for fast road detection. IEEE Conference on Robotics and Automation IEEE International Conference on Robotics and Automation 2016:3174–3179Google Scholar
  25. 25.
    Michael T, Medina A (2016) Speeding up semantic segmentation for autonomous driving. NIPS Workshop 2016:96–108Google Scholar
  26. 26.
    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. International Conference on International Conference on Machine Learning 2016:807–814Google Scholar
  27. 27.
    Nie L, Wang X, Zhang J et al (2017) Enhancing Micro-video Understanding by Harnessing External Sounds. ACM on Multimedia Conference 2017:1192–1200Google Scholar
  28. 28.
    Oeljeklaus M, Hoffmann F, Bertram T (2018) A fast Multi-Task CNN for spatial understanding of traffic scenes. IEEE Intelligent Transportation Systems Conference 2018:1–1Google Scholar
  29. 29.
    Peyman M, Starzyk JA, Sardha Wijesoma W (2012) Fast vanishing point detection in unstructured environments. IEEE Trans Image Process 21(1):425–430MathSciNetCrossRefGoogle Scholar
  30. 30.
    Qin H, Zain JM, Ma X et al (2010) Scene segmentation based on seeded region growing for foreground detection. IEEE Sixth Int Conf Nat Comput 7:3619–3623Google Scholar
  31. 31.
    Schlosser J, Chow CK, Kira Z (2016) Fusing LIDAR and images for pedestrian detection using convolutional neural networks. IEEE International Conference on Robotics and Automation 2016:2198–2205Google Scholar
  32. 32.
    Shen F, Yang Y, Liu L, Liu W, Tao D, Shen HT (2017) Asymmetric binary coding for image search. IEEE Trans Multimed 19(9):2022–2032CrossRefGoogle Scholar
  33. 33.
    Shen F, Zhou X, Yang Y, Song J, Shen HT, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25 (12):5610–5621MathSciNetCrossRefGoogle Scholar
  34. 34.
    Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with Similarity-Adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 99:1–1Google Scholar
  35. 35.
    Siam M, Elkerdawy S, Jagersand M, Yogamani S (2017) Deep semantic segmentation for automated driving: Taxonomy, roadmap and challenges. arXiv:1707.02432
  36. 36.
    Simonyan K, Zisserman A (2015) Very deep convolutional networks for Large-Scale image recognition international conference on learning representationsGoogle Scholar
  37. 37.
    Song X, Feng F, Han X, Yang X, Liu W, Nie L (2018) Neural Compatibility Modeling with Attentive Knowledge Distillation. arXiv:1805.00313
  38. 38.
    Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: Real-time joint semantic reasoning for autonomous driving. arXiv:1612.07695
  39. 39.
    Wang Q, Gao J, Yuan Y et al (2018) Embedding structured contour and location prior in Siamesed fully convolutional networks for road detection. IEEE Trans Intell Transp Syst 19(1):230–241CrossRefGoogle Scholar
  40. 40.
    Wijesoma WS, Kodagoda KRS, Balasuriya AP (2014) Road-boundary detection and tracking using ladar sensing. IEEE Trans Robot Autom 20(3):456–464CrossRefGoogle Scholar
  41. 41.
    Xiao L, Dai B, Liu D, Hu T, Wu T (2015) CRF Based road detection with multi-sensor fusion. IEEE Intelligent Vehicles Symposium 2015:192–198Google Scholar
  42. 42.
    Xiao L, Wang R, Dai B, Fang Y, Liu D, Wu T (2018) Hybrid conditional random field based camera-LIDAR fusion for road detection. Inf Sci 432:543–558MathSciNetCrossRefGoogle Scholar
  43. 43.
    Xie L, Shen J, Han J et al (2017) Dynamic Multi-View hashing for online image retrieval. International Joint Conference on Artificial Intelligence 2017:3133–3139Google Scholar
  44. 44.
    Zhu L, Huang Z, Chang X et al (2017) Exploring consistent preferences: Discrete hashing with pair-exemplar for scalable landmark search. In: Proceedings of the 2017 ACM on Multimedia Conference, vol 2017, pp 726–734Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNanjing University of Science and TechnologyNanjingChina
  2. 2.Jiangsu Key Laboratory of Big Data Security, Intelligent ProcessingNanjing University of Posts and TelecommunicationsNanjingChina

Personalised recommendations