Deep Feature Learning for Acoustics-Based Terrain Classification

  • Abhinav ValadaEmail author
  • Luciano Spinello
  • Wolfram Burgard
Part of the Springer Proceedings in Advanced Robotics book series (SPAR, volume 3)


In order for robots to efficiently navigate in real-world environments, they need to be able to classify and characterize terrain for safe navigation. The majority of techniques for terrain classification is predominantly based on using visual features. However, as vision-based approaches are severely affected by appearance variations and occlusions, relying solely on them incapacitates the ability to function robustly in all conditions. In this paper, we propose an approach that uses sound from vehicle-terrain interactions for terrain classification. We present a new convolutional neural network architecture that learns deep features from spectrograms of extensive audio signals, gathered from interactions with various indoor and outdoor terrains. Using exhaustive experiments, we demonstrate that our network significantly outperforms classification approaches using traditional audio features by achieving state of the art performance. Additional experiments reveal the robustness of the network in situations corrupted with varying amounts of white Gaussian noise and that fine-tuning with noise-augmented samples significantly boosts the classification rate. Furthermore, we demonstrate that our network performs exceptionally well even with samples recorded with a low-quality mobile phone microphone that adds substantial amount of environmental noise.



This work has been partly supported by the European Commission under the grant numbers ERC-AGPE7-267686-LifeNav and FP7-610603-EUROPA2, and from the Ministry of Science, Research and Arts of Baden-Württemberg (Az: 32-7545.24-9/1/1) as well as by the German Ministry for Research and Technology under grant ZAFH-AAL.


  1. 1.
    Brijesh, V., Blumenstein, M.: Pattern Recognition Technologies and Applications: Recent Advances, IGI Global (2008)Google Scholar
  2. 2.
    Brooks, C.A., Iagnemma, K.: Vibration-based terrain classification for planetary exploration rovers. IEEE Trans. Robot. 21(6), 1185–1191 (2005)CrossRefGoogle Scholar
  3. 3.
    Brooks, C.A., Iagnemma, K.: Self-Supervised Classification for Planetary Rover Terrain Sensing. In: 2007 IEEE Aerospace Conference, pp.1–9 (2007)Google Scholar
  4. 4.
    Ellis, D.: Classifying music audio with timbral and chroma features. In: 8th International Conference on Music Information Retrieval (2007)Google Scholar
  5. 5.
    Eriksson, J., Girod, L., Hull, B., Newton, R., Madden, S., Balakrishnan, H.: The pothole patrol: using a mobile sensor network for road surface monitoring. In: 6th Annual International conference on Mobile Systems, Applications and Services (2008)Google Scholar
  6. 6.
    Giannakopoulos, T., Dimitrios, K., Andreas, A., Sergios, T.: Violence content classification using audio features. In: Hellenic Artificial Intelligence Conference (2006)Google Scholar
  7. 7.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)Google Scholar
  8. 8.
    Hadsell, R., Samarasekera, S., Divakaran, A.: Audio based robot control and navigation, U.S. Patent 8532863 B2, 28 Sept 2010Google Scholar
  9. 9.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving Deep into Rectifiers : Surpassing Human-Level Performance on ImageNet Classification. arXiv:1502.01852 (2015)
  10. 10.
    Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:cs/1207.0580v3 (2012)
  11. 11.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv:1408.5093 (2014)
  12. 12.
    Khunarsal, P., Lursinsap, C., Raicharoen, T.: Very short time environmental sound classification based on spectrogram pattern matching. J. Inf. Sci. 243, 57–74 (2013)CrossRefGoogle Scholar
  13. 13.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)Google Scholar
  14. 14.
    Lee, H., Largman, Y., Pham, P., Ng, A.Y.: Unsupervised feature learning for audio classification using convolutional deep belief networks. Adv. Neural Inf. Proces. Syst. 22, 1096–1104 (2009)Google Scholar
  15. 15.
    Libby, J., Stentz, A.: Using sound to classify vehicle-terrain interactions in outdoor environments. In: 2012 IEEE International Conference on Robotics & Automation (2012)Google Scholar
  16. 16.
    Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (2014). arXiv:1409.1556
  17. 17.
    Ojeda, L., Borenstein, J., Witus, G., Karlen, R.: Terrain characterization and classification with a mobile robot. J. Field Robot. 29(1) (2006)Google Scholar
  18. 18.
    Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, vol. 26 (2013)Google Scholar
  19. 19.
    Trautmann, E., Ray, L.: Mobility characterization for autonomous mobile robots using machine learning. Auton. Robots 30(4), 369–383 (2011)CrossRefGoogle Scholar
  20. 20.
    Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)CrossRefGoogle Scholar
  21. 21.
    Weiss, C., Frohlich, H., Zell, A.: Vibration-based terrain classification using support vector machines. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4429–4434, Oct 9-15 2006Google Scholar
  22. 22.
    Wellman, M.C., Srour, N., Hillis, D.B.: Feature Extraction and Fusion of Acoustic and Seismic Sensors for Target Identification. In: Proceedings of SPIE 3081 (1997)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Abhinav Valada
    • 1
    Email author
  • Luciano Spinello
    • 1
  • Wolfram Burgard
    • 1
  1. 1.Department of Computer ScienceUniversity of FreiburgFreiburgGermany

Personalised recommendations