Abstract
In order for robots to efficiently navigate in real-world environments, they need to be able to classify and characterize terrain for safe navigation. The majority of techniques for terrain classification is predominantly based on using visual features. However, as vision-based approaches are severely affected by appearance variations and occlusions, relying solely on them incapacitates the ability to function robustly in all conditions. In this paper, we propose an approach that uses sound from vehicle-terrain interactions for terrain classification. We present a new convolutional neural network architecture that learns deep features from spectrograms of extensive audio signals, gathered from interactions with various indoor and outdoor terrains. Using exhaustive experiments, we demonstrate that our network significantly outperforms classification approaches using traditional audio features by achieving state of the art performance. Additional experiments reveal the robustness of the network in situations corrupted with varying amounts of white Gaussian noise and that fine-tuning with noise-augmented samples significantly boosts the classification rate. Furthermore, we demonstrate that our network performs exceptionally well even with samples recorded with a low-quality mobile phone microphone that adds substantial amount of environmental noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brijesh, V., Blumenstein, M.: Pattern Recognition Technologies and Applications: Recent Advances, IGI Global (2008)
Brooks, C.A., Iagnemma, K.: Vibration-based terrain classification for planetary exploration rovers. IEEE Trans. Robot. 21(6), 1185–1191 (2005)
Brooks, C.A., Iagnemma, K.: Self-Supervised Classification for Planetary Rover Terrain Sensing. In: 2007 IEEE Aerospace Conference, pp.1–9 (2007)
Ellis, D.: Classifying music audio with timbral and chroma features. In: 8th International Conference on Music Information Retrieval (2007)
Eriksson, J., Girod, L., Hull, B., Newton, R., Madden, S., Balakrishnan, H.: The pothole patrol: using a mobile sensor network for road surface monitoring. In: 6th Annual International conference on Mobile Systems, Applications and Services (2008)
Giannakopoulos, T., Dimitrios, K., Andreas, A., Sergios, T.: Violence content classification using audio features. In: Hellenic Artificial Intelligence Conference (2006)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Hadsell, R., Samarasekera, S., Divakaran, A.: Audio based robot control and navigation, U.S. Patent 8532863 B2, 28 Sept 2010
He, K., Zhang, X., Ren, S., Sun, J.: Delving Deep into Rectifiers : Surpassing Human-Level Performance on ImageNet Classification. arXiv:1502.01852 (2015)
Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:cs/1207.0580v3 (2012)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv:1408.5093 (2014)
Khunarsal, P., Lursinsap, C., Raicharoen, T.: Very short time environmental sound classification based on spectrogram pattern matching. J. Inf. Sci. 243, 57–74 (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Lee, H., Largman, Y., Pham, P., Ng, A.Y.: Unsupervised feature learning for audio classification using convolutional deep belief networks. Adv. Neural Inf. Proces. Syst. 22, 1096–1104 (2009)
Libby, J., Stentz, A.: Using sound to classify vehicle-terrain interactions in outdoor environments. In: 2012 IEEE International Conference on Robotics & Automation (2012)
Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (2014). arXiv:1409.1556
Ojeda, L., Borenstein, J., Witus, G., Karlen, R.: Terrain characterization and classification with a mobile robot. J. Field Robot. 29(1) (2006)
Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Trautmann, E., Ray, L.: Mobility characterization for autonomous mobile robots using machine learning. Auton. Robots 30(4), 369–383 (2011)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)
Weiss, C., Frohlich, H., Zell, A.: Vibration-based terrain classification using support vector machines. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4429–4434, Oct 9-15 2006
Wellman, M.C., Srour, N., Hillis, D.B.: Feature Extraction and Fusion of Acoustic and Seismic Sensors for Target Identification. In: Proceedings of SPIE 3081 (1997)
Acknowledgements
This work has been partly supported by the European Commission under the grant numbers ERC-AGPE7-267686-LifeNav and FP7-610603-EUROPA2, and from the Ministry of Science, Research and Arts of Baden-Württemberg (Az: 32-7545.24-9/1/1) as well as by the German Ministry for Research and Technology under grant ZAFH-AAL.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Valada, A., Spinello, L., Burgard, W. (2018). Deep Feature Learning for Acoustics-Based Terrain Classification. In: Bicchi, A., Burgard, W. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 3. Springer, Cham. https://doi.org/10.1007/978-3-319-60916-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-60916-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60915-7
Online ISBN: 978-3-319-60916-4
eBook Packages: EngineeringEngineering (R0)