Abstract
Based on recent examples of the author’s work from the domains of Intelligent Speech, Music, and Sound Analysis, a comprehensive overview is given on currently obtainable performances in the field of Intelligent Audio Analysis. This comprises discrete classification tasks, namely, digit, spelling, phoneme, word, and non-linguistic vocalisation recognition alongside writer sentiment and speaker emotion, age, gender, intoxication, and sleepiness recognition. Further, continuous writer sentiment, and speaker interest and height determination as well as sound listener induced arousal and valence prediction are contained. Based on these performances, future perspectives are given.
We can only see a short distance ahead, but we can see plenty there that needs to be done.
—Alan Turing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schiel, F.: Perception of alcoholic intoxication in speech. In Proceedings of Interspeech, pp. 3281–3284. Florence (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schuller, B. (2013). Vision. In: Intelligent Audio Analysis. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36806-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-36806-6_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36805-9
Online ISBN: 978-3-642-36806-6
eBook Packages: EngineeringEngineering (R0)