International Conference on Statistical Language and Speech Processing

Statistical Language and Speech Processing pp 111-122

A Comparison of Human and Machine Estimation of Speaker Age

Conference paper

DOI: 10.1007/978-3-319-25789-1_11

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9449)
Cite this paper as:
Huckvale M., Webb A. (2015) A Comparison of Human and Machine Estimation of Speaker Age. In: Dediu AH., Martín-Vide C., Vicsi K. (eds) Statistical Language and Speech Processing. Lecture Notes in Computer Science, vol 9449. Springer, Cham


The estimation of the age of a speaker from his or her voice has both forensic and commercial applications. Previous studies have shown that human listeners are able to estimate the age of a speaker to within 10 years on average, while recent machine age estimation systems seem to show superior performance with average errors as low as 6 years. However the machine studies have used highly non-uniform test sets, for which knowledge of the age distribution offers considerable advantage to the system. In this study we compare human and machine performance on the same test data chosen to be uniformly distributed in age. We show that in this case human and machine accuracy is more similar with average errors of 9.8 and 8.6 years respectively, although if panels of listeners are consulted, human accuracy can be improved to a value closer to 7.5 years. Both human and machines have difficulty in accurately predicting the ages of older speakers.


Speaker profiling Speaker age prediction Computational paralinguistics 

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Speech, Hearing and Phonetic SciencesUniversity College LondonLondonUK

Personalised recommendations