Abstract
In this paper, we consider the extraction of speaker identity (first name and last name) from audio records of broadcast news. Using an automatic speech recognition system, we present improvements for a method which allows to extract speaker identities from automatic transcripts and to assign them to speaker turns. The detected full names are chosen as potential candidates for these assignments. All this information, which is often contradictory, is described and combined in the Belief Functions formalism, which makes the knowledge representation of the problem coherent. The Belief Function theory has proven to be very suitable and adapted for the management of uncertainties concerning the speaker identity. Experiments are carried out on French broadcast news records from a French evaluation campaign of automatic speech recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Canseco-Rodriguez, L., Lamel, L., Gauvain, J.-L.: A comparative study using manual and automatic transcriptions for diarization. In: Automatic Speech Recognition and Understanding, San Juan, pp. 415–419 (2005)
Galliano, S., Geffroy, E., Mostefa, D., Choukri, K., Bonastre, J.-F., Gravier, G.: The ESTER phase II evaluation campaign for the rich transcription of French broadcast news. In: European Conference on Speech Communication and Technology (2005)
Kuhn, R., De Mori, R.: The application of semantic classification trees to natural language understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(5), 449–460 (1995)
Jousse, V., Petitrenaud, S., Meignier, S., Estève, Y., Jacquin, C.: Automatic named identification of speakers using diarization and ASR systems. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Taipei, pp. 4557–4560 (2009)
Mauclair, J., Meignier, S., Estève, Y.: Speaker diarization: about whom the speaker is talking? In: IEEE Odyssey (2006)
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Smets, P., Kennes, R.: The transferable belief model. Artificial Intelligence 66, 191–234 (1994)
Tranter, S.E.: Who really spoke when? Finding speaker turns and identities in broadcast news audio. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 1013–1016 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Petitrenaud, S., Jousse, V., Meignier, S., Estève, Y. (2010). Identification of Speakers by Name Using Belief Functions. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Methods. IPMU 2010. Communications in Computer and Information Science, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14055-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-14055-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14054-9
Online ISBN: 978-3-642-14055-6
eBook Packages: Computer ScienceComputer Science (R0)