Abstract
This paper presents a method of speaker detection using probabilistic prediction for avoiding the tuning of thresholds to detect a speaker in an audio stream. We introduce g-GEBI (generalized GEBI) as a generalization of BI (Bayesian Inference) and GEBI (Gibbs-distribution-based Extended BI) to execute iterative detection of a speaker in audio stream uttered by more than one speaker. Then, we show a method of probabilistic prediction in multiclass classification to classify the results of speaker detection. By means of numerical experiments using recorded real speech data, we examine the properties and the effectiveness of the present method. Especially, we show that g-GEBI and g-BI (generalized BI) are more effective than the conventional BI and GEBI in incremental speaker detection task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beigi, H.: Fundamentals of speaker recognition. Springer-Verlag New York Inc. (2011)
Kurogi, S., Sakashita, S., Takeguchi, S., Ueki, T., Matsuo, K.: Probabilistic prediction in multiclass classification derived for flexible text-prompted speaker verification. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9489, pp. 216–225. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26532-2_24
Kurogi, S., Ueki, T., Mizobe, Y., Nishida, T.: Text-prompted multistep speaker verification using Gibbs-distribution-based extended Bayesian inference for reducing verification errors. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 184–192. Springer, Heidelberg (2013). doi:10.1007/978-3-642-42051-1_24
Kurogi, S., Ueki, T., Takeguchi, S., Mizobe, Y.: Properties of text-prompted multistep speaker verification using Gibbs-distribution-based extended Bayesian inference for rejecting unregistered speakers. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8835, pp. 35–43. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12640-1_5
Slingo, J., Palmer, T.: Uncertainty in weather and climate prediction. Phil. Trans. R. Soc. A 369, 4751–4767 (2011)
Kurogi, S., Ueno, T., Sawa, M.: A batch learning method for competitive associative net and its application to function approximation. In: Proceedings of the SCI 2004, vol. V, pp. 24–28 (2004)
Kurogi, S., Mineishi, S., Sato, S.: An analysis of speaker recognition using bagging CAN2 and pole distribution of speech signals. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6443, pp. 363–370. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17537-4_45
Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Sakata, K., Sakashita, S., Matsuo, K., Kurogi, S. (2016). Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-46681-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)