Robust Bootstrapping of Speaker Models for Unsupervised Speaker Indexing

ZhongHua, Fu

doi:10.1007/978-3-540-73417-8_19

Fu ZhongHua¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4577))

Included in the following conference series:

International Workshop on Multimedia Content Analysis and Mining

1494 Accesses

Abstract

The conventional bootstrapping approaches of speaker models in unsupervised speaker indexing tasks are very sensitive to the bootstrapping segment duration. If the duration is insufficient to build speaker model, such as in telephone conversations and meetings scenario, serious problems will arise. We therefore propose a robust bootstrapping framework, which employs Multi-EigenSpace modeling technique based on Regression Class (RC-MES) to build speaker models with sparse data, and a short-segment clustering to prevent the too short segments from influencing bootstrapping. For a real discussion archive with a total duration of 8 hours, we demonstrate the significant robustness of the proposed method, which not only improves the speaker change detection performance but also outperforms the conventional bootstrapping methods, even if the average bootstrapping segment duration is less than 5 seconds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Delacourt, P., Kryze, D., Wellekens, C.J.: Detection of Speaker Changes in an Audio Document. In : Proc. Eur. Conf. Speech Commum. Tech (EUROSPEECH), vol. 3, 1195–1198 (1999)
Google Scholar
Moh, Y., Nguyen, P., Junqua, J.-C.: Towards Domain Independent Speaker Clustering. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 85–88 (2003)
Google Scholar
Wu, T., Lu, L., Chen, K., Zhang, H.: UBM-Based Real-Time Speaker Segmentation for Boradcasting News. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 193–196 (2003)
Google Scholar
Kwon, S., Narayanan, S.: Unsupervised Speaker Indexing Using Generic Models. IEEE Trans. On Speech and Audio Processing 13(5), 1004–1013 (2005)
Article Google Scholar
Thyes, O., Kuhn, R., Nguyen, P., Junqua J.-C.: Speaker Identification and Verification Using Eigenvoices. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 242–246 (2000)
Google Scholar
Aubert, X.L.: Eigen-MLLRs Applied to Unsupervised Speaker Enrollment for Large Vocabulary Continous Speech Recognition. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 1, pp. 17–21 (2004)
Google Scholar
Fu, Z., Zhao, R.: Speaker Modeling Technique Based on Regression Class for Speaker Identification with Sparse Trainging. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, Springer, Heidelberg, GuangZhou, China (2004)
Google Scholar
Ajmera, J., McCowan, I., Bourland, H.: Robust Speaker Change Detection. IEEE Signal Processing Letters 11(8), 649–651 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, P.R. China
Fu ZhongHua

Authors

Fu ZhongHua
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Nicu Sebe Yuncai Liu Yueting Zhuang Thomas S. Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

ZhongHua, F. (2007). Robust Bootstrapping of Speaker Models for Unsupervised Speaker Indexing. In: Sebe, N., Liu, Y., Zhuang, Y., Huang, T.S. (eds) Multimedia Content Analysis and Mining. MCAM 2007. Lecture Notes in Computer Science, vol 4577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73417-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-73417-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73416-1
Online ISBN: 978-3-540-73417-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics