Skip to main content

Robust Bootstrapping of Speaker Models for Unsupervised Speaker Indexing

  • Conference paper
Multimedia Content Analysis and Mining (MCAM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4577))

Included in the following conference series:

  • 1494 Accesses

Abstract

The conventional bootstrapping approaches of speaker models in unsupervised speaker indexing tasks are very sensitive to the bootstrapping segment duration. If the duration is insufficient to build speaker model, such as in telephone conversations and meetings scenario, serious problems will arise. We therefore propose a robust bootstrapping framework, which employs Multi-EigenSpace modeling technique based on Regression Class (RC-MES) to build speaker models with sparse data, and a short-segment clustering to prevent the too short segments from influencing bootstrapping. For a real discussion archive with a total duration of 8 hours, we demonstrate the significant robustness of the proposed method, which not only improves the speaker change detection performance but also outperforms the conventional bootstrapping methods, even if the average bootstrapping segment duration is less than 5 seconds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Delacourt, P., Kryze, D., Wellekens, C.J.: Detection of Speaker Changes in an Audio Document. In : Proc. Eur. Conf. Speech Commum. Tech (EUROSPEECH), vol. 3, 1195–1198 (1999)

    Google Scholar 

  2. Moh, Y., Nguyen, P., Junqua, J.-C.: Towards Domain Independent Speaker Clustering. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 85–88 (2003)

    Google Scholar 

  3. Wu, T., Lu, L., Chen, K., Zhang, H.: UBM-Based Real-Time Speaker Segmentation for Boradcasting News. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 193–196 (2003)

    Google Scholar 

  4. Kwon, S., Narayanan, S.: Unsupervised Speaker Indexing Using Generic Models. IEEE Trans. On Speech and Audio Processing 13(5), 1004–1013 (2005)

    Article  Google Scholar 

  5. Thyes, O., Kuhn, R., Nguyen, P., Junqua J.-C.: Speaker Identification and Verification Using Eigenvoices. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 2, pp. 242–246 (2000)

    Google Scholar 

  6. Aubert, X.L.: Eigen-MLLRs Applied to Unsupervised Speaker Enrollment for Large Vocabulary Continous Speech Recognition. In: Proc. IEEE Int. Conf. Acoust. Speech. Signal Process (ICASSP), vol. 1, pp. 17–21 (2004)

    Google Scholar 

  7. Fu, Z., Zhao, R.: Speaker Modeling Technique Based on Regression Class for Speaker Identification with Sparse Trainging. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, Springer, Heidelberg, GuangZhou, China (2004)

    Google Scholar 

  8. Ajmera, J., McCowan, I., Bourland, H.: Robust Speaker Change Detection. IEEE Signal Processing Letters 11(8), 649–651 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Nicu Sebe Yuncai Liu Yueting Zhuang Thomas S. Huang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

ZhongHua, F. (2007). Robust Bootstrapping of Speaker Models for Unsupervised Speaker Indexing. In: Sebe, N., Liu, Y., Zhuang, Y., Huang, T.S. (eds) Multimedia Content Analysis and Mining. MCAM 2007. Lecture Notes in Computer Science, vol 4577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73417-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73417-8_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73416-1

  • Online ISBN: 978-3-540-73417-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics