The Diarization System for an Unknown Number of Speakers
This paper presents a system for speaker diarization that can be used if the number of speakers is unknown. The proposed system is based on the ag-glomerative clustering approach in conjunction with factor analysis, Total Variability approach and linear discriminant analysis. We present the results of the proposed diarization system. The results demonstrate that our system can be used both if an answering machine or handset transfer is present in telephone recordings and in the case of a summed channel in telephone or meeting recordings.
Keywordsdiarization speaker segmentation speaker recognition clustering
Unable to display preview. Download preview PDF.
- 2.Jin, Q., Laskowski, K., Schultz, T., Waibel, A.: Speaker segmentation and clustering in meetings. In: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju Island, Korea (2004)Google Scholar
- 3.Reynolds, D., Kenny, P., Castaldo, F.: A Study of New Approaches to Speaker Diarization. In: Proc. Interspeech, pp. 1047–1050 (2009)Google Scholar
- 5.Kenny, P.: Bayesian Analysis of Speaker Diarization with Eigenvoice Priors. Technical report. Centre de recherche informatique de Montreal (CRIM), Montreal, Canada (2008)Google Scholar
- 6.2008 NIST Speaker Recognition Evaluation Test Set, http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011S08
- 7.AMI Meeting Corpus, http://corpus.amiproject.org/
- 8.Rich Transcription Evaluation Project, http://www.itl.nist.gov/iad/mig//tests/rt/
- 9.Rich Transcription Spring 2006 Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/2006-spring/