The Rich Transcription 2006 Spring Meeting Recognition Evaluation
We present the design and results of the Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation; the fourth in a series of community-wide evaluations of language technologies in the meeting domain. For 2006, we supported three evaluation tasks in two meeting sub-domains: the Speech-To-Text (STT) transcription task, and the “Who Spoke When” and “Speech Activity Detection” diarization tasks. The meetings were from the Conference Meeting, and Lecture Meeting sub-domains. The lowest STT word error rate, with up to four simultaneous speakers, in the multiple distant microphone condition was 46.3% for the conference sub-domain, and 53.4% for the lecture sub-domain. For the “Who Spoke When” task, the lowest diarization error rates for all speech were 35.8% and 24.0% for the conference and lecture sub-domains respectively. For the “Speech Activity Detection” task, the lowest diarization error rates were 4.3% and 8.0% for the conference and lecture sub-domains respectively.
KeywordsEvaluation Task Word Error Rate Microphone Array Meeting Participant Linguistic Data Consortium
Unable to display preview. Download preview PDF.
- 1.Fiscus, et al.: Results of the Fall 2004 STT and MDE Evaluation. In: RT-04F Evaluation Workshop Proceedings, November 7–10 (2004)Google Scholar
- 2.Garofolo, et al.: The Rich Transcription 2004 Spring Meeting Recognition Evaluation. In: ICASSP 2004 Meeting Recognition Workshop, May 17 (2004)Google Scholar
- 3.Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation Plan (2006), http://www.nist.gov/speech/tests/rt/rt2006/spring/
- 4.LDC Meeting Recording Transcription, http://www.ldc.upenn.edu/Projects/Transcription/NISTMeet
- 5.SCTK toolkit, http://www.nist.gov/speech/tools/index.htm
- 9.Fiscus, et al.: Multiple Dimension Levenshtein Distance Calculations for Evaluating Automatic Speech Recognition Systems During Simultaneous Speech. In: LREC 2006. Sixth International Conference on Language Resources and Evaluation (2006)Google Scholar
- 10.Gehrig, McDonough: Tracking Multiple Simultaneous Speakers with Probabilistic Data Association Filters. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299. Springer, Heidelberg (2006)Google Scholar
- 11.Stanford, V.: The NIST Mark-III microphone array - infrastructure, reference data, and metrics. In: Proceedings International Workshop on Microphone Array Systems - Theory and Practice, Pommersfelden, Germany (2003)Google Scholar