Advertisement

The Rich Transcription 2006 Spring Meeting Recognition Evaluation

  • Jonathan G. Fiscus
  • Jerome Ajot
  • Martial Michel
  • John S. Garofolo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4299)

Abstract

We present the design and results of the Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation; the fourth in a series of community-wide evaluations of language technologies in the meeting domain. For 2006, we supported three evaluation tasks in two meeting sub-domains: the Speech-To-Text (STT) transcription task, and the “Who Spoke When” and “Speech Activity Detection” diarization tasks. The meetings were from the Conference Meeting, and Lecture Meeting sub-domains. The lowest STT word error rate, with up to four simultaneous speakers, in the multiple distant microphone condition was 46.3% for the conference sub-domain, and 53.4% for the lecture sub-domain. For the “Who Spoke When” task, the lowest diarization error rates for all speech were 35.8% and 24.0% for the conference and lecture sub-domains respectively. For the “Speech Activity Detection” task, the lowest diarization error rates were 4.3% and 8.0% for the conference and lecture sub-domains respectively.

Keywords

Evaluation Task Word Error Rate Microphone Array Meeting Participant Linguistic Data Consortium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fiscus, et al.: Results of the Fall 2004 STT and MDE Evaluation. In: RT-04F Evaluation Workshop Proceedings, November 7–10 (2004)Google Scholar
  2. 2.
    Garofolo, et al.: The Rich Transcription 2004 Spring Meeting Recognition Evaluation. In: ICASSP 2004 Meeting Recognition Workshop, May 17 (2004)Google Scholar
  3. 3.
    Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation Plan (2006), http://www.nist.gov/speech/tests/rt/rt2006/spring/
  4. 4.
    LDC Meeting Recording Transcription, http://www.ldc.upenn.edu/Projects/Transcription/NISTMeet
  5. 5.
  6. 6.
    Michel, et al.: The NIST Meeting Room Phase II Corpus. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Fiscus, et al.: The Rich Transcription 2005 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Fiscus, et al.: Multiple Dimension Levenshtein Distance Calculations for Evaluating Automatic Speech Recognition Systems During Simultaneous Speech. In: LREC 2006. Sixth International Conference on Language Resources and Evaluation (2006)Google Scholar
  10. 10.
    Gehrig, McDonough: Tracking Multiple Simultaneous Speakers with Probabilistic Data Association Filters. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299. Springer, Heidelberg (2006)Google Scholar
  11. 11.
    Stanford, V.: The NIST Mark-III microphone array - infrastructure, reference data, and metrics. In: Proceedings International Workshop on Microphone Array Systems - Theory and Practice, Pommersfelden, Germany (2003)Google Scholar
  12. 12.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jonathan G. Fiscus
    • 1
  • Jerome Ajot
    • 1
  • Martial Michel
    • 1
    • 2
  • John S. Garofolo
    • 1
  1. 1.National Institute of Standards and TechnologyGaithersburgUSA
  2. 2.Systems Plus, Inc.RockvilleUSA

Personalised recommendations