Skip to main content

Captioning Multiple Speakers Using Speech Recognition to Assist Disabled People

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5105))

Abstract

Meetings and seminars involving many people speaking can be some of the hardest situations for deaf people to be able to follow what is being said and also for people with physical, visual or cognitive disabilities to take notes or remember key points. People may also be absent during important interactions or they may arrive late or leave early. Real time captioning using phonetic keyboards can provide an accurate live as well as archived transcription of what has been said but is often not available because of the cost and shortage of highly skilled and trained stenographers. This paper describes the development of applications that use speech recognition to provide automatic real time text transcriptions in situations when there can be many people speaking.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wald, M.: An exploration of the potential of Automatic Speech Recognition to assist and enable receptive communication in higher education. ALT-J, Research in Learning Technology 14(1), 9–20 (2006)

    Google Scholar 

  2. Wald, M., Bain, K.: Using Automatic Speech Recognition to Assist Communication and Learning. In: Proceedings of HCI International 2005: 11th International Conference on Human-Computer Interaction, Las Vegas USA, vol. 8 (2005)

    Google Scholar 

  3. Zshorn, A., Littlefield, J.S., Broughton, M., Dwyer, B., Hashemi-Sakhtsari, A., Dwyer, B.: Transcription of multiple speakers using speaker dependent speech recognition. Australian Government Department of Defence Technical Report DSTO-TR-1498 (2003)

    Google Scholar 

  4. Fiscus, J., Radde, N., Garofolo, J., Le, A., Ajot, J., Laprun, C.: The Rich Transcription 2005 Spring Meeting Recognition Evaluation, National Institute Of Standards and Technology (2005)

    Google Scholar 

  5. http://www.nist.gov/speech/test_beds/mr_proj/

  6. http://www.nist.gov/speech/test_beds/mr_proj/publications/rt05sresults.pdf

  7. http://www.nist.gov/speech/tests/rt/rt2007/workshop/RT07-STT-v8.pdf

  8. Nuance (2006) Retrieved February 7, 2007, http://www.nuance.co.uk/

  9. Bain, K., Basson, S., Wald, M.: Speech recognition in university classrooms. In: Proceedings of the Fifth International ACM SIGCAPH Conference on Assistive Technologies, pp. 192–196. ACM Press, New York (2002)

    Chapter  Google Scholar 

  10. IBM (2005) Retrieved February 7, 2007, http://www-306.ibm.com/able/solution_offerings/ViaScribe.html

  11. Leitch, D., MacMillan, T.: Liberated Learning Initiative Innovative Technology and Inclusion: Current Issues and Future Directions for Liberated Learning Research. Saint Mary’s University, Nova Scotia (2003), Retrieved February 7, 2007, http://www.liberatedlearning.com/

  12. Wald, M.: Personalised Displays. In: Speech Technologies: Captioning, Transcription and Beyond IBM T.J. Watson Research Center New York USA (2005), Retrieved February 7, 2007, http://www.nynj.avios.org/Proceedings.htm

  13. Lambourne, A., Hewitt, J., Lyon, C., Warren, S.: Speech-Based Real-Time Subtitling Service. International Journal of Speech Technology 7, 269–279 (2004)

    Article  Google Scholar 

  14. Francis, P.M., Stinson, M.: The C-Print Speech-to-Text System for Communication Access and Learning. In: Proceedings of CSUN Conference Technology and Persons with Disabilities, California State University Northridge (2003)

    Google Scholar 

  15. Wald, M.: Creating Accessible Educational Multimedia through Editing Automatic Speech Recognition Captioning in Real Time. International Journal of Interactive Technology and Smart Education: Smarter Use of Technology in Education 3(2), 131–142 (2006)

    Google Scholar 

  16. Wald, M.: Research and development of client-server personal display of speech recognition generated text, real time editing and annotation systems: Speech Technologies-Accessibility Inroads: A special symposium on accessibility and speech recognition technology. IBM Hursley Research Park (2006), Retrieved February 7, 2007, http://www.liberatedlearning.com/news/proceedings.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Klaus Miesenberger Joachim Klaus Wolfgang Zagler Arthur Karshmer

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wald, M. (2008). Captioning Multiple Speakers Using Speech Recognition to Assist Disabled People. In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds) Computers Helping People with Special Needs. ICCHP 2008. Lecture Notes in Computer Science, vol 5105. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70540-6_88

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70540-6_88

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70539-0

  • Online ISBN: 978-3-540-70540-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics