Computer Assisted Transcription of Speech

Rodríguez, Luis; Casacuberta, Francisco; Vidal, Enrique

doi:10.1007/978-3-540-72847-4_32

Luis Rodríguez¹,
Francisco Casacuberta² &
Enrique Vidal²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4477))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1568 Accesses
9 Citations

Abstract

Speech recognition systems have proved their usefulness in very different tasks. Nevertheless, the present state-of-the-art of the speech technologies does not make it possible to achieve perfect transcriptions in most of the cases. Owing to this fact, human intervention is necessary to check and correct the results of such systems. We present a novel approach that faces this problem by combining the efficiency of the automatic speech recognition systems with the accuracy of the human transcriptor. The result of this process is a cost-effective perfect transcription of the input signal.

This work has been partially supported by the Spanish project iDoc TIN2006-15694-C02-01.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amengual, J.C., Benedí, J.M., Casacuberta, F., Castano, A., Castellanos, A., Jiménez, V., Lloréns, D., Marzal, A., Pastor, M., Prat, F., Vidal, E., Vilar, J.M.: The EuTrans-I Speech Translation System. Machine Translation 15, 75–103 (2000)
Article MATH Google Scholar
Castro, M.J., Llorens, D., Sánchez, J.A., Casacuberta, F., Aibar, P., Segarra, E.: A fast version of the atros system. In: European Conference on Speech Communication and Technology. EUROSPEECH’99, Budapest, September 1999, pp. 1299–1302 (1999)
Google Scholar
Civera, J., Vilar, J.M., Cubel, E., Lagarda, A.L., Barrachina, S., Casacuberta, F., Vidal, E., Picó, D., González, J.: A syntactic pattern recognition approach to computer assisted translation. In: Fred, A., Caelli, T.M., Duin, R.P.W., Campilho, A.C., de Ridder, D. (eds.) SSPR&SPR 2004. LNCS, vol. 3138, pp. 207–215. Springer, Heidelberg (2004)
Google Scholar
Cubel, E., Civera, J., Vilar, J.M., Lagarda, A.L., Barrachina, S., Vidal, E., Casacuberta, F., Picó, D., González, J., Rodríguez, L.: Finite-state models for computer assisted translation. In: Proceedings of the 16th European Conference on Artificial Intelligence (ECAI04), Valencia, Spain, pp. 586–590 (2004)
Google Scholar
Díaz-Verdejo, J.E., Peinado, A.M., Rubio, A.J., Segarra, E., Prieto, N., Casacuberta, F.: Albayzin: a task oriented spanish speech corpus. In: Proceedings of First Intern. Conf. on Language Resources and Evaluation (LREC-98), vol. 1, pp. 497–501 (1998)
Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press, Cambridge (1998)
Google Scholar
Llorens, D., Casacuberta, F., Segarra, E., Sánchez, J.A., Aibar, P.: Acoustical and syntactical modeling in ATROS system. In: Proceedings of International Conference on Acoustic, Speech and Signal Processing (ICASSP99), Phoenix, Arizona, USA, March 1999, pp. 641–644 (1999)
Google Scholar
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recoginition. Proceedings of the IEEE 77, 257–286 (1989)
Article Google Scholar
Stolcke, A.: SRILM - an extensible language modeling toolkit. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP02), Denver, Colorado, USA, September 2002, pp. 901–904 (2002)
Google Scholar
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. on Information Theory 13(2), 260–269 (1967)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Sistemas Informáticos. Universidad de Castilla La Mancha,
Luis Rodríguez
Departamento de Sistemas Informáticos y Computación. Universidad Politécnica, de Valencia,
Francisco Casacuberta & Enrique Vidal

Authors

Luis Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Casacuberta
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joan Martí José Miguel Benedí Ana Maria Mendonça Joan Serrat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodríguez, L., Casacuberta, F., Vidal, E. (2007). Computer Assisted Transcription of Speech. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-540-72847-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72846-7
Online ISBN: 978-3-540-72847-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics