Investigation of improved approaches to bayes risk decoding

Xu, Hai-hua; Zhu, Jie

doi:10.1007/s12204-011-1189-1

Investigation of improved approaches to bayes risk decoding

Published: 02 October 2011

Volume 16, pages 524–529, (2011)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Hai-hua Xu (徐海华)¹ &
Jie Zhu (朱杰)¹

43 Accesses
Explore all metrics

Abstract

Bayes risk (BR) decoding methods have been widely investigated in the speech recognition area due to its flexibility and complexity compared with the maximum a posteriori (MAP) method regarding to minimum word error (MWE) optimization. This paper investigates two improved approaches to the BR decoding, aiming at minimizing word error. The novelty of the proposed methods is shown in the explicit optimization of the objective function, the value of which is calculated by an improved forward algorithm on the lattice. However, the result of the first method is obtained by an expectation maximization (EM) like iteration, while the result of the second one is achieved by traversing the confusion network (CN), both of which lead to an optimized objective function value with distinct approaches. Experimental results indicate that the proposed methods result in an error reduction for lattice rescoring, compared with the traditional CN method for lattice rescoring.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Levenshtein V I. Binary codes capable of correcting deletions, insertions and reversals [J]. Soviet Physics Doklady, 1966, 10(8): 707–710.
MathSciNet Google Scholar
Stolcke A, Konig Y, Weintraub M. Explicit word error minimization in n-best list rescoring [C]//Proceedings of the 5th European Conference on Speech Communication and Technology. Rhodes, Greece: ISCA, 1997: 163–166.
Google Scholar
Mangu L, Brill E, Stolcke A. Finding consensus in speech recognition: Word error minimization and other applications of confusion networks [J]. Computer Speech and Language, 2000, 14: 373–400.
Article Google Scholar
Goel V, Kumar S, Byrne W J. Minimum bayes-risk automatic speech recognition [J]. Computer Speech and Language, 2000, 14: 115–135.
Article Google Scholar
Wessel F, Schluter R, Ney H. Explicit word error minimization using word hypothesis posterior probabilities [C]//Proceeding of International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City, USA: IEEE, 2001: 33–36.
Google Scholar
Goel V, Byrne W J. Segmental minimum bayes-risk decoding for automatic speech recognition [J]. IEEE Transactions on Speech and Audio Processing, 2006, 12: 234–249.
Article Google Scholar
Xu H, Povey D, Zhu J, et al. Minimum hypothesis phone error as a decoding method for speech recognition [C]//Proceedings of INTERSPEECH. Brighton, UK: ISCA, 2009: 76–79.
Google Scholar
Povey D, Woodland P C. Minimum phone error and I-smoothing for improved discriminative training [C]//Proceeding of International Conference on Acoustics, Speech, and Signal Processing. Florida, USA: IEEE, 2002: 105–108.
Google Scholar
Hoffmeister B, Schluter R, Ney H. Bayes risk approximations using time overlap with an application to system combination [C]// Proceedings of INTERSPEECH. Brighton, UK: ISCA, 2009: 1191–1194.
Google Scholar
Heigold G, Macherey W, Schluter R, et al. Minimum exact word error training [C]// Proceedings of Automatic Speech Recognition and Understanding. San Juan, USA: IEEE, 2005: 186–190.
Chapter Google Scholar
Xu H, Povey D, Mangu L, et al. An improved consensus-like method for minimum Bayes risk decoding and lattice combination [C]//Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Dallas, USA: IEEE, 2010: 4938–4941.
Google Scholar
Stolcke A. SRILM — An extensible language modeling toolkit [C]//Proceedings of International Conference on Spoken Language Processing. Denver, USA: ISCA, 2002: 901–904.
Google Scholar
Fiscus J G. A post-processing system to yield reduced word error rates: Recognizer output error reduction (ROVER) [C]// Proceedings of Automatic Speech Recognition and Understanding. Santa Barbara, USA: IEEE, 1997: 347–354.
Chapter Google Scholar
Young S, Evermanna G, Gales M, et al. The HTK book [M]. 3rd ed. Cambridge: Cambridge University, 2006.
Google Scholar
Povey D, Kanevsky D, Kingsbury B, et al. Boosted MMI for model and feature-space discriminative training recognition [C]// Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Las Vegas, USA: IEEE, 2008: 4057–4060.
Google Scholar
Ortmanns S, Ney H. A word graph algorithm for large vocabulary continuous speech recognition [J]. Computer Speech and Language, 1997, 11: 43–72.
Article Google Scholar
Povey D. Discriminative training for large vocabulary speech recognition [M]. Cambridge: Cambridge University, 2004.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai, 200240, China
Hai-hua Xu (徐海华) & Jie Zhu (朱杰)

Authors

Hai-hua Xu (徐海华)
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhu (朱杰)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai-hua Xu (徐海华).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, Hh., Zhu, J. Investigation of improved approaches to bayes risk decoding. J. Shanghai Jiaotong Univ. (Sci.) 16, 524–529 (2011). https://doi.org/10.1007/s12204-011-1189-1

Download citation

Received: 11 July 2010
Published: 02 October 2011
Issue Date: October 2011
DOI: https://doi.org/10.1007/s12204-011-1189-1

Key words

CLC number

TP 391.4

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigation of improved approaches to bayes risk decoding

Abstract

Access this article

Similar content being viewed by others

Enriching Confusion Networks for Post-processing

An optimized iterative clustering framework for recognizing speech

A Unified Confidence Measure Framework Using Auxiliary Normalization Graph

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Investigation of improved approaches to bayes risk decoding

Abstract

Access this article

Similar content being viewed by others

Enriching Confusion Networks for Post-processing

An optimized iterative clustering framework for recognizing speech

A Unified Confidence Measure Framework Using Auxiliary Normalization Graph

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation