Engine-Independent ASR Error Management for Dialog Systems

Choi, Junhwi; Lee, Donghyeon; Ryu, Seounghan; Lee, Kyusong; Kim, Kyungduk; Noh, Hyungjong; Lee, Gary Geunbae

doi:10.1007/978-3-319-21834-2_17

Junhwi Choi⁵,
Donghyeon Lee⁵,
Seounghan Ryu⁵,
Kyusong Lee⁵,
Kyungduk Kim⁵,
Hyungjong Noh⁶ &
…
Gary Geunbae Lee⁵

Part of the book series: Signals and Communication Technology ((SCT))

749 Accesses
1 Citations

Abstract

This paper describes a method of ASR (automatic speech recognition) engine independent error correction for a dialog system. The proposed method can correct ASR errors only with a text corpus which is used for training of the target dialog system, and it means that the method is independent of the ASR engine. We evaluated our method on two test corpora (Korean and English) that are parallel corpora including ASR results and their correct transcriptions. Overall results indicate that the method decreases the word error rate of the ASR results and recovers the errors in the important attributes of the dialog system. The method is general and can also be applied to the other speech based applications such as voice question-answering and speech information extraction systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Most of the commercial ASR engine is provided as a whole system in binary code.

References

Jeong M, Jung S, Lee GG (2004) Speech recognition error correction using maximum entropy language model. In: Proceedings of the international speech communication association, pp 2137-2140
Google Scholar
Ringger EK, Allen JF (1996) Error correction via a post-processor for continuous speech recognition. In: Proceedings of IEEE international conference on the acoustics, speech and signal processing, pp 427-430
Google Scholar
Ringger EK, Allen JF (1996) A fertility channel model for post correction of continuous speech recognition. In: Proceedings of international conference on spoken language processing, pp 897-900
Google Scholar
Brandow RL, Strzalkowski T (2000) Improving speech recognition through text-based linguistic post-procesing. United States, Patent 6064957
Google Scholar
Williams JD, Young S (2007) Partially observable Markov decision processes for spoken dialog systems. J Comput Speech Lang 21(2):393-422
Article Google Scholar
Liu Y, Shriberg E, Stolcke A (2003) Automatic disfluency identification in conversational speech using multiple knowledge sources. In: Proceedings of the international speech communication association
Google Scholar
Sarma A, Palmer DD (2004) Context-based speech recognition error detection and correction. In: Proceedings of the human language technology conference of the north American chapter of the association for computational linguistics, pp 85-88
Google Scholar
Choi J, Kim K, Lee S, Kim S, Lee D, Lee I, Lee GG (2012) Seamless error correction interface for voice word processor. In: Proceedings of IEEE international conference on the acoustics, speech and signal processing, pp 4973-4976
Google Scholar
Jeong M, Lee GG (2006) Jointly predicting dialog act and named entity for statistical spoken language understanding. In: Proceedings of the IEEE/ACL workshop on spoken language technology, pp 66-69
Google Scholar

Download references

Acknowledgments

This work was partly supported by the IT R&D program of MSIP/KEIT [10044508, Development of Non-Symbolic Approach-based Human-Like-Self-Taught Learning Intelligence Technology] and by the Quality of Life Technology (QoLT) development program of MKE [10036458, Development of Voice Word-processor and Voice-controlled Computer Software for Physical Handicapped Person].

Author information

Authors and Affiliations

Pohang University of Science and Technology, Pohang, Gyungbuk, Korea
Junhwi Choi, Donghyeon Lee, Seounghan Ryu, Kyusong Lee, Kyungduk Kim & Gary Geunbae Lee
Samsung Electronics, Suwon, Gyeonggi, Korea
Hyungjong Noh

Authors

Junhwi Choi
View author publications
You can also search for this author in PubMed Google Scholar
Donghyeon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Seounghan Ryu
View author publications
You can also search for this author in PubMed Google Scholar
Kyusong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Kyungduk Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hyungjong Noh
View author publications
You can also search for this author in PubMed Google Scholar
Gary Geunbae Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junhwi Choi .

Editor information

Editors and Affiliations

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Alexander Rudnicky
Cupertino, California, USA
Antoine Raux
Silicon Valley, Carnegie Mellon University, Moffett Field, California, USA
Ian Lane
Mountain View, California, USA
Teruhisa Misu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Choi, J. et al. (2016). Engine-Independent ASR Error Management for Dialog Systems. In: Rudnicky, A., Raux, A., Lane, I., Misu, T. (eds) Situated Dialog in Speech-Based Human-Computer Interaction. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-21834-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-21834-2_17
Published: 21 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21833-5
Online ISBN: 978-3-319-21834-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics