Abstract
This chapter describes the applications of multilingual phone recognition in code-switched and non-code-switched scenarios. It compares two approaches for multilingual phone recognition using code-switched and non-code-switched test sets. The development and evaluation of Multi-PRSs using LID-Mono and common multilingual phone-set based approaches are described. The analysis and comparison of the results are provided. The code-switched speech recognition using Multi-PRSs is studied using code-switched speech data of Kannada and Urdu languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
C. Chang, C. Lin, LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011). https://doi.org/10.1145/1961189.1961199
K.E. Manjunath, S.S. Kumar, D. Pati, B. Satapathy, K.S. Rao, Development of consonant-vowel recognition systems for Indian languages: Bengali and Odia, in IEEE India Conference on Emerging Trends and Innovation in Technology (INDICON) (2013), pp. 1–6. https://doi.org/10.1109/INDCON.2013.6726109
scikit-learn, Scikit-learn: machine learning in Python—Online documentation. https://scikit-learn.org [Accessed Mar. 08, 2020]
N. Dehak, P.A. Torres-Carrasquillo, D. Reynolds, R. Dehak, Language recognition via i-vectors and dimensionality reduction, in INTERSPEECH (2011), pp. 857–860
D.A. Reynolds, T.F. Quatieri, R.B. Dunn, Speaker verification using adapted Gaussian mixture models, in Digital Signal Processing (2000), pp. 19–41
B. Jiang, Y. Song, S. Wei, J.H. Liu, I. McLoughlin, L. Dai, Deep bottleneck features for spoken language identification. PLoS ONE 9(4), 1–11 (2014). https://doi.org/10.1371/journal.pone.0100795
B. Jiang, Y. Song, S. Wei, M. Wang, I. McLoughlin, L. Dai, Performance evaluation of deep bottleneck features for spoken language identification, in International Symposium on Chinese Spoken Language Processing (2014), pp. 143–147. https://doi.org/10.1109/ISCSLP.2014.6936580
B. Padi, S. Ramoji, V. Yeruva, S. Kumar, S. Ganapathy, The LEAP language recognition system for LRE 2017 challenge—improvements and error analysis, in Odyssey: The Speaker and Language Recognition Workshop (2018), pp. 31–38. https://doi.org/10.21437/Odyssey.2018-5
K.E. Manjunath, K.M.S. Raghavan, K.S. Rao, D.B. Jayagopi, V. Ramasubramanian, Multilingual phone recognition: comparison of traditional versus common multilingual phone-set approaches and applications in code-switching, in International Symposium on Signal Processing and Intelligent Recognition Systems, Thiruvananthapuram (2019). https://doi.org/10.1007/978-981-15-4828-4_7
K.E. Manjunath, K.M.S. Raghavan, K.S. Rao, D.B. Jayagopi, V. Ramasubramanian, Approaches for multilingual phone recognition in code-switched and non-code-switched scenarios using Indian languages, in ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) (2020)
A. Waibel, H. Soltau, T. Schultz, T. Schaaf, F. Metze, Multilingual Speech Recognition, in Verbmobil: Foundations of Speech-to-Speech Translation. Artificial Intelligence (Springer, Berlin, 2000), pp. 33–45. https://doi.org/10.1007/978-3-662-04230-4_3
H. Lin, J.T. Huang, F. Beaufays, B. Strope, Y. Sung, Recognition of multilingual speech in mobile applications, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto (2012), pp. 4881–4884. https://doi.org/10.1109/ICASSP.2012.6289013
J.G. Dominguez, D. Eustis, I.L. Moreno, A. Senior, F. Beaufays, P.J. Moreno, A real-time end-to-end multilingual speech recognition architecture. IEEE J. Sel. Top. Signal Process. 9(4), 749–759 (2015). https://doi.org/10.1109/JSTSP.2014.2364559
A.K.V. SaiJayram, V. Ramasubramanian, T.V. Sreenivas, Language identification using parallel sub-word recognition, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSAP) (2003), pp. I–32. https://doi.org/10.1109/ICASSP.2003.1198709
S.A. Santosh Kumar, V. Ramasubramanian, Automatic language identification using ergodic-hmm, in ICASSP (2005), pp. 609–612. https://doi.org/10.1109/ICASSP.2005.1415187
L. Mary, B. Yegnanarayana, Autoassociative neural network models for language identification, in International Conference on Intelligent Sensing and Information Processing (2004), pp. 317–320. https://doi.org/10.1109/ICISIP.2004.1287674
D. Nandi, D. Pati, K.S. Rao, Implicit processing of LP residual for language identification. Comput. Speech Lang. 41(C), 68–87 (2017). https://doi.org/10.1016/j.csl.2016.06.002
T. Nagarajan, H.A. Murthy, A pair-wise multiple codebook approach to implicit language identification, in Workshop on Spoken Language Processing (2003), pp. 101–108
M. Li, H. Suo, X. Wu, P. Lu, Y. Yan, Spoken language identification using score vector modeling and support vector machine, in INTERSPEECH (2007), pp. 350–353
W.M. Campbell, E. Singer, P.A. Torres-Carrasquillo, D.A. Reynolds, Language recognition with support vector machines, in Proceedings of the Odyssey: The Speaker and Language Recognition Workshop (2004), pp. 285–288
W.M. Campbell, J.P. Campbell, D.A. Reynolds, E. Singer, P.A. Torres-Carrasquillo, Support vector machines for speaker and language recognition. Comput. Speech Language 20(2–3), 210–229 (2006). https://doi.org/10.1016/j.csl.2005.06.003
Sclite Tool. http://www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/sclite.htm [Accessed Mar. 08, 2020]
S.M. Siniscalchi, D. Lyu, T. Svendsen, C. Lee, Experiments on cross-language attribute detection and phone recognition with minimal target-specific training data. IEEE Trans. Acoust. Speech Signal Process. 20(3), 875–887 (2012). https://doi.org/10.1109/TASL.2011.2167610
T. Schultz, A. Waibel, Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35, 31–51 (2001). https://doi.org/10.1016/S0167-6393(00)00094-7
T. Schultz, A. Waibel, Language independent and language adaptive large vocabulary speech recognition, in International Conference on Spoken Language Processing (ICSLP) (1998), pp. 1819–1822
T. Schultz, A. Waibel, Multilingual and crosslingual speech recognition, in Proceedings of the DARPA Workshop on Broadcast News Transcription and Understanding (1998), pp. 259–262
T. Schultz, K. Kirchhoff, Multilingual Speech Processing (Academic Press, New York, 2006). https://doi.org/10.1016/B978-0-12-088501-5.X5000-8
C.S. Kumar, V.P. Mohandas, L. Haizhou, Multilingual speech recognition: a unified approach, in INTERSPEECH (2005)
N.T. Vu, D. Imseng, D. Povey, P. Motlicek, T. Schultz, H. Bourlard, Multilingual deep neural network based acoustic modeling for rapid language adaptation, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence (2014), pp. 7639–7643. https://doi.org/10.1109/ICASSP.2014.6855086
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Manjunath, K.E. (2022). Applications of Multilingual Phone Recognition in Code-Switched and Non-code-Switched Scenarios. In: Multilingual Phone Recognition in Indian Languages. SpringerBriefs in Speech Technology. Springer, Cham. https://doi.org/10.1007/978-3-030-80741-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-80741-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80740-5
Online ISBN: 978-3-030-80741-2
eBook Packages: EngineeringEngineering (R0)