Abstract
This chapter provides an overview of existing language identification systems. Existing language-specific features applied for LID study have been highlighted. The reasons for attraction towards developing implicit LID systems are explained and finally the motivation for the present work has been discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Muthusamy YK, Cole RA, BT Oshika (1992) The OGI multi-language telephone speech corpus. In: Proceedings of international conference spoken language processing, pp 895–898, Oct 1992
LDC (1996) Philadelphia, PA. http://www.ldc.upenn.edu/Catalog. LDC96S46–LDC96S60
Muthusamy YK, Jain N, Cole RA (1994) Perceptual benchmarks for automatic language identification. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, vol 1, pp 333–336, April 1994
Lamel LF, Gauvain JL (1993) Cross lingual experiments with phone recognition. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, pp 507–510, April 1993
Lamel LF, Gauvain JL (1994) Language identification using phonebased acoustic likelihoods. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing,vol 1, pp 293–296, April 1994
Berkling KM, Arai T, Bernard E (1994) Analysis of phoneme based features for langugae identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 289–292, April 1994
Hazen TJ, Zue VW (1994) Recent improvements in an approach to segement-based automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1883–1886, Sept 1994
Andersen O, Dalsgaard P, Barry W (1994) On the use of datadriven clustering technique for identification of poly and mono-phonemes for four European languages. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 121–124, April 1994
Tucker RCF, Carey MJ, Paris ES (1994) Automatic language identification using sub-words models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 301–304, April 1994
Zissman MA, Singer E (1994) Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, (ICASSP-94), vol 1, pp I/305-I/308, 1994
Zissman MA (1996) Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans Speech Audio Process 4:31–44
Kadambe S, Hieronymus JL (1995) Language identification with phonological and lexical models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 3507–3510, May 1995
Yan Y, Barnard E (1995) Analysis approach to automatic langauge identification based on language-dependent phone recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol 5, pp 3511–3514, May 1995
Navratil J, Zuhlke W (1997) Phonetic-context mapping in language identification. In: Proceedings of EUROSPEECH, vol 1, (Greece), pp 71–74, Sept 1997
Navratil J (2001) Spoken language recognition a step toward multilinguality in speech processing. IEEE Trans Speech Audio Process 9:678–685 Sep
Hazen TJ, Zue VW (1997) Segment-based automatic language identification. J Acoust Soc Am 101:2323–2331
Kirchhoff K, Parandekar S (2001) Multi-stream statistical N-gram modeling with application to automatic language identification. In Proceeding of EUROSPEECH-2001, pp 803–806, 2001
Prasad VK (2003) Segmentation and recognition of continuous speech. Ph.D. thesis, Indian Institute of Technology, Department of Computer Science and Engineering, Madras, India, 2003
Ramasubramanian V, Jayaram AKVS, Sreenivas TV (2003) Language identification using parallel phone recognition. In: WSLP, TIFR, (Mumbai), pp 109–116, Jan 2003
Gauvain J, Messaoudi A, Schwenk H (2004) Language recognition using phone latices. In: Proceedings of INTERSPEECH-2004, pp 25–28, 2004
Shen W, Campbell W, Gleason T, Reynolds D, Singer E (2006) Experiments with lattice-based PPRLM language identification. In: Proceedings on IEEE Odyssey 2006: speaker and language recognition workshop, pp 1–6, 2006
Gleason TP, Zissman MA (2001) Composite background models and score standardization for language identification systems. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP01), vol 1, pp 529–532, 2001
Cordoba R, Dharo L, Fernandez-Martinez F, Macias-Guarasa J, Ferreiros J (2007) Language identification based on n-gram frequency ranking. In: Proceedings of EUROSPEECH-2007, pp 2137–2140, 2007
Li H, Ma B, Lee C-H (2007) A vector space modeling approach to spoken language identification. IEEE Trans Audio, Speech Lang Process 15:271–284
Chai SK, Haizhou L (2008) On acoustic diversification front-end for spoken language identification. IEEE Trans Audio, Speech, Lang Process 16:1029–1037
Tong R, Ma B, Li H, Chng E (2008) Target-oriented phone selection from universal phone set for spoken language recognition. In: Proceedings of INTERSPEECH-2008
You J-L, Chen Y-N, Chu M, Soong FK, Wang J-L (2008) Identifying language origin of named entity with multiple information sources. IEEE Trans Audio, Speech, Lang Process 16:1077–1086 Auguest
Botha GR, Barnard E (2012) Factors that affect the accuracy of text-based language identification. Compu Speech Lang 26:307–320
Zissman MA, Berkling KM (2001) Automatic language identification. Speech Comm 35:115–124
Martin AF, Przybocki MA (2003) NIST 2003 language recognition evaluation. In: Proceedings of EUROSPEECH (Geneva, Switzerland), pp 1341–1344, Sept 2003
Leonard RG, Doddington GR (1974) Automatic language identification. Technical report, A.F.R.A.D. Centre Technical Report RADC-TR-74-200, 1974
House AS, Neuburg EP (1977) Toward automatic identification of the language of an utterance. J Acoust Soc Am 62:708–713
Cimarusti D, Eves RB (1982) Development of an automatic identification system of spoken languages: phase I. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1661–1663, May 1982
Eady S (1982) Differences in F0 patterns of speech: tone languages versus stress language. Lang Speech 25:29–42
Ives R (1986) A minimal rule AI expert system for real-time classification of natural spoken languages. In: Proceedings of 2nd annual artificial intelligence and advanced computer technology conference, pp 337–340, 1986
Foil JT (1986) Language identification using noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 861–864, April 1986
Goodman FJ, Martin AF, Wohlford RE (1989) Improved automatic language identification in noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 528–531, May 1989
Muthusamy YK, Cole RA, Gopalakrishnan M (1991) A segment-based approach to automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech,and signal processing, vol 1, pp 353–356, April 1991
Sugiyama M (1991) Automatic language recognition using acoustic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 813–816, May 1991
Riek L, Mistreta W, Morgan D (1991) Experiments in language identification. Technical Report, Lockheed Sanders Technical Report SPCOT-91-002, 1991
Nakagawa S, Ueda Y, Seino T (1992) Speaker-independent, text independent language identification by HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1992), pp 1011–1014, 1992
Zissman MA (1993) Automatic langauge identification using Gaussian mixture and hidden Markov models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 399–402, April 1993
Itahashi S, Zhou J, Tanaka K (1994) Spoken language discrimination using speech fundamental frequency. In: Proceedings of international conference on spoken language processing (ICSLP-1994), pp 1899–1902, 1994
Shuichi I, Liang D (1995) Language identification based on speech fundamental frequency. In: Proceedings of EUROSPEECH-1995, pp 1359–1362, 1995
Li K (1994) Automatic language identification using syllabic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 297–300, 1994
Pellegrino F, Andre-Abrecht R (1999) An unsupervised approach to language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 833–836, 1999
Carrasquillo PAT, Reynolds DA, Deller JR (2002) Language identification using Gaussian mixture model tokenization. In: Proceedings of IEEE international conference on acoustics,speech, and signal processing, vol I, pp 757–760, 2002
Torres-Carrasquillo P, Singer E, Kohler M, Greene R, Reynolds D, Deller JJ (2002) Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings international conference on spoken language processing (ICSLP-2002), 2002
Corredor-Ardoy C, Gauvain J, Adda-Decker M, Lamel L (1997) Language identification with language-independent acoustic models. In: Proceedings of EUROSPEECH-1997, pp 55–58, 1997
Dalsgaard P, Andersen O (1992) Identification of mono- and polyphonemes using acoustic-phonetic features derived by a self-organising neural network. In: Proceedings of International conference spoken language processing (ICSLP-1992), pp 547–550, 1992
Pellegrino F, Farinas J, Andr-Obrecht R (1992) Comparison of two phonetic approaches to language identification. In: Proceedings of EUROSPEECH99, pp 399–402, 1999
Ueda Y, Nakagawa S (1990) Diction for phoneme/syllable/word-category and identification of language using HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1990), pp 1209–1212, 1990
Cole RA, Inouye JWT, Muthusamy YK, Gopalakrishnan M (1989) Language identification with neural networks: A feasibility study. In: Proceedings of IEEE Pacific rim conference communications, computers and signal processing, pp 525–529
Braun J, Levkowitz H (1998) Automatic language identification with perceptually guided training and recurrent neural networks. In: Proceedings of international conference on spoken language processing (ICSLP-1998), 1998
Wong E, Sridharan S (2002) Gaussian mixture model based language identification system. In: Proceedings international conference spoken language processing (ICSLP-2002), pp 93–96, 2002
Campbell W, Singera E, Torres-Carrasquillo P, Reynolds D (2004) Language recognition with support vector machines. In Proceedings of ODYSSEY- 2004:2004
Lu-Feng Z, Man-hung S, Xi Y, Gish H (2006) Discriminatively trained language models using support vector machines for language identification. In: Proceedings of speaker and language recognition workshop, 2006. IEEE Odyssey, pp 1–6
Castaldo F, Dalmasso E, Laface P, Colibro D, Vair C (2007) Language identification using acoustic models and speaker compensated cepstral-time matrices. In: IEEE international conference on acoustics, speech and signal processing (ICASSP 2007), pp IV-1013IV-1016, 2007
Noor E, Aronowitz H (2006) Efficient language identification using anchor models and support vector machines. In: Proceedings of IEEE Odyssey 2006 speaker and language recognition workshop, pp 1–6, 2006
Lin C, Wang H (2006) Language identification using pitch contour information in the ergodic Markov model. In: Proceedings of 2006 IEEE international conference on acoustics, speech, and signal processing (ICASSP 2006), pp I-I, 2006
Rouas J-L, Farinas J, Pellegrino F, Andr-Obrecht R (2005) Rhythmic unit extraction and modelling for automatic language identification. Speech Commun 47:436–456
Wu C-H, Chiu Y-H, Shia C-J, Lin C-Y (2006) Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. IEEE Trans Audio Speech Lang Process 14:266–276
Rouas JL (2007) Automatic prosodic variations modeling for language and dialect discrimination. IEEE Trans Audio, Speech, Lang Process 15:1904–1911
Siu M-H, Yang X, Gish H (2009) Discriminatively trained GMMs for language classification using boosting methods. IEEE Trans Audio, Speech, Lang Process 17:187–197
Sangwan A, Mehrabani M, Hansen JHL (2010) Automatic language analysis and identification based on speech production knowledge. In: ICASSP, 2010
Martnez D, Burget L, Ferrer L, Scheffer N (2012) iVector-based prosodic system for language Identification. In: ICASSP, 2012
Jyotsna B, Murthy HA, Nagarajan T (2000) Language identification from short segments of speech. In: Proceedings of international conference on spoken language processing (Beijing, China), pp 1033–1036, Oct 2000
Nagarajan T, Murthy HA (2002) Language identification using spectral vector distribution across languages. In: Proceedings of international conference on natural language processing, pp 327–335, 2002
Jayaram AKVS, Ramasubramanian V, Sreenivas TV (2003) Language identification using parallel sub-word recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol I, pp 32–35, 2003
Mary L, Yegnanarayana B (2004) Autoassociative neural network models for language identification. In: Proceedings of international conference on intelligent sensing and information processing (Chennai, India), pp 317–320, 2004
Mary L, Rao KS, Yegnanarayana B (2005) Neural network classifiers for language identification using syntactic and prosodic features. In: Proceedings of IEEE international conference on Intelligent sensing and information processing (Chennai, India), pp 404–408, Jan 2005
Mary L, Yegnanarayana B (2008) Extraction and representation of prosodic features for language and speaker recognition. Speech Commun 50:782–796
Sreenivasa Rao K, Koolagudi SG (2011) Identification of hindi dialects and emotions using spectral and prosodic features of speech. J Syst Cybern Inform 9(4):24–33
Yadav J, Sreenivasa Rao K (2014) Emotional-speech synthesis from neutral-speech using prosody imposition. In: International conference on recent trends in computer science and engineering (ICRTCSE-2014), Central University of Bihar, Patna, India, Feb 8–9, 2014
Koolagudi SG, Rastogi D, Sreenivasa Rao K (2012) Spoken language identification using spectral features. Communications in computer and information science (CCIS): contemporary computing, vol 306. Springer, New York, pp 496–497
Greenberg S (1999) Speaking in short hand–a syllable-centric perspective for understanding pronunciation variation. Speech Comm 29:159–176
Maity S, Vuppala AK, Rao KS, Nandi D (2012) IITKGP-MLILSC speech database for language identification. In: National Conference Communication, Feb 2012
Rao KS, Maity S, Reddy VR (2013) Pitch synchronous and glottal closure based speech analysis for language recognition. Int J Speech Technol (Springer) 16(4):413–430
Ramu Reddy V, Maity S, Sreenivasa Rao K (2013) Recognition of Indian languages using multi-level spectral and prosodic features. Int J Speech Technol (Springer) 16(4):489–510
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 The Author(s)
About this chapter
Cite this chapter
Rao, K.S., Reddy, V.R., Maity, S. (2015). Literature Review. In: Language Identification Using Spectral and Prosodic Features. SpringerBriefs in Electrical and Computer Engineering(). Springer, Cham. https://doi.org/10.1007/978-3-319-17163-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-17163-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17162-3
Online ISBN: 978-3-319-17163-0
eBook Packages: EngineeringEngineering (R0)