Literature Review

Rao, K. Sreenivasa; Reddy, V. Ramu; Maity, Sudhamay

doi:10.1007/978-3-319-17163-0_2

Literature Review

K. Sreenivasa Rao⁵,
V. Ramu Reddy⁶ &
Sudhamay Maity⁷

Chapter
First Online: 01 January 2015

634 Accesses

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSPEECHTECH))

Abstract

This chapter provides an overview of existing language identification systems. Existing language-specific features applied for LID study have been highlighted. The reasons for attraction towards developing implicit LID systems are explained and finally the motivation for the present work has been discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Muthusamy YK, Cole RA, BT Oshika (1992) The OGI multi-language telephone speech corpus. In: Proceedings of international conference spoken language processing, pp 895–898, Oct 1992
Google Scholar
LDC (1996) Philadelphia, PA. http://www.ldc.upenn.edu/Catalog. LDC96S46–LDC96S60
Muthusamy YK, Jain N, Cole RA (1994) Perceptual benchmarks for automatic language identification. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, vol 1, pp 333–336, April 1994
Google Scholar
Lamel LF, Gauvain JL (1993) Cross lingual experiments with phone recognition. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, pp 507–510, April 1993
Google Scholar
Lamel LF, Gauvain JL (1994) Language identification using phonebased acoustic likelihoods. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing,vol 1, pp 293–296, April 1994
Google Scholar
Berkling KM, Arai T, Bernard E (1994) Analysis of phoneme based features for langugae identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 289–292, April 1994
Google Scholar
Hazen TJ, Zue VW (1994) Recent improvements in an approach to segement-based automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1883–1886, Sept 1994
Google Scholar
Andersen O, Dalsgaard P, Barry W (1994) On the use of datadriven clustering technique for identification of poly and mono-phonemes for four European languages. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 121–124, April 1994
Google Scholar
Tucker RCF, Carey MJ, Paris ES (1994) Automatic language identification using sub-words models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 301–304, April 1994
Google Scholar
Zissman MA, Singer E (1994) Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, (ICASSP-94), vol 1, pp I/305-I/308, 1994
Google Scholar
Zissman MA (1996) Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans Speech Audio Process 4:31–44
Article Google Scholar
Kadambe S, Hieronymus JL (1995) Language identification with phonological and lexical models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 3507–3510, May 1995
Google Scholar
Yan Y, Barnard E (1995) Analysis approach to automatic langauge identification based on language-dependent phone recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol 5, pp 3511–3514, May 1995
Google Scholar
Navratil J, Zuhlke W (1997) Phonetic-context mapping in language identification. In: Proceedings of EUROSPEECH, vol 1, (Greece), pp 71–74, Sept 1997
Google Scholar
Navratil J (2001) Spoken language recognition a step toward multilinguality in speech processing. IEEE Trans Speech Audio Process 9:678–685 Sep
Google Scholar
Hazen TJ, Zue VW (1997) Segment-based automatic language identification. J Acoust Soc Am 101:2323–2331
Article Google Scholar
Kirchhoff K, Parandekar S (2001) Multi-stream statistical N-gram modeling with application to automatic language identification. In Proceeding of EUROSPEECH-2001, pp 803–806, 2001
Google Scholar
Prasad VK (2003) Segmentation and recognition of continuous speech. Ph.D. thesis, Indian Institute of Technology, Department of Computer Science and Engineering, Madras, India, 2003
Google Scholar
Ramasubramanian V, Jayaram AKVS, Sreenivas TV (2003) Language identification using parallel phone recognition. In: WSLP, TIFR, (Mumbai), pp 109–116, Jan 2003
Google Scholar
Gauvain J, Messaoudi A, Schwenk H (2004) Language recognition using phone latices. In: Proceedings of INTERSPEECH-2004, pp 25–28, 2004
Google Scholar
Shen W, Campbell W, Gleason T, Reynolds D, Singer E (2006) Experiments with lattice-based PPRLM language identification. In: Proceedings on IEEE Odyssey 2006: speaker and language recognition workshop, pp 1–6, 2006
Google Scholar
Gleason TP, Zissman MA (2001) Composite background models and score standardization for language identification systems. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP01), vol 1, pp 529–532, 2001
Google Scholar
Cordoba R, Dharo L, Fernandez-Martinez F, Macias-Guarasa J, Ferreiros J (2007) Language identification based on n-gram frequency ranking. In: Proceedings of EUROSPEECH-2007, pp 2137–2140, 2007
Google Scholar
Li H, Ma B, Lee C-H (2007) A vector space modeling approach to spoken language identification. IEEE Trans Audio, Speech Lang Process 15:271–284
Article Google Scholar
Chai SK, Haizhou L (2008) On acoustic diversification front-end for spoken language identification. IEEE Trans Audio, Speech, Lang Process 16:1029–1037
Article Google Scholar
Tong R, Ma B, Li H, Chng E (2008) Target-oriented phone selection from universal phone set for spoken language recognition. In: Proceedings of INTERSPEECH-2008
Google Scholar
You J-L, Chen Y-N, Chu M, Soong FK, Wang J-L (2008) Identifying language origin of named entity with multiple information sources. IEEE Trans Audio, Speech, Lang Process 16:1077–1086 Auguest
Google Scholar
Botha GR, Barnard E (2012) Factors that affect the accuracy of text-based language identification. Compu Speech Lang 26:307–320
Article Google Scholar
Zissman MA, Berkling KM (2001) Automatic language identification. Speech Comm 35:115–124
MATH Google Scholar
Martin AF, Przybocki MA (2003) NIST 2003 language recognition evaluation. In: Proceedings of EUROSPEECH (Geneva, Switzerland), pp 1341–1344, Sept 2003
Google Scholar
Leonard RG, Doddington GR (1974) Automatic language identification. Technical report, A.F.R.A.D. Centre Technical Report RADC-TR-74-200, 1974
Google Scholar
House AS, Neuburg EP (1977) Toward automatic identification of the language of an utterance. J Acoust Soc Am 62:708–713
Article Google Scholar
Cimarusti D, Eves RB (1982) Development of an automatic identification system of spoken languages: phase I. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1661–1663, May 1982
Google Scholar
Eady S (1982) Differences in F0 patterns of speech: tone languages versus stress language. Lang Speech 25:29–42
Google Scholar
Ives R (1986) A minimal rule AI expert system for real-time classification of natural spoken languages. In: Proceedings of 2nd annual artificial intelligence and advanced computer technology conference, pp 337–340, 1986
Google Scholar
Foil JT (1986) Language identification using noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 861–864, April 1986
Google Scholar
Goodman FJ, Martin AF, Wohlford RE (1989) Improved automatic language identification in noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 528–531, May 1989
Google Scholar
Muthusamy YK, Cole RA, Gopalakrishnan M (1991) A segment-based approach to automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech,and signal processing, vol 1, pp 353–356, April 1991
Google Scholar
Sugiyama M (1991) Automatic language recognition using acoustic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 813–816, May 1991
Google Scholar
Riek L, Mistreta W, Morgan D (1991) Experiments in language identification. Technical Report, Lockheed Sanders Technical Report SPCOT-91-002, 1991
Google Scholar
Nakagawa S, Ueda Y, Seino T (1992) Speaker-independent, text independent language identification by HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1992), pp 1011–1014, 1992
Google Scholar
Zissman MA (1993) Automatic langauge identification using Gaussian mixture and hidden Markov models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 399–402, April 1993
Google Scholar
Itahashi S, Zhou J, Tanaka K (1994) Spoken language discrimination using speech fundamental frequency. In: Proceedings of international conference on spoken language processing (ICSLP-1994), pp 1899–1902, 1994
Google Scholar
Shuichi I, Liang D (1995) Language identification based on speech fundamental frequency. In: Proceedings of EUROSPEECH-1995, pp 1359–1362, 1995
Google Scholar
Li K (1994) Automatic language identification using syllabic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 297–300, 1994
Google Scholar
Pellegrino F, Andre-Abrecht R (1999) An unsupervised approach to language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 833–836, 1999
Google Scholar
Carrasquillo PAT, Reynolds DA, Deller JR (2002) Language identification using Gaussian mixture model tokenization. In: Proceedings of IEEE international conference on acoustics,speech, and signal processing, vol I, pp 757–760, 2002
Google Scholar
Torres-Carrasquillo P, Singer E, Kohler M, Greene R, Reynolds D, Deller JJ (2002) Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings international conference on spoken language processing (ICSLP-2002), 2002
Google Scholar
Corredor-Ardoy C, Gauvain J, Adda-Decker M, Lamel L (1997) Language identification with language-independent acoustic models. In: Proceedings of EUROSPEECH-1997, pp 55–58, 1997
Google Scholar
Dalsgaard P, Andersen O (1992) Identification of mono- and polyphonemes using acoustic-phonetic features derived by a self-organising neural network. In: Proceedings of International conference spoken language processing (ICSLP-1992), pp 547–550, 1992
Google Scholar
Pellegrino F, Farinas J, Andr-Obrecht R (1992) Comparison of two phonetic approaches to language identification. In: Proceedings of EUROSPEECH99, pp 399–402, 1999
Google Scholar
Ueda Y, Nakagawa S (1990) Diction for phoneme/syllable/word-category and identification of language using HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1990), pp 1209–1212, 1990
Google Scholar
Cole RA, Inouye JWT, Muthusamy YK, Gopalakrishnan M (1989) Language identification with neural networks: A feasibility study. In: Proceedings of IEEE Pacific rim conference communications, computers and signal processing, pp 525–529
Google Scholar
Braun J, Levkowitz H (1998) Automatic language identification with perceptually guided training and recurrent neural networks. In: Proceedings of international conference on spoken language processing (ICSLP-1998), 1998
Google Scholar
Wong E, Sridharan S (2002) Gaussian mixture model based language identification system. In: Proceedings international conference spoken language processing (ICSLP-2002), pp 93–96, 2002
Google Scholar
Campbell W, Singera E, Torres-Carrasquillo P, Reynolds D (2004) Language recognition with support vector machines. In Proceedings of ODYSSEY- 2004:2004
Google Scholar
Lu-Feng Z, Man-hung S, Xi Y, Gish H (2006) Discriminatively trained language models using support vector machines for language identification. In: Proceedings of speaker and language recognition workshop, 2006. IEEE Odyssey, pp 1–6
Google Scholar
Castaldo F, Dalmasso E, Laface P, Colibro D, Vair C (2007) Language identification using acoustic models and speaker compensated cepstral-time matrices. In: IEEE international conference on acoustics, speech and signal processing (ICASSP 2007), pp IV-1013IV-1016, 2007
Google Scholar
Noor E, Aronowitz H (2006) Efficient language identification using anchor models and support vector machines. In: Proceedings of IEEE Odyssey 2006 speaker and language recognition workshop, pp 1–6, 2006
Google Scholar
Lin C, Wang H (2006) Language identification using pitch contour information in the ergodic Markov model. In: Proceedings of 2006 IEEE international conference on acoustics, speech, and signal processing (ICASSP 2006), pp I-I, 2006
Google Scholar
Rouas J-L, Farinas J, Pellegrino F, Andr-Obrecht R (2005) Rhythmic unit extraction and modelling for automatic language identification. Speech Commun 47:436–456
Article Google Scholar
Wu C-H, Chiu Y-H, Shia C-J, Lin C-Y (2006) Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. IEEE Trans Audio Speech Lang Process 14:266–276
Article Google Scholar
Rouas JL (2007) Automatic prosodic variations modeling for language and dialect discrimination. IEEE Trans Audio, Speech, Lang Process 15:1904–1911
Article Google Scholar
Siu M-H, Yang X, Gish H (2009) Discriminatively trained GMMs for language classification using boosting methods. IEEE Trans Audio, Speech, Lang Process 17:187–197
Article Google Scholar
Sangwan A, Mehrabani M, Hansen JHL (2010) Automatic language analysis and identification based on speech production knowledge. In: ICASSP, 2010
Google Scholar
Martnez D, Burget L, Ferrer L, Scheffer N (2012) iVector-based prosodic system for language Identification. In: ICASSP, 2012
Google Scholar
Jyotsna B, Murthy HA, Nagarajan T (2000) Language identification from short segments of speech. In: Proceedings of international conference on spoken language processing (Beijing, China), pp 1033–1036, Oct 2000
Google Scholar
Nagarajan T, Murthy HA (2002) Language identification using spectral vector distribution across languages. In: Proceedings of international conference on natural language processing, pp 327–335, 2002
Google Scholar
Jayaram AKVS, Ramasubramanian V, Sreenivas TV (2003) Language identification using parallel sub-word recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol I, pp 32–35, 2003
Google Scholar
Mary L, Yegnanarayana B (2004) Autoassociative neural network models for language identification. In: Proceedings of international conference on intelligent sensing and information processing (Chennai, India), pp 317–320, 2004
Google Scholar
Mary L, Rao KS, Yegnanarayana B (2005) Neural network classifiers for language identification using syntactic and prosodic features. In: Proceedings of IEEE international conference on Intelligent sensing and information processing (Chennai, India), pp 404–408, Jan 2005
Google Scholar
Mary L, Yegnanarayana B (2008) Extraction and representation of prosodic features for language and speaker recognition. Speech Commun 50:782–796
Article Google Scholar
Sreenivasa Rao K, Koolagudi SG (2011) Identification of hindi dialects and emotions using spectral and prosodic features of speech. J Syst Cybern Inform 9(4):24–33
Google Scholar
Yadav J, Sreenivasa Rao K (2014) Emotional-speech synthesis from neutral-speech using prosody imposition. In: International conference on recent trends in computer science and engineering (ICRTCSE-2014), Central University of Bihar, Patna, India, Feb 8–9, 2014
Google Scholar
Koolagudi SG, Rastogi D, Sreenivasa Rao K (2012) Spoken language identification using spectral features. Communications in computer and information science (CCIS): contemporary computing, vol 306. Springer, New York, pp 496–497
Google Scholar
Greenberg S (1999) Speaking in short hand–a syllable-centric perspective for understanding pronunciation variation. Speech Comm 29:159–176
Article Google Scholar
Maity S, Vuppala AK, Rao KS, Nandi D (2012) IITKGP-MLILSC speech database for language identification. In: National Conference Communication, Feb 2012
Google Scholar
Rao KS, Maity S, Reddy VR (2013) Pitch synchronous and glottal closure based speech analysis for language recognition. Int J Speech Technol (Springer) 16(4):413–430
Article Google Scholar
Ramu Reddy V, Maity S, Sreenivasa Rao K (2013) Recognition of Indian languages using multi-level spectral and prosodic features. Int J Speech Technol (Springer) 16(4):489–510
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
K. Sreenivasa Rao
Innovation Lab Kolkata, Kolkata, West Bengal, India
V. Ramu Reddy
Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
Sudhamay Maity

Authors

K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
V. Ramu Reddy
View author publications
You can also search for this author in PubMed Google Scholar
Sudhamay Maity
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Sreenivasa Rao .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rao, K.S., Reddy, V.R., Maity, S. (2015). Literature Review. In: Language Identification Using Spectral and Prosodic Features. SpringerBriefs in Electrical and Computer Engineering(). Springer, Cham. https://doi.org/10.1007/978-3-319-17163-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-17163-0_2
Published: 01 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17162-3
Online ISBN: 978-3-319-17163-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics