Advertisement

Sādhanā

, 43:53 | Cite as

Spoken Indian language identification: a review of features and databases

  • BAKSHI AARTI
  • SUNIL KUMAR KOPPARAPU
Article
  • 64 Downloads

Abstract

Spoken language is one of the distinctive characteristics of the human race. Spoken language processing is a branch of computer science that plays an important role in human–computer interaction (HCI), which has made remarkable advancement in the last two decades. This paper reviews and summarizes the acoustic, phonetic and prosody features that have been used for spoken language identification specifically for Indian languages. In addition, we also review the speech databases, which are already available for Indian languages and can be used for the purposes of spoken language identification.

Keywords

SLID phonetic characteristics features 

References

  1. 1.
    Ambikairajah E, Li H, Wang L, Yin B and Sethu V 2013. Language identification: a tutorial. IEEE Circuits and Systems Magazine 11: 82–108CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Li H, Ma B and Lee K A 2013 Spoken language recognition: from fundamentals to practice. Proceedings of the IEEE 5: 1136–1159CrossRefGoogle Scholar
  4. 4.
    Reddy M V, Hanumanthappa M and Jyothi N M 2014 Phonetic dictionary for natural language processing: Kannada. International Journal of Engineering Research and Applications 4(7): 01–04Google Scholar
  5. 5.
    Bhaskararao P 2011 Salient phonetic features of Indian languages in speech technology. Sadhana 36(5): 587–599CrossRefGoogle Scholar
  6. 6.
    Mohanty S 2011 Phonotactic model for spoken language identification in Indian language perspective. International Journal of Computer Applications 19(9): 18–24Google Scholar
  7. 7.
    Koch D B, McGee T J, Bradlow A R and Kraus N 1999 Acoustic-phonetic approach toward understanding neural processes and speech perception. Journal of the American Academy of Audiology 10: 304–318Google Scholar
  8. 8.
    Patil V and Rao P 2011 Acoustic features for detection of aspirated stops. In: Proceedings of the IEEE 2011 National Conference on Communications (NCC), pp. 1–5Google Scholar
  9. 9.
    Esposito C, Hurst A, et al 2005 Breathy nasals and /Nh/clusters in Bengali, Hindi, and Marathi. http://citeseerx.ist.psu.edu/viewdoc/summary? doi=10.1.1.499.8360
  10. 10.
  11. 11.
  12. 12.
    Shinde R B and Pawar V P 2012 A review on acoustic phonetic approach for Marathi speech recognition. International Journal of Computer Applications 59(2): 40–44Google Scholar
  13. 13.
  14. 14.
  15. 15.
    M R Mhaiskar 2014 Change in progress: phonology of Marathi-Hindi contact in Eastern Vidarbha. International Journal of English Language, Literature and Humanities 2(7)Google Scholar
  16. 16.
  17. 17.
  18. 18.
    Esposito C M, et al 2012 Contrastive breathiness across consonants and vowels: a comparative study of Gujarati and White Hmong. Journal of the International Phonetic Association 42(2): 123–143CrossRefGoogle Scholar
  19. 19.
    Rami M K, Kalinowski J, Stuart A and Rastatter M P 1999 Voice onset times and burst frequencies of four velar stop consonants in Gujarati. Journal of the Acoustical Society of America 106(6): 3736–3738CrossRefGoogle Scholar
  20. 20.
    Thati, and Bollepalli B, Bhaskararao P and Yegnanarayana B 2012 Analysis of breathy voice based on excitation characteristics of speech production. In: Proceedings of the IEEE International Conference on Signal Processing and Communications (SPCOM), pp. 1–5Google Scholar
  21. 21.
    Agrawal S S 2008 Analysis of breathy voice based on excitation characteristics of speech production. https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/agrawal2008.pdf
  22. 22.
    Gaurav, Deiv D S, Sharma G K and Bhattacharya M 2012 Development of application specific continuous speech recognition system in hindi. Journal of Signal and Information Processing 3: 394Google Scholar
  23. 23.
    Rajput P and Lehana P 2014 Effect of the proportion of harmonic and noise part on the quality of synthesized speech using HNM in Hindi language. MAGNT Research Report, http://brisjast.com/wp-content/uploads/2014/11/Dec-10-2014.pdf, vol. 2, no.7, pp. 116–133
  24. 24.
  25. 25.
  26. 26.
    Berkson K H 2012 Phonation types in Marathi: an acoustic investigation. PhD DissertationGoogle Scholar
  27. 27.
    Kanth B L, Keri V and Prahallad K S 2011 Durational characteristics of Indian phonemes for language discrimination. In: Information systems for Indian languages. Springer, pp. 130–135Google Scholar
  28. 28.
    Index of sound 2017 https://books.google.co.in
  29. 29.
  30. 30.
    Barman B 2008 Distinctiveness of aspiration in Bangla. Daffodil International UniversityGoogle Scholar
  31. 31.
    Das A 2009 The distribution of aspirated stops and/h/in Bangla: an optimality theoretic approach. Linguistics Journal 4(2): 51–76Google Scholar
  32. 32.
  33. 33.
    Das A 2009 The distribution of aspirated stops and/h/in Bangla: an optimality theoretic approach. International Journal of Electrical and Electronics Engineering 3(5): 2281–2291Google Scholar
  34. 34.
    Sarma M, Dutta K and Sarma K K 1982 Assamese numeral corpus for speech recognition using cooperative ANN architecture. Department of Publication, Gauhati University, http://www.amazon.co.uk/Structure-Assamese-Golockchandra-Goswami/dp/B005VYXSDG
  35. 35.
    Sarma M and Sarma K K 2013 An ANN based approach to recognize initial phonemes of spoken words of Assamese language. Applied Soft Computing 13(7): 116–133Google Scholar
  36. 36.
  37. 37.
    Devi M, Thakuria L K, Purnendu B A and Talukdar P 2014 A study on acoustic behavior of Assamese nasal phoneme. Journal of Harmonized Research in Engineering 2(1): 235–238Google Scholar
  38. 38.
    Srinivasan A, Rao K S, Kannan K and Narasimhan D 2010 Speech recognition of the letter’zha’in Tamil language using HMM. arXiv preprint arXiv:1001.4190
  39. 39.
  40. 40.
    Pushpa N, Revathi R, Ramya C and Hameed S S 2014 Speech processing of Tamil language with back propagation neural network and semi-supervised training. International Journal of Innovative Research in Computer and Communication Engineering 2(1): 2718–2723Google Scholar
  41. 41.
    Thangarajan R, Natarajan A M and Selvam M 2008 Word and triphone based approaches in continuous speech recognition for Tamil language. WSEAS Transactions on Signal Processing 4(3): 76–86Google Scholar
  42. 42.
    South Asia Language Resource Center 2015 http://www.southasia.sas.upenn.edu/tamil/consonants.html
  43. 43.
  44. 44.
    Nagamani M and Girija P N 2015 Pronunciation variant and substitutional error analysis for improving Telugu language lexical performance in ASR system accuracy. http://www.ijser.org/paper/Pronunciation-Variant-and-Substitutional-error-analysis-for-Improving-Telugu-Language-Lexical-performance-in-ASR-system-Accuracy.html.
  45. 45.
    Girija P N and Sridevi A 1995 Duration rules for vowels in Telugu. Department of Computer & Information sciences, AI Lab, University of Hyderabad, file:///C:/Users/Dell/Downloads/Duration-Rules-For-Vowels-In-TeluguGoogle Scholar
  46. 46.
    Datta A K, Ganguli N R and Ray S 1980 Recognition of unaspirated plosives—a statistical approach. IEEE Transactions on Acoustics, Speech and Signal Processing 28(1): 85–91CrossRefGoogle Scholar
  47. 47.
    Rajashekhar B, et al 2013 Vowel duration across age and dialects of Telugu language. Language in India 13(2)Google Scholar
  48. 48.
    Bhat D N S 1973 Retroflexion: an areal feature. Working Papers on Language Universals, No. 13, ERICGoogle Scholar
  49. 49.
    Malayalam Phonology 2015 https://en.wikipedia.org/wiki/Malayalam
  50. 50.
  51. 51.
    Jiang H 2010 Malayalam – a grammatical sketch and a text. Department of Linguistics, Rice UniversityGoogle Scholar
  52. 52.
    Mohanan K P 1986 The theory of lexical phonology. In: Studies in Natural Language and Linguistic Theory, Springer, pp. 63–108Google Scholar
  53. 53.
    Acoustic Characteristics of Speech Sound 2017 http://isites.harvard.edu/fs/docs/icb.topic482062.files/Reetz-Jongman
  54. 54.
  55. 55.
    George J, Abraham A S, Arya G S and Kumaraswami S 2015 Acoustic characteristics of stop consonants during fast and normal speaking rate in typically developing Malayalam speaking children. Language in India 15: 47Google Scholar
  56. 56.
    Local J and Simpson A P 1999 Phonetic implementation of geminates in Malayalam nouns. Work 4(92): 46Google Scholar
  57. 57.
  58. 58.
    Hemakumar G 2011 Acoustic phonetic characteristics of Kannada language. International Journal of Computer Science Issues 8(2): 332–339Google Scholar
  59. 59.
    Manjunath N, Varghese S M and Narasimhan S V 2010 Variation of voice onset time (VOT) in Kannada language. Language in India 10(5): 170–181Google Scholar
  60. 60.
    Dyrud L O 2001 Hindi–Urdu: stress accent or non-stress accent. University of North DakotaGoogle Scholar
  61. 61.
    Sirsa H and Redford M A 2013 The effects of native language on Indian English sounds and timing patterns. Journal of Phonetics 41(6): 393–406CrossRefGoogle Scholar
  62. 62.
  63. 63.
    Le Grézause E 2015 Investigating weight-sensitive stress in disyllabic words in Marathi and its acoustic correlates Google Scholar
  64. 64.
    The Handbook of Phonological Theory 2017 https://books.google.co.in/books
  65. 65.
    Sarma P and Sarma S K 2016 A study on detection of intonation events of Assamese speech required for tilt model. Int. J. Comput. Appl. 154: 34–38Google Scholar
  66. 66.
    Vijayakrishnan K G 2015 The path from the prosodic phonology of loans to second language phonology: two case studies. http://www.iitg.ernet.in/wti3/img/abstract/%20Vijayakrishnan.pdf
  67. 67.
    The Dravidian Languages 2017 https://books.google.co.in/books
  68. 68.
    Leonard A P 1964 Partial analysis of the phonology of formal Kannada. The University of Montana, MissoulaGoogle Scholar
  69. 69.
  70. 70.
    Sengar A, Mannell R, et al 2012 A preliminary study of Hindi intonation. Proc. SST Google Scholar
  71. 71.
    The Intonation of South Asian Languages Towards a Comparative Analysis 2017 https://www.reed.edu/linguistics/khan/assets/Khan2016-FASAL.pdf.
  72. 72.
    Bhuvaneswar C 2017 Intonation in English and Telugu proverbs: evidence for Karmik linguistic theory Google Scholar
  73. 73.
    Agrawal S, Samudravijaya K and Arora K 2006 Recent advances of speech databases development activity for Indian languages. In: Proceedings of ISCSLP Google Scholar
  74. 74.
    Shastri S V 1988 The Kolhapur Corpus of Indian English and work done on its basis so far. ICAME Journal 12: 15–26Google Scholar
  75. 75.
  76. 76.
  77. 77.
    Samudravijaya K 2006 Development of multi-lingual spoken corpora of Indian languages. In: Proceedings of the Chinese Spoken Language Processing Symposium, pp. 792–801Google Scholar
  78. 78.
    Malde K D, Vachhani B B, Madhavi M C, Chhayani N H and Patil H A 2013 Development of speech corpora in Gujarati and Marathi for phonetic transcription. In: Proceedings of Oriental COCOSDA held jointly with 2013 IEEE Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–6Google Scholar
  79. 79.
    Shruti Bengali Continuous ASR Speech Corpus 2015 http://cse.iitkgp.ac.in/pabitra/shruti-corpus.html
  80. 80.
    Upadhyay R K and Riyal M K 2010. Garhwali speech database. In: Proceedings of O-COCOSDA.Google Scholar
  81. 81.
    Samudravijaya K and Gogate M R A 2006 Marathi speech database. In: Proceedings of the International Symposium on Speech Technology and Processing Systems and Oriental COCOSDA-2006, Penang, Malaysia, http://speech.tifr.res.in/chief/publ/06ococosdaMarathiDatabase.pdf, pp. 21–24
  82. 82.
    Gaikwad S, Gawali B and Mehrotra S 2013 Creation of Marathi speech corpus for automatic speech recognition. In: Proceedings of Oriental COCOSDA held jointly with 2013 IEEE International Conference Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–5Google Scholar
  83. 83.
    Sarma B D, Sarma M, Sarma M and Prasanna S R M 2013 Development of Assamese phonetic engine: some issues. In: Proceedings of the Annual IEEE India Conference (INDICON), pp. 1–6Google Scholar
  84. 84.
  85. 85.
    Maity S, Vuppala A K, Rao K S and Nandi D 2012 IITKGP-MLILSC speech database for language identification. In: Proceedings of the IEEE National Conference on Communications (NCC), pp. 1–5Google Scholar

Copyright information

© Indian Academy of Sciences 2018

Authors and Affiliations

  1. 1.Department of Electronics and CommunicationUMIT, SNDT UniversityMumbaiIndia
  2. 2.TCS Innovation Labs - MumbaiTATA Consultancy ServicesThaneIndia

Personalised recommendations