Speech Processing for Hindi Dialect Recognition

  • Shweta Sinha
  • Aruna Jain
  • Shyam S. Agrawal
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 264)


In this paper, the authors have used 2-layer feed forward neural network for Hindi dialect recognition. A Dialect is a pattern of pronunciation of a language used by a community of native speakers belonging to the same geographical region. In this work, speech features have been explored to recognize four major dialects of Hindi. The dialects under consideration areKhariboli (spoken in West Uttar Pradesh, Delhi and some parts of Uttarakhand and Himachal Pradesh), Bhojpuri (spoken by population of East Uttar Pradesh, Bihar and Jharkhand), Haryanvi (spoken in Haryana, parts of Delhi, Uttar Pradesh and Uttarakhand) and Bagheli (spoken in Central India). Speech corpus for this work is collected from 15 speakers (including both male and female) from each dialect. The syllables of CVC structure is used as processing unit. Spectral features (MFCC) and prosodic features (duration and pitch contour) are extracted from speech for discriminating the dialects. Performance of the system is observed with spectral features and prosodic features as input. Results show that the system performs best when all the spectral and prosodic features are combined together to form input feature set during network training. The dialect recognition system shows a recognition score of 79% with these input features.


Hindi Dialects spectral features prosodic features Feed forward neural networks 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Liu, M., Xu, B., Hunng, T., Deng, Y., Li, C.: Mandarin accent adaptation based on context independent/context dependent pronunciation modeling. In: Proceedings of The Acoustic, Speech and Signal Processing, ICASSP 2000, Washington DC, USA, pp. 1025–1028 (2000)Google Scholar
  2. 2.
    Behravan, H.: Dialect and accent recognition. Dissertation, University of Eastern Finland (2012)Google Scholar
  3. 3.
    Mishra, D., Bali, K.: A comparative phonological study of the dialects of Hindi. In: Proceedings of ICPhS XVII, Hong Kong, pp. 17–21 (2011)Google Scholar
  4. 4.
    Zue, W., Hazen, T.J.: Automatic language identification using segment based approach. In: Proceedings of Eurospeech, pp. 1303–1306 (1993)Google Scholar
  5. 5.
    Rao, K.S., Nandy, S., Koolagudi, S.G.: Identification of Hindi dialect using speech. In: Proceedings of WMSCI 2010- the 14th World Multi-Conference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA (2010)Google Scholar
  6. 6.
    Mehrabani, M., Boril, H., Hansen, J.H.L.: Dialect distance assessment method based on comparision of pitch pattern statistical models. In: Proceedings of ICASSP, Dallas, USA, pp. 5158–5161 (2010)Google Scholar
  7. 7.
    Lee, H., Seong, C.J.: Experimental phonetic study of the syllable duration of Korean with respect to the positional effect. In: Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP), Philadelphia, pp. 1193–1196 (1996)Google Scholar
  8. 8.
    Sinha, S., Agrawal, S.S., Jain, A.: Dialectal influences on acoustic duration of Hindi phonemes. In: Proceedings of Oriental-COCOSDA-2013, Gurgaon, India (2013)Google Scholar
  9. 9.
    Aggarwal, R.K., Dave, M.: Integration of multiple acoustic and language models for improved Hindi speech recognition system. International Journal of Speech Technology (IJST) 15, 165–180 (2012)CrossRefGoogle Scholar
  10. 10.
    Haykin, S.: Neural Networks: A comprehensive foundation. Pearson Education Asia, Inc., New Delhi (2002)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Birla Institute of TechnologyMesraIndia
  2. 2.KIIT College of EngineeringGurgaonIndia

Personalised recommendations