Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers

Chittaragi, Nagaratna B.; Prakash, Ambareesh; Koolagudi, Shashidhar G.

doi:10.1007/s13369-017-2941-0

Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers

Research Article - Special Issue - Computer Engineering and Computer Science
Published: 17 November 2017

Volume 43, pages 4289–4302, (2018)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Nagaratna B. Chittaragi^1,2,
Ambareesh Prakash³ &
Shashidhar G. Koolagudi¹

352 Accesses
29 Citations
Explore all metrics

Abstract

In this paper, investigation of the significance of spectral and prosodic behaviors of speech signal has been carried out for dialect identification. Spectral features such as cepstral coefficients, spectral flux, and entropy are extracted from shorter frames. Prosodic attributes such as pitch, energy, and duration are derived from longer frames. IViE (Intonational Variations in English) speech corpus covering nine dialectal regions of British Isles has been considered, to evaluate the proposed approach. Since corpus is available in both read and semi-spontaneous modes, the influence of spectral and prosodic behavior over these datasets is distinguishably articulated. Further, two distinct classification algorithms, namely support vector machine (SVM) and an ensemble of decision trees along with the SVM are used for identification of nine dialects. Dialect discriminating information captured from both features are used for constructing feature vectors. Experiments have been conducted on individual and combinations of features. A better dialect recognition performance is observed with ensemble methods over a single independent SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Article 21 November 2019

Analysis and modeling of acoustic information for automatic dialect classification

Article 22 July 2016

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Article 11 October 2016

References

Chambers, J.K.; Trudgill, P.: Dialectology, 2nd edn. Cambridge University Press, Cambridge (1998)
Book Google Scholar
Ferragne, E.; Pellegrino, F.: Automatic dialect identification: a study of British English. Speak. Classif. II, 243–257 (2007)
Article Google Scholar
Chen, N.F; Shen, W.; Campbell, J.P: A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models. In: IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 5014–5017 (2010)
Harris, M.J.; Gries, S.T.; Miglio, V.G.: Prosody and its application to forensic linguistics. Ling. Evid. Sec. Law Intell. 2(2), 11–29 (2014)
Google Scholar
Gray, S.; Hansen, J.H.L.: An integrated approach to the detection and classification of accents/dialects for a spoken document retrieval system. In: Automatic Speech Recognition and Understanding, pp. 35–40 (2005)
Zissman, M.A.: Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans. Speech Audio Process. 4(1), 31–44 (1996)
Article Google Scholar
Mehrabani, M.; Hansen, J.H.L.: Automatic analysis of dialect/language sets. Int. J. Speech Technol. 18(3), 277–286 (2015)
Article Google Scholar
Biadsy, F.: Automatic Dialect and Accent Recognition and its Application to Speech Recognition. PhD Thesis, Columbia University (2011)
Liu, G.A.; Hansen, J.H.L.: A systematic strategy for robust automatic dialect identification. In: 19th European Signal Processing Conference, pp. 2138–2141 (2011)
Sreenivasa Rao, K.; Yegnanarayana, B.: Modeling durations of syllables using neural networks. Comput. Speech Lang. 21(2), 282–295 (2007)
Article Google Scholar
Torres-carrasquillo, P.A.; Gleason, T.P.; Reynolds, D.A.: Dialect identification using Gaussian Mixture Models. ODYSSEY - The Speaker and Language Recognition Workshop, pp. 2–5 (2004)
Huang, R.; Hansen, J.H.L.; Angkititrakul, P.: Dialect/accent classification using unrestricted audio. IEEE Trans. Audio Speech Lang. Process. 15(2), 453–464 (2007)
Article Google Scholar
Zissman, M.A.; Gleason, T.P.; Rekart, D.M.; Losiewicz, B.L.: Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech. In: IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 777–780 (1996)
Lei, Y.; Hansen, J.H.L.: Dialect classification via text-independent training and testing for Arabic, Spanish, and Chinese. IEEE Trans. Audio Speech Lang. Process. 19(1), 85–96 (2011)
Article Google Scholar
Rouas, J.L.: Automatic prosodic variations modeling for language and dialect discrimination. IEEE Trans. Audio Speech Lang. Process. 15(6), 1904–1911 (2007)
Article Google Scholar
Chen, N.F.; Tam, S.W.; Shen, W.; Campbell, J.P.: Characterizing phonetic transformations and acoustic differences across english dialects. IEEE/ACM Trans. Audio Speech Lang. Process. 22(1), 110–124 (2014)
Article Google Scholar
Sarma, M.; Sarma, K.K.: Dialect Identification from Assamese speech using prosodic features and a neuro fuzzy classifier. In: 3rd International Conference on Signal Processing and Integrated Networks (SPIN), pp. 127–132 (2016)
Shen, W.; Chen, N.; Reynolds, D.: Dialect recognition using adapted phonetic models. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 763–766 (2008)
Purnell, T.; Idsardi, W.; Baugh, J.: Perceptual and phonetic experiments on American English dialect identification. J. Lang. Soc. Psychol. 18(1), 10–30 (1999)
Article Google Scholar
Chen, T.; Huang, C.; Chang, E.; Wang, J.: Automatic accent identification using Gaussian Mixture Models. In: Automatic Speech Recognition and Understanding, IEEE Workshop, pp. 343–346 (2001)
Dehak, N.; Torres-Carrasquillo, P.A.; Reynolds, D.A.; Dehak, R.: Language recognition via i-vectors and dimensionality reduction. In: Interspeech, pp. 857–860 (2011)
Hansen, J.H.L.; Liu, G.: Unsupervised accent classification for deep data fusion of accent and language information. Speech Commun. 78, 19–33 (2016)
Article Google Scholar
Sreenivasa Rao, K.; Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Int. J. Syst. Cybern. Inform. 9(4), 24–33 (2011)
Google Scholar
Etman, A.; Louis, A.A.: American dialect identification using phonotactic and prosodic features. In: SAI Intelligent Systems Conference (IntelliSys), pp. 963–970 (2015)
Biadsy, F.; Hirschberg, J.; Habash, N.: Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the Workshop on Computational Approaches to Semitic Languages Conducted by Association for Computational Linguistics, pp. 53–61 (2009)
Utami, I.T.; Sartono, B.; Sadik, K.: Comparison of single and ensemble classifiers of support vector machine and classification tree. J. Math. Sci. Appl. 2(2), 17–20 (2014)
Google Scholar
Pedersen, C.; Diederich, J.: Accent classification using support vector machines. In: Computer and Information Science, 6th IEEE/ACIS, pp. 444–449 (2007)
Chitturi, R.; Hansen, J.H.L.: Multi-stream dialect classification using SVM-GMM hybrid classifiers. In: IEEE Workshop on Automatic Speech Recognition Understanding (ASRU), pp. 431–436 (2007)
Lachachi, N.E.; Adla, A.: Two approaches-based L2-SVMs reduced to MEB problems for dialect identification. Int. J. Comput. Vis. Robot. 6(1–2), 1–18 (2016)
Google Scholar
Darwish, K.; Sajjad, H.; Mubarak, H.: Verifiably Effective Arabic dialect identification. In: Empirical Methods in Natural Language Processing, pp. 1465–1468 (2014)
Malmasi, S.; Dras, M.: Language identification using classifier ensembles. In: Proceedings of the Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects, pp. 35–43 (2015)
Grabe, E.; Post, B.: Intonational variation in the british isles. In: Speech Prosody, International Conference (2002)
Giannakopoulos, T.; Pikrakis, A.: Introduction to Audio Analysis: A MATLAB Approach. Academic Press, London (2014)
Google Scholar
Reetz, H.; Jongman, A.: Phonetics Transcription, Production, Aoustics and Perception. Wiley Blackwell, New York (2009)
Google Scholar
Tsai, W.H.; Chang, W.W.: Discriminative training of gaussian mixture bigram models with application to chinese dialect identification. Speech Commun. 36(3), 317–326 (2002)
Article MATH Google Scholar
Hermansky, H.; Morgan, N.: Rasta processing of speech. IEEE Trans. Speech Audio Process. 2(4), 578–589 (1994)
Article Google Scholar
Kotnik, B.; Vlaj, D.; Kacic, Z; Horvat, B.: Robust MFCC feature extraction algorithm using efficient additive and convolutional noise reduction procedures. In: ICSLP, 2, pp. 445–448 (2002)
Ramus, F.; Mehler, J.: Language identification with suprasegmental cues: a study based on speech resynthesis. J. Acoust. Soc. Am. 105(1), 512–521 (1999)
Article Google Scholar
Liu, G.; Lei, Y.; Hansen, J.H.L.: Dialect identification: impact of differences between read versus spontaneous speech. In: 18th European Signal Processing Conference, pp. 2003–2006. IEEE (2010)
Nakamura, M.; Iwano, K.; Furui, S.: Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance. Comput. Speech Lang. 22(2), 171–184 (2008)
Article Google Scholar
Wightman, C.W.: Automatic detection of prosodic constituents for parsing. Doctoral dissertation (1992)
Sun, X.: A pitch determination algorithm based on subharmonic-to-harmonic ratio. In: The 6th International Conference of Spoken Language Processing, pp. 676–679 (2000)
Campbell, W.M.; Campbell, J.P.; Reynolds, D.A.; Singer, E.; Torres-Carrasquillo, P.A.: Support vector machines for speaker and language recognition. Comput. Speech Lang. 20(2), 210–229 (2006)
Article Google Scholar
Paleologo, G.; Elisseeff, A.; Antonini, G.: Subagging for credit scoring models. Eur. J. Oper. Res. 201(2), 490–499 (2010)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Freund, Y.; Schapire, R.: A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14, 771–780 (1999)
Google Scholar
Chang, C.-C.; Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)
Article Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Springer, pp. 1–15 (2000)
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Friedman, J.; Hastie, T.; Tibshirani, R.: The Elements of Statistical Learning, Volume 1. Springer Series in Statistics. Springer, New York (2001)
MATH Google Scholar
Geurts, P.; Ernst, D.; Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Article MATH Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Article MathSciNet MATH Google Scholar
Chen, T.; Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Kim, H.C.; Pang, S.; Je, H.M.; Kim, D.; Bang, S.Y.: Support vector machine ensemble with bagging. In: Pattern Recognition with Support Vector Machines: First International Workshop, pp. 397–408 (2002)
Grabe, E.; Post, B.; Nolan, F.: The IViE Corpus. Department of Linguistics. University of Cambridge, Cambridge (2001)
Google Scholar
Marc, C.; De Frank, S.; Johan, S.; De Bart, M.: EnsembleSVM: a library for ensemble learning using support vector machines. J. Mach. Learn. Res. 15, 141–145 (2014)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, 575025, India
Nagaratna B. Chittaragi & Shashidhar G. Koolagudi
Department of Information Science and Engineering, Siddaganga Institute of Technology, Tumkur, India
Nagaratna B. Chittaragi
Department of Mechanical Engineering, National Institute of Technology Karnataka, Surathkal, Karnataka, 575025, India
Ambareesh Prakash

Authors

Nagaratna B. Chittaragi
View author publications
You can also search for this author in PubMed Google Scholar
Ambareesh Prakash
View author publications
You can also search for this author in PubMed Google Scholar
Shashidhar G. Koolagudi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nagaratna B. Chittaragi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chittaragi, N.B., Prakash, A. & Koolagudi, S.G. Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers. Arab J Sci Eng 43, 4289–4302 (2018). https://doi.org/10.1007/s13369-017-2941-0

Download citation

Received: 03 May 2017
Accepted: 26 October 2017
Published: 17 November 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s13369-017-2941-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers

Abstract

Access this article

Similar content being viewed by others

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Analysis and modeling of acoustic information for automatic dialect classification

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers

Abstract

Access this article

Similar content being viewed by others

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Analysis and modeling of acoustic information for automatic dialect classification

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation