Skip to main content

Galo Lexical Tone Recognition Using Machine Learning Approach

  • Conference paper
  • First Online:
Fourth International Conference on Image Processing and Capsule Networks (ICIPCN 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 798))

Included in the following conference series:

  • 165 Accesses

Abstract

In this paper, we present the development of a Galo speech corpus and the design of a Galo lexical tone recognizer. The Galo language is spoken by the Galo people residing in Arunachal Pradesh, a frontier state of North East India. Galo belongs to the Tani branch of the Tibeto–Burman language family. Galo exhibits two discernible lexical tones: High/Plain, characterized by a moderately high-level pitch contour and Low/Tense, characterized by a descending pitch contour. The speech database used in this study contains Galo phrases spoken by 27 individuals, comprising 13 male and 14 female speakers aged between 20 and 50 years. The recording of each speaker encompasses a phonetically diverse script consisting of twenty Galo tonal words and all their possible tonal variations. The work aims to automatically recognize these lexical tones by employing seven features extracted from the fundamental frequency (F0) contour. Two classification models based on machine learning, specifically support vector machine (SVM) and random forest (RF), were employed to identify lexical tones. The RF-based recognizer achieves a recognition accuracy of 90.42%, while the SVM-based recognizer surpasses that with an accuracy of 92.31%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Moore R (2007) Presence: a human-inspired architecture for speech-based human-machine interaction. IEEE Trans Comput 56(9):1176–1188

    Article  MathSciNet  Google Scholar 

  2. Li YW, Cheng X, Ding C, Galvin JJ, Chen B, Fu QJ (2023) Benefits of long-term music training for segregation of competing speech by tonal language speakers. J Acoust Soc Am 153(no. 3_supplement):A330–A330

    Google Scholar 

  3. Post MW (2007) A grammar of Galo. Ph.D. Dissertation. La Trobe University

    Google Scholar 

  4. Chen XX, Cai CN, Guo P, Sun Y (1987) A hidden Markov model applied to Chinese four-tone recognition. In: ICASSP'87 IEEE international conference on acoustics, speech, and signal processing, vol 12, pp 797–800 IEEE

    Google Scholar 

  5. Chang, PC, Sun SW, Chen SH (1990) Mandarin tone recognition by multi-layer perceptron. In: international conference on acoustics, speech, and signal processing, IEEE pp 517–520

    Google Scholar 

  6. Lee T, Ching PC, Chan LW, Cheng YH, Mak B (1995) Tone recognition of isolated Cantonese syllables. IEEE Trans Speech Audio process 3(3):204–209

    Google Scholar 

  7. Lee T, Kochanski G, Shih C, Li Y (2002) Modeling tones in continuous Cantonese speech. In: Proceedings ICSLP, pp 2401–2404

    Google Scholar 

  8. Peng G, Wang WS (2004) Parallel tone score association method for tone language speech recognition. In: Proceedings international conference on spoken language processing (ICSLP)

    Google Scholar 

  9. Wang S, Levow GA (2008) Mandarin Chinese tone nucleus detection with landmarks. In: ninth annual conference of the International Speech Communication Association

    Google Scholar 

  10. Hsieh YL, Chuang CT, Hsieh FF, Chang YC, Hsu WL (2014) Taiwanese tone recognition using fractionalized curve-fitting of prosodic features. In: Proceedings 7th international conference on speech prosody 2014, pp 772–775

    Google Scholar 

  11. Gogoi P, Tzudir M, Sarmah P, Prasanna SRM (2020) Automatic tone recognition of Ao language. In Proceedings 10th international conference on speech prosody 2020, pp 1005–1008

    Google Scholar 

  12. Gogoi P, Dey A, Lalhminghlui W, Sarmah P, Prasanna SM (2020) Lexical tone recognition in Mizo using acoustic-prosodic features. In: Proceedings of the twelfth language resources and evaluation conference, pp 6458–6461

    Google Scholar 

  13. Zhao X, Shaughnessy DO, Minh-Quang N (2007) A processing method for pitch smoothing based on autocorrelation and cepstral F0 detection approaches. In: 2007 international symposium on signals, systems and electronics, pp 59–62 IEEE

    Google Scholar 

  14. Vapnik V (1999) The nature of statistical learning theory. Springer science and business media

    Google Scholar 

  15. Gigović L, Pourghasemi HR, Drobnjak S, Bai S (2019) Testing a new ensemble model based on SVM and random forest in forest fire susceptibility assessment and its mapping in Serbia’s Tara National Park. Forests 10(5):408

    Google Scholar 

  16. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  MATH  Google Scholar 

  17. Abuella M, Chowdhury B (2017) Hourly probabilistic forecasting of solar power. In: 2017 North American power symposium (NAPS), IEEE pp 1–5

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Utpal Bhattacharjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kamdak, B., Taye, G., Bhattacharjee, U. (2023). Galo Lexical Tone Recognition Using Machine Learning Approach. In: Shakya, S., Tavares, J.M.R.S., Fernández-Caballero, A., Papakostas, G. (eds) Fourth International Conference on Image Processing and Capsule Networks. ICIPCN 2023. Lecture Notes in Networks and Systems, vol 798. Springer, Singapore. https://doi.org/10.1007/978-981-99-7093-3_23

Download citation

Publish with us

Policies and ethics