Abstract
In this paper, we present the development of a Galo speech corpus and the design of a Galo lexical tone recognizer. The Galo language is spoken by the Galo people residing in Arunachal Pradesh, a frontier state of North East India. Galo belongs to the Tani branch of the Tibeto–Burman language family. Galo exhibits two discernible lexical tones: High/Plain, characterized by a moderately high-level pitch contour and Low/Tense, characterized by a descending pitch contour. The speech database used in this study contains Galo phrases spoken by 27 individuals, comprising 13 male and 14 female speakers aged between 20 and 50 years. The recording of each speaker encompasses a phonetically diverse script consisting of twenty Galo tonal words and all their possible tonal variations. The work aims to automatically recognize these lexical tones by employing seven features extracted from the fundamental frequency (F0) contour. Two classification models based on machine learning, specifically support vector machine (SVM) and random forest (RF), were employed to identify lexical tones. The RF-based recognizer achieves a recognition accuracy of 90.42%, while the SVM-based recognizer surpasses that with an accuracy of 92.31%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Moore R (2007) Presence: a human-inspired architecture for speech-based human-machine interaction. IEEE Trans Comput 56(9):1176–1188
Li YW, Cheng X, Ding C, Galvin JJ, Chen B, Fu QJ (2023) Benefits of long-term music training for segregation of competing speech by tonal language speakers. J Acoust Soc Am 153(no. 3_supplement):A330–A330
Post MW (2007) A grammar of Galo. Ph.D. Dissertation. La Trobe University
Chen XX, Cai CN, Guo P, Sun Y (1987) A hidden Markov model applied to Chinese four-tone recognition. In: ICASSP'87 IEEE international conference on acoustics, speech, and signal processing, vol 12, pp 797–800 IEEE
Chang, PC, Sun SW, Chen SH (1990) Mandarin tone recognition by multi-layer perceptron. In: international conference on acoustics, speech, and signal processing, IEEE pp 517–520
Lee T, Ching PC, Chan LW, Cheng YH, Mak B (1995) Tone recognition of isolated Cantonese syllables. IEEE Trans Speech Audio process 3(3):204–209
Lee T, Kochanski G, Shih C, Li Y (2002) Modeling tones in continuous Cantonese speech. In: Proceedings ICSLP, pp 2401–2404
Peng G, Wang WS (2004) Parallel tone score association method for tone language speech recognition. In: Proceedings international conference on spoken language processing (ICSLP)
Wang S, Levow GA (2008) Mandarin Chinese tone nucleus detection with landmarks. In: ninth annual conference of the International Speech Communication Association
Hsieh YL, Chuang CT, Hsieh FF, Chang YC, Hsu WL (2014) Taiwanese tone recognition using fractionalized curve-fitting of prosodic features. In: Proceedings 7th international conference on speech prosody 2014, pp 772–775
Gogoi P, Tzudir M, Sarmah P, Prasanna SRM (2020) Automatic tone recognition of Ao language. In Proceedings 10th international conference on speech prosody 2020, pp 1005–1008
Gogoi P, Dey A, Lalhminghlui W, Sarmah P, Prasanna SM (2020) Lexical tone recognition in Mizo using acoustic-prosodic features. In: Proceedings of the twelfth language resources and evaluation conference, pp 6458–6461
Zhao X, Shaughnessy DO, Minh-Quang N (2007) A processing method for pitch smoothing based on autocorrelation and cepstral F0 detection approaches. In: 2007 international symposium on signals, systems and electronics, pp 59–62 IEEE
Vapnik V (1999) The nature of statistical learning theory. Springer science and business media
Gigović L, Pourghasemi HR, Drobnjak S, Bai S (2019) Testing a new ensemble model based on SVM and random forest in forest fire susceptibility assessment and its mapping in Serbia’s Tara National Park. Forests 10(5):408
Breiman L (2001) Random forests. Mach Learn 45:5–32
Abuella M, Chowdhury B (2017) Hourly probabilistic forecasting of solar power. In: 2017 North American power symposium (NAPS), IEEE pp 1–5
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kamdak, B., Taye, G., Bhattacharjee, U. (2023). Galo Lexical Tone Recognition Using Machine Learning Approach. In: Shakya, S., Tavares, J.M.R.S., Fernández-Caballero, A., Papakostas, G. (eds) Fourth International Conference on Image Processing and Capsule Networks. ICIPCN 2023. Lecture Notes in Networks and Systems, vol 798. Springer, Singapore. https://doi.org/10.1007/978-981-99-7093-3_23
Download citation
DOI: https://doi.org/10.1007/978-981-99-7093-3_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7092-6
Online ISBN: 978-981-99-7093-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)