Abstract
In this paper, a novel approach of connected spoken word recognition is proposed, based only on a relatively simple artificial neural network model. The model used is a modified version of the previously proposed cascaded neuro-computational model and has a three-layered network structure, where a non-linear metric to each of the second-layer units is newly introduced for performing effectively the pattern matching at the word-feature level. Simulations were conducted using connected speech data sets of a larger lexicon than those used in the previous works; the data sets were comprised of the naturally spoken strings, each string consisting of a varying number of 2–7 words selected from a total of 47 Japanese prefecture names. The simulation results show that the modified model yields the overall recognition performance, i.e., 95.2% in terms of the word accuracy rate, which is comparable to that (98.1%) obtained using a benchmark approach of hidden Markov model with embedded training.
Similar content being viewed by others
References
Hannagan T, Magnuson JS, Grainger J (2013) Spoken word recognition without a TRACE. Front Psychol 4:563
McClelland JL, Elman JL (1986) The TRACE model of speech perception. Cogn Psychol 18:86
Norris D, McQueen JM (2008) Shortlist B: a Bayesian model of continuous speech recognition. Psychol Rev 115–2:357395
Hoya T, van Leeuwen C (2010) A cascaded neuro-computational model for spoken word recognition. Connect Sci 22–1:87–101
Hoya T, van Leeuwen C (2016) Connected word recognition using a cascaded neuro-computational model. Connect Sci 28(4):332–345
Yu D, Li J (2017) Recent progresses in deep learning based acoustic models. IEEE/CAA J Autom Sin 4–3:396–409
Duda HE, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York
Young S, Evermann G, Gales M, Hain T, Kershaw D, Moore G, Woodland P (2005) The HTK book (version 3.3). Department of Engineering, Cambridge University, Cambridge
Hoya T (2016) On the parameter setting of a network-growing algorithm for radial basis kernel networks. In: Proceedings of the joint 8th international conference on soft computing and intelligent systems and 17th international symposium on advanced intelligent systems, Sapporo
Acknowledgements
The author would like to thank all the students who participated in the recording sessions and Mr. Hideki Shimizu for his partial involvement in the simulation study for this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was presented in part at the 24th International Symposium on Artificial Life and Robotics, Beppu, Oita, January 23–25, 2019.
About this article
Cite this article
Hoya, T. A modified cascaded neuro-computational model applied to recognition of connected spoken Japanese prefecture words. Artif Life Robotics 24, 499–504 (2019). https://doi.org/10.1007/s10015-019-00551-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10015-019-00551-z