Abstract
The problem of automated determination of language similarity (or even defining of a distance on the space of languages) could be solved in different ways – working with phonetic transcriptions, with speech recordings or both of them. For the recordings, we propose and test a HMM-based one: in the first part of our article we successfully try language detection, afterwards we are trying to calculate distances between HMM-based models, using different metrics and divergences. The Kullback-Leibler divergence is the only one we got good results with – it means that the calculated distances between languages correspond to analytical understanding of similarity between them. Even if it does not work very well, the conclusion is that this method is usable, but usage of some other methods could be more rational.
Keywords
- Distance between languages
- Hidden Markov models
- Kullback-Leibler divergence
This is a preview of subscription content, access via your institution.
Buying options

Notes
- 1.
We chose women because we collected more female voice speech data in our expeditions – apparently because women live longer [12] and are more talkative (at least by our observations, although in research their predominance of daily word use does not meet thresholds for statistical significance, e.g., [13, 14]).
- 2.
This program is used to perform a single re-estimation of the parameters of a set of HMMs using an embedded training version of the Baum-Welch algorithm. Training data consists of one or more utterances each of which has a transcription in the form of a standard label file (segment boundaries are ignored). For each training utterance, a composite model is effectively synthesized by concatenating the phoneme models given by the transcription. [5]
- 3.
We are also concerned with the statistical problem of discrimination, by considering a measure of "distance" or "divergence" between statistical populations in terms of our measure of information. For the statistician two populations differ more or less according as to how difficult it is to discriminate between them with the best test. The particular measure we use has been considered by Jeffreys in another connection. He is primarily concerned with its use in providing an invariant density of a priory probability. A special case of this divergence is Mahalanobis' generalized distance. [6]
References
Bинoгpaдoв, B.A.: Идиoм. Лингвиcтичecкий энциклoпeдичecкий cлoвapь/Пoд peд. B.H. Яpцeвoй, cтp. 685. Coвeтcкaя энциклoпeдия, Mocквa (1990)
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 262–286 (1989)
Кyшниp, Д.A.: Aлгopитм фopмиpoвaния cтpyктypы этaлoнa для пocлoвнoгo диктopo-нeзaвиcимoгo pacпoзнaвaния кoмaнд oгpaничeннoгo cлoвapя. Штyчный iнтeлeкт № 3’2006, Київ (2006)
[berzini, a.]
[inp’ormats’iis mopovebis prints’ipebi p’onogramebis avtomaturi analizist’vis] = Пpинципы cбopa инфopмaции для aвтoмaтизиpoвaннoгo aнaлизa фoнoгpaмм.
- 2011 [k’art’uli ena da t’anamedrove tek’nologiebi - 2011] cтp. 39–46.
[meridiani],
[t’bilisi] (2011)
Young, S., et al.: The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, Cambridge (2009)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Šimko, J., Suni, A., Hiovain, K., Vainio, M.: Comparing languages using hierarchical prosodic analysis. In: Proceedings of Interspeech 2017, pp. 1213–1217 (2017)
Nerbonne, J., Heeringa, W., van den Hout, E., van der Kooi, P., Otten, S., van de Vis, S.W.: Phonetic distance between Dutch dialects. In: CLIN VI, Papers from the Sixth CLIN Meeting. Antwerp: University of Antwerp, Center for Dutch Language and Speech, pp. 185–202
Tambovtsev, Y.: Phonological similarity between basque and other world languages based on the frequency of occurrence of certain typological consonantal features. Prague Bull. Math. Linguist. 79–80, 121–126 (2003)
Berzinch, A.A.: La comparaison de typologie traditionnelle et de typologie phonolexique, basée sur la méthode des n-grammes, dans les dialectes baltes. Identification des langues et des variétés dialectales par les humains et par les machines. Paris: École National Supérieure des Télécommunications (2004)
Бepзинь, A.У.: Измepeниe фoнoмopфoлeкcичecкoгo paccтoяния мeждy лaтышcкими нapeчиями пyтём пpимeнeния paccтoяния Baгнepa-Фишepa. Tpyды мeждyнapoднoй кoнфepeнции. Диaлoг 2006. M.: Издaтeльcтвo PГГУ (2006)
Demogrāfija 2018: statistisko datu krājums. R.: Centrālā statistikas pārvalde (2018)
Mehl, M.R., Vazire, S., Ramírez-Esparza, N., Slatcher, R.B., Pennebaker, J.W.: Are women really more talkative than men? Science 317(5832), 82 (2007). American Association for the Advancement of Science, Washington
Liberman M.: Sex-Linked Lexical Budgets. Language Log 2006/2007. http://itre.cis.upenn.edu/~myl/languagelog/archives/003420.html. Accessed 15 Sept 2019
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bērziņš, A.A. (2019). Usage of HMM-Based Speech Recognition Methods for Automated Determination of a Similarity Level Between Languages. In: Ustalov, D., Filchenkov, A., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2019. Communications in Computer and Information Science, vol 1119. Springer, Cham. https://doi.org/10.1007/978-3-030-34518-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-34518-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34517-4
Online ISBN: 978-3-030-34518-1
eBook Packages: Computer ScienceComputer Science (R0)