Skip to main content

Usage of HMM-Based Speech Recognition Methods for Automated Determination of a Similarity Level Between Languages

  • 415 Accesses

Part of the Communications in Computer and Information Science book series (CCIS,volume 1119)


The problem of automated determination of language similarity (or even defining of a distance on the space of languages) could be solved in different ways – working with phonetic transcriptions, with speech recordings or both of them. For the recordings, we propose and test a HMM-based one: in the first part of our article we successfully try language detection, afterwards we are trying to calculate distances between HMM-based models, using different metrics and divergences. The Kullback-Leibler divergence is the only one we got good results with – it means that the calculated distances between languages correspond to analytical understanding of similarity between them. Even if it does not work very well, the conclusion is that this method is usable, but usage of some other methods could be more rational.


  • Distance between languages
  • Hidden Markov models
  • Kullback-Leibler divergence

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-34518-1_8
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-34518-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   74.99
Price excludes VAT (USA)
Fig. 1.


  1. 1.

    We chose women because we collected more female voice speech data in our expeditions – apparently because women live longer [12] and are more talkative (at least by our observations, although in research their predominance of daily word use does not meet thresholds for statistical significance, e.g., [13, 14]).

  2. 2.

    This program is used to perform a single re-estimation of the parameters of a set of HMMs using an embedded training version of the Baum-Welch algorithm. Training data consists of one or more utterances each of which has a transcription in the form of a standard label file (segment boundaries are ignored). For each training utterance, a composite model is effectively synthesized by concatenating the phoneme models given by the transcription. [5]

  3. 3.

    We are also concerned with the statistical problem of discrimination, by considering a measure of "distance" or "divergence" between statistical populations in terms of our measure of information. For the statistician two populations differ more or less according as to how difficult it is to discriminate between them with the best test. The particular measure we use has been considered by Jeffreys in another connection. He is primarily concerned with its use in providing an invariant density of a priory probability. A special case of this divergence is Mahalanobis' generalized distance. [6]


  1. Bинoгpaдoв, B.A.: Идиoм. Лингвиcтичecкий энциклoпeдичecкий cлoвapь/Пoд peд. B.H. Яpцeвoй, cтp. 685. Coвeтcкaя энциклoпeдия, Mocквa (1990)

    Google Scholar 

  2. Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 262–286 (1989)

    CrossRef  Google Scholar 

  3. Кyшниp, Д.A.: Aлгopитм фopмиpoвaния cтpyктypы этaлoнa для пocлoвнoгo диктopo-нeзaвиcимoгo pacпoзнaвaния кoмaнд oгpaничeннoгo cлoвapя. Штyчный iнтeлeкт № 3’2006, Київ (2006)

    Google Scholar 

  4. [berzini, a.] [inp’ormats’iis mopovebis prints’ipebi p’onogramebis avtomaturi analizist’vis] = Пpинципы cбopa инфopмaции для aвтoмaтизиpoвaннoгo aнaлизa фoнoгpaмм. - 2011 [k’art’uli ena da t’anamedrove tek’nologiebi - 2011] cтp. 39–46. [meridiani], [t’bilisi] (2011)

    Google Scholar 

  5. Young, S., et al.: The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, Cambridge (2009)

    Google Scholar 

  6. Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    CrossRef  MathSciNet  Google Scholar 

  7. Šimko, J., Suni, A., Hiovain, K., Vainio, M.: Comparing languages using hierarchical prosodic analysis. In: Proceedings of Interspeech 2017, pp. 1213–1217 (2017)

    Google Scholar 

  8. Nerbonne, J., Heeringa, W., van den Hout, E., van der Kooi, P., Otten, S., van de Vis, S.W.: Phonetic distance between Dutch dialects. In: CLIN VI, Papers from the Sixth CLIN Meeting. Antwerp: University of Antwerp, Center for Dutch Language and Speech, pp. 185–202

    Google Scholar 

  9. Tambovtsev, Y.: Phonological similarity between basque and other world languages based on the frequency of occurrence of certain typological consonantal features. Prague Bull. Math. Linguist. 79–80, 121–126 (2003)

    Google Scholar 

  10. Berzinch, A.A.: La comparaison de typologie traditionnelle et de typologie phonolexique, basée sur la méthode des n-grammes, dans les dialectes baltes. Identification des langues et des variétés dialectales par les humains et par les machines. Paris: École National Supérieure des Télécommunications (2004)

    Google Scholar 

  11. Бepзинь, A.У.: Измepeниe фoнoмopфoлeкcичecкoгo paccтoяния мeждy лaтышcкими нapeчиями пyтём пpимeнeния paccтoяния Baгнepa-Фишepa. Tpyды мeждyнapoднoй кoнфepeнции. Диaлoг 2006. M.: Издaтeльcтвo PГГУ (2006)

    Google Scholar 

  12. Demogrāfija 2018: statistisko datu krājums. R.: Centrālā statistikas pārvalde (2018)

    Google Scholar 

  13. Mehl, M.R., Vazire, S., Ramírez-Esparza, N., Slatcher, R.B., Pennebaker, J.W.: Are women really more talkative than men? Science 317(5832), 82 (2007). American Association for the Advancement of Science, Washington

    Google Scholar 

  14. Liberman M.: Sex-Linked Lexical Budgets. Language Log 2006/2007. Accessed 15 Sept 2019

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ansis Ataols Bērziņš .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Bērziņš, A.A. (2019). Usage of HMM-Based Speech Recognition Methods for Automated Determination of a Similarity Level Between Languages. In: Ustalov, D., Filchenkov, A., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2019. Communications in Computer and Information Science, vol 1119. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34517-4

  • Online ISBN: 978-3-030-34518-1

  • eBook Packages: Computer ScienceComputer Science (R0)