From Kratzenstein to the Soviet Vocoder: Some Results of a Historic Research Project in Speech Technology

  • Rüdiger HoffmannEmail author
  • Peter Birkholz
  • Falk Gabriel
  • Rainer Jäckel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11096)


This paper demonstrates by means of an example, how historic collections of universities can be utilized in modern research and teaching. The project refers to the Historic Acoustic-phonetic Collection (HAPS) of the TU Dresden. Two “guiding fossils” from the history of speech technology are selected to present a selection of results.


History of speech communication research Mechanical speech synthesis Vocoder 



Supported by the German Federal Ministry of Education and Research (BMBF) in the project “Sprechmaschine”, FKZ 01UQ1601A.


  1. 1.
    Panconcelli-Calzia, G.: Geschichtszahlen der Phonetik (1941)/Quellenatlas der Phonetik (1940), New edition by K. Koerner. Benjamins, Amsterdam (1994)Google Scholar
  2. 2.
    Dudley, H., Tarnoczy, T.H.: The speaking machine of Wolfgang von Kempelen. JASA 22(2), 151–166 (1950)CrossRefGoogle Scholar
  3. 3.
    Ohala, J.J. (ed.): A Guide to the History of the Phonetic Sciences in the United States. University of California, Berkeley (1999)Google Scholar
  4. 4.
    Bekanntmachung von Förderrichtlinien “Vernetzen - Erschließen - Forschen. Allianz für universitäre Sammlungen" (2015). BMBF Homepage Accessed 22 Apr 2018
  5. 5.
    Hoffmann, R.; Mehnert, D.: Early experimental phonetics in Germany - historic traces in the collection of the TU Dresden. In: Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007), Saarbrücken, pp. 881–884 (2007)Google Scholar
  6. 6.
    Mehnert, D.: Historische phonetische Geräte. Katalog der historischen akustisch-phonetischen Sammlung der TU Dresden, 1. Teil. TUDpress, Dresden (2012)Google Scholar
  7. 7.
    Kratzenstein, C.G.: Tentamen resolvendi problema, Petersburg 1781. Übersetzt und kommentiert von Christian Korpiun. TUDpress, Dresden (2016)Google Scholar
  8. 8.
    Wethlo, F.: Versuche mit Polsterpfeifen. Passow-Schaefers Beiträge für die gesamte Physiologie 6(3), 268–280 (1913)Google Scholar
  9. 9.
    Chiba, T., Kajiyama, M.: The Vowel: Its Nature and Structure. Tokyo-Kaiseikan Pub. Co., Tokyo (1941)Google Scholar
  10. 10.
    Arai, T.: Education in acoustics and speech science using vocal-tract models. JASA 131(3), 2444–2454 (2012)CrossRefGoogle Scholar
  11. 11.
    Chhetri, D.K., Zhang, Z., Neubauer, J.: Measurement of Young’s modulus of vocal folds by indentation. J. Voice 25(1), 1–7 (2011)CrossRefGoogle Scholar
  12. 12.
    Alipour, F., Vigmostad, S.: Measurement of vocal folds elastic properties for continuum modeling. J. Voice 26, 816.e21–816.e29 (2012)CrossRefGoogle Scholar
  13. 13.
    Scherer, R.C., et al.: Intraglottal pressure profiles for a symmetric and oblique glottis with a divergence angle of 10 degrees. JASA 109(4), 1616–30 (2001)CrossRefGoogle Scholar
  14. 14.
    Murray, P.R., Thomson, S.L.: Synthetic, multi-layer, self-oscillating vocal fold model fabrication. J. Vis. Exp. (JoVE) 58 (2011)Google Scholar
  15. 15.
    Chen, G., et al.: Development of a glottal area index that integrates glottal gap size and open quotient. JASA 133(3), 1656–66 (2013)CrossRefGoogle Scholar
  16. 16.
    Kreiman, J., et al.: Variability in the relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation. JASA 132(4), 2625–32 (2012)CrossRefGoogle Scholar
  17. 17.
    Stone, S., Marxen, M., Birkholz, P.: Construction and evaluation of a parametric one-dimensional vocal tract model. IEEE Trans. Audio Speech Lang. Process. 26(8), 1381–1392 (2018)CrossRefGoogle Scholar
  18. 18.
    Fleischer, M., Mainka, A., Kürbis, S., Birkholz, P.: How to precisely measure the volume velocity transfer function of physical vocal tract models by external excitation. PLoS ONE 13(3), e0193708 (2018). Scholar
  19. 19.
    Yushkevich, P.A., et al.: User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31(3), 1116–1128 (2006)CrossRefGoogle Scholar
  20. 20.
    Birkholz, P.: Enhanced area functions for noise source modeling in the vocal tract. In: Proceedings of the 10th International Seminar on Speech Production (ISSP 2014), Cologne, pp. 37–40 (2014)Google Scholar
  21. 21.
    Beautemps, D., Badin, P., Bailly, G.: Linear degrees of freedom in speech production: analysis of cineradio- and labio-film data and articulatory-acoustic modeling. JASA 109(5), 2165–80 (2001)CrossRefGoogle Scholar
  22. 22.
    Laprie, Y., Loosvelt, M., Maeda, S., Sock, R., Hirsch, F.: Articulatory copy synthesis from cine X-ray films. In: Proceedings of the Interspeech, Lyon, France (2013)Google Scholar
  23. 23.
    Dang, J., Honda, K.: Acoustic characteristics of the piriform fossa in models and humans. JASA 101(1), 456–465 (1997)CrossRefGoogle Scholar
  24. 24.
    Delvaux, B., Howard, D.: A new method to explore the spectral impact of the piriform fossae on the singing voice: benchmarking using MRI-based 3D-printed vocal tracts. PLOS ONE 9(7), e102680 (2014)CrossRefGoogle Scholar
  25. 25.
    Echternach, M., et al.: Articulation and vocal tract acoustics at soprano subject’s high fundamental frequencies. JASA 137(5), 2586–2595 (2015)CrossRefGoogle Scholar
  26. 26.
    Hoffmann, R.: On the development of early vocoders. In: Proceedings of the 2nd IEEE Histelcon 2010, Madrid, pp. 359–364, 3–5 November 2010Google Scholar
  27. 27.
    Hoffmann, R.: Zur Entwicklung des Vocoders in Deutschland. In: Jahrestagung für Akustik, DAGA 2011, Düsseldorf, 37. Jahrestagung für Akustik, DAGA 2011, pp. 149–150, 21–24 March 2011Google Scholar
  28. 28.
    Hoffmann, R., Gramm, G.: The Sennheiser vocoder goes digital: On a German R&D project in the 1970s. In: Proceedings of the 2nd International Workshop on the History of Speech Communication Research (HSCR 2017), Helsinki, 18–19 August 2017, pp. 35–44. TUDpress, Dresden (2017)Google Scholar
  29. 29.
    Solschenizyn, A.: Im ersten Kreis. Aus dem Russ. übersetzt und zusammengetragen von S. Geier. Vollständige Ausgabe der wiederhergestellten Urfassung. S. Fischer Verlag, Frankfurt am Main (1982)Google Scholar
  30. 30.
    Schroeder, M.R.: Computer Speech: Recognition, Compression, Synthesis. Springer Series in Information Sciences, vol. 35. Springer, Heidelberg (1999). Scholar
  31. 31.
    Tompkins, D.: How to Wreck a Nice Beach: The Vocoder from World War II to Hip-Hop. Melville House/Chicago: Stop Smiling Media, Brooklyn (2010)Google Scholar
  32. 32.
    Kotel’nikov, V.A.: Sud’ba, ochvativšaja vek. Tom 2: N. V. Kotel’nikova ob otce. Fizmatlit, Moskva (2011)Google Scholar
  33. 33.
    Kalačev, K.F.: V kruge tret’em. Vospominanija i razmyšlenija o rabote Marfinskoj laboratorii v 1948–1951 godach. Moskva (1999)Google Scholar
  34. 34.
    Hoffmann, R., Jäckel, R.: Zur Geschichte des Vocoders in der Sowjetunion. In: Jahrestagung für Akustik, DAGA 2018, München, 44. Jahrestagung für Akustik, DAGA 2018, pp. 840–843, 19–22 March 2018Google Scholar
  35. 35.
    Mjasnikov, L.L.: Ob-ektivnoe raspoznavanie zvukov reči. Žurnal Techničeskoj Fiziki 13(3), 109–115 (1943)Google Scholar
  36. 36.
    Schroeder, M.R., David, E.E.: A vocoder for transmitting 10 kc/s speech over a 3.5 kc/s channel. Acustica 10, 35–43 (1960)Google Scholar
  37. 37.
    Munson, W.A., Montgomery, H.C.: A speech analyzer and synthesizer. JASA 22(5), 678 (1950)CrossRefGoogle Scholar
  38. 38.
    Sapožkov, M.A.: Rečevoj signal v kibernetike i svjazi. Svjaz’izdat, Moskva (1963)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Rüdiger Hoffmann
    • 1
    Email author
  • Peter Birkholz
    • 1
  • Falk Gabriel
    • 1
  • Rainer Jäckel
    • 1
  1. 1.Institut für Akustik und SprachkommunikationTechnische Universität DresdenDresdenGermany

Personalised recommendations