An Approach to Intelligent Signal Processing

  • Matthias Wolff
  • Rüdiger Hoffmann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7403)

Abstract

This paper describes an approach to intelligent signal processing. First we propose a general signal model which applies to speech, music, biological, and technical signals. We formulate this model mathematically using a unification of hidden Markov models and finite state machines. Then we name tasks for intelligent signal processing systems and derive a hierarchical architecture which is capable of solving them. We show the close relationship of our approach to cognitive dynamic systems. Finally we give a number of application examples.

Keywords

intelligent signal processing hidden Markov automata hierarchical systems cognitive systems acoustic pattern recognition audio processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bilmes, J.: A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian Mixture and hidden Markov models. Tech. rep., International Computer Science Institute (1998)Google Scholar
  2. 2.
    Caseiro, D., Trancoso, I.: A specialized on-the-fly algorithm for lexicon and language model composition. IEEE Transactions on Audio, Speech, and Language Processing 14(4), 1281–1291 (2006)CrossRefGoogle Scholar
  3. 3.
    Duckhorn, F.: Optimierung von Hidden-Markov-Modellen für die Sprach- und Signalerkennung. Diplomarbeit, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (2007)Google Scholar
  4. 4.
    Duckhorn, F., Wolff, M., Strecha, G., Hoffmann, R.: An application example for unified speech synthesis and recognition using Hidden Markov Models. In: One Day Meeting on Unified Models for Speech Recognition and Synthesis, Birmingham, U.K. (March 2009)Google Scholar
  5. 5.
    Eichner, M.: Spracherkennung und Sprachsynthese mit gemeinsamen Datenbasen - Akustische Analyse und Modellierung. Dissertationsschrift, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation, Studientexte zur Sprachkommunikation vol. 43, w.e.b. Universitätsverlag, Dresden (2006) ISBN 978-3-940046-10-9Google Scholar
  6. 6.
    Eichner, M.: Signalverarbeitung für ein rotationsbezogenes Messsystem. Forschungsbericht, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (April 2007)Google Scholar
  7. 7.
    Eichner, M., Göcks, M., Hoffmann, R., Kühne, M., Wolff, M.: Speech-enabled services in a web-based e-learning environment. Advanced Technology for Learning 1(2), 91–98 (2004)CrossRefGoogle Scholar
  8. 8.
    Eichner, M., Wolff, M., Hoffmann, R.: A unified approach for speech synthesis and speech recognition using Stochastic Markov Graphs. In: Proceedings of the Internation Conference on Spoken Language Processing, ICSLP 2000, Beijing, PR China, vol. 1, pp. 701–704 (October 2000)Google Scholar
  9. 9.
    Eichner, M., Wolff, M., Hoffmann, R.: Voice characteristics conversion for TTS using reverse VTLN. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2004, Montreal, Canada, vol. 1, pp. 17–20 (May 2004)Google Scholar
  10. 10.
    Eichner, M., Wolff, M., Hoffmann, R.: Instrument classification using Hidden Markov Models. In: International Conference on Music Information Retrieval, ISMIR 2006, Victoria, BC, Canada, pp. 349–350 (October 2006)Google Scholar
  11. 11.
    Eichner, M., Wolff, M., Hoffmann, R.: An HMM based investigation of differences between musical instruments of the same type. In: Proceedings of the International Congress on Acoustics, ICA 2007, Madrid, Spain, 5 pages on CD-ROM Proceedings (September 2007)Google Scholar
  12. 12.
    Eichner, M., Wolff, M., Hoffmann, R., Kordon, U., Ziegenhals, G.: Verfahren und Vorrichtung zur Klassifikation und Beurteilung von Musikinstrumenten. Deutsches Patent 102006014507 (December 2008)Google Scholar
  13. 13.
    Eichner, M., Wolff, M., Ohnewald, S., Hoffmann, R.: Speech synthesis using stochastic Markov graphs. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2001, Salt Lake City, UT, USA, pp. 829–832 (May 2001)Google Scholar
  14. 14.
    Fuster, J.M.: Cortex and Mind: Unifying Cognition. Oxford University Press, New York (2005) 978-0-19-530084-0CrossRefGoogle Scholar
  15. 15.
    Haykin, S.: Cognitive dynamic systems. Proceedings of the IEEE 94(11), 1910–1911 (2006)CrossRefGoogle Scholar
  16. 16.
    Haykin, S.: Foundations of cognitive dynamic systems. IEEE Lecture, Queens University (January 29, 2009), http://soma.mcmaster.ca/papers/Slides_Haykin_Queens.pdf
  17. 17.
    Hübler, S.: Suchraumoptimierung zur Identifizierung ähnlicher Musikstücke. Diplomarbeit, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (2008)Google Scholar
  18. 18.
    Hentschel, D., Tschöpe, C., Hoffmann, R., Eichner, M., Wolff, M.: Verfahren zur Beurteilung einer Güteklasse eines zu prüfenden Objekts. Deutsches Patent 10 2004 023 824 (July 2006)Google Scholar
  19. 19.
    Hentschel, D., Tschöpe, C., Hoffmann, R., Eichner, M., Wolff, M.: Verfahren zur Beurteilung einer Güteklasse eines zu prüfenden Objekts. Europäisches Patent EP 1 733 223 (January 2008)Google Scholar
  20. 20.
    Hentschel, D., Tschöpe, C., Hoffmann, R., Eichner, M., Wolff, M.: Verfahren zur Beurteilung einer Güteklasse eines zu prüfenden Objekts. Österreichisches Patent AT 384261 (February 2008)Google Scholar
  21. 21.
    Erkennungsexperimente mit Barkhausen-Rauschen. In: Hoffmann, R. (ed.) Jahresbericht 1999, p. 34. Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (December 1999)Google Scholar
  22. 22.
    Hoffmann, R.: Recognition of non-speech acoustic signals. In: Kacic, Z. (ed.) Proceedings of the International Workshop on Advances in Speech Technology Advances, AST 2006, p. 107. University of Maribor, Maribor (2006)Google Scholar
  23. 23.
    Hoffmann, R.: Denken in Systemen. In: Gerlach, G., Hoffmann, R. (eds.) Neue Entwicklungen in der Elektroakustik und elektromechanischen Messtechnik, Dresdner Beiträge zur Sensorik, vol. 40, pp. 13–24. TUD Press, Dresden (2009)Google Scholar
  24. 24.
    Hoffmann, R., Eichner, M., Wolff, M.: Analysis of Verbal and Nonverbal Acoustic Signals with the Dresden UASR System. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) Verbal and Nonverbal Commun. Behaviours. LNCS (LNAI), vol. 4775, pp. 200–218. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  25. 25.
    Hussein, H., Strecha, G., Hoffmann, R.: Resynthesis of prosodic information using the cepstrum vocoder. In: Proceedings of the 5th International Conference Speech Prosody. Chicago, IL, March 11-14, 4 pages (2010)Google Scholar
  26. 26.
    Hutschenreuther, T.: Automatische Anordnung von Gesangstexten zu Musik mit Hilfe von Methoden aus der Spracherkennung. Diplomarbeit, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (2009)Google Scholar
  27. 27.
    Imai, S., Sumita, K., Furuichi, C.: Mel log spectrum approximation (MLSA) filter for speech synthesis. In: Electronics and Communications in Japan (Part I: Communications), vol. 66, pp. 10–18 (1983)Google Scholar
  28. 28.
    Juang, H.H., Rabiner, L.R.: The segmental K-means algorithm for estimating parameters of Hidden Markov Models. IEEE Transactions on Acoustics, Speech, Signal Processing 38(9), 1639–1641 (1990)MATHCrossRefGoogle Scholar
  29. 29.
    Kühne, M., Wolff, M., Eichner, M., Hoffmann, R.: Voice activation using prosodic features. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2004, pp. 3001–3004 (October 2004)Google Scholar
  30. 30.
    Korotkoff, N.C.: On the subject of methods of determining blood pressure. Bull. Imperial. Mil. Med. Acad. 11, 365–367 (1905)Google Scholar
  31. 31.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (2001)Google Scholar
  32. 32.
    Mohri, M.: Weighted automata algorithms. In: Droste, M., Kuich, W., Vogler, H. (eds.) Handbook of Weighted Automata. Monographs in Theoretical Computer Science. An EATCS Series, pp. 213–254. Springer, Heidelberg (2009) ISBN 978-3-642-01491-8CrossRefGoogle Scholar
  33. 33.
    Mohri, M., Pereira, F., Riley, M.: Speech recognition with weighted finite-state transducers. In: Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer (2008)Google Scholar
  34. 34.
    Mohri, M., Riley, M.: Weighted finite-state transducers in speech recognition (tutorial). In: Proceedings of the International Conference on Spoken Language Processing (2002)Google Scholar
  35. 35.
    Mohri, M., Riley, M., Hindle, D., Ljolje, A., Pereira, F.: Full expansion of context-dependent networks in large vocabulary speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998, vol. 2, pp. 665–668 (May 1998)Google Scholar
  36. 36.
    Petrick, R., Lohde, K., Wolff, M., Hoffmann, R.: The harming part of room acoustics in automatic speech recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2007, Antwerp, Belgium, pp. 1094–1097 (August 2007)Google Scholar
  37. 37.
    Päßler, S., Wolff, M., Fischer, W.J.: Chewing sound classification using a grammar based classification algorithm. In: Proceedings of Forum Acusticum 2011 (2011) ISBN 978-84-694-1520-7Google Scholar
  38. 38.
    Pusch, T., Cherif, C., Farooq, A., Wittenberg, S., Hoffmann, R., Tschöpe, C.: Early fault detection at textile machines with the help of structure-borne sound analysis. Melliand English 11-12, E144–E145 (2008)Google Scholar
  39. 39.
    Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  40. 40.
    Richter, T.: Erkennung von Biosignalen. Diplomarbeit, Technische Universität Dresden, Institut für Akustik und Sprachkommunikation (2001)Google Scholar
  41. 41.
    Römer, R.: Beschreibung von Analyse-Synthese-Systemen unter Verwendung von kaskadierten bidirektionalen HMMs. In: Kröger, B.J., Birkholz, P. (eds.) Elektronische Sprachsignalverarbeitung 2011, Tagungsband der 22. Konferenz. Studientexte zur Sprachkommunikation, vol. 61, pp. 67–74. TUD Press (2011) ISBN 978-3-942710-37-4Google Scholar
  42. 42.
    Römer, R.: A Cortical Approach Based on Cascaded Bidirectional Hidden Markov Models. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) Cognitive Behavioural Systems. LNCS, vol. 7403, pp. 266–272. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  43. 43.
    Strecha, G., Wolff, M.: Speech synthesis using hmm based diphone inventory encoding for low-resource devices. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011), pp. 5380–5383 (2011)Google Scholar
  44. 44.
    Strecha, G., Wolff, M., Duckhorn, F., Wittenberg, S., Tschöpe, C.: The HMM synthesis algorithm of an embedded unified speech recognizer and synthesizer. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, U.K., pp. 1763–1766 (September 2009)Google Scholar
  45. 45.
    Tokuda, K., Masuko, T., Hiroi, J., Kobayashi, T., Kitamura, T.: A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 609–612 (1998)Google Scholar
  46. 46.
    Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T.: Speech parameter generation algorithms for hmm-based speech synthesis. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1315–1318 (2000)Google Scholar
  47. 47.
    Tschöpe, C.: Klassifikation technischer Signale, Studientexte zur Sprachkommunikation, vol. 60. TUD Press (2012)Google Scholar
  48. 48.
    Tschöpe, C., Hentschel, D., Wolff, M., Eichner, M., Hoffmann, R.: Classification of non-speech acoustic signals using structure models. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2004, vol. 5, pp. V653–V656 (May 2004)Google Scholar
  49. 49.
    Tschöpe, C., Hirschfeld, D., Hoffmann, R.: Klassifikation technischer Signale für die Geräuschdiagnose von Maschinen und Bauteilen. In: Tschöke, H., Henze, W. (eds.) Motor- und Aggregateakustik II, pp. 45–53. Expert Verlag, Renningen (2005)Google Scholar
  50. 50.
    Tschöpe, C., Wolff, M.: Automatic decision making in SHM using Hidden Markov Models. In: Database and Expert Systems Applications, DEXA 2007, pp. 307–311 (September 2007)Google Scholar
  51. 51.
    Tschöpe, C., Wolff, M.: Statistical classifiers for structural health monitoring. IEEE Sensors Journal 9(11), 1567–1676 (2009)CrossRefGoogle Scholar
  52. 52.
    Werner, S., Wolff, M., Eichner, M., Hoffmann, R., Estelmann, J.: Language identification using meta-classification of multiple experts. In: Processings of the International Conference on Speech and Computer, SPECOM 2005, Patras, Greece, pp. 519–522 (October 2005)Google Scholar
  53. 53.
    Wirsching, G., Huber, M., Kölbl, C.: The confidence-probability semiring. Tech. Rep. 2010-4, Institut für Informatik der Universität Augsburg (2010)Google Scholar
  54. 54.
    Wirsching, G., Huber, M., Kölbl, C., Lorenz, R., Römer, R.: Semantic Dialogue Modeling. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) Cognitive Behavioural Systems. LNCS, vol. 7403, pp. 104–113. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  55. 55.
    Wittenberg, S., Wolff, M., Hoffmann, R.: Feasibility of statistical classifiers for monitoring rollers. In: Proceedings of the International Conference on Signals and Electronic Systems, ICSES 2008, Krakow, Poland, pp. 463–466 (September 2008)Google Scholar
  56. 56.
    Wolff, M.: Akustische Musterkennung, Studientexte zur Sprachkommunikation, vol. 57. TUD Press (2011) ISBN 978-3-942710-14-5Google Scholar
  57. 57.
    Wolff, M., Kordon, U., Hussein, H., Eichner, M., Hoffmann, R., Tschöpe, C.: Auscultatory blood pressure measurement using HMMs. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2007, Honolulu, HI, USA, vol. 1, pp. 405–408 (April 2007)Google Scholar
  58. 58.
    Wolff, M., Schubert, R., Hoffmann, R., Tschöpe, C., Schulze, E., Neunübel, H.: Experiments in acoustic structural health monitoring of airplane parts. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, Las Vegas, NV, USA, pp. 2037–2040 (April 2008)Google Scholar
  59. 59.
    Wolff, M., Tschópe, C.: Pattern recognition for sensor signals. In: Proceedings of the IEEE Sensors Conference 2009, Christchurch, New Zealand, pp. 665–668 (October 2009)Google Scholar
  60. 60.
    Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Communication 51(11), 1039–1154 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Matthias Wolff
    • 1
  • Rüdiger Hoffmann
    • 2
  1. 1.Lehrstuhl KommunikationstechnikBrandenburgische Technische Universität CottbusCottbusGermany
  2. 2.Systemtheorie und SprachtechnologieTechnische Universität DresdenDresdenGermany

Personalised recommendations