Skip to main content
Log in

Spracherkennung mit Hidden Control Neural Networks

Speech recognition using hidden control neutral networks

  • Begutachtete Originalarbeiten
  • Published:
e&i Elektrotechnik und Informationstechnik Aims and scope Submit manuscript

Zusammenfassung

Hidden Control Neural Networks (HCN Networks) eignen sich für vielfältige Mustererkennungsaufgaben. Hier wird ein Spracherkennungsverfahren zur sprecherunabhängigen Einzelworterkennung beschrieben, welches die Implementierung von Benutzerschnittstellen zur Steuerung von Geräten mittels einfacher Wort-Kommandos ermöglicht. Um das Verfahren zu evaluieren, wurden Minimalpaare verwendet, also Wortpaare, innerhalb derer sich die Worte lediglich um ein einziges Phonem unterscheiden. Es gelang, die Erkennungsrate zu erhöhen, indem Zeitabschnitte der Aufnahme, welche sich im Training als erkennungsrelevant herausgestellt haben, verstärkt Berücksichtigung finden.

Abstract

Hidden control neural networks (HCN networks) are suitable for a variety of pattern recognition techniques. The speech recognizer described here is built for speaker-independent single-word recognition and is intended to implement user interfaces to control devices via simple word-commands. To evaluate the speech recognizer, it has been applied to minimum pairs. Within a minimum pair two words differ only in a single phoneme. It was achieved to increase the recognition rate while taking those periods of time especially into account, that are found to contain the relevant difference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Schrifttum

  1. Levin, E.: Hidden control neural architecture modeling of nonlinear time varying systems and its applications. Transactions on Neural Networks 4 (1993), pp. 109–116.

    Article  Google Scholar 

  2. Levin, E.: Word recognition using hidden control neural architecture. Proceedings of the ICASSP (1990), pp. 433–437.

  3. Forney, G. D.: The viterbi algorithm. Proceedings of the IEEE, 61 (1973), pp. 268–278.

    Article  MathSciNet  Google Scholar 

  4. Vidal, E., Marzal, A.: A new technique for automatic segmentation of continuous speech. NATO ASI Speech Recognition and Understanding F75 (1990), pp. 543–548.

    Google Scholar 

  5. Widrow, B., Lehr, M.: 30 Years of adaptive neural networks: Perceptron, madaline and backpropagation. Proceedings of the IEEE 78 (1973), pp. 1415–1442.

    Article  Google Scholar 

  6. Rumelhart, D. E., Hinton, G. E., Williams, R. J.: Learning internal representation by error propagation. (Parallel distributed processing). Cambridge: MIT Press 1986.

    Google Scholar 

  7. Cybenko, G.: Approximation by superposition of a sigmodial function. Mathematics of Control, Signals and Systems (1989), pp. 303–314.

  8. Juang, B.-H., Katagiri, S.: Discriminative learning for minimum error classification. IEEE Transactions on Signal Processing 40 (1992), pp. 3043–3054.

    Article  MATH  Google Scholar 

  9. Carrol, S. M., Dickinson, B. W.: Construction of neural networks using the radon transform. Proceedings of the IJCNN (1989), pp. I-607–I-611.

  10. Levin, E., Gewirtzman, R., Inbar, G. F.: Neural network architecture for adaptive system modeling and control. Proceedings of the IJCNN (1989), pp. II-311–II-316.

  11. Lapedes, A., Farber, R.: Los Alamos National Laboratory Technical Report LA/UR87/2662: Nonlinear signal processing using neural networks: Prediction and system modeling. Los Alamos NM. 1987.

  12. Tishby, N.Z.: A dynamical systems approach to speech processing. Proceedings of the ICASSP (1990), pp. 365–369.

  13. Atal, B. S.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoustical Society of America 55 (1974), no. 6, pp. 1304–1312.

    Article  Google Scholar 

  14. Eppinger, B., Herter, E.: Mustererkennung bei Sprachsignalen. S. 147 bis 194. Sprachverarbeitung. Esstingen: STZ. 1993.

    Google Scholar 

  15. Iso, K. Watanabe, T.: Speaker-independent word recognition using a neural prediction model. Proceedings of the ICASSP (1990), pp. 441–444.

  16. Tebelskis, J., Waibel, A.: Large vocabulary recognition using linked predictive neural networks. Proceedings of the ICASSP (1990), pp. 437–440.

  17. Hanazawa, T., Hinton, G., Shikano, K., Waibel, A., Lang, K.: Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech and Signal Processing IEEE 37 (1989), no. 3, pp. 328–339.

    Article  Google Scholar 

  18. Sawai, H., Shikano, K., Miyatake, M.: Integrated training for spotting Japanese phonemes using large phonemic time-delay neutral networks. Proceedings of the ICASSP (1990), pp. 449–452.

  19. Bottou, L., Fogelman-Soulie, F., Blanchet, P., Lienard, J. S.: Experiments with time delay networks and dynamic time warping for speaker independent isolated digit recognition. Proceedings Eurospeech (1989), pp. 537–540.

  20. Rabiner, L. R., Wilpon, J. G., Soong, F. K.: High performance connected digit recognition using hidden markov models. IEEE Transactions on Acoustics, Speech and Signal Processing 37 (1989), no. 8, pp. 1214–1225.

    Article  Google Scholar 

  21. Rabiner, L. R.: A Tutorial on hidden markov models and selected applications in speech recognition. IEEE Proceedings 77 (1989), no. 2, pp. 257–286.

    Article  Google Scholar 

  22. Bourlard, H., Wellekens, C. J.: Links between markov models and multilayered perceptrons. Advances in Neural Network Information Processing Systems (1988), pp. 502–510.

  23. Bridle, J. S.: Neural network or hidden markov models for automatic speech recognition: Is there a choice? NATO ASI Speech Recognition and Understanding (1990), pp. 225–236.

  24. Dietrich, S.: Hidden Markov Modelle zum Mustervergleich in der Spracherkennung. Diplomarbeit am Institut für Computersprachen, Institut für Nachrichten und Hochfrequenztechnik, TU Wien, 1994.

  25. Hickersberger, H.: Spracherkennung mit Predictive Neural Networks. Programmierpraktikum am Institut für Computertechnik, TU Wien, 1997.

  26. Hickersberger, H.: Spracherkennung mit Hidden Control Neural Networks. Diplomarbeit Siemens AG, Institut für Computertechnik, TU Wien, 1997.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Diese Arbeit wurde anläßlich der letzten Generalversammlung des ÖVE am 27. November 1997 mit einem GIT-Preis ausgezeichnet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hickersberger, H. Spracherkennung mit Hidden Control Neural Networks. Elektrotech. Inftech. 115, 245–250 (1998). https://doi.org/10.1007/BF03159578

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03159578

Schlüsselwörter

Keywords

Navigation