Skip to main content

Representation and Recognition of Temporal Patterns

  • Chapter
  • 178 Accesses

Abstract

How can a nervous system represent for itself the temporal relations of patterns that it knows? In order to label auditory patterns, the nervous system must store early portions in order to identify the whole. Both linguists and engineer-scientists have a similar need to record spoken words. This paper reviews three basic models for handling the information-collection problem that supports pattern recognition, whether by scientists or others. Many of these techniques have been implemented in connectionist networks. In linguistic models for words, there are only ordered symbols, i.e. either phonemic segments or words. In engineering and speech science, time windows are built that store the entire signal and allow parametric description of time. But such windows are not plausible for nervous systems. A third alternative is a memory in the form of a dynamic system. These models are driven through a trajectory in state space by the input signals. Thus, the recognition process for familiar patterns produces a distinct trajectory through state space for each learned pattern. Among the advantages of such a system are that (1) it tends to recognize patterns despite changes in the rate of presentation, and (2) the system can be run continuously yet will respond as quickly as possible at appropriate times. Evidence is reviewed about human auditory memory for complex tone sequences. The data suggest that human auditory memory exhibits many similarities to the dynamic model.

The author is grateful to Svën Anderson for important contributions to the work described here. He is also grateful to Charles Watson, Gary R. Kidd, Michael Gasser, Jungyul Suh and John W. R. Merrill for helpful discussion of these ideas. This research was supported in part by the Air Force Office of Scientific Research, Grant 870089, and by the National Science Foundation, Grants DCR-8505635 and DCR-8518725.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abraham, R. & Shaw, C. (1983) Dynamics, the Geometry of Behavior, Part 1. Santa Cruz, CA: Aerial Press.

    Google Scholar 

  • Anderson, S. & Port, R. (1990) Network model of auditory pattern recognition. Technical Report 11, Indiana University, Cognitive Science Program.

    Google Scholar 

  • Baird, B. (1986) Nonlinear dynamics of pattern formation and pattern recognition in the rabbit olfactory bulb. Physica, 22D, 150–175.

    Google Scholar 

  • Barlow, W.R.L. (1965) The mechanism of directionally selective units in a rabbit’s retina. Journal of Physiology, 173, 477–504.

    Google Scholar 

  • Bever, T.G. (1973) Serial position and response biases do not account for the effect of syntactic structure on the location of brief noises during sentences. Journal of Psycholinguistic Research, 2(3), 287–288.

    Google Scholar 

  • Bregman, A.S. & Campbell, J. (1971) Primary auditory stream segregation and perception of order in rapid sequences of tones. Journal of Experimental Psychology, 89, 244–249.

    Article  Google Scholar 

  • Carlson, R. & Granstrom, B. (Eds) (1982) Representation of Speech in the Peripheral Auditory System. Amsterdam: Elsevier.

    Google Scholar 

  • Chomsky, N. & Halle, M. (1968) The Sound Pattern of English. New York: Harper & Row.

    Google Scholar 

  • Clements, G.N. (1985) The geometry of phonological features. Phonology Yearbook, 2, 223–274.

    Article  Google Scholar 

  • Crowder, R. & Morton, J. (1969) Precategorical acoustic storage. Perception and Psychophysics, 5, 365–373.

    Article  Google Scholar 

  • Dorman, M., Raphael, L. & Liberman, A. (1979) Some experiments on the sound of silence in phonetic perception. Journal of the Acoustical Society of America, 65, 1518–1532.

    Article  Google Scholar 

  • Elman, J. (1988) Finding structure in time. Cognitive Science, 14, 179–211.

    Article  Google Scholar 

  • Elman, J.L. & McClelland, J.L. (1986) Interactive processes in speech perception: the TRACE model. In: J. McClelland & D. Rumelhart (Eds) Parallel Distributed Processing, Vol 2, 58–121. Cambridge, MA: MIT Press.

    Google Scholar 

  • Elman, J. & Zipser, D. (1988) Learning the hidden structure of speech. Journal of the Acoustical Society of America, 83, 615–626.

    Article  Google Scholar 

  • Espinoza-Varas, B. & Watson, C. (1986) Temporal discrimination for single components of nonspeech auditory patterns. Journal of the Acoustical Society of America, 80(6), 1685–1694.

    Article  Google Scholar 

  • Fant, G. (1973) Speech Sounds and Features. Cambridge, MA: MIT Press.

    Google Scholar 

  • Gasser, M. & Lee, C.-D. (1989) Networks that learn phonology. Technical Report 300, Computer Science Department, Indiana University.

    Google Scholar 

  • Goldsmith, J. (1976) Autosegmental Phonology. New York: Garland Press.

    Google Scholar 

  • Grossberg, S. (1982) Studies of Mind and Brain, Vol. 70 of Boston Studies in the Philosophy of Science. Dordrecht, the Netherlands: D. Reidel.

    Google Scholar 

  • Grossberg, S. (1986) The adaptive self-organization of serial order in behavior: speech language, and motor control. In: E. Schwab & H. Nusbaum (Eds) Pattern Recognition by Humans and Machines: Speech Perception. Orlando, FL: Academic Press.

    Google Scholar 

  • Halle, M. & Stevens, K.N. (1980) A note on laryngeal features. Quarterly Progress Report, Research Lab of Electronics, MIT, 101, 198–213.

    Google Scholar 

  • Handel, S. (1989) Listening: an Introduction to the Perception of Auditory Events. Cambridge, MA: Bradford Books/MIT Press.

    Google Scholar 

  • Hare, M.L. (1990) The role of similarity in Hungarian vowel harmony: a connectionist account. Connection Science, 2, 123–150.

    Article  Google Scholar 

  • Harris, C.L. & Ellman, J.L. (1989) Representing variable information with simple recurrent networks. In: Proceedings of the Eleventh Annual Conference of the Cognitive Science Society, 635–642. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Hinton, G. (1988) Representing part-whole hierarchies in connectionist networks. In: Proceedings of the Tenth Annual Conference of the Cognitive Science Society, 48–54. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Hirsch, M.W. (1989) Convergent activation dynamics in continuous time network. Neural Networks, 2, 331–349.

    Article  Google Scholar 

  • Hopfield, J.J. (1982) Neural networks and physical systems with emergent collective computational abilities. In: Proceedings of the National Academy of Sciences, Vol. 79, 2554–2558. National Academy of Sciences.

    Article  Google Scholar 

  • Itakura, F. (1975) Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23, 67–72.

    Article  Google Scholar 

  • Jakobson, R., Fant, G. & Halle, M. (1952) Preliminaries to Speech Analysis: the Distinctive Features and their Correlates. Cambridge, MA: MIT Press.

    Google Scholar 

  • Kantowicz, B. & Sorkin, R. (1983) Human Factors: Understanding People-System Relationships. New York: Wiley.

    Google Scholar 

  • Keeler, J. (1988) Comparison between Kanerva’s SDM and Hopfield-type neural networks. Cognitive Science, 12, 299–329.

    Article  Google Scholar 

  • Kewley-Port, D. (1983) Time-varying features as correlates of place of articulation in stop consonants. Journal of the Acoustical Society of America, 73, 322–335.

    Article  Google Scholar 

  • Kidd, G.R. & Watson, C.S. (1988) Detection of changes in frequency-and time-transposed auditory patterns. Journal of the Acoustical Society of America, 84, 5141–5142.

    Article  Google Scholar 

  • Klatt, D. (1976) Linguistic uses of segmental duration in English: acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208–1221.

    Article  Google Scholar 

  • Klatt, D. (1986) Problem of variability in speech recognition and in models of speech perception. In: Perkell, J. & Klatt, D. (Eds) Invariance and Variability in the Speech Processes, 300–320. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Ladefoged, P. (1989) Representing phonetic structure. Working Papers in Phonetics 73, University of California, Los Angeles.

    Google Scholar 

  • Lakoff, G. (1988) Cognitive phonology. Paper presented at the LSA Annual Meeting.

    Google Scholar 

  • Lang, K.J., Waibel, A.H. & Hinton, G.E. (1990) A time-delay neural network architecture for isolated word recognition. Neural Networks, 3(1), 23–43.

    Article  Google Scholar 

  • Lashley, K.S. (1951) The problem of serial order in behavior. In: L. A. Jefress (Ed.) Cerebral Mechanisms in Behavior, 112–136. New York: Wiley.

    Google Scholar 

  • Lea, W.A. (1980) Trends in Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Leek, M.R. & Watson, C. (1984) Learning to detect auditory pattern components. Journal of the Acoustical Society of America, 76, 1037–1044.

    Article  Google Scholar 

  • Lehiste, I. (1970) Suprasegmentals. Cambridge, MA: MIT Press.

    Google Scholar 

  • Levinson, S.E. (1985) A unified theory of composite pattern analysis for automatic speech recognition. In: F. Fallside & W.A. Woods (Eds) Computer Speech Processing, 243–272. Englewood Clifs, NJ: Prentice-Hall.

    Google Scholar 

  • Liberman, A., Cooper, F., Shankweiler, D. & Studdert-Kennedy, M. (1967) Perception of the speech code. Psychological Review, 74, 431–461.

    Article  Google Scholar 

  • Lisker, L. & Abramson, A. (1964) A cross-language study of voicing in initial stops: acoustical measurements. Word, 20, 384–422.

    Google Scholar 

  • Lisker, L. & Abramson, A. (1971) Distinctive features and laryngeal control. Language, 44, 767–785.

    Article  Google Scholar 

  • Luenberger, D.G. (1979) Introduction to Dynamic Systems. New York: Wiley.

    Google Scholar 

  • Mannes, C. & Dorffner, G. (1989) Self-organizing detectors for spatiotemporal patterns. Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Austria.

    Google Scholar 

  • Moore, B.C.J. (1982) An Introduction to Psychology of Hearing. New York: Harcourt Brace Jovanovich, 2nd edition.

    Google Scholar 

  • Neisser, U. (1967) Cognitive Psychology. New York: Appleton.

    Google Scholar 

  • O’Shaugnessy, D. (1987) Speech Communication: Human and Machine. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Port, R. (1986) Invariance in phonetics. In: J. Perkell & D. Klatt (Eds) Invariance and Variability in Speech Processes, 540–558. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Port, R. & Anderson, S. (1989) Recognition of melody fragments in continuously performed music. In: G. Olson & E. Smith (Eds) Proceedings of the Eleventh Annual Meeting of the Cognitive Science Society, 820–827. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Port, R. & Crawford, P. (1989) Pragmatic effects on neutralization rules. Journal of Phonetics, 16(4) 257–282.

    Google Scholar 

  • Port, R. & Dalby, J. (1982) C/V ratio as a cue for voicing in English. Journal of the Acoustical Society of America, 69, 262–274.

    Article  Google Scholar 

  • Port, R.F. (1981) Linguistic timing factors in combination. Journal of the Acoustical Society of America, 69, 262–274.

    Article  Google Scholar 

  • Port, R.F. & Rotunno, R. (1979) Relation between voice-onset time and vowel duration. Journal of the Acoustical Society of America, 66(3), 654–662.

    Article  Google Scholar 

  • Rabiner, L. & Juang, B. (1986) An introduction to hidden Markov models. IEEE ASSP Magazine, 4–16.

    Google Scholar 

  • Repp, B. (1984) Categorical perception: issues, methods and findings. In: N.J. Lass (Ed.) Speech and Language: Advances in Basic Research and Practice, Vol. 10, 243–335. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Port, R. & Reilly, W. & Maki, D. (1988) Use of syllable-scale timing to discriminate words. Journal of the Acoustical Society of America, 83(1), 265–273.

    Article  Google Scholar 

  • Robinson, D.E. & Watson, C.S. (1972) Psychophysical methods in modern psychoacoustics. In: J.V. Tobias (Ed.) Foundations of Modern Auditory Theory, Vol. 2, 99–131. New York: Academic Press.

    Google Scholar 

  • Sachs, M.B. & Young, E.D. (1980) Effects of nonlinearities on speech encoding in the auditory nerve. Journal of the Acoustical Society of America, 68, 858–875.

    Article  Google Scholar 

  • Sankoff, D. & Kruskal, J.B. (Eds) (1983) Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Sejnowski, T. & Rosenberg, C. (1987) Parallel networks that learn to pronounce English text. Complex Systems, 1, 145–168.

    Google Scholar 

  • Selfridge, O.G. (1959) Pandemonium: a paradigm for learning. In: Mechanisation of Thought Processes, 511–531. London: H.M. Stationery Office.

    Google Scholar 

  • Shamma, S.A. (1989) Stereausis: binaural processing without neural delays. Journal of the Acoustical Society of America, 86(3), 989–1006.

    Article  Google Scholar 

  • Skarda, C. & Freeman, W. (1987) How brains make chaos in order to make sense of the world. Behavioral and Brain Sciences, 10, 161–195.

    Article  Google Scholar 

  • Smythe, E.J. (1987) The detection of formant transitions in a connectionist network. In: Proceedings of the First IEEE International Conference on Neural Networks, 495–503. San Diego, CA.

    Google Scholar 

  • Smythe, E.J. (1988) Temporal computation in connectionist models. Technical Report 251, Indiana University, Computer Science Department, Indiana University, Bloomington, IN.

    Google Scholar 

  • Spiegel, M.F. & Watson, C.S. (1981) Factors in the discrimination of tonal patterns. III. Frequency discrimination with components of well-learned patterns. Journal of the Acoustical Society of America, 69(1), 223–230.

    Article  Google Scholar 

  • Stevens, K.N. (1983) Design features of speech sound systems. In: P. MacNeilage (Ed.) The Production of Speech, 247–262. New York: Springer-Verlag.

    Chapter  Google Scholar 

  • Stevens, K.N. & Blumstein, S.E. (1981) The search for invariant acoustic correlates of phonetic features. In: P. Eimas & J. Miller (Eds) Perspectives on the Study of Speech. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Stevens, S.S. (1951) Mathematics, measurement and psychophysics. In: S. S. Stevens (Ed.) Handbook of Experimental Psychology, 1–49. New York: Wiley.

    Google Scholar 

  • Swets, J.A. (1961) Is there a sensory threshold? Science, 34, 168–177.

    Article  Google Scholar 

  • Tank, D. & Hopfield, J. (1987) Neural computation by concentrating information in time. In: Proceedings of the National Academy of Sciences, 1896–1900.

    Google Scholar 

  • Vaissière, J. (1985) Speech recognition: a tutorial. In: F. Fallside & W. A. Woods (Eds) Computer Speech Processing, 191–242. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Waibel, A. (1986) Prosody and Speech Recognition. PhD thesis, Carnegie-Mellon University, Computer Science Dept. Pittsburgh, PA.

    Google Scholar 

  • Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. & Lang, K. (1988) Phoneme recognition: neural networks vs. hidden Markov models. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 107–110. IEEE.

    Google Scholar 

  • Warren, R. & Bashford, J. (1981) Perception of acoustic iterance: pitch and infrapitch. Perception and Psychophysics, 29(4), 395–402.

    Article  Google Scholar 

  • Watrous, R. (1990) Phoneme discrimination using connectionist networks. Journal of the Acoustical Society of America, 87, 1753–1772.

    Article  Google Scholar 

  • Watson, C. & Foyle, D. (1985) Central factors in the discrimination and identification of complex sounds. Journal of the Acoustical Society of America, 78, 375–380.

    Article  Google Scholar 

  • Watson, C.S. (1987) Uncertainty, informational masking, and the capacity of immediate auditory memory. In: W. A. Yost (Ed.) Auditory Processing of Complex Sounds, 267–277. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Watson, C.S., Wroton, H.W., Kelly, W.J. & Benbasset, C.A. (1975) Factors in the discrimination of tonal patterns. I. Component frequency, temporal position, and silent intervals. Journal of the Acoustical Society of America, 57, 1175–1181.

    Article  Google Scholar 

  • Wheeler, D. & Touretzky, D. (1989) A connectionist implementation of cognitive phonology. Technical Report CMU-CS-89-144, School of Computer Science, CMU.

    Google Scholar 

  • Williams, R. & Zipser, D. (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2), 270–280.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Port, R.F. (1992). Representation and Recognition of Temporal Patterns. In: Sharkey, N. (eds) Connectionist Natural Language Processing. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-2624-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-2624-3_15

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-5160-6

  • Online ISBN: 978-94-011-2624-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics