Representation and Recognition of Temporal Patterns

Port, Robert F.

doi:10.1007/978-94-011-2624-3_15

Representation and Recognition of Temporal Patterns

Robert F. Port²

Chapter

178 Accesses

Abstract

How can a nervous system represent for itself the temporal relations of patterns that it knows? In order to label auditory patterns, the nervous system must store early portions in order to identify the whole. Both linguists and engineer-scientists have a similar need to record spoken words. This paper reviews three basic models for handling the information-collection problem that supports pattern recognition, whether by scientists or others. Many of these techniques have been implemented in connectionist networks. In linguistic models for words, there are only ordered symbols, i.e. either phonemic segments or words. In engineering and speech science, time windows are built that store the entire signal and allow parametric description of time. But such windows are not plausible for nervous systems. A third alternative is a memory in the form of a dynamic system. These models are driven through a trajectory in state space by the input signals. Thus, the recognition process for familiar patterns produces a distinct trajectory through state space for each learned pattern. Among the advantages of such a system are that (1) it tends to recognize patterns despite changes in the rate of presentation, and (2) the system can be run continuously yet will respond as quickly as possible at appropriate times. Evidence is reviewed about human auditory memory for complex tone sequences. The data suggest that human auditory memory exhibits many similarities to the dynamic model.

The author is grateful to Svën Anderson for important contributions to the work described here. He is also grateful to Charles Watson, Gary R. Kidd, Michael Gasser, Jungyul Suh and John W. R. Merrill for helpful discussion of these ideas. This research was supported in part by the Air Force Office of Scientific Research, Grant 870089, and by the National Science Foundation, Grants DCR-8505635 and DCR-8518725.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abraham, R. & Shaw, C. (1983) Dynamics, the Geometry of Behavior, Part 1. Santa Cruz, CA: Aerial Press.
Google Scholar
Anderson, S. & Port, R. (1990) Network model of auditory pattern recognition. Technical Report 11, Indiana University, Cognitive Science Program.
Google Scholar
Baird, B. (1986) Nonlinear dynamics of pattern formation and pattern recognition in the rabbit olfactory bulb. Physica, 22D, 150–175.
Google Scholar
Barlow, W.R.L. (1965) The mechanism of directionally selective units in a rabbit’s retina. Journal of Physiology, 173, 477–504.
Google Scholar
Bever, T.G. (1973) Serial position and response biases do not account for the effect of syntactic structure on the location of brief noises during sentences. Journal of Psycholinguistic Research, 2(3), 287–288.
Google Scholar
Bregman, A.S. & Campbell, J. (1971) Primary auditory stream segregation and perception of order in rapid sequences of tones. Journal of Experimental Psychology, 89, 244–249.
Article Google Scholar
Carlson, R. & Granstrom, B. (Eds) (1982) Representation of Speech in the Peripheral Auditory System. Amsterdam: Elsevier.
Google Scholar
Chomsky, N. & Halle, M. (1968) The Sound Pattern of English. New York: Harper & Row.
Google Scholar
Clements, G.N. (1985) The geometry of phonological features. Phonology Yearbook, 2, 223–274.
Article Google Scholar
Crowder, R. & Morton, J. (1969) Precategorical acoustic storage. Perception and Psychophysics, 5, 365–373.
Article Google Scholar
Dorman, M., Raphael, L. & Liberman, A. (1979) Some experiments on the sound of silence in phonetic perception. Journal of the Acoustical Society of America, 65, 1518–1532.
Article Google Scholar
Elman, J. (1988) Finding structure in time. Cognitive Science, 14, 179–211.
Article Google Scholar
Elman, J.L. & McClelland, J.L. (1986) Interactive processes in speech perception: the TRACE model. In: J. McClelland & D. Rumelhart (Eds) Parallel Distributed Processing, Vol 2, 58–121. Cambridge, MA: MIT Press.
Google Scholar
Elman, J. & Zipser, D. (1988) Learning the hidden structure of speech. Journal of the Acoustical Society of America, 83, 615–626.
Article Google Scholar
Espinoza-Varas, B. & Watson, C. (1986) Temporal discrimination for single components of nonspeech auditory patterns. Journal of the Acoustical Society of America, 80(6), 1685–1694.
Article Google Scholar
Fant, G. (1973) Speech Sounds and Features. Cambridge, MA: MIT Press.
Google Scholar
Gasser, M. & Lee, C.-D. (1989) Networks that learn phonology. Technical Report 300, Computer Science Department, Indiana University.
Google Scholar
Goldsmith, J. (1976) Autosegmental Phonology. New York: Garland Press.
Google Scholar
Grossberg, S. (1982) Studies of Mind and Brain, Vol. 70 of Boston Studies in the Philosophy of Science. Dordrecht, the Netherlands: D. Reidel.
Google Scholar
Grossberg, S. (1986) The adaptive self-organization of serial order in behavior: speech language, and motor control. In: E. Schwab & H. Nusbaum (Eds) Pattern Recognition by Humans and Machines: Speech Perception. Orlando, FL: Academic Press.
Google Scholar
Halle, M. & Stevens, K.N. (1980) A note on laryngeal features. Quarterly Progress Report, Research Lab of Electronics, MIT, 101, 198–213.
Google Scholar
Handel, S. (1989) Listening: an Introduction to the Perception of Auditory Events. Cambridge, MA: Bradford Books/MIT Press.
Google Scholar
Hare, M.L. (1990) The role of similarity in Hungarian vowel harmony: a connectionist account. Connection Science, 2, 123–150.
Article Google Scholar
Harris, C.L. & Ellman, J.L. (1989) Representing variable information with simple recurrent networks. In: Proceedings of the Eleventh Annual Conference of the Cognitive Science Society, 635–642. Hillsdale, NJ: Erlbaum.
Google Scholar
Hinton, G. (1988) Representing part-whole hierarchies in connectionist networks. In: Proceedings of the Tenth Annual Conference of the Cognitive Science Society, 48–54. Hillsdale, NJ: Erlbaum.
Google Scholar
Hirsch, M.W. (1989) Convergent activation dynamics in continuous time network. Neural Networks, 2, 331–349.
Article Google Scholar
Hopfield, J.J. (1982) Neural networks and physical systems with emergent collective computational abilities. In: Proceedings of the National Academy of Sciences, Vol. 79, 2554–2558. National Academy of Sciences.
Article Google Scholar
Itakura, F. (1975) Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23, 67–72.
Article Google Scholar
Jakobson, R., Fant, G. & Halle, M. (1952) Preliminaries to Speech Analysis: the Distinctive Features and their Correlates. Cambridge, MA: MIT Press.
Google Scholar
Kantowicz, B. & Sorkin, R. (1983) Human Factors: Understanding People-System Relationships. New York: Wiley.
Google Scholar
Keeler, J. (1988) Comparison between Kanerva’s SDM and Hopfield-type neural networks. Cognitive Science, 12, 299–329.
Article Google Scholar
Kewley-Port, D. (1983) Time-varying features as correlates of place of articulation in stop consonants. Journal of the Acoustical Society of America, 73, 322–335.
Article Google Scholar
Kidd, G.R. & Watson, C.S. (1988) Detection of changes in frequency-and time-transposed auditory patterns. Journal of the Acoustical Society of America, 84, 5141–5142.
Article Google Scholar
Klatt, D. (1976) Linguistic uses of segmental duration in English: acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208–1221.
Article Google Scholar
Klatt, D. (1986) Problem of variability in speech recognition and in models of speech perception. In: Perkell, J. & Klatt, D. (Eds) Invariance and Variability in the Speech Processes, 300–320. Hillsdale, NJ: Erlbaum.
Google Scholar
Ladefoged, P. (1989) Representing phonetic structure. Working Papers in Phonetics 73, University of California, Los Angeles.
Google Scholar
Lakoff, G. (1988) Cognitive phonology. Paper presented at the LSA Annual Meeting.
Google Scholar
Lang, K.J., Waibel, A.H. & Hinton, G.E. (1990) A time-delay neural network architecture for isolated word recognition. Neural Networks, 3(1), 23–43.
Article Google Scholar
Lashley, K.S. (1951) The problem of serial order in behavior. In: L. A. Jefress (Ed.) Cerebral Mechanisms in Behavior, 112–136. New York: Wiley.
Google Scholar
Lea, W.A. (1980) Trends in Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Leek, M.R. & Watson, C. (1984) Learning to detect auditory pattern components. Journal of the Acoustical Society of America, 76, 1037–1044.
Article Google Scholar
Lehiste, I. (1970) Suprasegmentals. Cambridge, MA: MIT Press.
Google Scholar
Levinson, S.E. (1985) A unified theory of composite pattern analysis for automatic speech recognition. In: F. Fallside & W.A. Woods (Eds) Computer Speech Processing, 243–272. Englewood Clifs, NJ: Prentice-Hall.
Google Scholar
Liberman, A., Cooper, F., Shankweiler, D. & Studdert-Kennedy, M. (1967) Perception of the speech code. Psychological Review, 74, 431–461.
Article Google Scholar
Lisker, L. & Abramson, A. (1964) A cross-language study of voicing in initial stops: acoustical measurements. Word, 20, 384–422.
Google Scholar
Lisker, L. & Abramson, A. (1971) Distinctive features and laryngeal control. Language, 44, 767–785.
Article Google Scholar
Luenberger, D.G. (1979) Introduction to Dynamic Systems. New York: Wiley.
Google Scholar
Mannes, C. & Dorffner, G. (1989) Self-organizing detectors for spatiotemporal patterns. Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Austria.
Google Scholar
Moore, B.C.J. (1982) An Introduction to Psychology of Hearing. New York: Harcourt Brace Jovanovich, 2nd edition.
Google Scholar
Neisser, U. (1967) Cognitive Psychology. New York: Appleton.
Google Scholar
O’Shaugnessy, D. (1987) Speech Communication: Human and Machine. Reading, MA: Addison-Wesley.
Google Scholar
Port, R. (1986) Invariance in phonetics. In: J. Perkell & D. Klatt (Eds) Invariance and Variability in Speech Processes, 540–558. Hillsdale, NJ: Erlbaum.
Google Scholar
Port, R. & Anderson, S. (1989) Recognition of melody fragments in continuously performed music. In: G. Olson & E. Smith (Eds) Proceedings of the Eleventh Annual Meeting of the Cognitive Science Society, 820–827. Hillsdale, NJ: Erlbaum.
Google Scholar
Port, R. & Crawford, P. (1989) Pragmatic effects on neutralization rules. Journal of Phonetics, 16(4) 257–282.
Google Scholar
Port, R. & Dalby, J. (1982) C/V ratio as a cue for voicing in English. Journal of the Acoustical Society of America, 69, 262–274.
Article Google Scholar
Port, R.F. (1981) Linguistic timing factors in combination. Journal of the Acoustical Society of America, 69, 262–274.
Article Google Scholar
Port, R.F. & Rotunno, R. (1979) Relation between voice-onset time and vowel duration. Journal of the Acoustical Society of America, 66(3), 654–662.
Article Google Scholar
Rabiner, L. & Juang, B. (1986) An introduction to hidden Markov models. IEEE ASSP Magazine, 4–16.
Google Scholar
Repp, B. (1984) Categorical perception: issues, methods and findings. In: N.J. Lass (Ed.) Speech and Language: Advances in Basic Research and Practice, Vol. 10, 243–335. Hillsdale, NJ: Erlbaum.
Google Scholar
Port, R. & Reilly, W. & Maki, D. (1988) Use of syllable-scale timing to discriminate words. Journal of the Acoustical Society of America, 83(1), 265–273.
Article Google Scholar
Robinson, D.E. & Watson, C.S. (1972) Psychophysical methods in modern psychoacoustics. In: J.V. Tobias (Ed.) Foundations of Modern Auditory Theory, Vol. 2, 99–131. New York: Academic Press.
Google Scholar
Sachs, M.B. & Young, E.D. (1980) Effects of nonlinearities on speech encoding in the auditory nerve. Journal of the Acoustical Society of America, 68, 858–875.
Article Google Scholar
Sankoff, D. & Kruskal, J.B. (Eds) (1983) Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Reading, MA: Addison-Wesley.
Google Scholar
Sejnowski, T. & Rosenberg, C. (1987) Parallel networks that learn to pronounce English text. Complex Systems, 1, 145–168.
Google Scholar
Selfridge, O.G. (1959) Pandemonium: a paradigm for learning. In: Mechanisation of Thought Processes, 511–531. London: H.M. Stationery Office.
Google Scholar
Shamma, S.A. (1989) Stereausis: binaural processing without neural delays. Journal of the Acoustical Society of America, 86(3), 989–1006.
Article Google Scholar
Skarda, C. & Freeman, W. (1987) How brains make chaos in order to make sense of the world. Behavioral and Brain Sciences, 10, 161–195.
Article Google Scholar
Smythe, E.J. (1987) The detection of formant transitions in a connectionist network. In: Proceedings of the First IEEE International Conference on Neural Networks, 495–503. San Diego, CA.
Google Scholar
Smythe, E.J. (1988) Temporal computation in connectionist models. Technical Report 251, Indiana University, Computer Science Department, Indiana University, Bloomington, IN.
Google Scholar
Spiegel, M.F. & Watson, C.S. (1981) Factors in the discrimination of tonal patterns. III. Frequency discrimination with components of well-learned patterns. Journal of the Acoustical Society of America, 69(1), 223–230.
Article Google Scholar
Stevens, K.N. (1983) Design features of speech sound systems. In: P. MacNeilage (Ed.) The Production of Speech, 247–262. New York: Springer-Verlag.
Chapter Google Scholar
Stevens, K.N. & Blumstein, S.E. (1981) The search for invariant acoustic correlates of phonetic features. In: P. Eimas & J. Miller (Eds) Perspectives on the Study of Speech. Hillsdale, NJ: Erlbaum.
Google Scholar
Stevens, S.S. (1951) Mathematics, measurement and psychophysics. In: S. S. Stevens (Ed.) Handbook of Experimental Psychology, 1–49. New York: Wiley.
Google Scholar
Swets, J.A. (1961) Is there a sensory threshold? Science, 34, 168–177.
Article Google Scholar
Tank, D. & Hopfield, J. (1987) Neural computation by concentrating information in time. In: Proceedings of the National Academy of Sciences, 1896–1900.
Google Scholar
Vaissière, J. (1985) Speech recognition: a tutorial. In: F. Fallside & W. A. Woods (Eds) Computer Speech Processing, 191–242. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Waibel, A. (1986) Prosody and Speech Recognition. PhD thesis, Carnegie-Mellon University, Computer Science Dept. Pittsburgh, PA.
Google Scholar
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. & Lang, K. (1988) Phoneme recognition: neural networks vs. hidden Markov models. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 107–110. IEEE.
Google Scholar
Warren, R. & Bashford, J. (1981) Perception of acoustic iterance: pitch and infrapitch. Perception and Psychophysics, 29(4), 395–402.
Article Google Scholar
Watrous, R. (1990) Phoneme discrimination using connectionist networks. Journal of the Acoustical Society of America, 87, 1753–1772.
Article Google Scholar
Watson, C. & Foyle, D. (1985) Central factors in the discrimination and identification of complex sounds. Journal of the Acoustical Society of America, 78, 375–380.
Article Google Scholar
Watson, C.S. (1987) Uncertainty, informational masking, and the capacity of immediate auditory memory. In: W. A. Yost (Ed.) Auditory Processing of Complex Sounds, 267–277. Hillsdale, NJ: Erlbaum.
Google Scholar
Watson, C.S., Wroton, H.W., Kelly, W.J. & Benbasset, C.A. (1975) Factors in the discrimination of tonal patterns. I. Component frequency, temporal position, and silent intervals. Journal of the Acoustical Society of America, 57, 1175–1181.
Article Google Scholar
Wheeler, D. & Touretzky, D. (1989) A connectionist implementation of cognitive phonology. Technical Report CMU-CS-89-144, School of Computer Science, CMU.
Google Scholar
Williams, R. & Zipser, D. (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2), 270–280.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Linguistics, Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA
Robert F. Port

Authors

Robert F. Port
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Exeter, UK
Noel Sharkey

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Port, R.F. (1992). Representation and Recognition of Temporal Patterns. In: Sharkey, N. (eds) Connectionist Natural Language Processing. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-2624-3_15

Download citation

DOI: https://doi.org/10.1007/978-94-011-2624-3_15
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-5160-6
Online ISBN: 978-94-011-2624-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics