Abstract
How the brain recognizes complex patterns in the environment is a central, but little understood question in neuroscience. The problem is of great significance for a host of applications such as biometric-based access control, autonomous robots and content-based information management. Although some headway in these directions has been made, the current artificial systems do not match the robustness and versatility of their biological counterparts. Here I examine recognition tasks drawn from two different sensory modalities—face recognition and speaker/speech recognition. The goal is to characterize the present state of artificial recognition technologies for these tasks, the influence of neuroscience on the design of these systems and the key challenges they face.
Similar content being viewed by others
References
Gould, J.L. How bees remember flower shapes. Science 227, 1492–1494 (1985).
Herrnstein, R.J. & Loveland, D.H. Complex visual concept in the pigeon. Science 146, 549–551 (1964).
Duda, R. & Hart, P. Pattern Classification and Scene Analysis (Wiley, New York, 1973).
Kaya, Y. & Kobayashi, K. in Frontiers of Pattern Recognition (ed. Watanabe, S.) 265–289 (Academic, New York, 1972).
Kanade, T. Computer Recognition of Human Faces. (Birkhauser, Basel and Stuttgart, 1977).
Campbell, R.A., Cannon, S., Jones, G. & Morgan, N. Individual face classification by computer vision. Proc. Conf. Modeling Simulation Microcomp. 62–63 (1987).
Jones, M.J., Sinha, P., Vetter, T. & Poggio, T. Top-down learning of low-level vision tasks. Curr. Biol. 7, 991–994 (1997).
Turk, M. & Pentland, A. Eigenfaces for recognition. J. Cogn. Neurosci. 3, 71–86 (1991).
Belhumeur, P.N., Hespanha, J.P. & Kriegman, D.J. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997).
Swets, D.L. & Weng, J. Discriminant analysis and eigenspace partition tree for face and object recognition from views. Proc. Intl. Conf. Automatic Face and Gesture Recog. 192–197 (1996).
Etemad, K. & Chellappa, R. Discriminant analysis for recognition of human face images. Proc. Intl. Conf. Acoust. Speech Sign. Process. 2148–2151 (1994).
Phillips, P.J., Moon, H., Rauss, P. & Rizvi, S.A. The FERET evaluation methodology for face-recognition algorithms. IEEE Comput. Vision Pattern Recog. 137–143 (1997).
Lee, T.S. Image representation using 2D Gabor wavelets. IEEE Trans. Pattern Anal. Mach. Intell. 18, 959–971 (1996).
DeAngelis, G., Ohzawa, I. & Freeman, R.D. Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. I. General characteristics and postnatal development. J. Neurophysiol. 69, 1091–1117 (1993).
Bruce, V. & Young, A. In the Eye of the Beholder: the Science of Face Perception (Oxford Univ. Press, 1998).
Collishaw, S.M. & Hole, G.J. Featural and configurational processes in the recognition of faces of different familiarity. Perception 29, 893–909 (2000).
Harmon, L.D. & Julesz, B. Masking in visual recognition: effects of two-dimensional filtered noise. Science 180, 1194–1197 (1973).
Bachmann, T. Identification of spatially quantised tachistoscopic images of faces: how many pixels does it take to carry identity? Special issue: face recognition. Eur. J. Cognit. Psychol. 3, 87–103 (1991).
Costen, N.P., Parker, D.M. & Craw, I. Effects of high-pass and low-pass spatial filtering on face identification. Percept. Psychophys. 58, 602–612 (1996).
Bruce, V., Henderson, Z., Greenwood, K., Hancock, P.J.B. & Burton, M.A. Verification of face identities from images captured on video. J. Exp. Psychol. Appl. 5, 339–360 (1999).
Yip, A. & Sinha, P. Role of color in face recognition. Perception 31, 995–1003 (2002).
Sinha, P. Identifying perceptually significant features for recognizing faces. Proc. SPIE Electronic Imaging Symp. 4662, 12–21 (2002)
Ellis, H.D., Shepherd, J.W. & Davies, G.M. Identification of familiar and unfamiliar faces from internal and external features: some implications for theories of face recognition. Perception 8, 431–439 (1979).
Shepherd, J., Davies, G. & Ellis, H. in Perceiving and Remembering Faces (eds. Davies, G. et al.) 105–132 (Academic, New York, 1981).
Haig, N.D. Exploring recognition with interchanged facial features. Perception 15, 235–247 (1986).
Fraser, I.H., Craig, G.L. & Parker, D.M. Reaction time measures of feature saliency in schematic faces. Perception 19, 661–673 (1990).
Sinha, P. & Poggio, T. I think I know that face. Nature 384, 404 (1996).
Sinha, P. & Poggio, T. United we stand: the role of head structure in face recognition. Perception 31, 133 (2002).
Sinha, P. & Poggio, T. in Perceptual Learning (ed. Fahle, M.) 273–298 (MIT Press, Cambridge, Massachusetts, 2002).
Sakoe, H. & Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. 26, 623–625 (1980).
Soong, F.K., Rosenberg, A.E., Rabiner, L.R. & Juang, B.H. A vector quantization approach to speaker recognition. AT&T Technical J. 66, 14–26 (1987).
Rabiner, L. & Juang, B.H. Fundamentals of Speech Recognition (Prentice-Hall, Englewood Cliffs, New Jersey, 1993).
Gish, H. & Schmidt, M. Text-independent speaker identification. IEEE Signal Processing Mag. 11, 18–32 (1994).
Higgins, A., Bahler, L. & Porter, J. Speaker verification using randomized phrase prompting. Digital Signal Processing 1, 89–106 (1991).
Campbell, J.P. Testing with the YOHO CD-ROM voice verification corpus. Proc. Intl. Conf. on Acoust. Speech and Signal Processing 341–344 (1995).
Hermansky, H. Perceptual linear prediction (PLP) analysis for speech. J. Acoust. Soc. Am. 87, 1738–1752 (1990).
Greenberg, S. Speaking in shorthand—a syllable-centric perspective for understanding pronunciation variation. Speech Commun. 29, 159–176 (1999).
Acknowledgements
For a broad ranging review such as this, one has to draw upon the expertise of several colleagues. I am grateful for generous help from Keith Kluender, Steve Greenberg, Hynek Hermansky and Paul Griffin.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sinha, P. Recognizing complex patterns. Nat Neurosci 5 (Suppl 11), 1093–1097 (2002). https://doi.org/10.1038/nn949
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn949
- Springer Nature America, Inc.
This article is cited by
-
KIETA: Key-insight extraction from scientific tables
Applied Intelligence (2023)
-
Nanogenerator-based dual-functional and self-powered thin patch loudspeaker or microphone for flexible electronics
Nature Communications (2017)
-
NMF versus ICA for blind source separation
Advances in Data Analysis and Classification (2017)
-
Face Perception in the Mind’s Eye
Brain Topography (2011)
-
Alles für die Marke? Produktdesign im Konflikt zwischen einer markenkonformen und einer eigenständigen Produktliniengestaltung
Schmalenbachs Zeitschrift für betriebswirtschaftliche Forschung (2007)