Abstract
In deep learning-based visual pattern recognition systems, typically the entire image is presented to the system for recognition. However, the human visual system often scans a large visual object by sequential shifts of attention, which is integrated for visual classification. Even in artificial domains, such sequential integration is particularly useful when the input image is too large. Some previous studies based on Elman and Jordan networks have explored only with fully connected layers using full image as input but not with convolutional layers using attention window as input. To this end, we present a novel recurrent neural network architecture which possesses spatiotemporal memory called Convolutional Elman Jordan Neural Network (CEJNN) to integrate the information by looking at a series of small attentional windows applied over the full image. Two variations of CEJNN with some modifications have been developed for two tasks: reconstruction and classification. The network is trained on 48 K images and tested on 10 K images of MNIST handwritten digit database for both tasks. Our experiment shows that the network captures better correlation of the spatiotemporal information by providing the result with a mean square error (MSE) of 0.012 for reconstruction task and also claiming the classification with 97.62% accuracy on the testing set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
V. Mnih, N. Heess, A. Graves, Recurrent models of visual attention, in Advances in neural information processing systems (2014), pp. 2204–2212
J.E. Hoffman, C.W. Eriksen, Temporal and spatial characteristics of selective coding from visual displays temporal and spatial characteristics of selective encoding from visual displays*. Percept. Psychophys. 12, 201–204 (1972). https://doi.org/10.3758/BF03212870
A.M. Treisman, G. Gelade, A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980)
C. Koch, S. Ullman, Shifts in selective visual attention: towards the underlying neural circuitry, in Matters of intelligence. Springer (1987), pp. 115–141
C.E. Connor, H.E. Egeth, S. Yantis, Visual attention: bottom-up versus top-down (Curr, Biol, 2004)
Q. Lai, W. Wang, S. Khan, et al., Human versus machine attention in neural networks: a comparative study (2019)
L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20, 1254–1259 (1998)
M.J. Tovée, An introduction to the visual system. Cambridge University Press (1996)
H.T. Siegelmann, E.D. Sontag, On the computational power of neural nets. J. Comput. Syst. Sci. 50, 132–150 (1995). https://doi.org/10.1006/jcss.1995.1013
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature 323, 533–536 (1986). https://doi.org/10.1038/323533a0
J. Cheng, L. Dong, M. Lapata, Long short-term memory-networks for machine reading (2013)
A. Vaswani, Attention is all you need (2017)
B. Singh, T.K. Marks, M. Jones, O. Tuzel, A multi-stream bi-directional recurrent neural network for fine-grained action detection 1961–1970 (1961)
S. Bai, J.Z. Kolter, V. Koltun, An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv Prepr arXiv180301271 (2018)
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
J. Chung, Gated feedback recurrent neural networks 37 (2015)
C. Lea, M.D.F. Ren, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, pp. 156–165
J.L. Elman, Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7, 195–225 (1991). https://doi.org/10.1023/A:1022699029236
M.I. Jordan, Serial order: a parallel distributed processing approach (University of California, San Diego Inst Cogn Sci, 1986), p. 8604
R.H. Hahnloser, R. Sarpeshkar, M.A. Mahowald, R.J. Douglas, H.S. Seung, Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 442, 947–951 (2000)
Y. LeCun, B. Boser, J.S. Denker, et al. Lecun-89E. Neural Comput (1989)
D.P. Kingma, J. Ba, Adam: a method for stochastic optimization (2014), pp. 1–15
Recognition of handwritten digits using artificial neural networks
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumari, S., Aravindakshan, S., Jain, U., Srinivasa Chakravarthy, V. (2021). Convolutional Elman Jordan Neural Network for Reconstruction and Classification Using Attention Window. In: Sharma, M.K., Dhaka, V.S., Perumal, T., Dey, N., Tavares, J.M.R.S. (eds) Innovations in Computational Intelligence and Computer Vision. Advances in Intelligent Systems and Computing, vol 1189. Springer, Singapore. https://doi.org/10.1007/978-981-15-6067-5_20
Download citation
DOI: https://doi.org/10.1007/978-981-15-6067-5_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6066-8
Online ISBN: 978-981-15-6067-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)