Convolutional Neural Networks

Skansi, Sandro

doi:10.1007/978-3-319-73004-2_6

Sandro Skansi ORCID: orcid.org/0000-0002-3851-1186¹¹

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

403k Accesses
3 Citations

Abstract

This chapter introduces the first deep learning architecture of the book, convolutional neural networks. It starts with redefining the way a logistic regression accepts data, and defines 1D and 2D convolutional layers as a natural extension of the logistic regression. The chapter also details on how to connect the layers and dimensionality problems. The local receptive field is introduced as a core concept of any convolutional architecture and the connections with the vanishing gradient problem is explored. Also the idea of padding is introduced in the visual setting, as well as the stride of the local receptive field. Pooling is also explored in the general setting and as max-pooling. A complete convolutional neural network for classifying MNIST is then presented in Keras code, and all the details of the code are presented as comments and illustrations. The final section of the chapter presents modifications needed to adapt convolutional networks, which are primarily visual classificators, to work with text and language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Yann LeCun once told in an interview that he prefers the name ‘convolutional network’ rather than ‘convolutional neural network’.
2.
An image in this sense is any 2D array with values between 0 and 255. In Fig. 6.1 we have numbered the positions, and you may think of them as ‘cell numbers’, in the sense that they will contain some value, but the number on the image denotes only their order. In addition, note that if we have e.g. 100 by 100 RGB images, each image would be a 3D array (tensor) with dimensions (100, 100, 3). The last dimension of the array would hold the three channels, red, green and blue.
3.
Here you might notice how important is weight initialization. We do have some techniques that are better than random initialization, but to find a good weight initialization strategy is an important open research problem.
4.
If using padding we will keep the same size, but still expand the depth. Padding is useful when there is possibly important information on the edges of the image.
5.
You have everything you need in this book to get the array (tensor) with the feature maps, and even to squash it to 2D, but you might have to search the Internet to find out how to visualize the tensor as an image. Consider it a good (but advanced) Python exercise.
6.
If it has 100 neurons per layer, with only one output neuron, that makes the total of \(784\cdot 100 + 100\cdot 100+ 100\cdot 100 + 100\cdot 1 = 98500\) parameters, and that is without the biases!.
7.
Which is, mathematically speaking, a tensor.
8.
Remember how we can convert a 28 by 28 matrix into a 784-dimensional vector.
9.
Keras calls them ‘Dense’.
10.
Trivially, every paper will have a ‘trickiest part’, and it is your job to learn how to decode this part, since it is often the most important part of the paper.
11.
Since the whole alphabet will not fit on a page, but you can easily imagine how it will expand to the normal English alphabet.
12.
A couple of hours each day—not a literal week.

References

Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)
Article Google Scholar
X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, in Advances in Neural Information Processing Systems 28, NIPS (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Zagreb, Zagreb, Croatia
Sandro Skansi

Authors

Sandro Skansi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandro Skansi .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Skansi, S. (2018). Convolutional Neural Networks. In: Introduction to Deep Learning. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-73004-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-73004-2_6
Published: 06 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73003-5
Online ISBN: 978-3-319-73004-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics