Pattern Classification

Habibi Aghdam, Hamed; Jahani Heravi, Elnaz

doi:10.1007/978-3-319-57550-6_2

Hamed Habibi Aghdam³ &
Elnaz Jahani Heravi³

8712 Accesses
4 Citations

Abstract

In this chapter, we first explained what are classification problems and what is a decision boundary. Then, we showed how to model a decision boundary using linear models. In order to better understand the intuition behind a linear model, they were also studied from geometrical perspective. A linear model needs to be trained on a training dataset. To this end, there must be a way to assess how good is a linear model in classification of training samples. For this purpose, we thoroughly explained different loss functions including 0/1 loss, squared loss, hinge loss and logistic loss. Then, methods for extending binary models to multiclass models including one-versus-one and one-versus-rest were reviewed. It is possible to generalize a binary linear model directly into a multiclass model. This requires loss functions that can be applied on multiclass dataset. We showed how to extend hinge loss and logistic loss into multiclass datasets. The big issue with linear models is that that they perform poorly on datasets in which classes are not linearly separable. To overcome this problem, we introduced the idea of feature transformation function and applied it on a toy example. Designing a feature transformation function by hand could be a tedious task especially, when they have to be applied on high-dimensional datasets. A better solution is to learn a feature transformation function directly from training data and training a linear classifier on top of it. We developed the idea of feature transformation from simple functions to compositional functions and explained how neural networks can be used for simultaneously learning a feature transformation function together with a linear classifier. Training a complex model such as neural network requires computing gradient of loss function with respect to every parameter in the model. Computing gradients using conventional chain rule might not be tractable. We explained how to factorize a multivariate chain rule and reduce the number of arithmetic operations. Using this formulation, we explained the backpropagation algorithm for computing gradients on any computational graph. Next, we explained different activation functions that can be used in designing neural networks. We mentioned why ReLU activations are preferable over traditional activations such as hyperbolic tangent. Role of bias in neural networks is also discussed in detail. Finally, we finished the chapter by mentioning how an image can be used as the input of a neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Implementations of the methods in this chapter are available at github.com/pcnn/.
2.
You can read this formula as “\(N_K\) of \(\mathbf x _q\) given the dataset \(\mathscr {X}\)”.

References

Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). 1997, pp 1–13. arXiv:1511.07289
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. arXiv:1502.01852
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML workshop on deep learning for audio, speech and language processing, vol 28. http://www.stanford.edu/~awni/papers/relu_hybrid_icml2013_final.pdf
Stallkamp J, Schlipsing M, Salmen J, Igel C (2012) Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw 32:323–332. doi:10.1016/j.neunet.2012.02.016
Article Google Scholar
Xu B, Wang N, Chen T (2015) Empirical evaluation of rectified activations in convolutional network. arXiv:1505.00853v2

Download references

Author information

Authors and Affiliations

University Rovira i Virgili, Tarragona, Spain
Hamed Habibi Aghdam & Elnaz Jahani Heravi

Authors

Hamed Habibi Aghdam
View author publications
You can also search for this author in PubMed Google Scholar
Elnaz Jahani Heravi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamed Habibi Aghdam .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Habibi Aghdam, H., Jahani Heravi, E. (2017). Pattern Classification. In: Guide to Convolutional Neural Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-57550-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-57550-6_2
Published: 18 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57549-0
Online ISBN: 978-3-319-57550-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics