Abstract
We have a dataset that is a collection of d-dimensional vectors. This chapter introduces the nasty tricks that such data can play. A dataset like this is hard to plot, though Sect. 4.1 suggests some tricks that are helpful. Most readers will already know the mean as a summary (it’s an easy generalization of the 1D mean). The covariance matrix may be less familiar. This is a collection of all covariances between pairs of components. We use covariances, rather than correlations, because covariances can be represented in a matrix easily. High dimensional data has some nasty properties (it’s usual to lump these under the name “the curse of dimension”). The data isn’t where you think it is, and this can be a serious nuisance, making it difficult to fit complex probability models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Forsyth, D. (2019). High Dimensional Data. In: Applied Machine Learning . Springer, Cham. https://doi.org/10.1007/978-3-030-18114-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-18114-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18113-0
Online ISBN: 978-3-030-18114-7
eBook Packages: Computer ScienceComputer Science (R0)