Tools for Identification of Structure in Data

Gentle, James E.

doi:10.1007/978-0-387-98144-4_9

Tools for Identification of Structure in Data

James E. Gentle²

Chapter
First Online: 01 January 2009

10k Accesses

Part of the book series: Statistics and Computing ((SCO))

Abstract

In recent years, with our increased ability to collect and store data, have come enormous datasets. These datasets may consist of billions of observations and millions of variables. Some of the classical methods of statistical inference, in which a parametric model is studied, are neither feasible nor relevant for analysis of these datasets. The objective is to identify interesting structures in the data, such as clusters of observations, or relationships among the variables. Sometimes, the structures allow a reduction in the dimensionality of the data. Many of the classical methods of multivariate analysis, such as principal components analysis, factor analysis, canonical correlations analysis, and multidimensional scaling, are useful in identifying interesting structures. These methods generally attempt to combine variables in such a way as to preserve information yet reduce the dimension of the dataset. Dimension reduction generally carries a loss of some information. Whether the lost information is important is the major concern in dimension reduction. Another set of methods for reducing the complexity of a dataset attempts to group observations together, combining observations, as it were.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Department of Computational & Data Sciences, George Mason University, 4400, University Drive, Fairfax, VA, 220304444, USA
James E. Gentle

Authors

James E. Gentle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James E. Gentle .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gentle, J.E. (2009). Tools for Identification of Structure in Data. In: Computational Statistics. Statistics and Computing. Springer, New York, NY. https://doi.org/10.1007/978-0-387-98144-4_9

Download citation

DOI: https://doi.org/10.1007/978-0-387-98144-4_9
Published: 25 June 2009
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-98143-7
Online ISBN: 978-0-387-98144-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics