Variable Selection in Cell Classification Problems: A Strategy Based on Independent Component Analysis

Calò, Daniela G.; Galimberti, Giuliano; Pillati, Marilena; Viroli, Cinzia

doi:10.1007/3-540-27373-5_3

Daniela G. Calò²¹,
Giuliano Galimberti²¹,
Marilena Pillati²¹ &
…
Cinzia Viroli²¹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

1375 Accesses
5 Citations

Abstract

In this paper the problem of cell classification using gene expression data is addressed. One of the main features of this kind of data is the very large number of variables (genes), relative to the number of observations (cells). This condition makes most of the standard statistical methods for classification difficult to employ. The proposed solution consists of building classification rules on subsets of genes showing a behavior across the cells that differs most from that of all the other ones. This variable selection procedure is based on suitable linear transformations of the observed data: a strategy resorting to independent component analysis is explored. Our proposal is compared with the nearest shrunken centroid method (Tibshirani et al. (2002)) on three publicly available data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ALIZADEH, A.A., EISEN, M.B., DAVIS, R.E. et al. (2000): Distinct Types of Diffuse Large B-cell Lymphoma Identified by Gene Expression Profiling. Nature, 403, 503–511.
Article Google Scholar
DUDOIT, S., FRIDLYAND, J. and SPEED, T.P. (2002): Comparison of Discrimination Methods for the Classification of Tumors using Gene Expression Data. Journal of the American Statistical Association, 457, 77–87.
Article MathSciNet Google Scholar
GOLUB, T.R., SLONIM, D.K., TAMAYO, P. et al. (1999): Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, 286, 531–537.
Article Google Scholar
HYVÄRINEN, A., KARHUNEN, J. and OJA, E. (2001): Independent Component Analysis, Wiley, New York.
Google Scholar
KHAN, J., WEI, J., RINGNER, M. et al. (2001): Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks. Nature Medicine, 7, 673–679.
Article Google Scholar
TIBSHIRANI, R., HASTIE, T., NARASIMHAN, B. and CHU, G. (2002): Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression, Proceedings of the National Accademy of Sciences, 99, 6567–6572.
Article Google Scholar
VIROLI, C. (2003): Reflections on a Supervised Approach to Independent Component Analysis, Between Data Science and Applied Data Analysis, (M. Schader, W. Gaul e M. Vichi eds.), Studies in Classification, Data Analysis, and Knowledge Organization, Springer Berlin, 501–509.
Google Scholar
WALL, M.E., RECHTSTEINER, A. and ROCHA, L.M. (2003): Singular Value Decomposition and Principal Component Analysis, in: A Practical Approach to Microarray Data Analysis, Berrar D.P., Dubitzky W. and Granzow M. (Eds.), Kluwer, Norwell, 91–109.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Scienze Statistiche, Università di Bologna, Italy
Daniela G. Calò, Giuliano Galimberti, Marilena Pillati & Cinzia Viroli

Authors

Daniela G. Calò
View author publications
You can also search for this author in PubMed Google Scholar
Giuliano Galimberti
View author publications
You can also search for this author in PubMed Google Scholar
Marilena Pillati
View author publications
You can also search for this author in PubMed Google Scholar
Cinzia Viroli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Aachen
H.-H. Bock
Karlsruhe
W. Gaul
Rome
M. Vichi
Newark
Ph. Arabie
Cottbus
D. Baier
Milton Keynes
F. Critchley
Bielefeld
R. Decker
Paris
E. Diday
Barcelona
M. Greenacre
Naples
C. Lauro
Leiden
J. Meulman
Bologna
P. Monari
Toronto
S. Nishisato
Tokyo
N. Ohsumi
Augsburg
O. Opitz
Passau
G. Ritter
Mannheim
M. Schader
Dortmund
C. Weihs
Department of Statistics, Probability and Applied Statistics, University of Rome “La Sapienza”, Piazzale Aldo Moro 5, 00185, Rome, Italy
Maurizio Vichi
Department of Statistical Sciences, University of Bologna, Via Belle Arti 41, 40126, Bologna, Italy
Paola Monari , Stefania Mignani & Angela Montanari , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Calò, D.G., Galimberti, G., Pillati, M., Viroli, C. (2005). Variable Selection in Cell Classification Problems: A Strategy Based on Independent Component Analysis. In: Bock, HH., et al. New Developments in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27373-5_3

Download citation

DOI: https://doi.org/10.1007/3-540-27373-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23809-6
Online ISBN: 978-3-540-27373-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics