Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data

Berrar, Daniel; Ohmayer, Georg

doi:10.1007/s00521-010-0478-1

Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data

ISNN 2010
Published: 04 November 2010

Volume 20, pages 1211–1218, (2011)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Daniel Berrar^1,2 &
Georg Ohmayer³

185 Accesses
1 Citation
Explore all metrics

Abstract

Visualization techniques for high-dimensional data sets play a pivotal role in exploratory analysis in a wide range of disciplines. A particularly challenging problem represents gene expression data based on microarray technology where the number of features (genes) typically exceeds 20,000, whereas the number of samples is frequently below 200. We investigated class-specific discrimination coefficients for each feature and each pair of classes for an effective nonlinear mapping to lower-dimensional space. We applied the technique to three microarray data sets and compared the projections to two-dimensional space with the results from a conventional multidimensional scaling method, a score plot resulting from principal component analysis, and projections from linear discriminant analysis. In the experiments, we observed that the discrimination coefficients allowed for an improved visualization of high-dimensional genomic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Introduction to Bioinformatics

Publishing Research: Book Chapters and Books

A note on detecting statistical outliers in psychophysical data

Article Open access 14 May 2019

References

Dubitzky W, Granzow M, Downes CS, Berrar D (2002) Introduction to microarray data analysis. In: Berrar D, Granzow M, Dubitzky W (eds) A practical approach to microarray data analysis. Kluwer Academic Publishers, Boston, pp 1–46
Google Scholar
Golub TR, Slonim DK, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537
Article Google Scholar
van’t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536
Article Google Scholar
Wang Y, Klijn JG, Zhang Y et al (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679
Google Scholar
Jansen MP, Foekens JA, van Staveren IL et al (2005) Molecular classification of tamoxifen-resistant breast carcinomas by gene expression profiling. J Clin Oncol 23(4):732–740
Article Google Scholar
Simon R (2003) Supervised analysis when the number of candidate features (p) greatly exceeds the number of cases (n). ACM SIGKDD Expl Newslett 5(2):31–36
Article Google Scholar
Wall ME, Rechtsteiner A, Rocha LM (2002) Singular value decomposition and principal component analysis. In: Berrar D, Granzow M, Dubitzky W (eds) A practical approach to microarray data analysis. Kluwer Academic Publishers, Boston, pp 91–109
Google Scholar
Sammon JW (1969) A non-linear mapping for data structure analysis. IEEE Trans Comp C-18:401–409
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2002) The elements of statistical learning. Springer, Berlin
Google Scholar
Lerner B, Guterman H, Aladjem M, Dinstein I, Romem Y (1998) On pattern classification with Sammon’s nonlinear mapping—an experimental study. Pattern Recog 31(4):371–381
Article Google Scholar
R Development Core Team (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. URL http://www.R-project.org
Venables WN, Ripley BD (2002) Modern applied statistics with S. Fourth edition. Springer
GraphPad Prism, http://www.graphpad.com
Sotiriou C, Wirapati P, Loi S et al (2006) Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98(4):262–272
Article Google Scholar
Chang HY, Nuyten DS, Sneddon JB et al (2006) Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci USA 102(10):3738–3743. Data available at http://microarray-pubs.stanford.edu/wound_NKI/explore.html
Geyer FC, Marchio C, Reis-Filho JS (2009) The role of molecular analysis in breast cancer. Pathology 41(1):77–88
Article Google Scholar
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47:583–621
Article Google Scholar
Fisher RA (1922) On the interpretation of χ² from contingency tables, and the calculation of P. J Royal Stat Soc 85(1):87–94
Article Google Scholar
Mantel N (1966) Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chem Rep 50(3):163–170
Google Scholar
Lê Cao KA, Gonçalves O, Besse P, Gadat S (2007) Selection of biologically relevant genes with a wrapper stochastic algorithm. Stat Appl Genet Mol Biol 6: Article 29

Download references

Acknowledgments

This work was supported in part by the Japan Society for the Promotion of Science. We thank H. Kitano for his support and the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Systems Biology Research Group, School of Biomedical Sciences, University of Ulster, Coleraine, UK
Daniel Berrar
Department of Cancer Systems Biology, Cancer Institute, Japan Foundation for Cancer Research, Tokyo, Japan
Daniel Berrar
University of Applied Sciences, Weihenstephan, Germany
Georg Ohmayer

Authors

Daniel Berrar
View author publications
You can also search for this author in PubMed Google Scholar
Georg Ohmayer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Berrar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berrar, D., Ohmayer, G. Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data. Neural Comput & Applic 20, 1211–1218 (2011). https://doi.org/10.1007/s00521-010-0478-1

Download citation

Received: 22 February 2010
Accepted: 19 October 2010
Published: 04 November 2010
Issue Date: November 2011
DOI: https://doi.org/10.1007/s00521-010-0478-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data

Abstract

Access this article

Similar content being viewed by others

Introduction to Bioinformatics

Publishing Research: Book Chapters and Books

A note on detecting statistical outliers in psychophysical data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data

Abstract

Access this article

Similar content being viewed by others

Introduction to Bioinformatics

Publishing Research: Book Chapters and Books

A note on detecting statistical outliers in psychophysical data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation