Visualization of Patient Samples by Dimensionality Reduction of Genome-Wide Measurements

  • Huilei Xu
  • Avi Ma ‘ayan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7058)


As the cost of genome-wide profiling is decreasing, the possibility for using such technologies for routine diagnostics as well as for classification and stratification of patients in clinical settings is increasing. However, the high dimensionality of such data makes it challenging to interpret and visualize for comparing and contrasting patient samples. Here we propose two visualization methods that display unsupervised clustering of genome-wide profiling of mRNA from breast cancer tumors from patients as images that can quickly show clusters of patients based on their expression profiles with perspective of their clinical outcome. The first visualization method converts expression profiles into a sparse network, whereas the second method visualizes patient samples on a hexagonal grid. Both visualization methods use the first three coordinates from principle component analysis (PCA) applied to reduce the dimensionality of the data. Colors of nodes in the network or hexagons are based on clinical outcome or tumor estrogen receptor (ER) status. Such visualization methods could be useful for grouping patients in an unsupervised manner to predict outcome and tailor personalized therapeutics.


Microarrays Graph Theory Hexagonal Grid Principle Component Analysis Dimensionality Reduction Data Visualization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)CrossRefGoogle Scholar
  2. 2.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  3. 3.
    Singh, G., Mémoli, F., Carlsson, G.: Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In: Eurographics Symposium on Point-Based Graphics (2007)Google Scholar
  4. 4.
    Nicolau, M., Levine, A.J., Carlsson, G.: Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc. Natl. Acad. Sci. U.S.A 108(17), 7265–7270 (2011)CrossRefGoogle Scholar
  5. 5.
    Hatzis, C., Pusztai, L., Valero, V., Booser, D.J., Esserman, L., Lluch, A., Vidaurre, T., Holmes, F., Souchon, E., Wang, H., et al.: A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. JAMA 305(18), 1873–1881 (2011)CrossRefGoogle Scholar
  6. 6.
    MacArthur, B.D., Lachmann, A., Lemischka, I.R., Ma’ayan, A.: GATE: software for the analysis and visualization of high-dimensional time series expression data. Bioinformatics 26(1), 143–144 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Huilei Xu
    • 1
  • Avi Ma ‘ayan
    • 1
  1. 1.Department of Pharmacology and Systems Therapeutics, Systems Biology Center New York (SBCNY)Mount Sinai School of MedicineNew YorkUSA

Personalised recommendations