Skip to main content

Spectral Properties of Correlation Matrices – Towards Enhanced Spectral Clustering

  • Protocol
  • First Online:
Data Mining in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 696))

  • 2957 Accesses

Abstract

This chapter compiles some properties of eigenvalues and eigenvectors of correlation and other matrices constructed from uncorrelated as well as systematically correlated Gaussian noise. All results are based on simulations. The situations depicted in the settings are found in time series analysis as one extreme variant and in gene/protein profile analysis with micro-arrays as the other extreme variant of the possible scenarios for correlation analysis and clustering where random matrix theory might contribute. The main difference between both is the number of variables versus the number of observations. To what extent the results can be transferred is yet unclear. While random matrix theory as such makes statements about the statistical properties of eigenvalues and eigenvectors, the expectation is that these statements, if used in a proper way, will improve the clustering of genes for the detection of functional groups. In the course of the scenarios, the relation and interchangeability between the concepts of time, experiment, and realisations of random variables play an important role. The mapping between a classical random matrix ensemble and the micro-array scenario is not yet obvious. In any case, we can make statements about pitfalls and sources of false conclusions. We also develop an improved spectral clustering algorithm that is based on the properties of eigenvalues and eigenvectors of correlation matrices. We found it necessary to rehearse and analyse these properties from the bottom up starting at one extreme end of scenarios and moving to the micro-array scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Comellas F, Diaz-Lopez J (2008) Spectral reconstruction of complex networks. Physica A 387:6436–6442

    Article  Google Scholar 

  2. Fang Y, Brass A, Hoyle DC, Hayes A, Bashein A, Oliver SG, Waddington D, Rattray M (2003) A model-based analysis of microarray experimental error and normalisation. Nucleic Acids Res 31(16):e96

    Article  PubMed  Google Scholar 

  3. Fulger D, Politi M, Germano G, Iori G (2009) The pearson and fourier pearson correlation estimators in the context of spectral correlation matrix analysis with continuous-time random walks

    Google Scholar 

  4. Laloux L, Cizeau P, Bouchaud J.-P, Potters M. (1999) Noise Dressing of Financial Correla‑tion Matrices. Phys Rev Lett 83(3) 1467–1470. American Physical Society. DOI 10.1103/Phys Rev Lett 83.1467

    Google Scholar 

  5. Lehmann N (2006) Principal components selection given extensively many variables. Stat Probab Lett 74:51–58

    Article  Google Scholar 

  6. Lin S, Kernighan BW (1973) An effec-tive heuristic algorithm for the traveling-salesman problem. Operations Res 21:498–516

    Article  Google Scholar 

  7. Luo F, Zhong J, Yang Y, Scheuermann RH, Zhou J (2006) Application of random matrix theory to biological networks. Phys Lett A 357:420–423

    Article  CAS  Google Scholar 

  8. Marčhenko VA, Pastur LA (1967) Distribution for some sets of random matrices. Math USSR-Sb 1:457–483

    Article  Google Scholar 

  9. Minicozzi P, Rapallo F, Scalas E, Dondero F (2008) Accuracy and robustness of clustering algorithms for small-size applications in bioinformatics. Physica A 387:6310–6318

    Article  CAS  Google Scholar 

  10. Newman MEJ (2004) Detecting community structure in networks. Eur Phys J B Condens Matter Phys 38:321–330

    CAS  Google Scholar 

  11. Reichmann WJ (1961) Use and abuse of statistics. Methuen. Reprinted 1964–1970 by Pelican

    Google Scholar 

  12. Luxburg U (2007) A tutorial on spectral clustering. Statistics and Computing 17(4) 395–416. Kluwer Academic Publishers. http://dx.doi.org/10.1007/s11222-007-9033-z

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Fulger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Fulger, D., Scalas, E. (2011). Spectral Properties of Correlation Matrices – Towards Enhanced Spectral Clustering. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-987-1_25

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-986-4

  • Online ISBN: 978-1-60761-987-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics