Skip to main content

Clustering High-Dimensional Data via Spectral Clustering Using Collaborative Representation Coefficients

  • Conference paper
  • First Online:
Intelligent Computing Theories and Methodologies (ICIC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9226))

Included in the following conference series:

Abstract

Clustering high-dimensional data is challenging for traditional clustering methods. Spectral clustering is one of the most popular methods to cluster high-dimensional data, in which the similarity matrix plays an important role. Recently, sparse representation coefficients have been proposed to construct the similarity matrix via the cosine similarity between each pair of coefficient vectors for spectral clustering and showed promising results. However, the sparse representation emphasizes too much on the role of \( \ell_{1} \)-norm sparsity and ignores the role of collaborative representation, which makes its computational cost very high. In this paper, we propose a spectral clustering method based on the similarity matrix which is constructed based on the collaborative representation coefficient vectors. Extensive experiments show that the proposed method has a strong competitiveness both in terms of computational cost and clustering performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Steinbach, M., Ertöz, L., Kumar, V.: The challenges of clustering high dimensional data. In: Wille, L.T (ed.), New Directions in Statistical Physics, pp. 273–309. Springer, Heidelberg (2004)

    Google Scholar 

  2. Cai, D.: Litekmeans: the fastest matlab implementation of kmeans (2011). http://www.zjucadcg.cn/dengcai/Data/Clustering.html

  3. Phoungphol, P., Zhang, Y.: Multi-source kernel k-means for clustering heterogeneous biomedical data. In: IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) 2011, pp. 223–228 (2011)

    Google Scholar 

  4. Qin, G., Gao, L.: Spectral clustering for detecting protein complexes in protein–protein interaction (PPI) networks. Math. Comput. Model. 52, 2066–2074 (2010)

    Article  MathSciNet  Google Scholar 

  5. Peng, X., Zhang, L., Yi, Z.: An Out-of-sample Extension of Sparse Subspace Clustering and Low Rank Representation for Clustering Large Scale Data Sets. arXiv preprint arXiv:1309.6487 (2013)

    Google Scholar 

  6. Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031–1044 (2010)

    Article  Google Scholar 

  7. Wu, S., Feng, X., Zhou, W.: Spectral clustering of high-dimensional data exploiting sparse representation vectors. Neurocomputing 135, 229–239 (2014)

    Article  Google Scholar 

  8. Zhang, D., Yang, M., Feng, X.: Sparse representation or collaborative representation: Which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV) 2011, pp. 471–478 (2011)

    Google Scholar 

  9. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52, 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  10. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  Google Scholar 

  11. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)

    Google Scholar 

  12. Yeoh, E.-J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2), 133–143 (2002)

    Article  Google Scholar 

  13. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. 96, 6745–6750 (1999)

    Article  Google Scholar 

  14. Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar, R.C., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8, 68–74 (2002)

    Article  Google Scholar 

  15. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  16. Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., et al.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 30, 41–47 (2001)

    Article  Google Scholar 

  17. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679 (2001)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China under (Grant no. 61474267, 60973153 and 61471169) and Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shulin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, S., Gu, J., Chen, F. (2015). Clustering High-Dimensional Data via Spectral Clustering Using Collaborative Representation Coefficients. In: Huang, DS., Jo, KH., Hussain, A. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9226. Springer, Cham. https://doi.org/10.1007/978-3-319-22186-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22186-1_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22185-4

  • Online ISBN: 978-3-319-22186-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics