Cognitive Computation

, Volume 10, Issue 6, pp 1042–1050 | Cite as

Accelerating Infinite Ensemble of Clustering by Pivot Features

  • Xiao-Bo JinEmail author
  • Guo-Sen Xie
  • Kaizhu Huang
  • Amir Hussain


The infinite ensemble clustering (IEC) incorporates both ensemble clustering and representation learning by fusing infinite basic partitions and shows appealing performance in the unsupervised context. However, it needs to solve the linear equation system with the high time complexity in proportion to O(d3) where d is the concatenated dimension of many clustering results. Inspired by the cognitive characteristic of human memory that can pay attention to the pivot features in a more compressed data space, we propose an acceleration version of IEC (AIEC) by extracting the pivot features and learning the multiple mappings to reconstruct them, where the linear equation system can be solved with the time complexity O(dr2) (rd). Experimental results on the standard datasets including image and text ones show that our algorithm AIEC improves the running time of IEC greatly but achieves the comparable clustering performance.


Ensemble clustering Infinite ensemble clustering Pivot features Reconstruction of features 


Funding Information

This work was partially supported by the Fundamental Research Funds for the Henan Provincial Colleges and Universities in the Henan University of Technology (2016RCJH06), the National Key Research & Development Program 418 (2016YFD0400104-5), the National Basic Research Program of China (2012CB316301), the National Natural Science Foundation of China (61103138 and 61473236).

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.


  1. 1.
    Bailey K. Numerical Taxonomy and cluster a. Typologies and Taxonomies. CA: SAGE Publications Ltd; 1994.Google Scholar
  2. 2.
    Filipovych R, Resnick SM, Davatzikos C. Semi-supervised cluster analysis of imaging data. NeuroImage 2011;54(3):2185–2197.PubMedCrossRefGoogle Scholar
  3. 3.
    Bewley A, Upcroft B. Advantages of exploiting projection structure for segmenting dense 3D point clouds. Proceedings of the 2013 Australasian Conference on Robotics and Automation, Australian Robotics & Automation Association. In: Katupitiya J, Guivant J, and Eaton R, editors. Sydney: University of New South Wales; 2013. p. 1–8.Google Scholar
  4. 4.
    Kim G, Xing EP. Reconstructing storyline graphs for image recommendation from web community photos. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’14. Washington: IEEE Computer Society; 2014. p. 3882–3889.Google Scholar
  5. 5.
    Estivill-Castro V. Why so many clustering algorithms: a position paper. SIGKDD Explor Newsl 2002;4(1):65–75.CrossRefGoogle Scholar
  6. 6.
    Li X, Lu Q, Dong Y, Tao D. SCE: A manifold regularized set-covering method for data partitioning. IEEE Trans Neural Netw Learn Syst 2017;PP(99):1–14.Google Scholar
  7. 7.
    Breiman L. Bagging predictors. Mach Learn 1996;24(2):123–140.Google Scholar
  8. 8.
    Luo D, Ding C, Huang H, Nie F. Consensus spectral clustering in near-linear time. Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, ICDE ’11. Washington: IEEE Computer Society; 2011. p. 1079–1090.Google Scholar
  9. 9.
    Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 2013;35(8):1798–1828.PubMedCrossRefGoogle Scholar
  10. 10.
    Hinton GE, Osindero S, Teh Y -W. A fast learning algorithm for deep belief nets. Neural Comput 2006; 18(7):1527–1554.PubMedCrossRefGoogle Scholar
  11. 11.
    Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19. In: Schölkopf PB, Platt JC, and Hoffman T, editors. MIT Press; 2007. p. 153–160.Google Scholar
  12. 12.
    Song C, Liu F, Huang Y, Wang L, Tan T. 2013. Auto-encoder based data clustering: Springer, Berlin.Google Scholar
  13. 13.
    Huang P, Huang Y, Wang W, Wang L. Deep embedding network for clustering. In: 2014 22nd International Conference on Pattern Recognition; 2014. p. 1532–1537.Google Scholar
  14. 14.
    Liu H, Shao M, Li S, Fu Y. Infinite ensemble for image clustering. Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. New York: ACM; 2016. p. 1745–1754.Google Scholar
  15. 15.
    Alelyani S, Tang J, Liu H. 2013. Feature selection for clustering: a review. In: Data Clustering: Algorithms and Applications .Google Scholar
  16. 16.
    Klawonn F, Keller A. Fuzzy clustering based on modified distance measures. Advances in Intelligent Data Analysis. Berlin: Springer; 1999. p. 291–301.Google Scholar
  17. 17.
    Jiu M, Wolf C, Garcia C, Baskurt A. Supervised learning and codebook optimization for bag-of-words models. Cognitive Comput 2012;4(4):409–419.CrossRefGoogle Scholar
  18. 18.
    Pandarachalil R, Sendhilkumar S, Mahalakshmi GS. Twitter sentiment analysis for large-scale data: an unsupervised approach. Cognitive Comput 2015;7(2):254–262.CrossRefGoogle Scholar
  19. 19.
    Jin X -B, Geng G -G, Sun M, Zhang D. Combination of multiple bipartite ranking for multipartite web content quality evaluation. Neurocomputing 2015;149:1305–1314.CrossRefGoogle Scholar
  20. 20.
    Ding S, Zhang J, Jia H, Qian J. An adaptive density data stream clustering algorithm. Cognitive Comput 2016;8(1):30–38.CrossRefGoogle Scholar
  21. 21.
    MacQueen J. 1967. Some methods for classification and analysis of multivariate observations The Regents of the University of California.Google Scholar
  22. 22.
    Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006.Google Scholar
  23. 23.
    De la Torre F, Kanade T. Discriminative cluster analysis. Proceedings of the 23rd International Conference on Machine Learning, ICML ’06. New York: ACM; 2006. p. 241–248.Google Scholar
  24. 24.
    Li X, Cui G, Dong Y. Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern 2017;47(11):3840–3853.PubMedCrossRefGoogle Scholar
  25. 25.
    Li X, Cui G, Dong Y. Refined-graph regularization-based nonnegative matrix factorization. ACM Trans Intell Syst Technol 2017;9(1):1:1–1:21.Google Scholar
  26. 26.
    Fred A. Finding consistent clusters in data partitions. Multiple Classifier Systems. Berlin: Springer; 2001. p. 309–318.Google Scholar
  27. 27.
    Topchy A, Jain AK, Punch W. Combining multiple weak clusterings. In: Third IEEE International Conference on Data Mining; 2003. p. 331–338.Google Scholar
  28. 28.
    Fred ALN, Jain AK. Learning pairwise similarity for data clustering. In: 18th International Conference on Pattern Recognition (ICPR’06); 2006. Vol 1. p. 925–928.Google Scholar
  29. 29.
    Vega-Pons S, Ruiz-Shulcloper J. A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 2011;25(03):337–372.CrossRefGoogle Scholar
  30. 30.
    Minaei-Bidgoli B, Topchy A, Punch WF. Ensembles of partitions via data resampling. In: International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004., Vol. 2; 2004. p. 188–192.Google Scholar
  31. 31.
    Chen M, Xu Z, Weinberger KQ, Sha F. Marginalized denoising autoencoders for domain adaptation. Proceedings of the 29th International Coference on International Conference on Machine Learning, ICML’12. USA: Omni Press; 2012. p. 1627–1634.Google Scholar
  32. 32.
    Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In Proceedings of the Twenty-eight International Conference on Machine learning, ICML; 2011.Google Scholar
  33. 33.
    Bingham E, Mannila H. Random projection in dimensionality reduction: applications to image and text data. San Francisco: ACM Press; 2001, pp. 245–250.Google Scholar
  34. 34.
    Achlioptas D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 2003;66(4):671–687.CrossRefGoogle Scholar
  35. 35.
    Li P, Hastie TJ, Church KW. Very sparse random projections. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06. New York: ACM; 2006. p. 287–296.Google Scholar
  36. 36.
    Bache K, Lichman M. UCI Repository of machine learning databases, Ph.D. thesis, University of California. Irvine: School of Information and Computer Sciences; 1998.Google Scholar
  37. 37.
    Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–2324.CrossRefGoogle Scholar
  38. 38.
    Samaria FS, Harter AC. Parameterisation of a stochastic model for human face identification. In: IEEE Workshop on Applications of Computer Vision; 1994. p. 138–142.Google Scholar
  39. 39.
    Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. 2007. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) results.Google Scholar
  40. 40.
    Lang K. NewsWeeder: learning to filter netnews. In: ICML; 1995. p. 331–339.Google Scholar
  41. 41.
    Strehl A, Strehl E, Ghosh J, Mooney R. Impact of similarity measures on web-page clustering, in: In Workshop on Artificial Intelligence for Web Search (AAAI 2000, AAAI; 2000. p. 58–64.Google Scholar
  42. 42.
    Brun M, Sima C, Hua J, Lowey J, Carroll B, Suh E, Dougherty ER. Model-based evaluation of clustering validation measures. Pattern Recogn. 2007;40(3):807–824.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Xiao-Bo Jin
    • 1
    Email author
  • Guo-Sen Xie
    • 2
    • 3
  • Kaizhu Huang
    • 4
  • Amir Hussain
    • 5
  1. 1.College of Information Science and EngineeringHenan University of TechnologyZhengzhouChina
  2. 2.Inception Institute of Artificial Intelligence (IIAI)Abu DhabiUAE
  3. 3.College of Information Science and EngineeringHenan University of Science and TechnologyLuoyangChina
  4. 4.Department of Electrical & Electronic EngineeringXi’an Jiaotong-Liverpool UniversitySuzhouChina
  5. 5.Division of Computing Science & Maths, School of Natural SciencesUniversity of StirlingStirlingUK

Personalised recommendations