TopSpin: TOPic Discovery via Sparse Principal Component INterference

Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 279)


We propose a novel topic discovery algorithm for unlabeled images based on the bag-of-words (BoW) framework. We first extract a dictionary of visual words and subsequently for each image compute a visual word occurrence histogram. We view these histograms as rows of a large matrix from which we extract sparse principal components (PCs). Each PC identifies a sparse combination of visual words which co-occur frequently in some images but seldom appear in others. Each sparse PC corresponds to a topic, and images whose interference with the PC is high belong to that topic, revealing the common parts possessed by the images. We propose to solve the associated sparse PCA problems using an Alternating Maximization (AM) method, which we modify for the purpose of efficiently extracting multiple PCs in a deflation scheme. Our approach attacks the maximization problem in SPCA directly and is scalable to high-dimensional data. Experiments on automatic topic discovery and category prediction demonstrate encouraging performance of our approach. Our SPCA solver is publicly available.


Sparse PCA Bag-of-words Topic discovery Hidden topic 


  1. 1.
    Bart, E., Porteous, I., Perona, P., Welling, M.: Unsupervised learning of visual taxonomies. In: CVPR (2008)Google Scholar
  2. 2.
    Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: NIPS (2004)Google Scholar
  3. 3.
    Blei, D.M., McAuliffe, J.: Supervised topic models. In: NIPS (2007)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). MarzbMATHGoogle Scholar
  5. 5.
    d’Aspremont, A., Bach, F., Ghaoui, L.E.: Optimal solutions for sparse principal component analysis. J. Mach. Learn. Res. 9, 1269–1294 (2008)MathSciNetzbMATHGoogle Scholar
  6. 6.
    d’Aspremont, A., Ghaoui, L.E., Jordan, M.I., Lanckriet, G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 48(3), 434–448 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004)Google Scholar
  8. 8.
    Grauman, K., Darrell, T.: Unsupervised learning of categories from sets of partially matching image features. In: CVPR (2006)Google Scholar
  9. 9.
    Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 (2010)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Kinnunen, T., Kamarainen, J.-K., Lensu, L., Kalviainen, H.: Unsupervised visual object categorisation via self-organisation. In: ICPR (2010)Google Scholar
  11. 11.
    Lowe, D.: Object recognition from local scale-invariant features. In: ICCV (1999)Google Scholar
  12. 12.
    Mackey, L.: Deflation methods for sparse PCA. In: NIPS (2008)Google Scholar
  13. 13.
    Naikal, N., Yang, A., Sastry, S.: Towards an efficient distributed object recognition system in wireless smart camera networks. In: International Conference on Information Fusion (2010)Google Scholar
  14. 14.
    Naikal, N., Yang, A.Y., Shankar Sastry, S.: Informative feature selection for object recognition via sparse PCA. In: ICCV (2011)Google Scholar
  15. 15.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In CVPR (2006)Google Scholar
  16. 16.
    Richtárik, P., Takáč, M., Ahipasaoglu S.D.: Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes (2012). arXiv:1212.4137
  17. 17.
    Sivic, J., Russell, B.C., Zisserman, A., Freeman, W.T., Efros, A.A.: Unsupervised discovery of visual object class hierarchies. In: CVPR (2008)Google Scholar
  18. 18.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  19. 19.
    Tuytelaars, T., Lampert, C.H., Blaschko, M.B., Buntine, W.: Unsupervised object discovery: a comparison. IJCV 88(2) (2010)Google Scholar
  20. 20.
    J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classification of texture and object categories: a comprehensive study. IJCV (2007)Google Scholar
  21. 21.
    Zhang, Y., Ghaoui, L.E.: Large–scale sparse principal component analysis with application to text data. In: NIPS (2011)Google Scholar
  22. 22.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. Technical report, Stanford University (2004)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Lehigh UniversityBethlehemUSA
  2. 2.Singapore University of Technology and DesignSingaporeSingapore
  3. 3.King Abdullah University of Science and Technology (KAUST)ThuwalKingdom of Saudi Arabia
  4. 4.University of EdinburghEdinburghUK
  5. 5.Moscow Institute of Physics and TechnologyDolgoprudnyRussia

Personalised recommendations