ACCV 2012: Computer Vision – ACCV 2012 pp 648-659 | Cite as
Modeling Hidden Topics with Dual Local Consistency for Image Analysis
Abstract
Image representation is the crucial component in image analysis and understanding. However, the widely used low-level features cannot correctly represent the high-level semantic content of images in many situations due to the “semantic gap”. In order to bridge the “semantic gap”, in this brief, we present a novel topic model, which can learn an effective and robust mid-level representation in the latent semantic space for image analysis. In our model, the ℓ1-graph is constructed to model the local image neighborhood structure and the word co-occurrence is computed to capture the local word consistency. Then, the local information is incorporated into the model for topic discovering. Finally, the generalized EM algorithm is used to estimate the parameters. As our model considers both the local image structure and local word consistency simultaneously when estimating the probabilistic topic distributions, the image representations can have more powerful description ability in the learned latent semantic space. Extensive experiments on the publicly available databases demonstrate the effectiveness of our approach.
Keywords
Topic Model Latent Dirichlet Allocation Latent Topic Locally Linear Embedding Probabilistic Latent Semantic AnalysisPreview
Unable to display preview. Download preview PDF.
References
- 1.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)Google Scholar
- 2.Li, P., Wang, M., Cheng, J., Xu, C., Lu, H.: Spectral hashing with semantically consistent graph for image indexing. IEEE Transactions on Multimedia 14 (2012)Google Scholar
- 3.Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)Google Scholar
- 4.Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196 (2001)MATHCrossRefGoogle Scholar
- 5.Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)MATHGoogle Scholar
- 6.Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 7.Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space. In: ACM Multimedia, pp. 348–351 (2004)Google Scholar
- 8.Cao, L., Li, F.: Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In: ICCV, pp. 1–8 (2007)Google Scholar
- 9.Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)CrossRefGoogle Scholar
- 10.Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computing 15, 1373–1396 (2002)CrossRefGoogle Scholar
- 11.Tenenbaum, J., Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)CrossRefGoogle Scholar
- 12.Cai, D., Mei, Q., Han, J., Zhai, C.: Modeling hidden topics on document manifold. In: CIKM, pp. 911–920 (2008)Google Scholar
- 13.Cai, D., Wang, X., He, X.: Probabilistic dyadic data analysis with local and global consistency. In: ICML (2009)Google Scholar
- 14.Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)CrossRefGoogle Scholar
- 15.Tenenbaum, J.: Mapping a manifold of perceptual observations. In: NIPS, pp. 682–688 (1997)Google Scholar
- 16.He, X., Niyogi, P.: Locality preserving projections. In: NIPS (2003)Google Scholar
- 17.He, X., Cai, D., Yan, S., Zhang, H.: Neighborhood preserving embedding. In: ICCV, pp. 1208–1213 (2005)Google Scholar
- 18.Cheng, B., Yang, J., Yan, S., Fu, Y., Huang, T.: Learning with ℓ1-graph for image analysis. IEEE Trans. on Image Processing 19, 858–866 (2010)MathSciNetCrossRefGoogle Scholar
- 19.Donoho, D.: For most large underdetermined systems of linear equations the minimal ℓ1-norm solution is also the sparsest solution. Communications on Pure and applied Mathematics 59, 797–829 (2004)MathSciNetCrossRefGoogle Scholar
- 20.Meinshansen, N., Buhlmann, P.: High-dimensional graphs and variable selection with the lasso. The Annals of Statistics 34, 1436–1462 (2006)MathSciNetCrossRefGoogle Scholar
- 21.Wright, J., Genesh, A., Yang, A., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. on Pattern Anal. Mach. Intell. 31, 210–227 (2009)CrossRefGoogle Scholar
- 22.Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from in complete data via the EM algorithm. Journal of the Royal Statistical Society 39, 1–38 (1977)MathSciNetMATHGoogle Scholar
- 23.Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models (1998)Google Scholar
- 24.Press, W., Flannery, B., Teukolsky, S., Vetterling, W.: Numerical recipes in C: the art of scientific computing. Cambridge University Press (1992)Google Scholar
- 25.Li, F., Rob, F., Pietro, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: CVPR Workshop on Generative Model Based Vision (2004)Google Scholar
- 26.Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
- 27.Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: SIGIR, pp. 267–273 (2003)Google Scholar