International Journal of Machine Learning and Cybernetics

, Volume 1, Issue 1, pp 43–52

Understanding bag-of-words model: a statistical framework

Original Article

DOI: 10.1007/s13042-010-0001-0

Cite this article as:
Zhang, Y., Jin, R. & Zhou, Z. Int. J. Mach. Learn. & Cyber. (2010) 1: 43. doi:10.1007/s13042-010-0001-0


The bag-of-words model is one of the most popular representation methods for object categorization. The key idea is to quantize each extracted key point into one of visual words, and then represent each image by a histogram of the visual words. For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. Although a number of studies have shown encouraging results of the bag-of-words representation for object categorization, theoretical studies on properties of the bag-of-words model is almost untouched, possibly due to the difficulty introduced by using a heuristic clustering process. In this paper, we present a statistical framework which generalizes the bag-of-words representation. In this framework, the visual words are generated by a statistical process rather than using a clustering algorithm, while the empirical performance is competitive to clustering-based method. A theoretical analysis based on statistical consistency is presented for the proposed framework. Moreover, based on the framework we developed two algorithms which do not rely on clustering, while achieving competitive performance in object categorization when compared to clustering-based bag-of-words representations.


Object recognitionBag of words modelRademacher complexity

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.National Key Laboratory for Novel Software Technology Nanjing UniversityNanjingChina
  2. 2.Department of Computer Science & EngineeringMichigan State UniversityEast LansingUSA