Advertisement

Visual Codebooks Survey for Video On-Line Processing

  • Vítězslav Beran
  • Pavel Zemčík
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6374)

Abstract

This paper explores techniques in the pipeline of image description based on visual codebooks suitable for video on-line processing. The pipeline components are (i) extraction and description of local image features, (ii) translation of each high-dimensional feature descriptor to several most appropriate visual words selected from the discrete codebook and (iii) combination of visual words into bag-of-words using hard or soft assignment weighting scheme. For each component, several state-of-the-art techniques are analyzed and discussed and their usability for video on-line processing is addressed. The experiments are evaluated on the standard Kentucky and Oxford building datasets using image retrieval framework. The results show the impact loosing the pipeline precision in the price of improving the time cost which is crucial for real-time video processing.

Keywords

Visual Word Inverse Document Frequency Video Retrieval Text Retrieval Object Retrieval 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sivic, J., Zisserman, A.: Video Google: Efficient visual search of videos. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 127–144. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  3. 3.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR 2006: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 2161–2168. IEEE Computer Society, Los Alamitos (2006)Google Scholar
  4. 4.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  5. 5.
    Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: CIVR 2007: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 494–501. ACM, New York (2007)Google Scholar
  6. 6.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  7. 7.
    Uijlings, J.R.R., Smeulders, A.W.M., Scha, R.J.H.: Real-time bag of words, approximately. In: CIVR 2009: Proceeding of the ACM International Conference on Image and Video Retrieval, pp. 1–8. ACM, New York (2009)CrossRefGoogle Scholar
  8. 8.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. Int. J. Comput. Vision 65(1-2), 43–72 (2005)CrossRefGoogle Scholar
  9. 9.
    Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 430–443. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vision 60(1), 63–86 (2004)CrossRefGoogle Scholar
  12. 12.
    Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Vítězslav Beran
    • 1
  • Pavel Zemčík
    • 1
  1. 1.Department of Computer Graphics and Multimedia Faculty of Information TechnologyBrno University of TechnologyBrno

Personalised recommendations