Pose Invariant Object Recognition Using a Bag of Words Approach

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 694)


Pose invariant object detection and classification plays a critical role in robust image recognition systems and can be applied in a multitude of applications, ranging from simple monitoring to advanced tracking. This paper analyzes the usage of the Bag of Words model for recognizing objects in different scales, orientations and perspective views within cluttered environments. The recognition system relies on image analysis techniques, such as feature detection, description and clustering along with machine learning classifiers. For pinpointing the location of the target object, it is proposed a multiscale sliding window approach followed by a dynamic thresholding segmentation. The recognition system was tested with several configurations of feature detectors, descriptors and classifiers and achieved an accuracy of 87% when recognizing cars from an annotated dataset with 177 training images and 177 testing images.


Object recognition Image feature analysis Clustering Machine learning 



This work is financed by the ERDF - European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the Portuguese funding agency, FCT - Fundao para a Ciência e a Tecnologia as part of project UID/EEA/50014/2013.


  1. 1.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision (2004)Google Scholar
  2. 2.
    Jang, D.M., Turk, M.: Car-Rec: a real time car recognition system. In: IEEE Workshop on Applications of Computer Vision (2011)Google Scholar
  3. 3.
    Thomas, A., Ferrar, V., Leibe, B., Tuytelaars, T., Schiel, B., Van Gool, L.: Towards multi-view object class detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  4. 4.
    Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: object localization by efficient subwindow search. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  5. 5.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (2004)Google Scholar
  6. 6.
    Ponce, J., Lazebnik, S., Rothganger, F., Schmid, C.: Toward true 3D object recognition. In: Congres de Reconnaissance des Formes et Intelligence Artificielle (2004)Google Scholar
  7. 7.
    Zheng, W., Liang, L.: Fast car detection using image strip features. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  8. 8.
    Gerónimo, D., Sappa, A.D., López, A., Ponsa, D.: Adaptive image sampling and windows classification for on-board pedestrian detection. In: International Conference on Computer Vision Systems (2007)Google Scholar
  9. 9.
    Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)CrossRefGoogle Scholar
  10. 10.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.INESC TEC and Faculty of EngineeringUniversity of PortoPortoPortugal

Personalised recommendations