Dog Breed Classification Using Part Localization

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7572)


We propose a novel approach to fine-grained image classification in which instances from different classes share common parts but have wide variation in shape and appearance. We use dog breed identification as a test case to show that extracting corresponding parts improves classification performance. This domain is especially challenging since the appearance of corresponding parts can vary dramatically, e.g., the faces of bulldogs and beagles are very different. To find accurate correspondences, we build exemplar-based geometric and appearance models of dog breeds and their face parts. Part correspondence allows us to extract and compare descriptors in like image locations. Our approach also features a hierarchy of parts (e.g., face and eyes) and breed-specific part localization. We achieve 67% recognition rate on a large real-world dataset including 133 dog breeds and 8,351 images, and experimental results show that accurate part localization significantly increases classification performance compared to state-of-the-art approaches.


Face Detection Query Image Color Histogram Part Localization Sift Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Spady, T.C., Ostrander, E.A.: Canine behavioral genetics: Pointing out the phenotypes and herding up the genes. AJHG 82(1), 10–18 (2008)CrossRefGoogle Scholar
  2. 2.
    Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual Recognition with Humans in the Loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Proc. 6th Indian Conf. on Computer Vision, Graphics and Image Processing, pp. 722–729 (2008)Google Scholar
  4. 4.
    Farrell, R., Oza, O., Zhang, N., Morariu, V., Darrell, T., Davis, L.: Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: Proc. ICCV (2011)Google Scholar
  5. 5.
    Belhumeur, P.N., Chen, D., Feiner, S.K., Jacobs, D.W., Kress, W.J., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S., Zhang, L.: Searching the World’s Herbaria: A System for Visual Identification of Plant Species. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 116–129. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. In: Proc. CVPR (2011)Google Scholar
  7. 7.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J.: Visual categorization with bags of keypoints. In: Work. on Stat. Learning in Comp. Vis., ECCV, pp. 1–22 (2004)Google Scholar
  8. 8.
    Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Proc. ICCV (2005)Google Scholar
  9. 9.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR, pp. 2169–2178 (2006)Google Scholar
  10. 10.
    Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: Proc. CVPR (2009)Google Scholar
  11. 11.
    Wang, Z., Hu, Y., Chia, L.-T.: Image-to-Class Distance Metric Learning for Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 706–719. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet. In: Proc. CVPR (2011)Google Scholar
  13. 13.
    Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: Proc. CVPR (2011)Google Scholar
  14. 14.
    Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: Proc. CVPR (2011)Google Scholar
  15. 15.
    Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting People Using Mutually Consistent Poselet Activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Parkhi, O., Vedaldi, A., Zisserman, A., Jawahar, C.: Cats and dogs. In: Proc. CVPR (2012)Google Scholar
  17. 17.
    Viola, P., Jones, M.: Robust real-time object detection. IJCV 57, 137–154 (2001)CrossRefGoogle Scholar
  18. 18.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. CVPR, vol. 1, pp. 886–893 (2005)Google Scholar
  19. 19.
    Parkhi, O., Vedaldi, A., Jawahar, C.V., Zisserman, A.: The truth about cats and dogs. In: Proc. ICCV (2011)Google Scholar
  20. 20.
    Cristinacce, D., Cootes, T.: Feature detection and tracking with constrained local models. In: Proc. BMVC, pp. 929–938 (2006)Google Scholar
  21. 21.
    Milborrow, S., Nicolls, F.: Locating Facial Features with an Extended Active Shape Model. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 504–513. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Saragih, J.M., Lucey, S., Cohn, J.F.: Face alignment through subspace constrained mean-shifts. In: Proc. ICCV (2009)Google Scholar
  23. 23.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 20 (2004)Google Scholar
  24. 24.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: Proc. ICCV (2009)Google Scholar
  25. 25.
    Yin, Q., Tang, X., Sun, J.: An associate-predict model for face recognition. In: Proc. CVPR, pp. 497–504 (2011)Google Scholar
  26. 26.
    Arca, S., Campadelli, P., Lanzarotti, R.: A face recognition system based on automatically determined facial fiducial points. Pattern Recognition 39, 432–443 (2006)zbMATHCrossRefGoogle Scholar
  27. 27.
    Campadelli, P., Lanzarotti, R., Lipori, G.: Precise eye localization through a general-to-specific model definition. In: Proc. BMVC (2006)Google Scholar
  28. 28.
    Vidaldi, A., Zisserman, A.: Image classification practical (2011),
  29. 29.
    Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: Proc. ICCV, pp. 606–613 (2009)Google Scholar
  30. 30.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proc. CVPR, pp. 3360–3367 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Columbia UniversityUSA
  2. 2.University of MarylandUSA

Personalised recommendations