Face Detection

  • Raghuraman GopalanEmail author
  • William R. Schwartz
  • Rama Chellappa
  • Ankur Srivastava


Face detection in still images and videos has been extensively studied over the last two decades. Attributed to the recent proliferation of cameras in consumer applications, research in face detection has gradually transformed into more unconstrained settings, with the goal of achieving performance close to humans. This presents two main challenges: (i) in addition to modeling the facial characteristics, understanding the information portrayed by the surrounding scene is important in resolving visual ambiguities, and (ii) the computational time needed for decision making should be compatible for real-time applications, since detection is primarily a front-end process on which additional knowledge extraction is built upon. This chapter begins with a review of recent work in modeling face-specific information, including appearance-based methods used by sliding window classifiers, concepts from learning and local interest-point descriptors, and then focuses on representing the contextual information shared by faces with the surrounding scene. To provide better understanding of working concepts, we discuss a method for learning the semantic context shared by the face with other human body parts that facilitates reasoning under occlusion, and then present an image representation which efficiently encodes contour information to enable fast detection of faces. We conclude the chapter by discussing some existing challenges.


Face Detection Image Representation Integral Image Weak Learner Semantic Context 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509–522 (2002) CrossRefGoogle Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003) zbMATHGoogle Scholar
  3. 3.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 22–32 (May 2004) Google Scholar
  4. 4.
    Dass, S.C., Jain, A.K., Lu, X.: Face detection and synthesis using Markov random field models. In: International Conference on Pattern Recognition, pp. 402–405 (July 2002) Google Scholar
  5. 5.
    Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1271–1278 (June 2009) CrossRefGoogle Scholar
  6. 6.
    Freund, Y., Schapire, R.: A short introduction to boosting. Jpn. Soc. Artif. Intell. 14, 771–780 (1999). Google Scholar
  7. 7.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997) MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Fukunaga, K., Koontz, W.L.G.: Application of the Karhunen–Loève expansion to feature selection and ordering. IEEE Trans. Comput. 19, 311–318 (1970) zbMATHCrossRefGoogle Scholar
  9. 9.
    Gopalan, R., Schwartz, W.: Detecting humans under partial occlusion using Markov logic networks. In: Performance Metrics for Intelligent Systems (September 2010) Google Scholar
  10. 10.
    Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: International Conference on Computer Vision, pp. 1458–1465 (October 2005) Google Scholar
  11. 11.
    Haralick, R.M., Dinstein, I., Shanmugam, K.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3, 610–621 (1973) CrossRefGoogle Scholar
  12. 12.
    Harris, C., Stephens, M.: A combined corner and edge detection. In: Alvey Vision Conference, pp. 147–151 (1988) Google Scholar
  13. 13.
    Hsu, R.L., Abdel-Mottaleb, M., Jain, A.K.: Face detection in color images. IEEE Trans. Pattern Anal. Mach. Intell. 24, 696–706 (2002) CrossRefGoogle Scholar
  14. 14.
    Jain, V., Miller, E.L.: Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst (2010) Google Scholar
  15. 15.
    Kanade, T.: Picture processing system by computer complex and recognition of human faces. In: Doctoral dissertation, Kyoto University (November 1973) Google Scholar
  16. 16.
    Kelly, M.D.: Visual identification of people by computer. PhD thesis, Stanford University, Stanford, CA, USA (1971) Google Scholar
  17. 17.
    Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2129–2142 (2009) CrossRefGoogle Scholar
  18. 18.
    LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 97–104 (June 2004) Google Scholar
  19. 19.
    Lee, K.M.: Component-based online learning for face detection and verification. In: Computational Intelligence and Security, pp. 832–837 (February 2005) CrossRefGoogle Scholar
  20. 20.
    Li, F.F., Fergus, R., Torralba, A.: Recognizing and learning object categories. In: Short Course at International Conference on Computer Vision (September 2009) Google Scholar
  21. 21.
    Li, X.B., Sweigart, J., Teng, J., Donohue, J., Thombs, L.: A dynamic programming based pruning method for decision trees. INFORMS J. Comput. 13, 332–344 (2001) CrossRefGoogle Scholar
  22. 22.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004) CrossRefGoogle Scholar
  23. 23.
    Mansour, Y.: Pessimistic decision tree pruning based on tree size. In: International Conference on Machine Learning, pp. 195–201 (1997) Google Scholar
  24. 24.
    Maron, O., Tomás, L.P.: A framework for multiple-instance learning. In: Neural Information Processing Systems, pp. 570–576 (December 1998) Google Scholar
  25. 25.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22, 761–767 (2004) CrossRefGoogle Scholar
  26. 26.
    Mikolajczyk, K., Choudhury, R., Schmid, C.: Face detection in a video sequence—A temporal approach. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 96–103 (June 2001) Google Scholar
  27. 27.
    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vis. 60, 63–86 (2004) CrossRefGoogle Scholar
  28. 28.
    Moon, H., Chellappa, R., Rosenfeld, A.: Optimal edge-based shape detection. IEEE Trans. Image Process. 11, 1209–1227 (2002) MathSciNetCrossRefGoogle Scholar
  29. 29.
    Nefian, A.V., Hayes III, M.H.: Face detection and recognition using hidden Markov models. In: International Conference on Image Processing, pp. 141–145 (October 1998) Google Scholar
  30. 30.
    Neubeck, A., Gool, L.V.: Efficient non-maximum suppression. In: International Conference on Pattern Recognition, pp. 850–855 (September 2006) Google Scholar
  31. 31.
    Noton, D., Stark, L.: Scanpaths in saccadic eye movements while viewing and recognizing patterns. Vis. Res. 11, 929–932 (1971) CrossRefGoogle Scholar
  32. 32.
    Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002) CrossRefGoogle Scholar
  33. 33.
    Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 29, 51–59 (1996) CrossRefGoogle Scholar
  34. 34.
    Osuna, E., Freund, R., Girosi, F.: Training support vector machines: An application to face detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 130–137, IEEE Comput. Soc., Los Alamitos (1997) Google Scholar
  35. 35.
    Palmer, S.E.: Explorations in Cognition. Freeman, New York (1975) Google Scholar
  36. 36.
    Papageorgiou, C., Poggio, T.: A trainable system for object detection. Int. J. Comput. Vis. 38, 15–33 (2000) zbMATHCrossRefGoogle Scholar
  37. 37.
    Quattoni, A., Collins, M., Darrell, T.: Conditional random fields for object recognition. In: Neural Information Processing Systems, pp. 1097–1104. MIT Press, Cambridge (2004) Google Scholar
  38. 38.
    Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62, 107–136 (2006) CrossRefGoogle Scholar
  39. 39.
    Rowley, H., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20, 23–38 (1998) CrossRefGoogle Scholar
  40. 40.
    Sakai, T., Nagao, M., Fujibayashi, S.: Line extraction and pattern detection in a photograph. Pattern Recognit. 1, 233–248 (1969) CrossRefGoogle Scholar
  41. 41.
    Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2033–2040 (June 2006) Google Scholar
  42. 42.
    Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37, 297–336 (1999) zbMATHCrossRefGoogle Scholar
  43. 43.
    Schwartz, W., Gopalan, R., Chellappa, R., Davis, L.: Robust human detection under occlusion by integrating face and person detectors. In: International Conference on Biometrics, pp. 970–979 (June 2009) Google Scholar
  44. 44.
    Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human detection using partial least squares analysis. In: Proceedings of the International Conference on Computer Vision, pp. 24–31 (September 2009) Google Scholar
  45. 45.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their locations in images. In: International Conference on Computer Vision, pp. 370–377 (October 2005) Google Scholar
  46. 46.
    Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53, 169–191 (2003) CrossRefGoogle Scholar
  47. 47.
    Vijayanarasimhan, S., Grauman, K.: What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2262–2269 (June 2009) CrossRefGoogle Scholar
  48. 48.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004) CrossRefGoogle Scholar
  49. 49.
    Wold, H.: Partial least squares. In: Encyclopedia of Statistical Sciences, vol. 6, pp. 581–591 (1985) Google Scholar
  50. 50.
    Wu, B., Ai, H., Huang, C., Lao, S.: Fast rotation invariant multi-view face detection based on real AdaBoost. In: Automatic Face and Gesture Recognition, pp. 79–83( May 2004) Google Scholar
  51. 51.
    Yang, M.H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 24, 34–58 (2002) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Raghuraman Gopalan
    • 1
    Email author
  • William R. Schwartz
    • 2
  • Rama Chellappa
    • 3
  • Ankur Srivastava
    • 4
  1. 1.Department of Electrical and Computer EngineeringUniversity of MarylandCollege ParkUSA
  2. 2.Institute of ComputingUniversity of CampinasCampinas-SPBrazil
  3. 3.Department of Electrical and Computer Engineering, and UMIACSUniversity of MarylandCollege ParkUSA
  4. 4.Department of Electrical and Computer Engineering, and Institute for Systems ResearchUniversity of MarylandCollege ParkUSA

Personalised recommendations