Skip to main content
Log in

Robust Real-Time Face Detection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

References

  • Amit, Y. and Geman, D. 1999. A computational model for visual selection. Neural Computation, 11:1691–1715.

    Google Scholar 

  • Crow, F. 1984. Summed-area tables for texture mapping. In Proceedings of SIGGRAPH, 18(3):207–212.

    Google Scholar 

  • Fleuret, F. and Geman, D. 2001. Coarse-to-fine face detection. Int. J. Computer Vision, 41:85–107.

    Google Scholar 

  • Freeman, W.T. and Adelson, E.H. 1991. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9):891–906.

    Google Scholar 

  • Freund, Y. and Schapire, R.E. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Eurocolt 95, Springer-Verlag, pp. 23–37.

  • Greenspan, H., Belongie, S., Gooodman, R., Perona, P., Rakshit, S., and Anderson, C. 1994. Overcomplete steerable pyramid filters and rotation invariance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Patt. Anal. Mach. Intell., 20(11):1254–1259.

    Google Scholar 

  • John, G., Kohavi, R., and Pfeger, K. 1994. Irrelevant features and the subset selection problem. In Machine Learning Conference Proceedings.

  • Osuna, E., Freund, R., and Girosi, F. 1997a. Training support vector machines: An application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Osuna, E., Freund, R., and Girosi, F. 1997b. Training support vector machines: an application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Papageorgiou, C., Oren, M., and Poggio, T. 1998. A general framework for object detection. In International Conference on Computer Vision.

  • Quinlan, J. 1986. Induction of decision trees. Machine Learning, 1:81–106.

    Google Scholar 

  • Roth, D., Yang, M., and Ahuja, N. 2000. A snowbased face detector. In Neural Information Processing 12.

  • Rowley, H., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Patt. Anal. Mach. Intell., 20:22–38.

    Google Scholar 

  • Schapire, R.E., Freund, Y., Bartlett, P., and Lee, W.S. 1997. Boosting the margin: A new explanation for the effectiveness of voting methods. In Proceedings of the Fourteenth International Conference on Machine Learning.

  • Schapire, R.E., Freund, Y., Bartlett, P., and Lee, W.S. 1998. Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Stat., 26(5):1651–1686.

    Google Scholar 

  • Schneiderman, H. and Kanade, T. 2000. A statistical method for 3D object detection applied to faces and cars. In International Conference on Computer Vision.

  • Simard, P.Y., Bottou, L., Haffner, P., and LeCun, Y. (1999). Boxlets: A fast convolution algorithm for signal processing and neural networks. In M. Kearns, S. Solla, and D. Cohn (Eds.), Advances in Neural Information Processing Systems, vol. 11, pp. 571– 577.

  • Sung, K. and Poggio, T. 1998. Example-based learning for viewbased face detection. IEEE Patt. Anal. Mach. Intell., 20:39–51.

    Google Scholar 

  • Tieu, K. and Viola, P. 2000. Boosting image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Tsotsos, J., Culhane, S., Wai, W., Lai, Y., Davis, N., and Nuflo, F. 1995. Modeling visual-attention via selective tuning. Artificial Intelligence Journal, 78(1/2):507–545.

    Google Scholar 

  • Webb, A. 1999. Statistical Pattern Recognition. Oxford University Press: New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Viola, P., Jones, M.J. Robust Real-Time Face Detection. International Journal of Computer Vision 57, 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VISI.0000013087.49260.fb

Navigation