Robust Real-Time Face Detection

Viola, Paul; Jones, Michael J.

doi:10.1023/B:VISI.0000013087.49260.fb

Paul Viola¹ &
Michael J. Jones²

30k Accesses
9315 Citations
30 Altmetric
Explore all metrics

Abstract

This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Amit, Y. and Geman, D. 1999. A computational model for visual selection. Neural Computation, 11:1691–1715.
Google Scholar
Crow, F. 1984. Summed-area tables for texture mapping. In Proceedings of SIGGRAPH, 18(3):207–212.
Google Scholar
Fleuret, F. and Geman, D. 2001. Coarse-to-fine face detection. Int. J. Computer Vision, 41:85–107.
Google Scholar
Freeman, W.T. and Adelson, E.H. 1991. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9):891–906.
Google Scholar
Freund, Y. and Schapire, R.E. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Eurocolt 95, Springer-Verlag, pp. 23–37.
Greenspan, H., Belongie, S., Gooodman, R., Perona, P., Rakshit, S., and Anderson, C. 1994. Overcomplete steerable pyramid filters and rotation invariance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Patt. Anal. Mach. Intell., 20(11):1254–1259.
Google Scholar
John, G., Kohavi, R., and Pfeger, K. 1994. Irrelevant features and the subset selection problem. In Machine Learning Conference Proceedings.
Osuna, E., Freund, R., and Girosi, F. 1997a. Training support vector machines: An application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Osuna, E., Freund, R., and Girosi, F. 1997b. Training support vector machines: an application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Papageorgiou, C., Oren, M., and Poggio, T. 1998. A general framework for object detection. In International Conference on Computer Vision.
Quinlan, J. 1986. Induction of decision trees. Machine Learning, 1:81–106.
Google Scholar
Roth, D., Yang, M., and Ahuja, N. 2000. A snowbased face detector. In Neural Information Processing 12.
Rowley, H., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Patt. Anal. Mach. Intell., 20:22–38.
Google Scholar
Schapire, R.E., Freund, Y., Bartlett, P., and Lee, W.S. 1997. Boosting the margin: A new explanation for the effectiveness of voting methods. In Proceedings of the Fourteenth International Conference on Machine Learning.
Schapire, R.E., Freund, Y., Bartlett, P., and Lee, W.S. 1998. Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Stat., 26(5):1651–1686.
Google Scholar
Schneiderman, H. and Kanade, T. 2000. A statistical method for 3D object detection applied to faces and cars. In International Conference on Computer Vision.
Simard, P.Y., Bottou, L., Haffner, P., and LeCun, Y. (1999). Boxlets: A fast convolution algorithm for signal processing and neural networks. In M. Kearns, S. Solla, and D. Cohn (Eds.), Advances in Neural Information Processing Systems, vol. 11, pp. 571– 577.
Sung, K. and Poggio, T. 1998. Example-based learning for viewbased face detection. IEEE Patt. Anal. Mach. Intell., 20:39–51.
Google Scholar
Tieu, K. and Viola, P. 2000. Boosting image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Tsotsos, J., Culhane, S., Wai, W., Lai, Y., Davis, N., and Nuflo, F. 1995. Modeling visual-attention via selective tuning. Artificial Intelligence Journal, 78(1/2):507–545.
Google Scholar
Webb, A. 1999. Statistical Pattern Recognition. Oxford University Press: New York.
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, One Microsoft Way, Redmond, WA, 98052, USA
Paul Viola
Mitsubishi Electric Research Laboratory, 201 Broadway, Cambridge, MA, 02139, USA
Michael J. Jones

Authors

Paul Viola
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Jones
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Viola, P., Jones, M.J. Robust Real-Time Face Detection. International Journal of Computer Vision 57, 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb

Download citation

Issue Date: May 2004
DOI: https://doi.org/10.1023/B:VISI.0000013087.49260.fb

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Real-Time Face Detection

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Excavating AI: the politics of images in machine learning training sets

A guide to measuring expert performance in forensic pattern matching

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Robust Real-Time Face Detection

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Excavating AI: the politics of images in machine learning training sets

A guide to measuring expert performance in forensic pattern matching

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation