Abstract
Cascades of boosted ensembles have become popular in the object detection community following their highly successful introduction in the face detector of Viola and Jones. Since then, researchers have sought to improve upon the original approach by incorporating new methods along a variety of axes (e.g. alternative boosting methods, feature sets, etc.). Nevertheless, key decisions about how many hypotheses to include in an ensemble and the appropriate balance of detection and false positive rates in the individual stages are often made by user intervention or by an automatic method that produces unnecessarily slow detectors. We propose a novel method for making these decisions, which exploits the shape of the stage ROC curves in ways that have been previously ignored. The result is a detector that is significantly faster than the one produced by the standard automatic method. When this algorithm is combined with a recycling method for reusing the outputs of early stages in later ones and with a retracing method that inserts new early rejection points in the cascade, the detection speed matches that of the best hand-crafted detector. We also exploit joint distributions over several features in weak learning to improve overall detector accuracy, and explore ways to improve training time by aggressively filtering features.
Similar content being viewed by others
References
Amit, Y., & Geman, D. (1999). A computational model for visual selection. Neural Computation, 11, 1691–1715.
Anthony, M. (2004). Generalization error bounds for threshold decision lists. Journal of Machine Learning Research, 5, 189–217.
Baker, S., & Nayar, S. (1996). Algorithms for pattern rejection. In Proceedings of ICPR (Vol. 2, pp. 869–874).
Bartlett, M., Littlewort, G., Fasel, I., & Movellan, J. (2003). Real time face detection and facial expression recognition: development and application to human-computer interaction.
Blanchard, G., & Blanchard, D. (June 2005). Sequential testing designs for pattern recognition. Annals of Statistics, 33, 1155–1202.
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Monterey: Wadsworth and Brooks.
Brubaker, S. C., Mullin, M. D., & Rehg, J. M. (2006). Towards optimal training of cascaded detectors. In ECCV (1) (pp. 325–337).
Chen, X., & Yuille, A. L. (2005). A time-efficient cascade for real-time object detection: with applications for the visually impaired. In CVPR (3) (pp. 20–26).
Elad, M., Hel-Or, Y., & Keshet, R. (2002). Pattern detection using a maximal rejection classifier. Pattern Recognition Letters, 23(12), 1459–1471.
Fleuret, F. (2004). Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research, 5, 1531–1555.
Fleuret, F., & Geman, D. (2002). Fast face detection with precise pose estimation. In Proceedings of ICPR (Vol. 1, pp. 235–238).
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
Froba, B., & Ernst, A. (2004). Face detection with the modified census transform. In The sixth IEEE international conference on automatic face and gesture recognition (pp. 91–96), May 2004.
Gangaputra, S., & Geman, D. (2006). A design principle for coarse-to-fine classification. In Proceedings of CVPR (Vol. 2, pp. 1877–1884).
Grossmann, E. (2004). Automatic design of cascaded classifiers. In International IAPR workshop on statistical pattern recognition, ICPR.
Grossmann, E., Kale, A., & Jaynes, C. (2005). Towards interactive generation of “ground-truth” in background subtraction from partially labeled examples. In Proceedings of ICCV VS-PETS workshop.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Heisele, B., Serre, T., Mukherjee, S., & Poggio, T. (2001). Feature reduction and hierarchy of classifiers for fast object detection in video images. In Proceedings of CVPR (Vol. 2, pp. 18–24).
Keren, D., Osadchy, M., & Gotsman, C. (2001). Antifaces: a novel, fast method for image detection. IEEE Transactions on PAMI, 23(7), 747–761.
Kienzle, W., Bakir, G., Franz, M., & Schlkopf, B. (2005). Face detection—efficient and rank deficient. In Weiss, Y. (Ed.), NIPS (Vol. 17, pp. 673–680). Cambridge: MIT Press.
Levi, K., & Weiss, Y. (2004). Learning object detection from a small number of examples: the importance of good features. In Proceedings of CVPR (Vol. 2).
Li, S. Z., & Zhang, Z. Q. (2004). Floatboost learning and statistical face detection. IEEE Transactions on PAMI, 26(9), 1112–1123.
Lienhart, R., Kuranov, A., & Pisarevsky, V. (2002). Empirical analysis of detection cascades of boosted classifiers for rapid object detection (Technical report). MRL, Intel Labs.
Liu, C., & Shum, H. (2003). Kullback-Leibler boosting. In Proceedings of CVPR (Vol. I, pp. 587–594).
Luo, H. (2005). Optimization design of cascaded classifiers. In Proceedings of CVPR (Vol. 1, pp. 480–485).
Mas-Colell, A., Whinston, M. D., & Green, J. R. (1995). Microeconomic theory. Oxford: Oxford University Press.
Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004). Weak hypotheses and boosting for generic object detection and recognition. In ECCV (2) (pp. 71–84).
Osadchy, R., Miller, M., & LeCun, Y. (2005). Synergistic face detection and pose estimation with energy-based model selection. In NIPS 17.
Rivest, R. (1987). Learning decision lists. Machine Learning, 2, 229–246.
Romdhani, S., Torr, P., Schoelkopf, B., & Blake, A. (2001). Computationally efficient face detection. In Proceedings of ICCV (pp. 695–700).
Rowley, H. A., Baluja, S., & Kanade, T. (1998). Neural network-based face detection. IEEE Transactions on PAMI, 20(1), 23–38.
Schapire, R. E., & Singer, Y. (1999). Improved boosting using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Schneiderman, H. (2004). Feature-centric evaluation for efficient cascaded object detection. In Proceedings of CVPR (Vol. 2, pp. 29–36).
Šochman, J., & Matas, J. (2005). Waldboost-learning for time constrained sequential detection. In Proceedings of CVPR (Vol. 2, pp. 150–156).
Sun, J., Rehg, J. M., & Bobick, A. (2004). Automatic cascade training with perturbation bias. In Proceedings of CVPR (Vol. 2, pp. 276–283).
Sung, K., & Poggio, T. (1998). Example-based learning for view-based human face detection. IEEE Transactions on PAMI, 20(1), 39–51.
Tu, Z. (2005). Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In ICCV (pp. 1589–1596).
Vidal-Naquet, M., & Ullman, S. (2003). Object recognition with informative features and linear classification. In Proceedings of ICCV (Vol. 1, pp. 281–288).
Viola, P., & Jones, M. (2002). Fast and robust classification using asymmetric AdaBoost and a detector cascade. In NIPS 14.
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. IJCV, 57(2), 137–154.
Wu, J., Rehg, J. M., & Mullin, M. D. (2004). Learning a rare event detection cascade by direct feature selection. In NIPS 16.
Wu, J., Mullin, M. D., & Rehg, J. M. (2005). Linear asymmetric classifier for cascade detectors. In Proceedings of 22nd international conference on machine learning (pp. 993–1000).
Xiao, R., Zhu, L., & Zhang, H.-J. (2003) Boosting chain learning for object detection. In Proceedings of ICCV (Vol. I, pp. 709–715).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brubaker, S.C., Wu, J., Sun, J. et al. On the Design of Cascades of Boosted Ensembles for Face Detection. Int J Comput Vis 77, 65–86 (2008). https://doi.org/10.1007/s11263-007-0060-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0060-1