Feature Ranking Ensembles for Facial Action Unit Classification

  • Terry Windeatt
  • Kaushala Dias
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5064)


Recursive Feature Elimination RFE combined with feature-ranking is an effective technique for eliminating irrelevant features. In this paper, an ensemble of MLP base classifiers with feature-ranking based on the magnitude of MLP weights is proposed. This approach is compared experimentally with other popular feature-ranking methods, and with a Support Vector Classifier SVC. Experimental results on natural benchmark data and on a problem in facial action unit classification demonstrate that the MLP ensemble is relatively insensitive to the feature-ranking method, and simple ranking methods perform as well as more sophisticated schemes. The results are interpreted with the assistance of bias/variance of 0/1 loss function.


Feature Selection Feature Subset Base Classifier Training Pattern Machine Learn Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Skuruchina, M., Duin, R.P.W.: Combining feature subsets in feature selection. In: Oza, N., Polikar, R., Roli, F., Kittler, J. (eds.) Proc. 6th Int. Workshop Multiple Classifier Systems, Seaside, Calif. USA, June 2005. LNCS, pp. 165–174. Springer, Heidelberg (2005)Google Scholar
  2. 2.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence Journal, special issueon relevance 97(1-2), 273–324 (1997)zbMATHGoogle Scholar
  3. 3.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)zbMATHCrossRefGoogle Scholar
  4. 4.
    Oza, N., Tumer, K.: Input Decimation ensembles: decorrelation through dimensionality reduction. In: Kittler, J., Roli, F. (eds.) Proc. 2nd Int. Workshop Multiple Classifier Systems, Cambridge, UK. LNCS, pp. 238–247. Springer, Heidelberg (2001)Google Scholar
  5. 5.
    Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36, 1291–1302 (2003)zbMATHCrossRefGoogle Scholar
  6. 6.
    Windeatt, T., Prior, M.: Stopping Criteria for Ensemble-based Feature Selection. In: Proc. 7th Int. Workshop Multiple Classifier Systems, Prague, May 2007. LNCS, pp. 271–281. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Bylander, T.: Estimating generalisation error two-class datasets using out-of-bag estimate. Machine Learning 48, 287–297 (2002)zbMATHCrossRefGoogle Scholar
  8. 8.
    Windeatt, T.: Accuracy/ Diversity and Ensemble Classifier Design. IEEE Trans Neural Networks 17(5), 1194–1211 (2006)CrossRefGoogle Scholar
  9. 9.
    James, G.: Variance and Bias for General Loss Functions. Machine Learning 51(2), 115–135 (2003)zbMATHCrossRefGoogle Scholar
  10. 10.
    Kong, E.B., Dietterich, T.G.: Error- Correcting Output Coding corrects Bias and Variance. In: 12th Int. Conf. Machine Learning, San Francisco, pp. 313–321 (1995)Google Scholar
  11. 11.
    Breiman, L.: Arcing Classifiers. The Annals of Statistics 26(3), 801–849 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)zbMATHCrossRefGoogle Scholar
  13. 13.
    Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)MathSciNetGoogle Scholar
  14. 14.
    Hsu, C., Huang, H., Schuschel, D.: The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans. System, Man and Cybernetics-Part B:Cybernetics 32(2), 207–212 (2002)CrossRefGoogle Scholar
  15. 15.
    Wang, W., Jones, P., Partridge, D.: Assessing the impact of input features in a feedforward neural network. Neural Computing and Applications 9, 101–112 (2000)CrossRefGoogle Scholar
  16. 16.
    Efron, N., Intrator, N.: The effect of noisy bootstrapping on the robustness of supervised classification of gene expression data. In: IEEE Int. Workshop on Machine Learning for Signal Processing, Brazil, pp. 411–420 (2004)Google Scholar
  17. 17.
    Windeatt, T., Prior, M., Effron, N., Intrator, N.: Ensemble-based Feature Selection Criteria. In: Proc. Conference on Machine Learning Data Mining MLDM2007, Leipzig, July 2007, pp. 168–182 (2007) ISBN 978-3-940501-00-4Google Scholar
  18. 18.
    Bartlett, M.S., Littlewort, G., Lainscsek, C., Fasel, I., Movellan, J.: Machine learning methods for fully automatic recognition of facial expressions and facial actions. In: IEEE Conf. Systems, Man and Cybernetics, October 2004, vol. 1, pp. 592–597 (2004)Google Scholar
  19. 19.
    Silapachote, P., Karuppiah, D.R., Hanson, A.R.: Feature Selection using Adaboost for Face Expression Recognition. In: Proc. Conf. on Visualisation, Imaging and Image Processing, Marbella, Spain, September 2004, pp. 84–89 (2004)Google Scholar
  20. 20.
    Fukunaga, K.: Introduction to statistical pattern recognition. Academic Press, London (1990)zbMATHGoogle Scholar
  21. 21.
    Heijden, F., Duin, R.P.W., Ridder, D., Tax, D.M.J.: Classification, Parameter Estimation and State Estimation. Wiley, Chichester (2004)zbMATHGoogle Scholar
  22. 22.
    Prechelt, L.: Proben1: A set of neural network Benchmark Problems and Benchmarking Rules, Tech Report 21/94, Univ. Karlsruhe, Germany (1994)Google Scholar
  23. 23.
    Merz, C.J., Murphy, P.M.: UCI repository of machine learning databases (1998),
  24. 24.
    Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognition 36, 259–275 (2003)zbMATHCrossRefGoogle Scholar
  25. 25.
    Tian, Y., Kanade, T., Cohn, J.F.: Recognising action units for facial expression analysis. IEEE Trans. PAMI 23(2), 97–115 (2001)Google Scholar
  26. 26.
    Kanade, T., Cohn, J.F., Tian, Y.: Comprehenive Database for facial expression analysis. In: Proc. 4th Int. Conf. automatic face and gesture recognition, Grenoble, France, pp. 46–53 (2000)Google Scholar
  27. 27.
    Dietterich, T.G.: Approx. statistical tests for comparing supervised classification learning algorithms. Neural Computation 10, 1895–1923 (1998)CrossRefGoogle Scholar
  28. 28.
    Valentini, G., Dietterich, T.G.: Bias-Variance Analysis for Development of SVM-Based Ensemble Methods. Journal of Machine Learning Research 4, 725–775 (2004)Google Scholar
  29. 29.
    Windeatt, T., Ghaderi, R.: Coding and Decoding Strategies for Multi-class Learning Problems. Information Fusion 4(1), 11–21 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Terry Windeatt
    • 1
  • Kaushala Dias
    • 1
  1. 1.Centre for Vision, Speech and Signal Proc (CVSSP)University of SurreyGuildfordUnited Kingdom

Personalised recommendations