Structured Output Ordinal Regression for Dynamic Facial Emotion Intensity Prediction

  • Minyoung Kim
  • Vladimir Pavlovic
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6313)


We consider the task of labeling facial emotion intensities in videos, where the emotion intensities to be predicted have ordinal scales (e.g., low, medium, and high) that change in time. A significant challenge is that the rates of increase and decrease differ substantially across subjects. Moreover, the actual absolute differences of intensity values carry little information, with their relative order being more important. To solve the intensity prediction problem we propose a new dynamic ranking model that models the signal intensity at each time as a label on an ordinal scale and links the temporally proximal labels using dynamic smoothness constraints. This new model extends the successful static ordinal regression to a structured (dynamic) setting by using an analogy with Conditional Random Field (CRF) models in structured classification. We show that, although non-convex, the new model can be accurately learned using efficient gradient search. The predictions resulting from this dynamic ranking model show significant improvements over the regular CRFs, which fail to consider ordinal relationships between predicted labels. We also observe substantial improvements over static ranking models that do not exploit temporal dependencies of ordinal predictions. We demonstrate the benefits of our algorithm on the Cohn-Kanade dataset for the dynamic facial emotion intensity prediction problem and illustrate its performance in a controlled synthetic setting.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers, MIT Press, Cambridge (2000)Google Scholar
  2. 2.
    Shashua, A., Levin, A.: Ranking with large margin principle: Two approaches. In: Neural Information Processing Systems (2003)Google Scholar
  3. 3.
    Chu, W., Keerthi, S.S.: New approaches to support vector ordinal regression. In: International Conference on Machine Learning (2005)Google Scholar
  4. 4.
    Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: Learning to rank images for image retrieval. In: Computer Vision and Pattern Recognition (2008)Google Scholar
  5. 5.
    Jing, Y., Baluja, S.: Pagerank for product image search. In: Proceeding of the 17th International Conference on World Wide Web (2008)Google Scholar
  6. 6.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2, 265–292 (2001)CrossRefGoogle Scholar
  7. 7.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (2001)Google Scholar
  8. 8.
    Kumar, S., Hebert, M.: Discriminative random fields. International Journal of Computer Vision 68, 179–201 (2006)CrossRefGoogle Scholar
  9. 9.
    Vishwanathan, S., Schraudolph, N., Schmidt, M., Murphy, K.: Accelerated training of conditional random fields with stochastic meta-descent. In: International Conference on Machine Learning (2006)Google Scholar
  10. 10.
    Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. Journal of Machine Learning Research 6, 1019–1041 (2005)MathSciNetGoogle Scholar
  11. 11.
    Kim, M., Pavlovic, V.: Discriminative learning for dynamic state prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 1847–1861 (2009)CrossRefGoogle Scholar
  12. 12.
    Ionescu, C., Bo, L., Sminchisescu, C.: Structural SVM for visual localization and continuous state estimation. In: International Conference on Computer Vision (2009)Google Scholar
  13. 13.
    Qin, T., Liu, T.Y., Zhang, X.D., Wang, D.S., Li, H.: Global ranking using continuous conditional random fields. In: Neural Information Processing Systems (2008)Google Scholar
  14. 14.
    Mao, Y., Lebanon, G.: Generalized isotonic conditional random fields. Machine Learning 77, 225–248 (2009)CrossRefGoogle Scholar
  15. 15.
    Pavlovic, V., Rehg, J.M., MacCormick, J.: Learning switching linear models of human motion. In: Neural Information Processing Systems (2000)Google Scholar
  16. 16.
    Lien, J., Kanade, T., Cohn, J., Li, C.: Detection, tracking, and classification of action units in facial expression. J. Robotics and Autonomous Systems (1999)Google Scholar
  17. 17.
    Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision 57, 137–154 (2001)CrossRefGoogle Scholar
  18. 18.
    Tian, Y.: Evaluation of face resolution for expression analysis. In: Computer Vision and Pattern Recognition, Workshop on Face Processing in Video (2004)Google Scholar
  19. 19.
    Lien, J.J., Cohn, J.F.: Automated facial expression recognition based on FACS action units. In: Int’l Conf. on Automatic Face and Gesture Recognition (1998)Google Scholar
  20. 20.
    Cohen, I., Sebe, N., Garg, A., Chen, L.S., Huang, T.S.: Facial expression recognition from video sequences: Temporal and static modeling. Computer Vision and Image Understanding 91, 160–187 (2003)CrossRefGoogle Scholar
  21. 21.
    Shan, C., Gong, S., McOwan, P.W.: Conditional mutual information based boosting for facial expression recognition. In: British Machine Vision Conference (2005)Google Scholar
  22. 22.
    Yang, P., Liu, Q., Metaxas, D.N.: Rankboost with l1 regularization for facial expression recognition and intensity estimation. In: International Conference on Computer Vision (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Minyoung Kim
    • 1
  • Vladimir Pavlovic
    • 1
  1. 1.Department of Computer ScienceRutgers UniversityPiscatawayUSA

Personalised recommendations