Landmark calibration for facial expressions and fish classification


This paper considers the automatic labeling of emotions in face images found on social media. Facial landmarks are commonly used to classify the emotions from a face image. However, it is difficult to accurately segment landmarks for some faces and for subtle emotions. Previous authors used a Gaussian prior for the refinement of landmarks, but their model often gets stuck in a local minima. Instead, the calibration of the landmarks with respect to the known emotion class label using principal component analysis is proposed in this paper. Next, the face image is generated from the landmarks using an image translation model. The proposed model is evaluated on the classification of facial expressions and also for fish identification underwater and outperforms baselines in accuracy by over \(20\%\).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.


  1. 1.

    Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: ICDM, Barcelona, pp. 439–448 (2016)

  2. 2.

    Chaturvedi, I., Satapathy, R., Cavallari, S., Cambria, E.: Fuzzy commonsense reasoning for multimodal sentiment analysis. Pattern Recogn. Lett. 125, 264–270 (2019)

    Article  Google Scholar 

  3. 3.

    Chaturvedi, I., Xiang, J.: Constrained manifold learning for videos. In: IJCNN, pp. 1–8 (2020)

  4. 4.

    Li, Y., Pan, Q., Wang, S., Yang, T., Cambria, E.: A generative model for category text generation. Inf. Sci. 450, 301–315 (2018)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Susanto, Y., Livingstone, A., Ng, B.C., Cambria, E.: The hourglass model revisited. IEEE Intell. Syst. 35(5), 96–102 (2020)

    Article  Google Scholar 

  6. 6.

    Bartlett, M.S., Littlewort, G., Braathen, B., Sejnowski, T.J., Movellan, J.R.: A prototype for automatic recognition of spontaneous facial actions. In: NIPS, pp. 1295–1302 (2002)

  7. 7.

    Jia, X., Zheng, X., Li, W., Zhang, C., Li, Z.: Facial emotion distribution learning by exploiting low-rank label correlations locally. In: CVPR, pp. 9833–9842 (2019)

  8. 8.

    Davison, A.K., Lansley, C., Costen, N., Tan, K., Yap, M.H.: Samm: a spontaneous micro-facial movement dataset. IEEE Trans. Affect. Comput. 9(1), 116–129 (2018)

    Article  Google Scholar 

  9. 9.

    Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1113–1133 (2015)

    Article  Google Scholar 

  10. 10.

    Zhang, K., Huang, Y., Du, Y., Wang, L.: Facial expression recognition based on deep evolutional spatial–temporal networks. IEEE Trans. Image Process. 26, 4193–4203 (2017)

    MathSciNet  Article  Google Scholar 

  11. 11.

    Shojaeilangari, S., Yau, W., Nandakumar, K., Li, J., Teoh, E.K.: Robust representation and recognition of facial emotions using extreme sparse learning. IEEE Trans. Image Process. 24(7), 2140–2152 (2015)

    MathSciNet  Article  Google Scholar 

  12. 12.

    Qian, C., Chaturvedi, I., Poria, S., Cambria, E., Malandri, L.: Learning visual concepts in images using temporal convolutional networks. In: SSCI, pp. 1280–1284 (2019)

  13. 13.

    Ragusa, E., Apicella, T., Gianoglio, C., Zunino, R., Gastaldo, P.: Design and deployment of an image polarity detector with visual attention. Cogn. Comput. 1–13 (2021)

  14. 14.

    Ragusa, E., Cambria, E., Zunino, R., Gastaldo, P.: A survey on deep learning in image polarity detection: balancing generalization performances and computational costs. Electronics 8(7), 783 (2019)

    Article  Google Scholar 

  15. 15.

    Liu, Z., Zhu, X., Hu, G., Guo, H., Tang, M., Lei, Z., Robertson, M.N., Wang, J.: Semantic alignment: finding semantically consistent ground-truth for facial landmark detection. In: CVPR, pp. 3467–3476 (2019)

  16. 16.

    Zhu, M., Shi, D., Zheng, M., Sadiq, M.: Robust facial landmark detection via occlusion-adaptive deep networks. In: CVPR, pp. 3481–3491 (2019)

  17. 17.

    Ragusa, E., Gianoglio, C., Zunino, R., Gastaldo, P.: Image polarity detection on resource-constrained devices. IEEE Intell. Syst. 35(6), 50–57 (2020)

    Article  Google Scholar 

  18. 18.

    Aifanti, N., Papachristou, C., Delopoulos, A.: The mug facial expression database. In: WIAMIS, pp. 1–4 (2010)

  19. 19.

    Giannopoulos, P., Perikos, I., Hatzilygeroudis, I., Palade, V.: Deep learning approaches for facial emotion recognition: A case study on fer-2013. In: Advances in Hybridization of Intelligent Methods: Models. Systems and Applications, pp. 1–16 (2018)

  20. 20.

    Siddiqui, S.A., Salman, A., Malik, M.I., Shafait, F., Mian, A., Shortis, M.R., Harvey, E.S.: Automatic fish species classification in underwater videos: exploiting pre-trained deep neural network models to compensate for limited labelled data. ICES J. Mar. Sci. 75(1), 374–389 (2017)

    Article  Google Scholar 

  21. 21.

    Gidaris, S., Bursuc, A., Komodakis, N., Perez, P., Cord, M.: Learning representations by predicting bags of visual words. In: CVPR (2020)

  22. 22.

    Qian, Y., Deng, W., Hu, J.: Unsupervised face normalization with extreme pose and expression in the wild. In: CVPR (June 2019)

  23. 23.

    Fan, Z., Yu, J.-G., Liang, Z., Ou, J., Gao, C., Xia, G.-S., Li, Y.: gn: fully guided network for few-shot instance segmentation. In: CVPR (2020)

  24. 24.

    Hsiao, Y.-H., Chen, C.-C., Lin, S.-I., Lin, F.-P.: Real-world underwater fish recognition and identification, using sparse representation. Ecol. Inform. 23, 13–21 (2014) (special Issue on Multimedia in Ecology and Environment)

  25. 25.

    Fishnet: The nature conservancy (2020): Fishnet open images dataset v0.1.2 the nature conservancy. dataset. The Nature Conservancy (2020). Data retrieved

Download references


This work is partially supported by the Computational Intelligence Lab at the Nanyang Technological University. This work is also partially supported by Information Technology, College of Science and Engineering at James Cook University.

Author information



Corresponding author

Correspondence to Iti Chaturvedi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chaturvedi, I., Chen, Q., Cambria, E. et al. Landmark calibration for facial expressions and fish classification. SIViP (2021).

Download citation


  • Facial expressions
  • Adversarial training
  • Singular value decomposition
  • Fish segmentation