Abstract
In this paper, a deep learning method is proposed for human image processing that incorporates a mechanism to update target-specific parameters. The aim is to improve system performance in situations where the target can be continuously observed. Network-based algorithms typically rely on offline training processes that use large datasets, while trained networks typically operate in a one-shot fashion. That is, each input image is processed one by one in the static network. On the other hand, many practical applications can be expected to use continuous observation rather than observation of a single image. The proposed method employs dynamic use of multiple observations to improve system performance. In this paper, the effectiveness of the proposed method adopting an iterative update process is clarified through its implementation in the task of face-pose estimation. The method consists of two separate processes: 1) sequential estimation and updating of face-shape parameters (target-specific parameters) and 2) face-pose estimation for every single image using the updated parameters. Experimental results indicate the effectiveness of the proposed method.
Similar content being viewed by others
References
Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626. https://doi.org/10.1109/TPAMI.2008.106
Julina JKJ, Sharmila TS (2017) A morphological approach to detect human in video. In: 2017 international conference on computer, communication and signal processing (ICCCSP), pp 1–5. https://doi.org/10.1109/ICCCSP.2017.7944083
Tsumugiwa T, Kamiyoshi A, Yokogawa R, Shibata H (2006) Development of human motion detecting device for human-machine interface. In: 2006 IEEE international conference on robotics and biomimetics, pp 239–244. https://doi.org/10.1109/ROBIO.2006.340160
Hernandez-Matamoros A, Bonarini A, Escamilla-Hernandez E, Nakano-Miyatake M, Perez-Meana H (2016) Facial expression recognition with automatic segmentation of face regions using a fuzzy based classification approach. Knowl-Based Syst 110:1
Hernandez-Matamoros A, Nagai T, Attamimi M, Nakano M, Perez-Meana H (2017) Facial expression recogntion in unconstrained environment. In: New trends in intelligent software methodologies, tools and techniques. IOS Press, pp 525–538
Park S, Mello SD, Molchanov P, Iqbal U, Hilliges O, Kautz J (2019) Few-shot adaptive gaze estimation. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 9367–9376. https://doi.org/10.1109/ICCV.2019.00946
Lindén E, Sjöstrand J, Proutiere A (2019) Learning to personalize in appearance-based gaze tracking. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 1140–1148. https://doi.org/10.1109/ICCVW.2019.00145
He J, Pham K, Valliappan N, Xu P, Roberts C, Lagun D, Navalpakkam V (2019) On-device few-shot personalization for real-time gaze estimation. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 1149–1158. https://doi.org/10.1109/ICCVW.2019.00146
Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE Trans Affect Comput 1–1. https://doi.org/10.1109/TAFFC.2020.2981446
Noroozi F, Kaminska D, Corneanu C, Sapinski T, Escalera S, Anbarjafari G (2018) Survey on emotional body gesture recognition. IEEE Trans Affect Comput 1–1. https://doi.org/10.1109/TAFFC.2018.2874986
Asadi-Aghbolaghi M, Clapés A., Bellantonio M, Escalante HJ, Ponce-López V, Baró X, Guyon I, Kasaei S, Escalera S (2017) A survey on deep learning based approaches for action and gesture recognition in image sequences. In: 2017 12th IEEE international conference on automatic face gesture recognition (FG 2017), pp 476–483. https://doi.org/10.1109/FG.2017.150
Kar A, Corcoran P (2017) A review and analysis of eye-gaze estimation systems, algorithms and performance evaluation methods in consumer platforms. IEEE Access 5:16495. https://doi.org/10.1109/ACCESS.2017.2735633
Yu J, Hong C, Rui Y, Tao D (2017) Multitask autoencoder model for recovering human poses. IEEE Trans Ind Electron 65(6):5060
Hong C, Yu J, Zhang J, Jin X, Lee KH (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inform 15(7):3952
Xiao J, Li H, Qu G, Fujita H, Cao Y, Zhu J, Huang C (2021) Hope: heatmap and offset for pose estimation. J Ambient Intell Humanized Comput 1–13
Krafka K, Khosla A, Kellnhofer P, Kannan H, Bhandarkar S, Matusik W, Torralba A (2016) Eye tracking for everyone. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2176–2184
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics (JMLR Workshop and Conference Proceedings), pp 249–256
Kingma DP, Ba J (2017) Adam: A method for stochastic optimization
Reddi SJ, Kale S, Kumar S (2019) On the convergence of adam and beyond
Fanelli G, Dantone M, Gall J, Fossati A, Van Gool L (2013) Random forests for real time 3D face analysis. Int J Comput Vision 101(3):437
Ruiz N, Chong E, Rehg JM (2018) Fine-grained head pose estimation without keypoints
Yang TY, Chen YT, Lin YY, Chuang YY (2019) FSA-Net: Learning fine-grained structure aggregation for head pose estimation from a single image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE conference on computer vision and pattern recognition, pp 1867–1874. https://doi.org/10.1109/CVPR.2014.241
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: A 3D solution. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 146–155. https://doi.org/10.1109/CVPR.2016.23
Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D face alignment problem? (and a Dataset of 230,000 3D Facial Landmarks). 2017 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/iccv.2017.116
Bin H, Chen R, Xu W, Zhou Q (2019) Improving head pose estimation using two-stage ensembles with top-k regression. Image Vis Comput 93. https://doi.org/10.1016/j.imavis.2019.11.005
Cao Z, Chu Z, Liu D, Chen Y (2020) A vector-based representation to enhance head pose estimation
Albiero V, Chen X, Yin X, Pang G, Hassner T (2021) img2pose: Face alignment and detection via 6dof face pose estimation
Köstinger M, Wohlhart P, Roth PM, Bischof H (2011) Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp 2144–2151. https://doi.org/10.1109/ICCVW.2011.6130513
Acknowledgements
This work was supported by JSPS KAKENHI Grant Number JP18H03269, JP18K11383, JP20K21824.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sei, M., Utsumi, A., Yamazoe, H. et al. Personalized face-pose estimation network using incrementally updated face shape parameters. Appl Intell 52, 11506–11516 (2022). https://doi.org/10.1007/s10489-021-02888-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02888-0