Neural Networks in Video-Based Age and Gender Recognition on Mobile Platforms
- 4 Downloads
Abstract
The paper considers the use of convolutional neural networks for the concurrent recognition of the gender and age of a person by video records of his face. The emphasis is on the incorporation of the approach into mobile video analytics systems. We have investigated the fusion of decisions obtained during the processing of each video frame, including the use of the classifier committee based on Dempster-Shafer theory. We propose the novel age prediction method using the evaluation of the expectation of the most probable ages. We have compared existing neural-net models with a specially trained modification of the MobileNet convolution network with two outputs. The experimental results are given for such data collections as Kinect, IJB-A, Indian Movie and EmotiW. As compared with other conventional methods, our approach makes it possible to increase the age and gender recognition accuracy by 2–5% and 5–10% respectively.
Keywords:
Facial gender and age recognition classifier fusion convolution neural networks (CNN) Dempster-Shafer theory mobile video analyticsNotes
ACKNOWLEDGMENTS
The paper was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE) in 2017–2018 (grant 17-05-0007) and by the Russian Academic Excellence Project “5-100”.
REFERENCES
- 1.Zhang, H., Object-level video advertising: An optimization framework, IEEE Trans. Ind. Inf., 2017, vol. 13, no. 2, pp. 520–531.CrossRefGoogle Scholar
- 2.Savchenko, A.V., Search Techniques in Intelligent Classification Systems, Basel: Springer-Verlag, 2016.CrossRefzbMATHGoogle Scholar
- 3.Kittler, J. and Alkoot, E.M., Sum versus vote fusion in multiple classifiers, IEEE Trans. Pattern Anal. Mach. Int., 2003, vol. 25, no. 1, pp. 110–115.CrossRefGoogle Scholar
- 4.Bagheri, M.A., Gao, Q., and Escalera, S., Logo recognition based on the Dempster-Shafer fusion of multiple classifiers, Can. Conf. Artif. Intell., Springer Berlin Heidelberg, 2013, pp. 1–12.Google Scholar
- 5.Dempster, A., Upper and lower probabilities induced by multivalued mappings, Ann. Math. Stat., 1967, vol. 38, no. 2, pp. 325–339.MathSciNetCrossRefzbMATHGoogle Scholar
- 6.LeCun, Y., Bengio, Y., and Hinton, G., Deep learning, Nature, 2015, vol. 521, no. 7553, pp. 436–444.CrossRefGoogle Scholar
- 7.Theodoridis, S., Pattern Recognition, Elsevier Inc., 2009.zbMATHGoogle Scholar
- 8.Levi, G. and Hassner, T., Age and gender classification using convolutional neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015, pp. 34–42.Google Scholar
- 9.Rothe, R., Timofte, R., and Van Gool, L., DEX: Deep EXpectation of apparent age from a single image, Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015, pp. 10–15.Google Scholar
- 10.Kwon, Y.H., Age classification from facial images, Proceedings CVPR'94 IEEE Computer Society Conference, 1994, pp. 762–767.Google Scholar
- 11.Geng, X., Learning from facial aging patterns for automatic age estimation, Proceedings of the 14th ACM International Conference on Multimedia. ACM, 2006, pp. 307–316.Google Scholar
- 12.Yan, S., Zhou, X., Liu, M., Hasegawa-Johnson, M., and Huang, T.S., Regression from patch-kernel, Proc. Conf. Comput. Vision Pattern Recognition. IEEE, 2008.Google Scholar
- 13.Guo, G., Mu, G., and Fu, Y., Human age estimation using bio-inspired features, CVPR 2009, IEEE Conference on, 2009, pp. 112–119.Google Scholar
- 14.Choi, S.E., Age estimation using a hierarchical classifier based on global and local facial features, Pattern Recognit., 2011, no. 6, pp. 1262–1281.Google Scholar
- 15.Makinen, E. and Raisamo, R., Evaluation of gender classification methods with automatically detected and aligned faces, IEEE Trans. Pattern Anal. Mach. Intell., 2008, vol. 30, no. 4, pp. 541–547.CrossRefGoogle Scholar
- 16.Shan, C., Face recognition and retrieval in video, in Video Search and Mining, Springer Berlin Heidelberg, 2010, pp. 235–260.Google Scholar
- 17.Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014.Google Scholar
- 18.Jin, J., Dundar, A., and Culurciello, E., Flattened convolutional neural networks for feedforward acceleration, arXiv preprint arXiv:1412.5474, 2014.Google Scholar
- 19.Howard, A. et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv:1704.04861, 2017.Google Scholar
- 20.Savchenko, A.V. and Belova, N.S., Unconstrained face identification using maximum likelihood of distances between deep off-the-shelf features, Expert Syst. Appl., 2018, vol. 108, pp. 170–182.CrossRefGoogle Scholar
- 21.Cao, Q., Shen, L., Xie, W., Parkhi, O.M., and Zisserman, A., Vggface2: A dataset for recognizing faces across pose and age, Automatic Face & Gesture Recognition (FG 2018), 2018 13th IEEE International Conference on, 2018, pp. 67–74.Google Scholar
- 22.Eidinger, E., Enbar, R., Hassner, T., Age and gender estimation of unfiltered faces, Trans. Inf. Forensics Secur., 2014, vol. 9, no. 12.Google Scholar
- 23.Szegedy, C., Going deeper with convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.Google Scholar
- 24.Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.Google Scholar
- 25.Esmaeili, M., Creating of multiple classifier systems by fuzzy decision making in human-computer interface systems, Conference IEEE Fuzzy Systems, 2007, pp. 1–7.CrossRefGoogle Scholar
- 26.Savchenko, A.V., Belova, N.S., and Savchenko, L.V., Fuzzy analysis and deep convolution neural networks in still-to-video recognition, Opt. Mem. Neural Networks (Inf. Opt.), 2018, vol. 27, no. 1, pp. 23–31.Google Scholar
- 27.Savchenko, A.V., Adaptive video image recognition system using a committee machine, Opt. Mem. Neural Networks (Inf. Opt.), 2012, vol. 21, no. 4, pp. 219–226.Google Scholar
- 28.Lienhart, R. and Maydt, J., An extended set of Haar-like features for rapid object detection, Proceedings of IEEE International Conference on Image Processing, 2002, vol. 1, p. 1.Google Scholar
- 29.Kaipeng, Z., Zhanpeng, Z., Zhifeng, L., and Qiao, Y., Joint face detection and alignment using multi-task cascaded convolutional networks, IEEE Signal Process. Lett., 2016, vol. 23, no. 10, pp. 1499–1503.CrossRefGoogle Scholar
- 30.Chen, W., Compressing neural networks with the hashing trick, International Conference on Machine Learning, 2015, pp. 2285–2294.Google Scholar
- 31.Han, S., Han, H., and Mao, W., Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding, arXiv preprint arXiv:1510.00149, 2015.Google Scholar
- 32.Wu, J., Wu, J., Leng, C., Wang, Y., Hu, Q., and Cheng, J., Quantized convolutional neural networks for mobile devices, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4820–4828.Google Scholar
- 33.Min, R., Kose N., Dugelay, J., KinectFaceDB: A kinect database for face recognition, IEEE Trans. Syst. Man Cybern. Syst., 2014, vol. 44, no. 11, pp. 1534–1548.CrossRefGoogle Scholar
- 34.Setty, S. et al., Indian movie face database: A benchmark for face recognition under wide variations, Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), IEEE Fourth National Conference on, 2013, pp. 1–5.Google Scholar
- 35.Klare, B., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., and Jain, A.K., Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1931–1939.Google Scholar
- 36.Dhall, A., Goecke, R., Gedeon, T., and Sebe, N., Emotion recognition in the wild, J. Multimodal User Interfaces, 2016, vol. 10, no. 2, pp. 95–97.CrossRefGoogle Scholar