Optical Memory and Neural Networks

, Volume 27, Issue 1, pp 23–31 | Cite as

Fuzzy Analysis and Deep Convolution Neural Networks in Still-to-video Recognition

  • A. V. Savchenko
  • N. S. Belova
  • L. V. Savchenko


We discuss the video classification problem with the matching of feature vectors extracted using deep convolutional neural networks from each frame. We propose the novel recognition method based on representation of each frame as a sequence of fuzzy sets of reference classes whose degrees of membership are defined based on asymptotic distribution of the Kullback–Leibler information divergence and its relation with the maximum likelihood method. In order to increase the classification accuracy, we perform the fuzzy intersection (product triangular norms) of these sets. Experimental study with YTF (YouTube Faces) and IJB-A (IARPA Janus Benchmark A) video datasets and VGGFace, ResFace and LightCNN descriptors shows that the proposed approach allows us to increase the accuracy of recognition by 2–6% comparing with the known classification methods.


Image recognition video recognition deep learning convolutional neural networks statistical pattern recognition Kullback–Leibler divergence fuzzy sets 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Goodfellow, I., Bengio., Y., and Courville, A., Deep Learning, Adaptive Computation and Machine Learning series, Cambridge, MA: MIT Press, 2016.zbMATHGoogle Scholar
  2. 2.
    LeCun, Y., Bengio, Y., and Hinton, G., Deep learning, Nature, 2015, vol. 521, no. 7553, pp. 436–444.CrossRefGoogle Scholar
  3. 3.
    Miech, A., Laptev, I., and Sivic, J., Learnable pooling with Context Gating for video classification, 2017 Scholar
  4. 4.
    Yang, J., Ren, P., Chen, D., Wen, F., Li, H., and Hua, G., Neural aggregation network for video face recognition, Proc. 29th IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2017, pp. 4362–4371.Google Scholar
  5. 5.
    Savchenko, A.V., Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition, Opt. Mem. Neural Networks, 2017, vol. 26, no. 2, pp. 129–136.CrossRefGoogle Scholar
  6. 6.
    Parkhi, O.M., Vedaldi, A., and Zisserman, A., Deep face recognition, Proc. 26th British Machine Vision Conference (BMVC), 2015, Swansea: BMVA Press, 2015, pp. 6–17.Google Scholar
  7. 7.
    Masi, I., Tran, A., Hassner, T., Leksut, J.T., and Medioni, G., Do we really need to collect millions of faces for effective face recognition? Proc. 14th European Conference on Computer Vision (ECCV), Dordrecht: Springer-Verlag, 2016.Google Scholar
  8. 8.
    Savchenko, A.V., Search Techniques in Intelligent Classification Systems, Basel: Springer-Verlag, 2016, p.83.CrossRefzbMATHGoogle Scholar
  9. 9.
    Savchenko, V.V., The principle of the information-divergence minimum in the problem of spectral analysis of the random time series under the condition of small observation samples, Radiophys. Quantum Electron. (Engl. Transl.), 2015, vol. 58, no. 5, pp. 373–379.CrossRefGoogle Scholar
  10. 10.
    Tan, X., Chen, S., Zhou, Z.-H., and Zhang, F., Face recognition from a single image per person: a survey, Pattern Recognit. Image Anal., 2006, vol. 39, no. 9, pp. 1725–1745.CrossRefzbMATHGoogle Scholar
  11. 11.
    Guo, Y. and Zhang, L., One-shot face recognition by promoting underrepresented classes, 2017. Scholar
  12. 12.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., and Berg, A.C., ImageNet large-scale visual recognition challenge, Int. J. Comput. Vision, 2015, vol. 115, no. 3, pp. 211–252.MathSciNetCrossRefGoogle Scholar
  13. 13.
    Krizhevsky, A., Sutskever, I., and Hinton, G.E., ImageNet classification with deep convolutional neural networks, Proc. 26th Annual Conference on Neural Information Processing Systems (NIPS), San Diego: NIPS Foundation, 2012, pp. 1097–1105.Google Scholar
  14. 14.
    Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carlsson, S., CNN features off-the-shelf: an astounding baseline for recognition, Proc. 2014 IEEE Conf. on Computer Vision and Pattern Recognition, Washington, 2014, pp. 806–813.Google Scholar
  15. 15.
    Savchenko, A.V., Maximum-Likelihood Approximate Nearest Neighbor Method in Real-time Image Recognition, Pattern Recognit., 2017, vol. 61, pp. 459–469.CrossRefGoogle Scholar
  16. 16.
    Savchenko, A.V., Maximum-likelihood dissimilarities in image recognition with deep neural networks, Computer Optics, 2017, vol. 41, no. 3, pp. 422–430.CrossRefGoogle Scholar
  17. 17.
    Best-Rowden, L., Han, H., Otto, C., Klare, B.F., and Jain, A.K., Unconstrained face recognition: identifying a person of interest from a media collection, IEEE Trans. Inform. Forensics Security, 2014, vol. 9, pp. 2144–2157.CrossRefGoogle Scholar
  18. 18.
    Klare, B.F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., and Jain, A.K., Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A., Proc. 2014 IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, 2015, pp. 1931–1939.Google Scholar
  19. 19.
    Wolf, L., Hassner, T., and Maoz, I., Face recognition in unconstrained videos with matched background similarity, Proc. 2011 IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, 2011, pp. 529–534.Google Scholar
  20. 20.
    Zadeh, L.A., Making computers think like people, IEEE Spectrum, 1984, vol. 21, no. 8, pp. 26–32.CrossRefGoogle Scholar
  21. 21.
    Savchenko, A.V. and Savchenko, L.V., Towards the creation of reliable voice control system based on a fuzzy approach, Pattern Recognit. Lett., 2015, vol. 65, pp. 145–151.CrossRefGoogle Scholar
  22. 22.
    Savchenko, L.V. and Savchenko, A.V., Fuzzy phonetic decoding method in a phoneme recognition problem, Proc. 6th Int. Conf. on Nonlinear Speech Processing (NOLISP), LNCS/LNAI, Mons, 2013, vol. 7911, pp. 176–183.CrossRefGoogle Scholar
  23. 23.
    Kullback, S., Information Theory and Statistics, New York: Dover, 1997.zbMATHGoogle Scholar
  24. 24.
    Savchenko, A.V., Adaptive video image recognition system using a committee machine, Opt. Mem. Neural Networks, 2012, vol. 21, no. 4, pp. 219–226.MathSciNetCrossRefGoogle Scholar
  25. 25.
    Viola P. and Jones, M., Rapid object detection using a boosted cascade of simple features, Proc. 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, 2001, pp. 511–518.Google Scholar
  26. 26.
    Szeliski, R., Computer Vision: Algorithms and Applications, New York: Springer-Verlag, 2010.zbMATHGoogle Scholar
  27. 27.
    Huang, Z., Zhao, X., Shan, S., Wang, R., and Chen, X., Coupling alignments with recognition for still-to-video face recognition, Proc. 2013 IEEE Int. Conf. on Computer Vision and Pattern Recognition, CVPR2013, Washington, 2013, pp. 3296–3303.CrossRefGoogle Scholar
  28. 28.
    Savchenko, A.V. and Khokhlova, Ya.I., About neural-network algorithms application in viseme classification problem with face video in audiovisual speech recognition systems, Opt. Mem. Neural Networks, 2014, vol. 23, no. 1, pp. 34–42.CrossRefGoogle Scholar
  29. 29.
    Liu, L., Zhang, L., Liu, H., and Yan, S., Toward large-population face identification in unconstrained videos, IEEE Trans. Circuits Syst., 2014, vol. 24, no. 11, pp. 1874–1884.Google Scholar
  30. 30.
    Huang, Z., Wang, R., Shan, S., and Chen, X., Projection metric learning on Grassmann manifold with application to video based face recognition, Proc. 2015 IEEE Int. Conf. on Computer Vision and Pattern Recognition, CVPR2015, Boston, 2015, pp. 140–149.Google Scholar
  31. 31.
    Raudys, S.J. and Jain, A.K., Small sample size effects in statistical pattern recognition: Recommendations for practitioners, IEEE Trans. Pattern Anal. Mach. Intell., 1991, vol. 13, no. 3, pp. 252–264.CrossRefGoogle Scholar
  32. 32.
    Prince, S.J., Computer Vision: Models, Learning, and Inference, Cambridge: Cambridge Univ. Press, 2012.CrossRefzbMATHGoogle Scholar
  33. 33.
    Wu, X., He, R., and Sun Z., A lightened CNN for deep face representation, Proc. 2015 IEEE Int. Conf. on Computer Vision and Pattern Recognition, CVPR2015, Boston, 2015.Google Scholar
  34. 34.
    Kittler, J. Hatef, M., Duin, R.P., and Matas, J., On combining classifiers, IEEE Trans. Pattern Anal. Mach. Intell., 1998, vol. 20, no. 3, pp. 226–239.CrossRefGoogle Scholar
  35. 35.
    Lumini, A., Nanni, L., and Brahnam, S., Ensemble of texture descriptors and classifiers for face recognition, Appl. Comput. Inf., 2017, vol. 13, no. 1, pp. 79–91.Google Scholar
  36. 36.
    Shakhnarovich, G., Fisher, J.W., and Darrell, T., Face recognition from long-term observations, Proc. 7th European Conf. on Computer Vision (ECCV), Copenhagen, 2002, pp. 851–865.Google Scholar
  37. 37.
    Sokolova, A.D., Kharchevnikova, A.S., and Savchenko, A.V., Organizing multimedia data in video surveillance systems based on face verification with convolutional neural networks, Proc. Int. Conf. on Analysis of Images, Social Networks and Texts (AIST 2017), LNCS, New York: Springer-Verlag, 2018, vol. 10716, pp. 223–230.CrossRefGoogle Scholar
  38. 38.
    Savchenko, V.V. and Savchenko, A.V., Information-theoretic analysis of efficiency of the phonetic encoding–decoding method in automatic speech recognition, J. Commun. Technol. Electron., 2016, vol. 61, no. 4, pp. 430–435.CrossRefGoogle Scholar
  39. 39.
    Savchenko, A.V. and Belova, N.S., Statistical testing of segment homogeneity in classification of piecewise–regular objects, Int. J. Appl. Math. Comput. Sci., 2015, vol. 25, no. 4, pp. 915–925.MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Burghouts, G., Smeulders, A., and Geusebroek, J.-M., The distribution family of similarity distances, Proc. Int. Conf. on Advances in Neural Information Processing Systems (NIPS), Cambridge, MA: MIT Press, 2008, pp. 201–208.Google Scholar
  41. 41.
    P’kalska, E. and Duin, R.P., Classifiers for dissimilarity-based pattern recognition, Proc. IEEE Int. Conf. on Pattern Recognition (CVPR), Los Alamitos, 2000, pp. 12–16.Google Scholar
  42. 42.
    Klement E.P., Mesiar R., and Pap E., Triangular Norms, Dordrecht: Springer-Verlag, 2000.CrossRefzbMATHGoogle Scholar
  43. 43.
    Savchenko A.V., Phonetic encoding method in the isolated words recognition problem, J. Commun. Technol. Electron., 2014, vol. 59, no. 4, pp. 310–315.CrossRefGoogle Scholar
  44. 44.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., and Darrell, T., Caffe: convolutional architecture for fast feature embedding, Proc. 22nd Int. Conf. on Multimedia, Orlando, 2014, pp. 675–678.Google Scholar
  45. 45.
    Learned-Miller, E., Huang, G.B., Roy Chowdhury, A., Li, H., and Hua, G., Labeled faces in the wild: a survey, in Advances in Face Detection and Facial Image Analysis, New York: Springer-Verlag, 2016, pp. 189–248.Google Scholar
  46. 46.
    Taigman, Y., Yang, M., Ranzato, M., and Wolf, L., DeepFace: closing the gap to human-level performance in face verification, Proc. 2014 IEEE Conf. on Computer Vision and Pattern Recognition, Washington, 2014, pp. 1701–1708.CrossRefGoogle Scholar
  47. 47.
    Savchenko, A.V., Face recognition in real-time applications: Comparison of directed enumeration method and K-d trees, Proc. 11th Int. Conf. on Perspectives in Business Informatics Research (BIR 2012), Nizhny Novgorod, 2012, vol. 128, pp. 187–199.CrossRefGoogle Scholar
  48. 48.
    Liu, J., Deng, Y., Bai, T., Wei, Z., and Huang, C., Targeting ultimate accuracy: Face recognition via deep embedding, 2015. Scholar

Copyright information

© Allerton Press, Inc. 2018

Authors and Affiliations

  • A. V. Savchenko
    • 1
  • N. S. Belova
    • 2
  • L. V. Savchenko
    • 1
    • 3
  1. 1.National Research University Higher School of EconomicsLaboratory of Algorithms and Technologies for Network AnalysisNizhny NovgorodRussia
  2. 2.National Research University Higher School of EconomicsMoscowRussia
  3. 3.Nizhny Novgorod State Linguistic UniversityNizhny NovgorodRussia

Personalised recommendations