Advertisement

Incremental Learning of People Identities

  • Federico Bartoli
  • Federico Pernici
  • Matteo Bruni
  • Alberto Del BimboEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11896)

Abstract

Face recognition in unconstrained open-world settings is a challenging problem. Differently from the closed-set and open-set face recognition scenarios that assume that the face representations of known subjects have been manually enrolled in a gallery, the open-world scenario requires that the system learns identities incrementally from frame to frame, discriminate between known and unknown identities and automatically enrolls every new identity in the gallery, so to be able to recognize it every time it is observed again in the future. Performance scaling with large number of identities is likely to be needed in real situations. In this paper we discuss the problem and present a system that has been designed to perform effective open-world face recognition in real time at both small-moderate and large scale.

Keywords

Open-world recognition Incremental learning Large scale 

Notes

Acknowledgments

This research has been partially supported by Leonardo SpA and TICOM, Consorzio Per Le Tecnologie Dell’informazione E Comunicazione, Italy.

References

  1. 1.
    Deng, W., Hu, J., Zhang, N., Chen, B., Guo, J.: Fine-grained face verification: FGLFW database, baselines, and human-DCMN partnership. Pattern Recogn. 66, 63–73 (2017)CrossRefGoogle Scholar
  2. 2.
    Phillips, P.J., et al.: Face recognition accuracy of forensic examiners, super recognizers, and face recognition algorithms. In: Proceedings of the National Academy of Sciences, p. 201721355 (2018)Google Scholar
  3. 3.
    Parkhi, O.M., Vedaldi, A., Zisserman, A., et al.: Deep face recognition. In: BMVC, vol. 1, p. 6 (2015)Google Scholar
  4. 4.
    Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2013)CrossRefGoogle Scholar
  5. 5.
    Scheirer, W.J., Jain, L.P., Boult, T.E.: Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI) 36 (2014)CrossRefGoogle Scholar
  6. 6.
    Bendale, A., Boult, T.: Towards open world recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1893–1902 (2015)Google Scholar
  7. 7.
    Bendale, A., Boult, T.E.: Towards open set deep networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  8. 8.
    Rudd, E.M., Jain, L.P., Scheirer, W.J., Boult, T.E.: The extreme value machine. IEEE Trans. Pattern Anal. Mach. Intell. 40, 762–768 (2017)CrossRefGoogle Scholar
  9. 9.
    Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_31CrossRefGoogle Scholar
  10. 10.
    Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)Google Scholar
  11. 11.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: CVPR, pp. 1701–1708 (2014)Google Scholar
  12. 12.
    Sun, Y., Liang, D., Wang, X., Tang, X.: DeepID3: face recognition with very deep neural networks. arXiv preprint arXiv:1502.00873 (2015)
  13. 13.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)Google Scholar
  14. 14.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: VGGFace2: a dataset for recognising faces across pose and age. arXiv preprint arXiv:1710.08092 (2017)
  15. 15.
    Chen, G., Shao, Y., Tang, C., Jin, Z., Zhang, J.: Deep transformation learning for face recognition in the unconstrained scene. Mach. Vis. Appl. 29, 1–11 (2018)CrossRefGoogle Scholar
  16. 16.
    Zhao, J., Cheng, Y., et al.: Towards pose invariant face recognition in the wild. In: CVPR, pp. 2207–2216 (2018)Google Scholar
  17. 17.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  18. 18.
    Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: International Conference on Learning Representations, Puerto Rico (2016)Google Scholar
  19. 19.
    Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)
  20. 20.
    Graves, A., et al.: Hybrid computing using a neural network with dynamic external memory. Nature 538(7626), 471–476 (2016)CrossRefGoogle Scholar
  21. 21.
    Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850 (2016)Google Scholar
  22. 22.
    Hu, P., Ramanan, D.: Finding tiny faces. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  23. 23.
    Zhang, S., et al.: Tracking persons-of-interest via adaptive discriminative features. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 415–433. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_26CrossRefGoogle Scholar
  24. 24.
    Bäuml, M., Tapaswi, M., Stiefelhagen, R.: Semi-supervised learning with constraints for person identification in multimedia data. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013Google Scholar
  25. 25.
    Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: Motchallenge 2015: towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942 (2015)
  26. 26.
    Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: One-shot learning with memory-augmented neural networks. arXiv preprint arXiv:1605.06065 (2016)
  27. 27.
    Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017)
  28. 28.
    Bansal, A., Nanduri, A., Castillo, C.D., Ranjan, R., Chellappa, R.: UMDFaces: an annotated face dataset for training deep networks. arXiv (2016)Google Scholar
  29. 29.
    Wong, Y., Chen, S., Mau, S., Sanderson, C., Lovell, B.C.: Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition. In: Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 81–88 (2011)Google Scholar
  30. 30.
    Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011).  https://doi.org/10.1109/TPAMI.2010.57. inria-00514462v2CrossRefGoogle Scholar
  31. 31.
    Korn, F., Muthukrishnan, S.: Influence sets based on reverse nearest neighbor queries. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 201–212. ACM, New York (2000)CrossRefGoogle Scholar
  32. 32.
    Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep Hypersphere embedding for face recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2017Google Scholar
  33. 33.
    Kumaran, D., Hassabis, D., McClelland, J.L.: What learning systems do intelligent agents need? Complementary learning systems theory updated. Trends Cogn. Sci. 20, 512–534 (2016)CrossRefGoogle Scholar
  34. 34.
    Sivic, J., Zisserman, A.: The inverted file from “Video Google: a text retrieval approach to object matching in videos.” In: ICCV (2003)Google Scholar
  35. 35.
    Wen, L., Lei, Z., Lyu, S., Li, S.Z., Yang, M.H.: Exploiting hierarchical dense structures on hypergraphs for multi-object tracking. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 38, 1983–1996 (2016)CrossRefGoogle Scholar
  36. 36.
    Pernici, F., Bartoli, F., Bruni, M., Del Bimbo, A.: Memory based online learning of deep representations from video streams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2324–2334 (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Federico Bartoli
    • 1
  • Federico Pernici
    • 1
  • Matteo Bruni
    • 1
  • Alberto Del Bimbo
    • 1
    Email author
  1. 1.MICC, Media Integration and Communication Center, Department of Information EngineeringUniversity of FirenzeFlorenceItaly

Personalised recommendations