Explainable Face Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12356)


Explainable face recognition (XFR) is the problem of explaining the matches returned by a facial matcher, in order to provide insight into why a probe was matched with one identity over another. In this paper, we provide the first comprehensive benchmark and baseline evaluation for XFR. We define a new evaluation protocol called the “inpainting game”, which is a curated set of 3648 triplets (probe, mate, nonmate) of 95 subjects, which differ by synthetically inpainting a chosen facial characteristic like the nose, eyebrows or mouth creating an inpainted nonmate. An XFR algorithm is tasked with generating a network attention map which best explains which regions in a probe image match with a mated image, and not with an inpainted nonmate for each triplet. This provides ground truth for quantifying what image regions contribute to face matching. Finally, we provide a comprehensive benchmark on this dataset comparing five state-of-the-art XFR algorithms on three facial matchers. This benchmark includes two new algorithms called subtree EBP and Density-based Input Sampling for Explanation (DISE) which outperform the state-of-the-art XFR by a wide margin.



This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) under contract number 2019-19022600003. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purpose notwithstanding any copyright annotation thereon.

Supplementary material

504452_1_En_15_MOESM1_ESM.pdf (15.9 mb)
Supplementary material 1 (pdf 16268 KB)


  1. 1.
    Bach, S., Binder, A., Montavon, G., Klauschen, F., Muller, K.-R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)CrossRefGoogle Scholar
  2. 2.
    Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3319–3327 (2017)Google Scholar
  3. 3.
    Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. In: Friedler, S.A., Wilson, C., (eds) Proceedings of the 1st Conference on Fairness, Accountability and Transparency, Proceedings of Machine Learning Research, New York, NY, USA, 23–24 February 2018, vol. 81, pp. 77–91. PMLR (2018)Google Scholar
  4. 4.
    Cao, C., et al.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2956–2964 (2015)Google Scholar
  5. 5.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: VGGFace2: a dataset for recognising faces across pose and age. In: International Conference on Automatic Face and Gesture Recognition (2018)Google Scholar
  6. 6.
    Castanon, G., Byrne., J.: Visualizing and quantifying discriminative features for face recognition. In: International Conference on Automatic Face and Gesture Recognition (2018)Google Scholar
  7. 7.
    Crispell, D., Bazik, M.: Pix2Face: direct 3D face model estimation. In: 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 2512–2518, October 2017Google Scholar
  8. 8.
    Dabkowski, P., Gal, Y.: Real time image saliency for black box classifiers. In: Advances in Neural Information Processing Systems, pp. 6967–6976 (2017)Google Scholar
  9. 9.
    Dhar, P., Bansal, A., Castillo, C.D., Gleason, J., Phillips, P.J., Chellappa, R.: How are attributes expressed in face DCNNs? ArXiv, abs/1910.05657 (2019)Google Scholar
  10. 10.
    Duchaine, B., Nakayama, K.: The Cambridge face memory test: results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia 44, 576–585 (2006)CrossRefGoogle Scholar
  11. 11.
    Facial Identification Scientific Working Group. FISWG Guidelines for Facial Comparison Methods. In: FISWG Standards Version 1.0 - 2012–02-02 (2012)Google Scholar
  12. 12.
    Fong, R., Vedaldi, A.: Interpretable Explanations of Black Boxes by Meaningful Perturbation. arXiv preprint (2017)Google Scholar
  13. 13.
    Garvie, C., Bedoya, A., Frankle, J.: The perpetual line-up: unregulated police face recognition in America. Technical Report, Georgetown University Law School (2018)Google Scholar
  14. 14.
    Grimm, C., Arumugam, D., Karamcheti, S., Abel, D., Wong, L.L., Littman, M.L.: Latent attention networks. arXiv:1706.00536v1 (2017)
  15. 15.
    Grother, P., Ngan, M., Hanaoka, K.: Face recognition vendor test (FRVT) Part 3: demographic effects. In: NISTIR 8280 (2019)Google Scholar
  16. 16.
    Hu, R., Andreas, J., Darrell, T., Saenko, K.: Explainable neural computation via stack neural module networks. In: ECCV (2018)Google Scholar
  17. 17.
    Kindermans, P.-J., et al.: Learning how to explain neural networks: patternnet and pattern attribution. arXiv preprint arXiv:1705.05598 (2017)
  18. 18.
    Li, H., Mueller, K., Chen, X.: Beyond saliency: understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation. Image Vis. Comput. 83–84, 70–86 (2017)Google Scholar
  19. 19.
    Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  20. 20.
    Maze, B., et al.: IARPA Janus benchmark-c: face dataset and protocol. In: 2018 International Conference on Biometrics (ICB), pp. 158–165. IEEE (2018)Google Scholar
  21. 21.
    Nguyen, A.M., Dosovitskiy, A., Yosinski, J., Brox, T., Clune, J.: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In: NIPS (2016)Google Scholar
  22. 22.
    Parkhi, O., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC (2015)Google Scholar
  23. 23.
    Petsiuk, V., Das, A., Saenko, K.: Rise: randomized input sampling for explanation of black-box models. In: British Machine Vision Conference (BMVC) (2018)Google Scholar
  24. 24.
    Phillips, P.J., et al.: Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. In: Proceedings of the National Academy of Sciences of the United States of America (2018)Google Scholar
  25. 25.
    Raji, I.D., Buolamwini, J.: Actionable auditing: investigating the impact of publicly naming biased performance results of commercial AI products. In: AIES 2019 (2019)Google Scholar
  26. 26.
    Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: Explaining the predictions of any classifier. In: KDD 2016 (2016)Google Scholar
  27. 27.
    RichardWebster, B., Kwon, S.Y., Clarizio, C., Anthony, S.E., Scheirer, W.J.: Visual psychophysics for making face recognition algorithms more explainable. In: European Conference on Computer Vision (ECCV) (2018)Google Scholar
  28. 28.
    Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R.: Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Netw. Learn. Syst. 28, 2660–2673 (2015)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K. (eds.): Explaining and Visualizing Deep Learning Explainable AI: Interpreting. Springer, Heidelberg (2019). Scholar
  30. 30.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In CVPR (2015)Google Scholar
  31. 31.
    Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2016)Google Scholar
  32. 32.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. CoRR, abs/1312.6034 (2013)Google Scholar
  33. 33.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: ICLR, p. 1 (2014)Google Scholar
  34. 34.
    Stylianou, A., Souvenir, R., Pless, R.: Visualizing deep similarity networks. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 2029–2037. IEEE (2019)Google Scholar
  35. 35.
    Wu, X., He, R., Sun, Z., Tan, T.: A light CNN for deep face representation with noisy labels. IEEE Trans. Inf. Forensics Secur. 13(11), 2884–2896 (2018)CrossRefGoogle Scholar
  36. 36.
    Xu, T., et al.: Deeper interpretability of deep networks. ArXiv, abs/1811.07807 (2018)Google Scholar
  37. 37.
    Yin, B., Tran, L., Li, H., Shen, X., Liu, X.: Towards interpretable face recognition. In: Proceeding of International Conference on Computer Vision, Seoul, South Korea, October 2019Google Scholar
  38. 38.
    Zee, T., Gali, G., Nwogu, I.: Enhancing human face recognition with an interpretable neural network. In: The IEEE International Conference on Computer Vision (ICCV) Workshops, October 2019Google Scholar
  39. 39.
    Zhang, J., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neural attention by excitation backprop. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9908, pp. 543–559 (2016)Google Scholar
  40. 40.
    Zheng, C., Cham, T.-J., Cai, J.: Pluralistic image completion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1438–1447 (2019)Google Scholar
  41. 41.
    Zhong, Y., Deng, W.: Exploring features and attributes in deep face recognition using visualization techniques. In: IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019) (2019)Google Scholar
  42. 42.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Systems and Technology ResearchWoburnUSA
  2. 2.Visym LabsCambridgeUSA

Personalised recommendations