Advertisement

Choose Your Neuron: Incorporating Domain Knowledge Through Neuron-Importance

  • Ramprasaath R. SelvarajuEmail author
  • Prithvijit Chattopadhyay
  • Mohamed Elhoseiny
  • Tilak Sharma
  • Dhruv Batra
  • Devi Parikh
  • Stefan Lee
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

Individual neurons in convolutional neural networks supervised for image-level classification tasks have been shown to implicitly learn semantically meaningful concepts ranging from simple textures and shapes to whole or partial objects – forming a “dictionary” of concepts acquired through the learning process. In this work we introduce a simple, efficient zero-shot learning approach based on this observation. Our approach, which we call Neuron Importance-Aware Weight Transfer (NIWT), learns to map domain knowledge about novel “unseen” classes onto this dictionary of learned concepts and then optimizes for network parameters that can effectively combine these concepts – essentially learning classifiers by discovering and composing learned semantic concepts in deep networks. Our approach shows improvements over previous approaches on the CUBirds and AWA2 generalized zero-shot learning benchmarks. We demonstrate our approach on a diverse set of semantic inputs as external domain knowledge including attributes and natural language captions. Moreover by learning inverse mappings, NIWT can provide visual and textual explanations for the predictions made by the newly learned classifiers and provide neuron names. Our code is available at https://github.com/ramprs/neuron-importance-zsl.

Keywords

Zero Shot Learning Interpretability Grad-CAM 

Notes

Acknowledgements

We thank Yash Goyal and Nirbhay Modhe for help with figures; Peter Vajda and Manohar Paluri for helpful discussions. This work was supported in part by NSF, AFRL, DARPA, Siemens, Google, Amazon, ONR YIPs and ONR Grants N00014-16-1-{2713,2793}. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.

References

  1. 1.
    Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2013)Google Scholar
  2. 2.
    Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)CrossRefGoogle Scholar
  3. 3.
    Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2927–2936 (2015)Google Scholar
  4. 4.
    Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Computer Vision and Pattern Recognition (2017)Google Scholar
  5. 5.
    Changpinyo, S., Chao, W.L., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: CVPR (2016)Google Scholar
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  7. 7.
    Elhoseiny, M., Elgammal, A., Saleh, B.: Write a classifier: predicting visual classifiers from unstructured text. IEEE Trans. Pattern Anal. PP(99), 1 (2017)Google Scholar
  8. 8.
    Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2584–2591 (2013)Google Scholar
  9. 9.
    Elhoseiny, M., Zhu, Y., Zhang, H., Elgammal, A.: Link the head to the “beak”: zero shot learning from noisy text description at part precision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  10. 10.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1778–1785. IEEE (2009)Google Scholar
  11. 11.
    Frome, A., et al.: DeViSE: a deep visual-semantic embedding model. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  12. 12.
    Goyal, Y., Mohapatra, A., Parikh, D., Batra, D.: Interpreting visual question answering models. CoRR abs/1608.08974 (2016). http://arxiv.org/abs/1608.08974
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  14. 14.
    Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  15. 15.
    Konam, S., Quah, I., Rosenthal, S., Veloso, M.: Understanding convolutional networks with apple : automatic patch pattern labeling for explanation. In: First AAAI/ACM Conference on AI, Ethics, and Society (2018)Google Scholar
  16. 16.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)CrossRefGoogle Scholar
  17. 17.
    Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, vol. 1, p. 3 (2008)Google Scholar
  18. 18.
    Lei Ba, J., Swersky, K., Fidler, S., et al.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4247–4255 (2015)Google Scholar
  19. 19.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)Google Scholar
  20. 20.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)Google Scholar
  21. 21.
    Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. In: ICLR (2014)Google Scholar
  22. 22.
    Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., Sohl-Dickstein, J.: Sensitivity and generalization in neural networks: an empirical study. arXiv preprint arXiv:1802.08760 (2018)
  23. 23.
    Qiao, R., Liu, L., Shen, C., Hengel, A.V.D.: Less is more: zero-shot learning from online textual documents with noise suppression. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  24. 24.
    Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49–58 (2016)Google Scholar
  25. 25.
    Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: Proceedings of The 32nd International Conference on Machine Learning, pp. 2152–2161 (2015)Google Scholar
  26. 26.
    Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: why did you say that? visual explanations from deep networks via gradient-based localization. In: ICCV (2017)Google Scholar
  27. 27.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)Google Scholar
  28. 28.
    Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  29. 29.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report CNS-TR-2011-001. California Institute of Technology (2011)Google Scholar
  30. 30.
    Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  31. 31.
    Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning - the good, the bad and the ugly. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  32. 32.
    Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: CVPR (2018)Google Scholar
  33. 33.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  34. 34.
    Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)Google Scholar
  35. 35.
    Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE International Conference on Computer Vision (2015)Google Scholar
  36. 36.
    Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. CoRR abs/1412.6856 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Ramprasaath R. Selvaraju
    • 1
    Email author
  • Prithvijit Chattopadhyay
    • 1
  • Mohamed Elhoseiny
    • 2
  • Tilak Sharma
    • 2
  • Dhruv Batra
    • 1
    • 2
  • Devi Parikh
    • 1
    • 2
  • Stefan Lee
    • 1
  1. 1.Georgia Institute of TechnologyAtlantaUSA
  2. 2.FacebookMenlo ParkUSA

Personalised recommendations