A Neural-Symbolic Architecture for Inverse Graphics Improved by Lifelong Meta-learning

  • Michael KissnerEmail author
  • Helmut Mayer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11824)


We follow the idea of formulating vision as inverse graphics and propose a new type of element for this task, a neural-symbolic capsule. It is capable of de-rendering a scene into semantic information feed-forward, as well as rendering it feed-backward. An initial set of capsules for graphical primitives is obtained from a generative grammar and connected into a full capsule network. Lifelong meta-learning continuously improves this network’s detection capabilities by adding capsules for new and more complex objects it detects in a scene using few-shot learning. Preliminary results demonstrate the potential of our novel approach.


  1. 1.
    Battaglia, P., Pascanu, R., Lai, M., Rezende, D.J., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: NIPS (2016)Google Scholar
  2. 2.
    Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. Proc. Nat. Acad. Sci. 110(45), 18327–18332 (2013)CrossRefGoogle Scholar
  3. 3.
    Hamrick, J.B., Ballard, A.J., Pascanu, R., Vinyals, O., Heess, N., Battaglia, P.W.: Metacontrol for adaptive imagination-based optimization. In: ICLR (2017)Google Scholar
  4. 4.
    Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). Scholar
  5. 5.
    Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: ICLR (2018)Google Scholar
  6. 6.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  7. 7.
    Kulkarni, T.D., Whitney, W.F., Kohli, P., Tenenbaum, J.B.: Deep convolutional inverse graphics network. In: NIPS (2015)Google Scholar
  8. 8.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  9. 9.
    Lenssen, J.E., Fey, M., Libuschewski, P.: Group equivariant capsule networks. In: NIPS (2018)Google Scholar
  10. 10.
    Lipton, Z.C.: The mythos of model interpretability. CoRR abs/1606.03490 (2017)Google Scholar
  11. 11.
    Liu, Y., Wu, Z., Ritchie, D., Freeman, W.T., Tenenbaum, J.B., Wu, J.: Learning to describe scenes with programs. In: ICLR (2019)Google Scholar
  12. 12.
    Liu, Z., Freeman, W.T., Tenenbaum, J.B., Wu, J.: Physical primitive decomposition. In: ECCV (2018)Google Scholar
  13. 13.
    Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: CVPR, pp. 5188–5196 (2015)Google Scholar
  14. 14.
    Mao, J., Gan, C., Kohli, P., Tenenbaum, J.B., Wu, J.: The neuro-symbolic concept learner: interpreting scenes, words, and sentences from natural supervision. In: ICLR (2019)Google Scholar
  15. 15.
    Martinovic, A., Gool, L.V.: Bayesian grammar learning for inverse procedural modeling. In: CVPR (2013)Google Scholar
  16. 16.
    Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Pharr, M., Humphreys, G., Jakob, W.: Physically Based Rendering, 3rd edn. Morgan Kaufmann, Burlington (2016)Google Scholar
  18. 18.
    Quílez, I.: Rendering signed distance fields (2017).
  19. 19.
    Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” explaining the predictions of any classifier. In: Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)Google Scholar
  20. 20.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS (2017)Google Scholar
  21. 21.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 (2014)
  22. 22.
    Godot Engine Team: Godot engine (2019).
  23. 23.
    Tian, Y., et al.: Learning to infer and execute 3D shape programs. In: ICLR (2019)Google Scholar
  24. 24.
    Towell, G.G., Shavlik, J.W.: Extracting refined rules from knowledge-based neural networks. Mach. Learn. 13(1), 71–101 (1993)Google Scholar
  25. 25.
    Towell, G.G., Shavlik, J.W.: Knowledge-based artificial neural networks. Artif. Intell. 70(1), 119–165 (1994)CrossRefGoogle Scholar
  26. 26.
    Tulsiani, S., Su, H., Guibas, L.J., Efros, A.A., Malik, J.: Learning shape abstractions by assembling volumetric primitives. In: CVPR (2017)Google Scholar
  27. 27.
    Ullman, T.D., Spelke, E., Battaglia, P., Tenenbaum, J.B.: Mind games: game engines as an architecture for intuitive physics. Trends Cogn. Sci. 21(9), 649–665 (2017)CrossRefGoogle Scholar
  28. 28.
    Wu, J., Tenenbaum, J.B., Kohli, P.: Neural scene de-rendering. In: CVPR (2017)Google Scholar
  29. 29.
    Yao, S., et al.: 3D-aware scene manipulation via inverse graphics. In: NIPS (2018)Google Scholar
  30. 30.
    Yi, K., Wu, J., Gan, C., Torralba, A., Kohli, P., Tenenbaum, J.B.: Neural-symbolic VQA: disentangling reasoning from vision and language understanding. In: NIPS (2018)Google Scholar
  31. 31.
    Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: CVPR, pp. 8827–8836 (2018)Google Scholar
  32. 32.
    Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point-capsule networks. arXiv:1812.10775 (2018)
  33. 33.
    Zhou, Y., Zhu, Z., Bai, X., Lischinski, D., Cohen-Or, D., Huang, H.: Non-stationary texture synthesis by adversarial expansion. In: SIGGRAPH (2018)Google Scholar
  34. 34.
    Zou, C., Yumer, E., Yang, J., Ceylan, D., Hoiem, D.: 3D-PRNN: generating shape primitives with recurrent neural networks. In: ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institute for Applied Computer ScienceBundeswehr University MunichNeubibergGermany

Personalised recommendations