Skip to main content

Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge

  • 1203 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12892)

Abstract

The ability to develop representations of components and to recombine them in a new but compositionally meaningful manner is considered a hallmark of human cognition, which has not been reached by machines, yet. The Omniglot challenge taps into this deficit by posing several one-shot/few-shot generation and classification tasks of handwritten character trajectories. In contrast to the original approach of providing character components, we investigated how compositional representations can develop naturally within a generative LSTM model. The network’s performance and the underlying mechanisms are examined on the original Omniglot dataset and on our own more representative dataset. We show that solving the challenge becomes possible, because, during training, the designed LSTM network fosters the learning of compositional representations, which it can quickly reassemble into new, unseen but related character trajectories. Evidence is provided by several experiments, including an analysis of the latent states of the system, revealing the emergent compositional structures with t-SNE, and the evaluation of the network’s performance, when training and test alphabets do or do not share components. Overall, we show how compositionality can be fostered in latent, generative encodings, thus improving machine learning by further aligning technical methods to cognitive mechanisms in humans.

Keywords

  • Omniglot challenge
  • Characters challenge
  • Compositionality
  • Efficient learning
  • Generative RNN
  • LSTM
  • One-shot inference mechanism

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-86340-1_42
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-86340-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.

Notes

  1. 1.

    Similar results were found for other test alphabets. A deeper interaction analysis goes beyond the scope of this paper. Test Burmese: Training Balinese (0.273) < Greek (0.280) < Latin (0.299); Test Latin: Training Greek / Burmese (0.230) < Balinese (0.251); Test Greek: Training Latin (0.329) < Burmese (0.334) < Balinese (0.339).

References

  1. Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261 (2018)

  2. Butz, M.V., Bilkey, D., Humaidan, D., Knott, A., Otte, S.: Learning, planning, and control in a monolithic neural event inference architecture. Neural Netw. 117, 135–144 (2019)

    CrossRef  Google Scholar 

  3. Edwards, H., Storkey, A.: Towards a neural statistician. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)

    Google Scholar 

  4. Eslami, S., et al.: Attend, infer, repeat: fast scene understanding with generative models. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)

    Google Scholar 

  5. Fabi, S., Otte, S., Wiese, J.G., Butz, M.V.: Investigating efficient learning and compositionality in generative LSTM networks. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12396, pp. 143–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61609-0_12

    CrossRef  Google Scholar 

  6. Feinman, R., Lake, B.M.: Learning task-general representations with generative neuro-symbolic modeling. arXiv:2006.14448 (2020)

  7. Franklin, N.T., Norman, K.A., Ranganath, C., Zacks, J.M., Gershman, S.J.: Structured event memory: a neuro-symbolic model of event cognition. Psychol. Rev. 127, 327–361 (2020)

    CrossRef  Google Scholar 

  8. George, D., et al.: A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs. Science 358, 6368 (2017)

    Google Scholar 

  9. Gopnik, A.: AIs versus four-year-olds. In: Brockman, J. (ed.) Possible Minds: Twenty-five ways of looking at AI. Penguin Press, New York (2019)

    Google Scholar 

  10. Graves, A.: Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013)

  11. Gregor, K., Besse, F., Rezende, D.J., Danihelka, I., Wierstra, D.: Towards conceptual compression. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)

    Google Scholar 

  12. Haibe-Kains, B., et al.: Transparency and reproducibility in artificial intelligence. Nature 586, 1–7 (2020)

    CrossRef  Google Scholar 

  13. Hewitt, L.B., Nye, M.I., Gane, A., Jaakkola, T., Tenenbaum, J.B.: The variational homoencoder: Learning to learn high capacity generative models from few examples. In: Uncertainty in Artificial Intelligence (2018)

    Google Scholar 

  14. Hinton, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems (NeurIPS) (2003)

    Google Scholar 

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    CrossRef  Google Scholar 

  16. Hupkes, D., Dankers, V., Mul, M., Bruni, E.: Compositionality decomposed: how do neural networks generalise? J. Artif. Intell. Res. 67, 757–795 (2020)

    MathSciNet  CrossRef  Google Scholar 

  17. Jensen, D.: Empirical research in machine learning: perspectives and strategies. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  18. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015)

    MathSciNet  CrossRef  Google Scholar 

  19. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: The omniglot challenge: a 3-year progress report. Curr. Opin. Behav. Sci. 29, 97–104 (2019)

    CrossRef  Google Scholar 

  20. Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017)

    Google Scholar 

  21. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  22. Niels, R., Vuurpijl, L.: Using dynamic time warping for intuitive handwriting recognition. In: Proceedings of the 12th Conference of the Internatonal Graphonomics Society (2005)

    Google Scholar 

  23. Otte, S., Karlbauer, M., Butz, M.V.: Active tuning. arXiv:2010.03958 (2020)

  24. Partee, B.: Lexical semantics and compositionality. Invitation Cogn. Sci. Lang. 1, 311–360 (1995)

    Google Scholar 

  25. Rezende, D., Danihelka, I., Gregor, K., Wierstra, D., et al.: One-shot generalization in deep generative models. In: International Conference on Machine Learning (2016)

    Google Scholar 

  26. Shyam, P., Gupta, S., Dukkipati, A.: Attentive recurrent comparators. In: International Conference on Machine Learning (2017)

    Google Scholar 

  27. Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)

    Google Scholar 

  28. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)

    Google Scholar 

Download references

Acknowledgements

We thank Marcel Molière for help with the t-SNE plots, Thilo Hagendorff for helpful comments on the manuscript, and Maximus Mutschler for maintaining the GPU cluster of the BMBF funded project Training Center for Machine Learning, on which the results were computed. This research was funded by the German Research Foundation (DFG) within Priority-Program SPP 2134 - project “Development of the agentive self” (BU 1335/11-1, EL 253/8-1). MB is part of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarah Fabi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Fabi, S., Otte, S., Butz, M.V. (2021). Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12892. Springer, Cham. https://doi.org/10.1007/978-3-030-86340-1_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86340-1_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86339-5

  • Online ISBN: 978-3-030-86340-1

  • eBook Packages: Computer ScienceComputer Science (R0)