Machine Learning

, Volume 108, Issue 8–9, pp 1653–1675 | Cite as

The teaching size: computable teachers and learners for universal languages

  • Jan Arne Telle
  • José Hernández-Orallo
  • Cèsar FerriEmail author
Part of the following topical collections:
  1. Special Issue of the ECML PKDD 2019 Journal Track


The theoretical hardness of machine teaching has usually been analyzed for a range of concept languages under several variants of the teaching dimension: the minimum number of examples that a teacher needs to figure out so that the learner identifies the concept. However, for languages where concepts have structure (and hence size), such as Turing-complete languages, a low teaching dimension can be achieved at the cost of using very large examples, which are hard to process by the learner. In this paper we introduce the teaching size, a more intuitive way of assessing the theoretical feasibility of teaching concepts for structured languages. In the most general case of universal languages, we show that focusing on the total size of a witness set rather than its cardinality, we can teach all total functions that are computable within some fixed time bound. We complement the theoretical results with a range of experimental results on a simple Turing-complete language, showing how teaching dimension and teaching size differ in practice. Quite remarkably, we found that witness sets are usually smaller than the programs they identify, which is an illuminating justification of why machine teaching from examples makes sense at all.


Machine teaching Teaching dimension Teaching size Compression Universal languages P” programming language Levin’s search 



We would like to thank the anonymous referees for their helpful comments. This work was supported by the EU (FEDER) and the Spanish MINECO under grant RTI2018-094403-B-C32, and the Generalitat Valenciana PROMETEO/2019/098. This work was done while the first author visited Universitat Politècnica de València and also while the third author visited University of Bergen (covered by Generalitat Valenciana BEST/2018/027 and University of Bergen). J. Hernández-Orallo is also funded by an FLI grant RFP2-152.


  1. Angluin, D., & Kriķis, M. (2003). Learning from different teachers. Machine Learning, 51(2), 137–163.CrossRefzbMATHGoogle Scholar
  2. Balbach, F. J. (2007). Models for algorithmic teaching. Ph.D. thesis, University of Lübeck.Google Scholar
  3. Balbach, F. J. (2008). Measuring teachability using variants of the teaching dimension. Theoretical Computer Science, 397(1–3), 94–113.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Balbach, F. J., & Zeugmann, T. (2009). Recent developments in algorithmic teaching. In Intl conf on language and automata theory and applications (pp. 1–18). Springer.Google Scholar
  5. Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41–48). ACM.Google Scholar
  6. Biran, O., & Cotton, C. (2017). Explanation and justification in machine learning: A survey. In IJCAI-17 Workshop on explainable AI (XAI) (p. 8).Google Scholar
  7. Böhm, C. (1964). On a family of turing machines and the related programming language. ICC Bulletin, 3(3), 187–194.MathSciNetGoogle Scholar
  8. Elias, P. (1975). Universal codeword sets and representations of the integers. IEEE Transactions on Information Theory, 21(2), 194–203.MathSciNetCrossRefzbMATHGoogle Scholar
  9. Freivalds, R., Kinber, E. B., & Wiehagen, R. (1989). Inductive inference from good examples. In International workshop on analogical and inductive inference (pp. 1–17). Springer.Google Scholar
  10. Freivalds, R., Kinber, E. B., & Wiehagen, R. (1993). On the power of inductive inference from good examples. Theoretical Computer Science, 110(1), 131–144.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Gao, Z., Ries, C., Simon, H. U., & Zilles, S. (2016). Preference-based teaching. In Conf. on learning theory (pp. 971–997).Google Scholar
  12. Gold, E. M. (1967). Language identification in the limit. Information and Control, 10(5), 447–474.MathSciNetCrossRefzbMATHGoogle Scholar
  13. Goldman, S. A., & Kearns, M. J. (1995). On the complexity of teaching. Journal of Computer and System Sciences, 50(1), 20–31.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Goldman, S. A., & Mathias, H. D. (1993). Teaching a smart learner. In Conf. on computational learning theory (pp. 67–76).Google Scholar
  15. Gulwani, S., Hernández-Orallo, J., Kitzelmann, E., Muggleton, S. H., Schmid, U., & Zorn, B. (2015). Inductive programming meets the real world. Communications of the ACM, 58(11).Google Scholar
  16. Hernandez-Orallo, J., & Telle, J. A. (2018). Finite biased teaching with infinite concept classes. arXiv preprint. arXiv:1804.07121.
  17. Jun, S. W. (2016). 50,000,000,000 instructions per second: Design and implementation of a 256-core brainfuck computer. Computer Science and AI Laboratory, MIT.Google Scholar
  18. Khan, F., Mutlu, B., & Zhu, X. (2011). How do humans teach: On curriculum learning and teaching dimension. In Advances in neural information processing systems (pp. 1449–1457).Google Scholar
  19. Lake, B., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In ICML (pp. 2879–2888).Google Scholar
  20. Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332–1338.MathSciNetCrossRefzbMATHGoogle Scholar
  21. Lázaro-Gredilla, M., Lin, D., Guntupalli, J. S., & George, D. (2019). Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. Science Robotics 4.Google Scholar
  22. Levin, L. A. (1973). Universal Search Problems. Problems of Information Transmission, 9, 265–266.Google Scholar
  23. Li, M., & Vitányi, P. (2008). An introduction to Kolmogorov complexity and its applications (3rd ed.). New York, NY: Springer.CrossRefzbMATHGoogle Scholar
  24. Lieberman, H. (2001). Your wish is my command: Programming by example. San Francisco, CA: Morgan Kaufmann.Google Scholar
  25. Shafto, P., Goodman, N. D., & Griffiths, T. L. (2014). A rational account of pedagogical reasoning: Teaching by, and learning from, examples. Cognitive Psychology, 71, 55–89.CrossRefGoogle Scholar
  26. Shinohara, A., & Miyano, S. (1991). Teachability in computational learning. New Generation Computing, 8(4), 337–347.CrossRefzbMATHGoogle Scholar
  27. Simard, P. Y., Amershi, S., Chickering, D. M., Pelton, A. E., Ghorashi, S., Meek, C., Ramos, G., Suh, J., Verwey, J., & Wang, M., et al. (2017). Machine teaching: A new paradigm for building machine learning systems. arXiv preprint arXiv:1707.06742.
  28. Solomonoff, R. J. (1964). A formal theory of inductive inference. Part I. Information and Control, 7(1), 1–22.MathSciNetCrossRefzbMATHGoogle Scholar
  29. Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.CrossRefzbMATHGoogle Scholar
  30. Vapnik, V. N., & Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and Its Applications, 16, 264–280.CrossRefzbMATHGoogle Scholar
  31. Zhu, X. (2013). Machine teaching for Bayesian learners in the exponential family. In Neural information processing systems 26, Curran (pp. 1905–1913).Google Scholar
  32. Zhu, X. (2015). Machine teaching: An inverse problem to machine learning and an approach toward optimal education. In AAAI (pp. 4083–4087).Google Scholar
  33. Zhu, X., Singla, A., Zilles, S., & Rafferty, A. N. (2018). An overview of machine teaching. arXiv preprint arXiv:1801.05927.

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of InformaticsUniversity of BergenBergenNorway
  2. 2.VRAINUniversitat Politècnica de ValènciaValènciaSpain

Personalised recommendations