Using MDL for Grammar Induction

  • Pieter Adriaans
  • Ceriel Jacobs
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4201)


In this paper we study the application of the Minimum Description Length principle (or two-part-code optimization) to grammar induction in the light of recent developments in Kolmogorov complexity theory. We focus on issues that are important for construction of effective compression algorithms. We define an independent measure for the quality of a theory given a data set: the randomness deficiency. This is a measure of how typical the data set is for the theory. It can not be computed, but it can in many relevant cases be approximated. An optimal theory has minimal randomness deficiency. Using results from [4] and [2] we show that:

– Shorter code not necessarily leads to better theories. We prove that, in DFA induction, already as a result of a single deterministic merge of two nodes, divergence of randomness deficiency and MDL code can occur.

– Contrary to what is suggested by the results of [6] there is no fundamental difference between positive and negative data from an MDL perspective.

– MDL is extremely sensitive to the correct calculation of code length: model code and data-to-model code.

These results show why the applications of MDL to grammar induction so far have been disappointing. We show how the theoretical results can be deployed to create an effective algorithm for DFA induction. However, we believe that, since MDL is a global optimization criterion, MDL based solutions will in many cases be less effective in problem domains where local optimization criteria can be easily calculated. The algorithms were tested on the Abbadingo problems ([10]). The code was in Java, using the Satin ([17]) divide-and-conquer system that runs on top of the Ibis ([18]) Grid programming environment.


Binary String Regular Language Kolmogorov Complexity Short Code Minimum Description Length Principle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  2. 2.
    Adriaans, P., Vitányi, P.M.B.: The power and perils of MDL, Human Computer Studies Lab, Universiteit van Amsterdam (2005)Google Scholar
  3. 3.
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and Its Applications, 2nd edn. Springer, New York (1997)MATHGoogle Scholar
  4. 4.
    Vereshchagin, N.K., Vitányi, P.M.B.: Kolmogorov’s structure functions and model selection. IEEE Trans. Information Theory 50(12), 3265–3290 (2004)CrossRefGoogle Scholar
  5. 5.
    Grünwald, P.D., Langford, J.: Suboptimal behaviour of Bayes and MDL in classification under misspecification. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 331–347. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Gold, E.M.: Mark, Language Identification in the Limit. Information and Control 10(5), 447–474 (1967)MATHCrossRefGoogle Scholar
  7. 7.
    Pitt, L., Warmuth, M.K.: The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial. Journal of the ACM 40(1), 95–142 (1993)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Adriaans, P., Vervoort, M.: The EMILE 4.1 grammar induction toolbox, in Grammatical Inference: Algorithms and Applications. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS (LNAI), vol. 2484, pp. 293–295. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Vervoort, M.: Games, walks and Grammars, Thesis University of Amsterdam (2000)Google Scholar
  10. 10.
    Lang, K.J., Pearlmutter, B.A., Price, R.A.: Results of the Abbadingo One DFA learning competition and a new evidence-driven state merging algorithm. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS (LNAI), vol. 2484, pp. 1–12. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    van Zaanen, M., Adriaans, P.: Alignment-Based Learning versus EMILE: A Comparison. In: Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC), Amsterdam, The Netherlands, pp. 315–322 (2001)Google Scholar
  12. 12.
    Solan, Z., Horn, D., Ruppin, E., Edelman, S.: Unsupervised learning of natural languages. PNAS 102(33), 11629–11634 (2005)CrossRefGoogle Scholar
  13. 13.
    Curnéjols, A., Miclet, L.: Apprentissage artificiel, concepts et algorithmes, Eyrolles (2003)Google Scholar
  14. 14.
    Wolff, J.G.: Computing As Compression: An Overview of the SP Theory and System. New Generation Comput. 13(2), 187–214 (1995)CrossRefGoogle Scholar
  15. 15.
    Wolff, J.G.: Information Compression by Multiple Alignment, Unification and Search as a Unifying Principle in Computing and Cognition. Journal of Artificial Intelligence Research 19(3), 193–230 (2003)CrossRefGoogle Scholar
  16. 16.
    de la Higuera, C., Adriaans, P. W., van Zaanen, M., Oncina, J.(eds.): Proceedings of the Workshop and Tutorial on Learning Context-Free Grammars held at the 14th European Conference on Machine Learning (ECML) and the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Dubrovnik, Croatia (2003)Google Scholar
  17. 17.
    van Nieuwpoort, R.V., Maassen, J., Kielmann, T., Bal, H.E.: Simple and Efficient Java-based Grid Programming. Scalable Computing: Practice and Experience 6(3), 19–32 (2005)Google Scholar
  18. 18.
    van Nieuwpoort, R.V., Maassen, J., Wrzesinska, G., Hofman, R., Jacobs, C., Kielmann, T., Bal, H.E.: Ibis: a Flexible and Efficient Java based Grid Programming Environment. Concurrency and Computation: Practice and Experience 17(7-8), 1079–1107 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Pieter Adriaans
    • 1
  • Ceriel Jacobs
    • 2
  1. 1.Department of Computer ScienceUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.Department of Computer ScienceVrije Universiteit AmsterdamAmsterdamThe Netherlands

Personalised recommendations