Inferring Grammar Rules of Programming Language Dialects
In this paper we address the problem of grammatical inference in the programming language domain. The grammar of a programming language is an important asset because it is used in developing many software engineering tools. Sometimes, grammars of languages are not available and have to be inferred from the source code; especially in the case of programming language dialects. We propose an approach for inferring the grammar of a programming language when an incomplete grammar along with a set of correct programs is given as input. The approach infers a set of grammar rules such that the addition of these rules makes the initial grammar complete. A grammar is complete if it parses all the input programs successfully. We also proposes a rule evaluation order, i.e. an order in which the rules are evaluated for correctness. A set of rules are correct if their addition makes the grammar complete. Experiments show that the proposed rule evaluation order improves the process of grammar inference.
KeywordsProgramming language grammars Dialects Minimum Description Length Principle
Unable to display preview. Download preview PDF.
- 1.Adriaans, P.W.: Language Learning for Categorial Perspective. PhD thesis, University of Amsterdam, Amsterdam, Netherlands (November 1992)Google Scholar
- 2.Aho, A.V., Sethi, R., Ullman, J.D.: Compilers Principles, Techniques, and Tools. Pearson Education (Singapore) Pte. Ltd, London (2002)Google Scholar
- 7.Dubey, A., Aggarwal, S.K., Jalote, P.: A technique for extracting keyword based rules from a set of programs. In: CSMR 2005: Proceedings of the Ninth European Conference on Software Maintenance and Reengineering (CSMR 2005), Manchester, UK, pp. 217–225. IEEE Computer Society, Los Alamitos (2005)CrossRefGoogle Scholar
- 8.Dubey, A., Jalote, P., Aggarwal, S.K.: A deterministic technique for extracting keyword based grammar rules from programs. In: Proceedings of 21st Annual ACM Symposium on Applied Computing, PL track, Dijon, France, pp. 1631–1632. ACM SIGAPP, New York (2006)Google Scholar
- 11.Grünwald, P.: A minimum description length approach to grammar inference. In: Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, pp. 203–216. Springer, London (1996)Google Scholar
- 17.Lee, L.: Learning of context-free languages: A survey of the literature. Technical Report TR-12-96, Harvard University (1996), ftp://deas-ftp.harvard.edu/techreports/tr-12-96.ps.gz
- 18.Mernik, M., Gerlic, G., Zumer, V., Bryant, B.: Can a parser be generated from examples? In: Proceedings of 18th ACM symposium on applied computing, pp. 1063–1067. ACM Press, New York (2003)Google Scholar
- 19.Parekh, R., Honovar, V.: Invited Chapter. In: Dale, Moisl, Somers (eds.) Grammar Inference, Automata Induction, and Language Acquision. Marcel Dekker, New York (2000)Google Scholar
- 20.van Zaanen, M.: ABL: Alignment-based learning. In: COLING 2000 - Proceedings of the 18th International Conference on Computational Linguistics, Saarbrücken, Germany, pp. 961–967 (August 2000)Google Scholar