codeQuest: Scalable Source Code Queries with Datalog

  • Elnar Hajiyev
  • Mathieu Verbaere
  • Oege de Moor
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4067)


Source code querying tools allow programmers to explore relations between different parts of the code base. This paper describes such a tool, named codeQuest. It combines two previous proposals, namely the use of logic programming and database systems.

As the query language we use safe Datalog, which was originally introduced in the theory of databases. That provides just the right level of expressiveness; in particular recursion is indispensable for source code queries. Safe Datalog is like Prolog, but all queries are guaranteed to terminate, and there is no need for extra-logical annotations.

Our implementation of Datalog maps queries to a relational database system. We are thus able to capitalise on the query optimiser provided by such a system. For recursive queries we implement our own optimisations in the translation from Datalog to SQL. Experiments confirm that this strategy yields an efficient, scalable code querying system.


Database System Logic Programming Query Language Query Optimiser Call Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    The TyRuBa metaprogramming system,
  5. 5.
  6. 6.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
  7. 7.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)zbMATHGoogle Scholar
  8. 8.
    Abraido-Fandino, L.: An overview of Refine 2.0. In: Procs. of the Second International Symposium on Knowledge Engineering and Software Engineering (1987)Google Scholar
  9. 9.
    Apt, K.R., Bol, R.N.: Logic programming and negation: A survey. Journal of Logic Programming 19/20, 9–71 (1994)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Avgustinov, P., Christensen, A.S., Hendren, L., Kuzins, S., Lhoták, J., Lhoták, O., de Moor, O., Sereni, D., Sittampalam, G., Tibble, J.: abc: An extensible AspectJ compiler. In: Aspect-Oriented Software Development (AOSD), pp. 87–98. ACM Press, New York (2005)Google Scholar
  11. 11.
    Backhouse, R., Hoogendijk, P.: Elements of a relational theory of datatypes. In: Möller, B., Schuman, S., Partsch, H. (eds.) Formal Program Development. LNCS, vol. 755, pp. 7–42. Springer, Heidelberg (1993)Google Scholar
  12. 12.
    Bancilhon, F., Maier, D., Sagiv, Y., Ullman, J.D.: Magic sets and other strange ways to implement logic programs. In: Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Cambridge, Massachusetts, March 24-26, pp. 1–16. ACM, New York (1986)CrossRefGoogle Scholar
  13. 13.
    Cast. Company website at:
  14. 14.
    Chen, Y., Nishimoto, M., Ramamoorthy, C.V.: The C information abstraction system. IEEE Transactions on Software Engineering 16(3), 325–334 (1990)CrossRefGoogle Scholar
  15. 15.
    Consens, M., Mendelzon, A., Ryman, A.: Visualizing and querying software structures. In: ICSE 1992: Proceedings of the 14th international conference on Software engineering, pp. 138–156. ACM Press, New York (1992)CrossRefGoogle Scholar
  16. 16.
    Crew, R.F.: ASTLOG: A language for examining abstract syntax trees. In: USENIX Conference on Domain-Specific Languages, pp. 229–242 (1997)Google Scholar
  17. 17.
    Dawson, S., Ramakrishnan, C.R., Warren, D.S.: Practical program analysis using general purpose logic programming systems. In: ACM Symposium on Programming Language Design and Implementation, pp. 117–126. ACM Press, New York (1996)Google Scholar
  18. 18.
    Doornbos, H., Backhouse, R.C., van der Woude, J.: A calculational approach to mathematical induction. Theoretical Computer Science 179(1–2), 103–135 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Eichberg, M., Haupt, M., Mezini, M., Schäfer, T.: Comprehensive software understanding with sextant. In: ICSM 2005: Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM 2005), Washington, DC, USA, September 2005, pp. 315–324. IEEE Computer Society Press, Los Alamitos (2005)CrossRefGoogle Scholar
  20. 20.
    Gallaire, H., Minker, J.: Logic and Databases. Plenum Press, New York (1978)Google Scholar
  21. 21.
    Hanenberg, S., Kniesel, G., Rho, T.: Evolvable pattern implementations need generic aspects. In: Proc. of ECOOP 2004 Workshop on Reflection, AOP and Meta-Data for Software Evolution, June 2004, pp. 116–126 (2004)Google Scholar
  22. 22.
    Gybels, K., Brichau, J.: Arranging language features for more robust pattern-based crosscuts. In: 2nd International Conference on Aspect-oriented Software Development, pp. 60–69. ACM Press, New York (2003)CrossRefGoogle Scholar
  23. 23.
    Hajiyev, E.: CodeQuest: Source Code Querying with Datalog. MSc Thesis, Oxford University Computing Laboratory (September 2005), Available at
  24. 24.
    Janzen, D., de Volder, K.: Navigating and querying code without getting lost. In: 2nd International Conference on Aspect-Oriented Software Development, pp. 178–187 (2003)Google Scholar
  25. 25.
    Jarzabek, S.: Design of flexible static program analyzers with PQL. IEEE Transactions on Software Engineering 24(3), 197–215 (1998)CrossRefGoogle Scholar
  26. 26.
    Javey, S., Mitsui, K., Nakamura, H., Ohira, T., Yasuda, K., Kuse, K., Kamimura, T., Helm, R.: Architecture of the XL C++ browser. In: CASCON 1992: Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research, pp. 369–379. IBM Press (1992)Google Scholar
  27. 27.
    Ježek, K., Toncar, V.: Experimental deductive database. In: Workshop on Information Systems Modelling, pp. 83–90 (1998)Google Scholar
  28. 28.
    Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., Griswold, W.G.: An Overview of AspectJ. In: Knudsen, J.L. (ed.) ECOOP 2001. LNCS, vol. 2072, pp. 327–353. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  29. 29.
    Knaster, B.: Un théorème sur les fonctions d’ensembles. Annales de la Societé Polonaise de Mathematique 6, 133–134 (1928)Google Scholar
  30. 30.
    Koymen, K.: A datalog interface for SQL (abstract). In: CSC 1990: Proceedings of the 1990 ACM annual conference on Cooperation, p. 422. ACM Press, New York (1990)CrossRefGoogle Scholar
  31. 31.
    Lam, M.S., Whaley, J., Livshits, V.B., Martin, M.C., Avots, D., Carbin, M., Unkel, C.: Context-sensitive program analysis as database queries. In: Monica, S. (ed.) PODS 2005: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1–12. ACM Press, New York (2005)CrossRefGoogle Scholar
  32. 32.
    Linton, M.A.: Implementing relational views of programs. In: Henderson, P.B. (ed.) Software Development Environments (SDE), pp. 132–140 (1984)Google Scholar
  33. 33.
    Martin, M., Livshits, B., Lam, M.S.: Finding application errors using PQL: a program query language. In: Proceedings of the 20th annual ACM SIGPLAN OOPSLA Conference, pp. 365–383 (2005)Google Scholar
  34. 34.
    McCormick, E., De Volder, K.: JQuery: finding your way through tangled code. In: OOPSLA 2004: Companion to the 19th annual ACM SIGPLAN OOPSLA conference, pp. 9–10. ACM Press, New York (2004)Google Scholar
  35. 35.
    Nystrom, N., Clarkson, M.R., Myers, A.C.: Polyglot: An Extensible Compiler Framework for Java. In: Hedin, G. (ed.) CC 2003 and ETAPS 2003. LNCS, vol. 2622, pp. 138–152. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  36. 36.
    Paul, S., Prakash, A.: Querying source code using an algebraic query language. IEEE Transactions on Software Engineering 22(3), 202–217 (1996)CrossRefGoogle Scholar
  37. 37.
  38. 38.
    Reps, T.W.: Demand interprocedural program analysis using logic databases. In: Workshop on Programming with Logic Databases, ILPS, pp. 163–196 (1993)Google Scholar
  39. 39.
    Sagonas, K., Swift, T., Warren, D.S.: XSB as an efficient deductive database engine. In: SIGMOD 1994: Proceedings of the 1994 ACM SIGMOD international conference on Management of data, pp. 442–453. ACM Press, New York (1994)CrossRefGoogle Scholar
  40. 40.
    Sword, E.: Create a root. combinedplot interface. JFreeChart feature request (2005),
  41. 41.
    Tarr, P., Harrison, W., Ossher, H.: Pervasive query support in the concern manipulation environment. Technical Report RC23343, IBM Research Division, Thomas J. Watson Research Center (2004)Google Scholar
  42. 42.
    M. Thompson. Bluephoenix: Application modernization technology audit (2004) Available at:
  43. 43.
    Whaley, J., Avots, D., Carbin, M., Lam, M.S.: Using Datalog with Binary Decision Diagrams for Program Analysis. In: Yi, K. (ed.) APLAS 2005. LNCS, vol. 3780, pp. 97–118. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Elnar Hajiyev
    • 1
  • Mathieu Verbaere
    • 1
  • Oege de Moor
    • 1
  1. 1.Programming Tools GroupOxford University Computing LaboratoryOxfordUK

Personalised recommendations