International Conference on Conceptual Modeling

Conceptual Modeling pp 329-343 | Cite as

Gitana: A SQL-Based Git Repository Inspector

  • Valerio Cosentino
  • Javier Luis Cánovas Izquierdo
  • Jordi Cabot
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9381)


Software development projects are notoriously complex and difficult to deal with. Several support tools such as issue tracking, code review and Source Control Management (SCM) systems have been introduced in the past decades to ease development activities. While such tools efficiently track the evolution of a given aspect of the project (e.g., bug reports), they provide just a partial view of the project and often lack of advanced querying mechanisms limiting themselves to command line or simple GUI support. This is particularly true for projects that rely on Git, the most popular SCM system today.

In this paper, we propose a conceptual schema for Git and an approach that, given a Git repository, exports its data to a relational database in order to (1) promote data integration with other existing SCM tools and (2) enable writing queries on Git data using standard SQL syntax. To ensure efficiency, our approach comes with an incremental propagation mechanism that refreshes the database content with the latest modifications. We have implemented our approach in Gitana, an open-source tool available on GitHub.


Git SQL Conceptual schema 


  1. 1.
    Cockburn, A., Highsmith, J.: Agile software development: the people factor. Computer 34(11), 131–133 (2001)CrossRefGoogle Scholar
  2. 2.
    Rochkind, M.J.: The source code control system. Trans. Softw. Eng. 4, 364–370 (1975)CrossRefGoogle Scholar
  3. 3.
    O’Sullivan, B.: Making sense of revision-control systems. Commun. ACM 52(9), 56–62 (2009)CrossRefGoogle Scholar
  4. 4.
    Serrano, N., Ciordia, I.: Bugzilla, itracker, and other bug trackers. Software 22(2), 11–13 (2005)CrossRefGoogle Scholar
  5. 5.
    Kemerer, C.F., Paulk, M.C.: The impact of design and code reviews on software quality: an empirical study based on PSP data. Trans. Softw. Eng. 35(4), 534–550 (2009)CrossRefGoogle Scholar
  6. 6.
    Chacon, S., Hamano, J.C.: Pro Git, vol. 288. Apress, Berkeley (2009)CrossRefGoogle Scholar
  7. 7.
    Bird, C., Rigby, P.C., Barr, E.T., Hamilton, D.J., German, D.M., Devanbu, P.: The promises and perils of mining Git. In: MSR, pp. 1–10 (2009)Google Scholar
  8. 8.
  9. 9.
  10. 10.
    Gousios, G., Spinellis, D.: GHTorrent: Github’s data from a firehose. In: MSR, pp. 12–21 (2012)Google Scholar
  11. 11.
  12. 12.
  13. 13.
    Fischer, M., Pinzger, M., Gall, H.: Populating a release history database from version control and bug tracking systems. In: ICSM, pp. 23–32 (2003)Google Scholar
  14. 14.
    Zimmermann, T., Weißgerber, P.: Preprocessing CVS data for fine-grained analysis. In: MSR, pp. 2–6 (2004)Google Scholar
  15. 15.
    Robles, G., Koch, S., GonZÁlez-Barahona, J.M., Carlos, J.: Remote Analysis and measurement of libre software systems by means of the CVSAnalY tool. In: RAMSS, pp. 51–55 (2004)Google Scholar
  16. 16.
    Draheim, D., Pekacki, L.: Process-centric analytical processing of version control data. In: IWPSE, pp. 131–136 (2003)Google Scholar
  17. 17.
    Robles, G., González-Barahona, J.M., Ghosh, R.A.: Gluetheos: automating the retrieval and analysis of data from publicly available software repositories. In: MSR, pp. 28–31(2004)Google Scholar
  18. 18.
    Antoniol, G., Di Penta, M., Gall, H., Pinzger, M.: Towards the integration of versioning systems, bug reports and source code meta-models. Electron. Notes Theor. Comput. Sci. 127(3), 87–99 (2005)CrossRefGoogle Scholar
  19. 19.
    Stephany, F., Mens, T., Gîrba, T.: Maispion: a tool for analysing and visualising open source software developer communities. In: ST, pp. 50–57 (2009)Google Scholar
  20. 20.
    Lee, H., Seo, B.K., Seo, E.: A Git source repository analysis tool based on a novel branch-oriented approach. In: ICISA, pp. 1–4 (2013)Google Scholar
  21. 21.
    Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N.: Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: ICSE, pp. 422–431 (2013)Google Scholar
  22. 22.
    Williams, J.R., Di Ruscio, D., Matragkas, N., Di Rocco, J., Kolovos, D.S.: Models of OSS project meta-information: a dataset of three forges. In: MSR, pp. 408–411 (2014)Google Scholar
  23. 23.
    Gupta, A., Mumick, I.S., et al.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995)Google Scholar
  24. 24.
    Staudt, M., Jarke, M.: Incremental maintenance of externally materialized views. VLDB 96, 3–6 (1996)Google Scholar
  25. 25.
    Ross, K.A., Srivastava, D., Sudarshan, S.: Materialized view maintenance and integrity constraint checking: trading space for time. SIGMOD Rec. 25, 447–458 (1996)CrossRefGoogle Scholar
  26. 26.
    Gonzalez-Barahona, J.M., Izquierdo-Cortazar, D., Robles, G., del Castillo, A.: Analyzing gerrit code review parameters with bicho. ECEASST, vol. 65 (2014)Google Scholar
  27. 27.
    Christen, P.: A comparison of personal name matching: techniques and practical issues. In: ICDM, pp. 290–294 (2006)Google Scholar
  28. 28.
    Goeminne, M., Mens, T.: A comparison of identity merge algorithms for software repositories. Sci. Comput. Program. 78(8), 971–986 (2013)CrossRefGoogle Scholar
  29. 29.
    Lee, M.L., Ling, T.W.: A methodology for structural conflict resolution in the integration of entity-relationship schemas. Knowl. Inf. Syst. 5(2), 225–247 (2003)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Chai, X., Sayyadian, M., Doan, A., Rosenthal, A., Seligman, L.: Analyzing and revising data integration schemas to improve their matchability. VLDB 1(1), 773–784 (2008)Google Scholar
  31. 31.
    Haas, L.M., Hentschel, M., Kossmann, D., Miller, R.J.: Schema AND data: a holistic approach to mapping, resolution and fusion in information integration. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 27–40. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  32. 32.
    Garrigós, I., Pardillo, J., Mazón, J.-N., Trujillo, J.: A conceptual modeling approach for OLAP personalization. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 401–414. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  33. 33.
    Cosentino, V., Cánovas Izquierdo, J.L., Cabot, J.: Assessing the bus factor of Git repositories. In: SANER, pp. 499–503 (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Valerio Cosentino
    • 1
  • Javier Luis Cánovas Izquierdo
    • 1
    • 2
  • Jordi Cabot
    • 2
    • 3
  1. 1.AtlanMod TeamInria, Mines Nantes, LINANantesFrance
  2. 2.UOCBarcelonaSpain
  3. 3.ICREABarcelonaSpain

Personalised recommendations