Skip to main content

Gitana: A SQL-Based Git Repository Inspector

  • Conference paper
  • First Online:
Conceptual Modeling (ER 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9381))

Included in the following conference series:

Abstract

Software development projects are notoriously complex and difficult to deal with. Several support tools such as issue tracking, code review and Source Control Management (SCM) systems have been introduced in the past decades to ease development activities. While such tools efficiently track the evolution of a given aspect of the project (e.g., bug reports), they provide just a partial view of the project and often lack of advanced querying mechanisms limiting themselves to command line or simple GUI support. This is particularly true for projects that rely on Git, the most popular SCM system today.

In this paper, we propose a conceptual schema for Git and an approach that, given a Git repository, exports its data to a relational database in order to (1) promote data integration with other existing SCM tools and (2) enable writing queries on Git data using standard SQL syntax. To ensure efficiency, our approach comes with an incremental propagation mechanism that refreshes the database content with the latest modifications. We have implemented our approach in Gitana, an open-source tool available on GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    GHTorrent and Gerrie schemas are available at: http://ghtorrent.org/files/schema.png and http://gerrie.readthedocs.org/en/latest/database/#schema, respectively.

  2. 2.

    https://pypi.python.org/pypi/GitPython.

  3. 3.

    The bus factor of a project is typically defined as the number of key developers who would need to be incapacitated, i.e., hit by a bus, to make the project unable to continue.

References

  1. Cockburn, A., Highsmith, J.: Agile software development: the people factor. Computer 34(11), 131–133 (2001)

    Article  Google Scholar 

  2. Rochkind, M.J.: The source code control system. Trans. Softw. Eng. 4, 364–370 (1975)

    Article  Google Scholar 

  3. O’Sullivan, B.: Making sense of revision-control systems. Commun. ACM 52(9), 56–62 (2009)

    Article  Google Scholar 

  4. Serrano, N., Ciordia, I.: Bugzilla, itracker, and other bug trackers. Software 22(2), 11–13 (2005)

    Article  Google Scholar 

  5. Kemerer, C.F., Paulk, M.C.: The impact of design and code reviews on software quality: an empirical study based on PSP data. Trans. Softw. Eng. 35(4), 534–550 (2009)

    Article  Google Scholar 

  6. Chacon, S., Hamano, J.C.: Pro Git, vol. 288. Apress, Berkeley (2009)

    Book  Google Scholar 

  7. Bird, C., Rigby, P.C., Barr, E.T., Hamilton, D.J., German, D.M., Devanbu, P.: The promises and perils of mining Git. In: MSR, pp. 1–10 (2009)

    Google Scholar 

  8. Gitstats (2007). http://gitstats.sourceforge.net/

  9. GitInspector (2012). https://code.google.com/p/gitinspector/

  10. Gousios, G., Spinellis, D.: GHTorrent: Github’s data from a firehose. In: MSR, pp. 12–21 (2012)

    Google Scholar 

  11. Gerrie (2013). http://gerrie.readthedocs.org/en/latest/index.html

  12. Gitana website. https://github.com/SOM-Research/Gitana

  13. Fischer, M., Pinzger, M., Gall, H.: Populating a release history database from version control and bug tracking systems. In: ICSM, pp. 23–32 (2003)

    Google Scholar 

  14. Zimmermann, T., Weißgerber, P.: Preprocessing CVS data for fine-grained analysis. In: MSR, pp. 2–6 (2004)

    Google Scholar 

  15. Robles, G., Koch, S., GonZÁlez-Barahona, J.M., Carlos, J.: Remote Analysis and measurement of libre software systems by means of the CVSAnalY tool. In: RAMSS, pp. 51–55 (2004)

    Google Scholar 

  16. Draheim, D., Pekacki, L.: Process-centric analytical processing of version control data. In: IWPSE, pp. 131–136 (2003)

    Google Scholar 

  17. Robles, G., González-Barahona, J.M., Ghosh, R.A.: Gluetheos: automating the retrieval and analysis of data from publicly available software repositories. In: MSR, pp. 28–31(2004)

    Google Scholar 

  18. Antoniol, G., Di Penta, M., Gall, H., Pinzger, M.: Towards the integration of versioning systems, bug reports and source code meta-models. Electron. Notes Theor. Comput. Sci. 127(3), 87–99 (2005)

    Article  Google Scholar 

  19. Stephany, F., Mens, T., Gîrba, T.: Maispion: a tool for analysing and visualising open source software developer communities. In: ST, pp. 50–57 (2009)

    Google Scholar 

  20. Lee, H., Seo, B.K., Seo, E.: A Git source repository analysis tool based on a novel branch-oriented approach. In: ICISA, pp. 1–4 (2013)

    Google Scholar 

  21. Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N.: Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: ICSE, pp. 422–431 (2013)

    Google Scholar 

  22. Williams, J.R., Di Ruscio, D., Matragkas, N., Di Rocco, J., Kolovos, D.S.: Models of OSS project meta-information: a dataset of three forges. In: MSR, pp. 408–411 (2014)

    Google Scholar 

  23. Gupta, A., Mumick, I.S., et al.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995)

    Google Scholar 

  24. Staudt, M., Jarke, M.: Incremental maintenance of externally materialized views. VLDB 96, 3–6 (1996)

    Google Scholar 

  25. Ross, K.A., Srivastava, D., Sudarshan, S.: Materialized view maintenance and integrity constraint checking: trading space for time. SIGMOD Rec. 25, 447–458 (1996)

    Article  Google Scholar 

  26. Gonzalez-Barahona, J.M., Izquierdo-Cortazar, D., Robles, G., del Castillo, A.: Analyzing gerrit code review parameters with bicho. ECEASST, vol. 65 (2014)

    Google Scholar 

  27. Christen, P.: A comparison of personal name matching: techniques and practical issues. In: ICDM, pp. 290–294 (2006)

    Google Scholar 

  28. Goeminne, M., Mens, T.: A comparison of identity merge algorithms for software repositories. Sci. Comput. Program. 78(8), 971–986 (2013)

    Article  Google Scholar 

  29. Lee, M.L., Ling, T.W.: A methodology for structural conflict resolution in the integration of entity-relationship schemas. Knowl. Inf. Syst. 5(2), 225–247 (2003)

    Article  MathSciNet  Google Scholar 

  30. Chai, X., Sayyadian, M., Doan, A., Rosenthal, A., Seligman, L.: Analyzing and revising data integration schemas to improve their matchability. VLDB 1(1), 773–784 (2008)

    Google Scholar 

  31. Haas, L.M., Hentschel, M., Kossmann, D., Miller, R.J.: Schema AND data: a holistic approach to mapping, resolution and fusion in information integration. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 27–40. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  32. Garrigós, I., Pardillo, J., Mazón, J.-N., Trujillo, J.: A conceptual modeling approach for OLAP personalization. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 401–414. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  33. Cosentino, V., Cánovas Izquierdo, J.L., Cabot, J.: Assessing the bus factor of Git repositories. In: SANER, pp. 499–503 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Valerio Cosentino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cosentino, V., Izquierdo, J.L.C., Cabot, J. (2015). Gitana: A SQL-Based Git Repository Inspector. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25264-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25263-6

  • Online ISBN: 978-3-319-25264-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics