Abstract
Software development projects are notoriously complex and difficult to deal with. Several support tools such as issue tracking, code review and Source Control Management (SCM) systems have been introduced in the past decades to ease development activities. While such tools efficiently track the evolution of a given aspect of the project (e.g., bug reports), they provide just a partial view of the project and often lack of advanced querying mechanisms limiting themselves to command line or simple GUI support. This is particularly true for projects that rely on Git, the most popular SCM system today.
In this paper, we propose a conceptual schema for Git and an approach that, given a Git repository, exports its data to a relational database in order to (1) promote data integration with other existing SCM tools and (2) enable writing queries on Git data using standard SQL syntax. To ensure efficiency, our approach comes with an incremental propagation mechanism that refreshes the database content with the latest modifications. We have implemented our approach in Gitana, an open-source tool available on GitHub.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
GHTorrent and Gerrie schemas are available at: http://ghtorrent.org/files/schema.png and http://gerrie.readthedocs.org/en/latest/database/#schema, respectively.
- 2.
- 3.
The bus factor of a project is typically defined as the number of key developers who would need to be incapacitated, i.e., hit by a bus, to make the project unable to continue.
References
Cockburn, A., Highsmith, J.: Agile software development: the people factor. Computer 34(11), 131–133 (2001)
Rochkind, M.J.: The source code control system. Trans. Softw. Eng. 4, 364–370 (1975)
O’Sullivan, B.: Making sense of revision-control systems. Commun. ACM 52(9), 56–62 (2009)
Serrano, N., Ciordia, I.: Bugzilla, itracker, and other bug trackers. Software 22(2), 11–13 (2005)
Kemerer, C.F., Paulk, M.C.: The impact of design and code reviews on software quality: an empirical study based on PSP data. Trans. Softw. Eng. 35(4), 534–550 (2009)
Chacon, S., Hamano, J.C.: Pro Git, vol. 288. Apress, Berkeley (2009)
Bird, C., Rigby, P.C., Barr, E.T., Hamilton, D.J., German, D.M., Devanbu, P.: The promises and perils of mining Git. In: MSR, pp. 1–10 (2009)
Gitstats (2007). http://gitstats.sourceforge.net/
GitInspector (2012). https://code.google.com/p/gitinspector/
Gousios, G., Spinellis, D.: GHTorrent: Github’s data from a firehose. In: MSR, pp. 12–21 (2012)
Gerrie (2013). http://gerrie.readthedocs.org/en/latest/index.html
Gitana website. https://github.com/SOM-Research/Gitana
Fischer, M., Pinzger, M., Gall, H.: Populating a release history database from version control and bug tracking systems. In: ICSM, pp. 23–32 (2003)
Zimmermann, T., Weißgerber, P.: Preprocessing CVS data for fine-grained analysis. In: MSR, pp. 2–6 (2004)
Robles, G., Koch, S., GonZÁlez-Barahona, J.M., Carlos, J.: Remote Analysis and measurement of libre software systems by means of the CVSAnalY tool. In: RAMSS, pp. 51–55 (2004)
Draheim, D., Pekacki, L.: Process-centric analytical processing of version control data. In: IWPSE, pp. 131–136 (2003)
Robles, G., González-Barahona, J.M., Ghosh, R.A.: Gluetheos: automating the retrieval and analysis of data from publicly available software repositories. In: MSR, pp. 28–31(2004)
Antoniol, G., Di Penta, M., Gall, H., Pinzger, M.: Towards the integration of versioning systems, bug reports and source code meta-models. Electron. Notes Theor. Comput. Sci. 127(3), 87–99 (2005)
Stephany, F., Mens, T., Gîrba, T.: Maispion: a tool for analysing and visualising open source software developer communities. In: ST, pp. 50–57 (2009)
Lee, H., Seo, B.K., Seo, E.: A Git source repository analysis tool based on a novel branch-oriented approach. In: ICISA, pp. 1–4 (2013)
Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N.: Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: ICSE, pp. 422–431 (2013)
Williams, J.R., Di Ruscio, D., Matragkas, N., Di Rocco, J., Kolovos, D.S.: Models of OSS project meta-information: a dataset of three forges. In: MSR, pp. 408–411 (2014)
Gupta, A., Mumick, I.S., et al.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995)
Staudt, M., Jarke, M.: Incremental maintenance of externally materialized views. VLDB 96, 3–6 (1996)
Ross, K.A., Srivastava, D., Sudarshan, S.: Materialized view maintenance and integrity constraint checking: trading space for time. SIGMOD Rec. 25, 447–458 (1996)
Gonzalez-Barahona, J.M., Izquierdo-Cortazar, D., Robles, G., del Castillo, A.: Analyzing gerrit code review parameters with bicho. ECEASST, vol. 65 (2014)
Christen, P.: A comparison of personal name matching: techniques and practical issues. In: ICDM, pp. 290–294 (2006)
Goeminne, M., Mens, T.: A comparison of identity merge algorithms for software repositories. Sci. Comput. Program. 78(8), 971–986 (2013)
Lee, M.L., Ling, T.W.: A methodology for structural conflict resolution in the integration of entity-relationship schemas. Knowl. Inf. Syst. 5(2), 225–247 (2003)
Chai, X., Sayyadian, M., Doan, A., Rosenthal, A., Seligman, L.: Analyzing and revising data integration schemas to improve their matchability. VLDB 1(1), 773–784 (2008)
Haas, L.M., Hentschel, M., Kossmann, D., Miller, R.J.: Schema AND data: a holistic approach to mapping, resolution and fusion in information integration. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 27–40. Springer, Heidelberg (2009)
Garrigós, I., Pardillo, J., Mazón, J.-N., Trujillo, J.: A conceptual modeling approach for OLAP personalization. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds.) ER 2009. LNCS, vol. 5829, pp. 401–414. Springer, Heidelberg (2009)
Cosentino, V., Cánovas Izquierdo, J.L., Cabot, J.: Assessing the bus factor of Git repositories. In: SANER, pp. 499–503 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Cosentino, V., Izquierdo, J.L.C., Cabot, J. (2015). Gitana: A SQL-Based Git Repository Inspector. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-25264-3_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25263-6
Online ISBN: 978-3-319-25264-3
eBook Packages: Computer ScienceComputer Science (R0)