Skip to main content

EMF Patterns of Usage on GitHub

  • Conference paper
  • First Online:
Modelling Foundations and Applications (ECMFA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10890))

Included in the following conference series:

Abstract

Mining software repositories is a common activity in software engineering with diverse use cases such as understanding project quality, technology usage, and developer profiles. Such mining activities involve, more often than not, a phase for data extraction from the source code in the repository with recurring tasks such as processing the folder structure (possibly on the timeline), classifying repository artifacts (e.g., in terms of the languages or technologies used), and extracting facts from the artifacts by parsing or otherwise. We describe a new approach for such data extraction; its key pillar is a declarative rule-based language for the uniform, inference-based extraction of facts from the repository (the file system), the artifacts in the repository (their content), and previously extracted facts. All inferred facts are maintained in a triple store. We describe a case study for the purpose of understanding the usage of EMF. To this end, we describe an emerging catalog of patterns of using EMF in repositories and we detect these patterns on GitHub. In our implementation, we use Apache Jena for which we provide dedicated language support tailored towards mining software repositories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/softlang/qegal.

  2. 2.

    https://jena.apache.org/.

  3. 3.

    http://www.eclipse.org/Xtext/.

References

  1. Atzeni, M., Atzori, M.: CodeOntology: RDF-ization of source code. In: d’Amato, C., Fernandez, M., Tamma, V., Lecue, F., Cudré-Mauroux, P., Sequeda, J., Lange, C., Heflin, J. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 20–28. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_2

    Chapter  Google Scholar 

  2. Berger, B.J., Sohr, K., Koschke, R.: Extracting and analyzing the implemented security architecture of business applications. In: Proceedings of the CSMR, pp. 285–294. IEEE (2013)

    Google Scholar 

  3. Bettini, L.: Implementing Domain-Specific Languages with Xtext and Xtend. Packt Publishing, Birmingham (2013)

    Google Scholar 

  4. Bézivin, J., Jouault, F., Rosenthal, P., Valduriez, P.: Modeling in the large and modeling in the small. In: Aßmann, U., Aksit, M., Rensink, A. (eds.) MDAFA 2003-2004. LNCS, vol. 3599, pp. 33–46. Springer, Heidelberg (2005). https://doi.org/10.1007/11538097_3

    Chapter  Google Scholar 

  5. Bézivin, J., Jouault, F., Valduriez, P.: On the need for megamodels. In: Proceedings of the OOPSLA/GPCE: Best Practices for Model-Driven Software Development workshop (2004)

    Google Scholar 

  6. Chen, T., Shang, W., Yang, J., Hassan, A.E., Godfrey, M.W., Nasser, M.N., Flora, P.: An empirical study on the practice of maintaining object-relational mapping code in Java systems. In: Proceedings of the MSR 2016, pp. 165–176 (2016)

    Google Scholar 

  7. Cleland-Huang, J., Gotel, O., Zisman, A. (eds.): Software and Systems Traceability. Springer, Heidelberg (2012). https://doi.org/10.1007/978-1-4471-2239-5

    Book  Google Scholar 

  8. ADecan, A., Mens, T., Claes, M., Grosjean, P.: When GitHub meets CRAN: an analysis of inter-repository package dependency problems. In: SANER, pp. 493–504 (2016)

    Google Scholar 

  9. Di Rocco, J., Di Ruscio, D., Härtel, J., Iovino, L., Lämmel, R., Pierantonio, A.: Systematic recovery of MDE technology usage. Springer, LNCS (2018)

    Google Scholar 

  10. Dittrich, K.R., Gatziu, S., Geppert, A.: The active database management system manifesto: a rulebase of ADBMS features. In: Sellis, T. (ed.) RIDS 1995. LNCS, vol. 985, pp. 1–17. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-60365-4_116

    Chapter  Google Scholar 

  11. Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N.: Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: ICSE, pp. 422–431. IEEE Computer Society (2013)

    Google Scholar 

  12. Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N.: Boa: ultra-large-scale software repository and source-code mining. ACM Trans. Softw. Eng. Methodol. 25(1), 7:1–7:34 (2015)

    Article  Google Scholar 

  13. Dyer, R., Rajan, H., Nguyen, H.A., Nguyen, T.N.: Mining billions of AST nodes to study actual and potential usage of java language features. In: ICSE, pp. 779–790. ACM (2014)

    Google Scholar 

  14. Favre, J.-M., Lämmel, R., Varanovich, A.: Modeling the linguistic architecture of software products. In: France, R.B., Kazmeier, J., Breu, R., Atkinson, C. (eds.) MODELS 2012. LNCS, vol. 7590, pp. 151–167. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33666-9_11

    Chapter  Google Scholar 

  15. Galvão, I., Goknil, A.: Survey of traceability approaches in model-driven engineering. In: Proceedings of the EDOC, pp. 313–326. IEEE (2007)

    Google Scholar 

  16. Han, M., Hofmeister, C., Nord, R.L.: Reconstructing software architecture for J2EE web applications. In: Proceedings of the WCRE, pp. 67–79. IEEE (2003)

    Google Scholar 

  17. Härtel, J., Härtel, L., Lämmel, R., Varanovich, A., Heinz, M.: Interconnected linguistic architecture. Program. J. 1(1), 3 (2017)

    Article  Google Scholar 

  18. Hassan, A.E., Jiang, Z.M., Holt, R.C.: Source versus object code extraction for recovering software architecture. In: Proceedings of the WCRE, pp. 67–76. IEEE (2005)

    Google Scholar 

  19. Heinz, M., Lämmel, R., Varanovich, A.: Axioms of linguistic architecture. In: Proceedings of the MODELSWARD 2017 (2017)

    Google Scholar 

  20. Janes, A., Piatov, D., Sillitti, A., Succi, G.: How to Calculate software metrics for multiple languages using open source parsers. In: Petrinja, E., Succi, G., El Ioini, N., Sillitti, A. (eds.) OSS 2013. IAICT, vol. 404, pp. 264–270. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38928-3_20

    Chapter  Google Scholar 

  21. Karus, S., Gall, H.C.: A study of language usage evolution in open source software. In: MSR, pp. 13–22. ACM (2011)

    Google Scholar 

  22. Keenan, E., Czauderna, A., Leach, G., Cleland-Huang, J., Shin, Y., Moritz, E., Gethers, M., Poshyvanyk, D., Maletic, J.I., Hayes, J.H., Dekhtyar, A., Manukian, D., Hossein, S., Hearn, D.: TraceLab: an experimental workbench for equipping researchers to innovate, synthesize, and comparatively evaluate traceability solutions. In: Proc. ICSE, pp. 1375–1378. IEEE (2012)

    Google Scholar 

  23. Kikas, R., Gousios, G., Dumas, M., Pfahl, D.: Structure and evolution of package dependency networks. In: MSR, pp. 102–112 (2017)

    Google Scholar 

  24. Kniesel, G., Binun, A.: Standing on the shoulders of giants - a data fusion approach to design pattern detection. In: Proceedings of the ICPC, pp. 208–217. IEEE (2009)

    Google Scholar 

  25. Kniesel, G., Binun, A., Hegedüs, P., Fülöp, L.J., Chatzigeorgiou, A., Guéhéneuc, Y., Tsantalis, N.: DPDX-towards a common result exchange format for design pattern detection tools. In: Proceedings of the CSMR, pp. 232–235. IEEE (2010)

    Google Scholar 

  26. Kolovos, D.S., Matragkas, N.D., Korkontzelos, I., Ananiadou, S., Paige, R.F.: Assessing the use of Eclipse MDE technologies in open-source software projects. In: Proceedings of the MODELS, pp. 20–29 (2015)

    Google Scholar 

  27. Koschke, R.: Architecture reconstruction. In: De Lucia, A., Ferrucci, F. (eds.) ISSSE 2006-2008. LNCS, vol. 5413, pp. 140–173. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-95888-8_6

    Chapter  Google Scholar 

  28. Kusel, A., Schoenboeck, J., Wimmer, M., Retschitzegger, W., Schwinger, W., Kappel, G.: Reality check for model transformation reuse: the ATL transformation zoo case study. In: Proceedings of the AMT 2013, volume 1077 of CEUR Workshop Proceedings. CEUR-WS.org (2013)

    Google Scholar 

  29. Lämmel, R., Varanovich, A.: Interpretation of linguistic architecture. In: Cabot, J., Rubin, J. (eds.) ECMFA 2014. LNCS, vol. 8569, pp. 67–82. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09195-2_5

    Chapter  Google Scholar 

  30. Lutellier, T., Chollak, D., Garcia, J., Tan, L., Rayside, D., Medvidovic, N., Kroeger, R.: Comparing software architecture recovery techniques using accurate dependencies. In: Proceedings of the ICSE, pp. 69–78 (2015)

    Google Scholar 

  31. Mäder, P., Egyed, A.: Do developers benefit from requirements traceability when evolving and maintaining a software system? Empir. Softw. Eng. 20(2), 413–441 (2015)

    Article  Google Scholar 

  32. Mayer, P., Bauer, A.: An empirical analysis of the utilization of multiple programming languages in open source projects. In: Proceedings of the EASE, pp. 4:1–4:10 (2015)

    Google Scholar 

  33. Robles, G., Ho-Quang, T., Hebig, R., Chaudron, M.R.V., Fernández, M.A.: An extensive dataset of UML models in GitHub. In: Proc. MSR, pp. 519–522 (2017)

    Google Scholar 

  34. Roover, C.D.: A logic meta-programming foundation for example-driven pattern detection in object-oriented programs. In: Proceedings of the ICSM, pp. 556–561. IEEE (2011)

    Google Scholar 

  35. Roover, C.D., Lämmel, R., Pek, E.: Multi-dimensional exploration of API usage. In: Proceedings of the ICPC 2013, pp. 152–161. IEEE (2013)

    Google Scholar 

  36. Saied, M.A., Sahraoui, H.A.: A cooperative approach for combining client-based and library-based API usage pattern mining. In: Proceedings of the ICPC, pp. 1–10 (2016)

    Google Scholar 

  37. Sawant, A.A., Bacchelli, A.: A dataset for API usage. In: Proceedings of the MSR, pp. 506–509 (2015)

    Google Scholar 

  38. Seibel, A., Hebig, R., Giese, H.: Traceability in model-driven engineering: efficient and scalable traceability maintenance. In: Cleland-Huang, J., Gotel, O., Zisman, A. (eds.) Software and Systems Traceability, pp. 215–240. Springer, London (2012). https://doi.org/10.1007/978-1-4471-2239-5_10

    Chapter  Google Scholar 

  39. Shatnawi, A., Mili, H., El-Boussaidi, G., Boubaker, A., Guéhéneuc, Y., Moha, N., Privat, J., Abdellatif, M.: Analyzing program dependencies in java EE applications. In: Proceedings of the MSR (2017)

    Google Scholar 

  40. Stevens, R., Roover, C.D., Noguera, C., Kellens, A., Jonckers, V.: A logic foundation for a general-purpose history querying tool. Sci. Comput. Program. 96, 107–120 (2014)

    Article  Google Scholar 

  41. Zisman, A.: Using rules for traceability creation. In: Cleland-Huang, J., Gotel, O., Zisman, A. (eds.) Software and Systems Traceability, pp. 147–170. Springer, London (2012). https://doi.org/10.1007/978-1-4471-2239-5_7

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ralf Lämmel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Härtel, J., Heinz, M., Lämmel, R. (2018). EMF Patterns of Usage on GitHub. In: Pierantonio, A., Trujillo, S. (eds) Modelling Foundations and Applications. ECMFA 2018. Lecture Notes in Computer Science(), vol 10890. Springer, Cham. https://doi.org/10.1007/978-3-319-92997-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92997-2_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92996-5

  • Online ISBN: 978-3-319-92997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics