Extracting, Identifying and Visualisation of the Content, Users and Authors in Software Projects

  • Ivan Polášek
  • Marek Uhlár
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8160)

Abstract

The paper proposes a method for extracting, identifying and visualisation of topics, code tiers, users and authors in software projects. In addition to standard information retrieval techniques, we use AST for source code and WordNet ontology to enrich document vectors extracted from parsed code, LSI to reduce its dimensionality and the swarm intelligence in the bee behaviour inspired algorithms to cluster documents contained in it. We extract topics from the identified clusters and visualise them in 3D graphs. Developers within and outside the teams can receive and utilize visualized information from the code and apply them to their projects. This new level of aggregated 3D visualization improves refactoring, source code reusing, implementing new features and exchanging knowledge.

Keywords

Software Project Visualisation Source Code WordNet Ontology Topic Identification and Extraction Latent Semantic Indexing Bee Behaviour Inspired Algorithms AST Swarm Intelligence Authorship 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kuhn, A., Ducasse, S., Girba, T.: Semantic clustering: Identifying topics in source code. Information and Software Technology 49(3), 230–243 (2007) ISSN 0950-5849Google Scholar
  2. 2.
    Karaboga, D., Bahriye, A.: A survey: algorithms simulating bee swarm intelligence. Artificial Intelligence Review 31(1), 61–85 (2010)Google Scholar
  3. 3.
    Návrat, P., et al.: The Bee Hive At Work: Exploring its Searching and Optimizing Potential. INFOCOMP Journal of Computer Science 11(1), 32–40 (2012) ISSN 1807-4545Google Scholar
  4. 4.
    Rajasekhar, A., et al.: A Hybrid Differential Artificial Bee Algorithm based tuning of fractional order controller for PMSM drive. In: Proceedings of the Third World Congress on Nature and Biologically Inspired Computing (NABIC 2011), pp. 1–6. IEEE (2011) ISBN 978-1-4577-1122-0Google Scholar
  5. 5.
    Kazemian, M., Ramezani, Y., Lucas, C., Moshiri, B.: Swarm Clustering Based on Flowers Pollination by Artificial Bees. In: Abraham, A., Grosan, C., Ramos, V. (eds.) Proceedings of Swarm Intelligence in Data Mining. SCI, vol. 34, pp. 191–202. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Karaboga, D., Ozturk, C.: A novel clustering approach: Artificial Bee Colony (ABC) algorithm. Applied Soft Computing 11(1), 652–657 (2011) Google Scholar
  7. 7.
    Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using Wikipedia. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009 (2009) ISBN 978-1-60558-483-6Google Scholar
  8. 8.
    Amine, A., Elberrichi, Z., Simonet, M.: Evaluation of text clustering methods using wordnet. Int. Arab J. Inf. Technol., 349–357 (2010)Google Scholar
  9. 9.
    Hoth, A., Staab, S., Stumme, G.: Wordnet improves Text Document Clustering. In: Proceedings of the SIGIR 2003 Semantic Web Workshop, pp. 541–544 (2003)Google Scholar
  10. 10.
    View File Changes Using Annotate. Team Foundation Server 2010, MSDN, Microsoft, http://msdn.microsoft.com/en-us/library/bb385979.aspx (accessed September 25, 2012)
  11. 11.
    Hodges, B.: Annotate (also known as blame) is now a power toy. MSDN Blogs, Microsoft, http://blogs.msdn.com/b/buckh/archive/2006/03/13/annotate.aspx (accessed September 25, 2012)
  12. 12.
    Grunwald, D.: NRefactory. SharpDevelop, http://wiki.sharpdevelop.net/NRefactory.ashx (accessed September 25, 2012)
  13. 13.
    Eppstein, D.: Longest Common Subsequences. ICS 161: Design and Analysis of Algorithms Lecture notes (February 29, 1996), http://www.ics.uci.edu/~eppstein/161/960229.html (accessed September 25, 2012)
  14. 14.
    Fluri, B., Wursch, M., Pinzger, M., Gall, H.: Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Transactions on Software Engineering 33(11) (2007) ISSN 725-743Google Scholar
  15. 15.
    Neamtiu, I., Foster, J.S., Hicks, M.: Understanding source code evolution using abstract syntax tree matching. In: Mining Software Repositories (MSR 2005), pp. 1–5. ACM, New York (2005) ISBN 1-59593-123-6Google Scholar
  16. 16.
    Marcus, A., Sergeyev, A., Rajlich, V., Maletic, J.I.: An Information Retrieval Approach to Concept Location in Source Code. In: Proceedings of the 11th Working Conference on Reverse Engineering, WCRE 2004 (2004) ISSN 1095-1350Google Scholar
  17. 17.
    Polášek, I., Ruttkay-Nedecký, I., Ruttkay-Nedecký, P., Tóth, T., Černík, A., Dušek, P.: Information and Knowledge Retrieval within Software Projects and their Graphical Representation for Collaborative Programming. In: Acta Polytechnica Hungarica, vol. 10(2), Óbuda University (2013) ISSN 1785-8860Google Scholar
  18. 18.
    Navrat, P., Sabo, S.: What’s going on out there right now? A beehive based machine to give snapshot of the ongoing stories on the Web. In: Proceedings of the Fourth World Congress on Nature and Biologically Inspired Computing (NABIC 2012), pp. 168–174. IEEE (2012) ISBN 978-1-4673-4767-9Google Scholar
  19. 19.
    Uhlár, M., Polášek, I.: Extracting, identifiyng and visualisation of the content in software projects. In: Proceedings of the Fourth World Congress on Nature and Biologically Inspired Computing (NABIC 2012), pp. 72–78. IEEE (2012) ISBN 978-1-4673-4768-6Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ivan Polášek
    • 1
  • Marek Uhlár
    • 1
  1. 1.Faculty of Informatics and Information TechnologySlovak University of Technology in BratislavaSlovakia

Personalised recommendations