Transactions on Computational Science XXI pp 269-295 | Cite as
Extracting, Identifying and Visualisation of the Content, Users and Authors in Software Projects
Abstract
The paper proposes a method for extracting, identifying and visualisation of topics, code tiers, users and authors in software projects. In addition to standard information retrieval techniques, we use AST for source code and WordNet ontology to enrich document vectors extracted from parsed code, LSI to reduce its dimensionality and the swarm intelligence in the bee behaviour inspired algorithms to cluster documents contained in it. We extract topics from the identified clusters and visualise them in 3D graphs. Developers within and outside the teams can receive and utilize visualized information from the code and apply them to their projects. This new level of aggregated 3D visualization improves refactoring, source code reusing, implementing new features and exchanging knowledge.
Keywords
Software Project Visualisation Source Code WordNet Ontology Topic Identification and Extraction Latent Semantic Indexing Bee Behaviour Inspired Algorithms AST Swarm Intelligence AuthorshipPreview
Unable to display preview. Download preview PDF.
References
- 1.Kuhn, A., Ducasse, S., Girba, T.: Semantic clustering: Identifying topics in source code. Information and Software Technology 49(3), 230–243 (2007) ISSN 0950-5849Google Scholar
- 2.Karaboga, D., Bahriye, A.: A survey: algorithms simulating bee swarm intelligence. Artificial Intelligence Review 31(1), 61–85 (2010)Google Scholar
- 3.Návrat, P., et al.: The Bee Hive At Work: Exploring its Searching and Optimizing Potential. INFOCOMP Journal of Computer Science 11(1), 32–40 (2012) ISSN 1807-4545Google Scholar
- 4.Rajasekhar, A., et al.: A Hybrid Differential Artificial Bee Algorithm based tuning of fractional order controller for PMSM drive. In: Proceedings of the Third World Congress on Nature and Biologically Inspired Computing (NABIC 2011), pp. 1–6. IEEE (2011) ISBN 978-1-4577-1122-0Google Scholar
- 5.Kazemian, M., Ramezani, Y., Lucas, C., Moshiri, B.: Swarm Clustering Based on Flowers Pollination by Artificial Bees. In: Abraham, A., Grosan, C., Ramos, V. (eds.) Proceedings of Swarm Intelligence in Data Mining. SCI, vol. 34, pp. 191–202. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 6.Karaboga, D., Ozturk, C.: A novel clustering approach: Artificial Bee Colony (ABC) algorithm. Applied Soft Computing 11(1), 652–657 (2011) Google Scholar
- 7.Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using Wikipedia. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009 (2009) ISBN 978-1-60558-483-6Google Scholar
- 8.Amine, A., Elberrichi, Z., Simonet, M.: Evaluation of text clustering methods using wordnet. Int. Arab J. Inf. Technol., 349–357 (2010)Google Scholar
- 9.Hoth, A., Staab, S., Stumme, G.: Wordnet improves Text Document Clustering. In: Proceedings of the SIGIR 2003 Semantic Web Workshop, pp. 541–544 (2003)Google Scholar
- 10.View File Changes Using Annotate. Team Foundation Server 2010, MSDN, Microsoft, http://msdn.microsoft.com/en-us/library/bb385979.aspx (accessed September 25, 2012)
- 11.Hodges, B.: Annotate (also known as blame) is now a power toy. MSDN Blogs, Microsoft, http://blogs.msdn.com/b/buckh/archive/2006/03/13/annotate.aspx (accessed September 25, 2012)
- 12.Grunwald, D.: NRefactory. SharpDevelop, http://wiki.sharpdevelop.net/NRefactory.ashx (accessed September 25, 2012)
- 13.Eppstein, D.: Longest Common Subsequences. ICS 161: Design and Analysis of Algorithms Lecture notes (February 29, 1996), http://www.ics.uci.edu/~eppstein/161/960229.html (accessed September 25, 2012)
- 14.Fluri, B., Wursch, M., Pinzger, M., Gall, H.: Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Transactions on Software Engineering 33(11) (2007) ISSN 725-743Google Scholar
- 15.Neamtiu, I., Foster, J.S., Hicks, M.: Understanding source code evolution using abstract syntax tree matching. In: Mining Software Repositories (MSR 2005), pp. 1–5. ACM, New York (2005) ISBN 1-59593-123-6Google Scholar
- 16.Marcus, A., Sergeyev, A., Rajlich, V., Maletic, J.I.: An Information Retrieval Approach to Concept Location in Source Code. In: Proceedings of the 11th Working Conference on Reverse Engineering, WCRE 2004 (2004) ISSN 1095-1350Google Scholar
- 17.Polášek, I., Ruttkay-Nedecký, I., Ruttkay-Nedecký, P., Tóth, T., Černík, A., Dušek, P.: Information and Knowledge Retrieval within Software Projects and their Graphical Representation for Collaborative Programming. In: Acta Polytechnica Hungarica, vol. 10(2), Óbuda University (2013) ISSN 1785-8860Google Scholar
- 18.Navrat, P., Sabo, S.: What’s going on out there right now? A beehive based machine to give snapshot of the ongoing stories on the Web. In: Proceedings of the Fourth World Congress on Nature and Biologically Inspired Computing (NABIC 2012), pp. 168–174. IEEE (2012) ISBN 978-1-4673-4767-9Google Scholar
- 19.Uhlár, M., Polášek, I.: Extracting, identifiyng and visualisation of the content in software projects. In: Proceedings of the Fourth World Congress on Nature and Biologically Inspired Computing (NABIC 2012), pp. 72–78. IEEE (2012) ISBN 978-1-4673-4768-6Google Scholar