Skip to main content

Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8584))

Abstract

Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system (or at least a small part of it) is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. Feature (or concept) location techniques address this problem.

This paper introduces Conclave, an environment for software analysis, and in particular the Conclave-Mapper tool that provides a feature location facility. This tool explores natural language terms used in programs (e.g. function and variable names), and using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts. An empirical study is also discussed to evaluate the underlying feature location technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Antoniol, G., Guéhéneuc, Y.-G.: Feature identification: An epidemiological metaphor. IEEE Transactions on Software Engineering 32(9), 627–641 (2006)

    Article  Google Scholar 

  2. Bechhofer, S., Van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A., et al.: Owl web ontology language reference. W3C Recommendation 10, 2006–01 (2004)

    Google Scholar 

  3. Biggerstaff, T.J., Mitbander, B.G., Webster, D.: The concept assignment problem in program understanding. In: Proceedings of the 15th International Conference on Software Engineering, pp. 482–498. IEEE Computer Society Press (1994)

    Google Scholar 

  4. Binkley, D., Lawrie, D.: Information retrieval applications in software maintenance and evolution. In: Encyclopedia of Software Engineering (2009)

    Google Scholar 

  5. Binkley, D., Lawrie, D.: Information retrieval applications in software development. In: Encyclopedia of Software Engineering (2010)

    Google Scholar 

  6. Carvalho, N.R., Almeida, J.J., Pereira, M.J.V., Henriques, P.R.: Probabilistic synset based concept location. In: SLATE 2012 — Symposium on Languages, Applications and Technologies (June 2012)

    Google Scholar 

  7. Chen, K., Rajlich, V.: Case study of feature location using dependence graph. In: 8th International Workshop on Program Comprehension. IEEE (2000)

    Google Scholar 

  8. Chikofsky, E.J., Cross II, J.H.: Reverse engineering and design recovery: A taxonomy. IEEE Software, 13–17 (1990)

    Google Scholar 

  9. Corbi, T.A.: Program understanding: Challenge for the 1990s. IBM Systems Journal 28(2), 294–306 (1989)

    Article  Google Scholar 

  10. Deissenboeck, F., Pizka, M.: Concise and consistent naming. Software Quality Journal 14(3), 261–282 (2006)

    Article  Google Scholar 

  11. Dit, B., Guerrouj, L., Poshyvanyk, D., Antoniol, G.: Can better identifier splitting techniques help feature location? In: IEEE 19th International Conference on Program Comprehension (2011)

    Google Scholar 

  12. Dit, B., Revelle, M., Gethers, M., Poshyvanyk, D.: Feature location in source code: a taxonomy and survey. Journal of Software: Evolution and Process 25(1), 53–95 (2013)

    Google Scholar 

  13. Eisenbarth, T., Koschke, R., Simon, D.: Locating features in source code. IEEE Transactions on Software Engineering 29(3), 210–224 (2003)

    Article  Google Scholar 

  14. Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM 30(11), 964–971 (1987)

    Article  Google Scholar 

  15. Happel, H.-J., Seedorf, S.: Applications of ontologies in software engineering. In: Proc. of Workshop on Sematic Web Enabled Software Engineering (SWESE) on the ISWC, pp. 5–9. Citeseer (2006)

    Google Scholar 

  16. Hayashi, S., Yoshikawa, T., Saeki, M.: Sentence-to-code traceability recovery with domain ontologies. In: 2010 17th Asia Pacific Software Engineering Conference (APSEC), pp. 385–394. IEEE (2010)

    Google Scholar 

  17. Hill, E., Pollock, L., Vijay-Shanker, K.: Exploring the neighborhood with dora to expedite software maintenance. In: Proceedings of 22nd IEEE/ACM International Conference on Automated Software Engineering, pp. 14–23 (2007)

    Google Scholar 

  18. Hill, E., Pollock, L., Vijay-Shanker, K.: Automatically capturing source code context of nl-queries for software maintenance and reuse. In: Proceedings of the 31st International Conference on Software Engineering. IEEE (2009)

    Google Scholar 

  19. Horrocks, I., Patel-Schneider, P.F., van Harmelen, F.: From SHIQ and RDF to OWL: the making of a Web Ontology Language. Web Semantics: Science, Services and Agents on the World Wide Web 1(1), 7–26 (2003)

    Article  Google Scholar 

  20. Keller, W.: Mapping objects to tables. In: Proc. of European Conference on Pattern Languages of Programming and Computing, Kloster Irsee, Germany, vol. 206, p. 207. Citeseer (1997)

    Google Scholar 

  21. Klyne, G., Carroll, J.J., McBride, B.: Resource description framework (rdf): Concepts and abstract syntax. W3C Recommendation, 10 (2004)

    Google Scholar 

  22. Lattner, C.: Llvm and clang: Next generation compiler technology. In: The BSD Conference, pp. 1–2 (2008)

    Google Scholar 

  23. Lawrie, D., Binkley, D.: Expanding identifiers to normalize source code vocabulary. In: 2011 27th IEEE International Conference on Software Maintenance (ICSM), pp. 113–122 (2011)

    Google Scholar 

  24. Lawrie, D., Morrell, C., Feild, H., Binkley, D.: What’s in a name? a study of identifiers. In: 14th International Conference on Program Comprehension (2006)

    Google Scholar 

  25. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  26. Marcus, A., Sergeyev, A., Rajlich, V., Maletic, J.I.: An information retrieval approach to concept location in source code. In: Proceedings of the 11th Working Conference on Reverse Engineering, pp. 214–223. IEEE (2004)

    Google Scholar 

  27. Marcus, A., Rajlich, V., Buchta, J., Petrenko, M., Sergeyev, A.: Static techniques for concept location in object-oriented code. In: Proceedings of the 13th International Workshop on Program Comprehension, IWPC 2005, pp. 33–42. IEEE (2005)

    Google Scholar 

  28. Marcus, A., Rajlich, V.: Identification of concepts, features, and concerns in source code. In: Panel Discussion at the International Conference on Software Maintenance (2005)

    Google Scholar 

  29. Martin, J.H., Jurafsky, D.: Speech and language processing (2000)

    Google Scholar 

  30. Nelson, M.L.: A survey of reverse engineering and program comprehension. Arxiv preprint cs/0503068 (2005)

    Google Scholar 

  31. Parr, T.: The Definitive ANTLR 4 Reference. Pragmatic Bookshelf (2013)

    Google Scholar 

  32. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 30–43. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  33. Poshyvanyk, D., Guéhéneuc, Y.-G., Marcus, A., Antoniol, G., Rajlich, V.: Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Transactions on Software Engineering 33(6), 420–432 (2007)

    Article  Google Scholar 

  34. Prud’Hommeaux, E., Seaborne, A., et al.: Sparql query language for rdf. W3C Recommendation, 15 (2008)

    Google Scholar 

  35. Rajlich, V., Wilde, N.: The role of concepts in program comprehension. In: Proceedings of the 10th International Workshop on Program Comprehension, pp. 271–278. IEEE (2002)

    Google Scholar 

  36. Ratiu, D., Deissenboeck, F.: How programs represent reality (and how they don’t). In: 13th Working Conference on Reverse Engineering, WCRE 2006, pp. 83–92. IEEE (2006)

    Google Scholar 

  37. Ratiu, D., Deissenboeck, F.: From reality to programs and (not quite) back again. In: 15th IEEE International Conference on Program Comprehension, ICPC 2007, pp. 91–102. IEEE (2007)

    Google Scholar 

  38. Revelle, M., Dit, B., Poshyvanyk, D.: Using data fusion and web mining to support feature location in software. In: 2010 IEEE 18th International Conference on Program Comprehension (ICPC), pp. 14–23. IEEE (2010)

    Google Scholar 

  39. Robillard, M.P.: Topology analysis of software dependencies. ACM Transactions on Software Engineering and Methodology (TOSEM) 17(4), 18 (2008)

    Article  Google Scholar 

  40. Safyallah, H., Sartipi, K.: Dynamic analysis of software systems using execution pattern mining. In: 14th IEEE International Conference on Program Comprehension (2006)

    Google Scholar 

  41. Shepherd, D., Fry, Z.P., Hill, E., Pollock, L., Vijay-Shanker, K.: Using natural language program analysis to locate and understand action-oriented concerns. In: Proceedings of the 6th International Conference on Aspect-Oriented Software Development, pp. 212–224. ACM (2007)

    Google Scholar 

  42. Simões, A., Almeida, J.J., Carvalho, N.R.: Defining a probabilistic translation dictionaries algebra. In: XVI Portuguese Conference on Artificial Inteligence - EPIA, pp. 444–455 (September 2013)

    Google Scholar 

  43. Von Mayrhauser, A., Vans, A.M.: Program comprehension during software maintenance and evolution. Computer 28(8), 44–55 (1995)

    Article  Google Scholar 

  44. Wilde, N., Buckellew, M., Page, H., Rajlich, V., Pounds, L.: A comparison of methods for locating features in legacy software. Journal of Systems and Software (2003)

    Google Scholar 

  45. Würsch, M., Ghezzi, G., Reif, G., Gall, H.C.: Supporting developers with natural language queries. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1 (2010)

    Google Scholar 

  46. Zhang, Y.: An Ontology-based Program Comprehension Model. PhD thesis (2007)

    Google Scholar 

  47. Zhao, W., Zhang, L., Liu, Y., Sun, J., Yang, F.: Sniafl: Towards a static noninteractive approach to feature location. ACM Trans. Softw. Eng. Methodol. 15(2), 195–226 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Carvalho, N.R., Almeida, J.J., Henriques, P.R., Pereira, M.J.V. (2014). Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8584. Springer, Cham. https://doi.org/10.1007/978-3-319-09153-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09153-2_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09152-5

  • Online ISBN: 978-3-319-09153-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics