Abstract
Concept location in source code is an essential activity during software change. It starts with a change request and results in a place in the source code where the change is to be implemented. As a program comprehension activity, it is also part of other software evolution tasks, such as, bug localization, recovery of traceability links between software artifacts, retrieving software components for reuse, etc. While concept location is primarily a human activity, tool support is necessary given the large amount of information encoded in source code. Many such tools rely on text retrieval techniques and help developers perform concept location much like document retrieval on web. This paper presents and discusses the applications of text retrieval to support concept location, in the context of software change.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rajlich, V.: Intensions are a Key to Program Comprehension. In: International Conference on Program Comprehension, pp. 1–9 (2009)
Biggerstaff, T.J., Mitbander, B.G., Webster, D.E.: The Concept Assignment Problem in Program Understanding. In: 15th IEEE/ACM International Conference on Software Engineering, pp. 482–498 (1994)
Rajlich, V., Wilde, N.: The Role of Concepts in Program Comprehension. In: IEEE International Workshop on Program Comprehension, pp. 271–278. IEEE Computer Society Press (2002)
Wilde, N., et al.: Locating User Functionality in Old Code. In: IEEE International Conference on Software Maintenance, pp. 200–205 (1992)
Robillard, M.P., Murphy, G.C.: Representing concerns in source code. ACM Transactions on Software Engineering and Methodology 16(1) (2007)
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill (1983)
Rajlich, V., Gosavi, P.: Incremental Change in Object-Oriented Programming. IEEE Software 21(4), 62–69 (2004)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Porter, M.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Gay, G., et al.: On the Use of Relevance Feedback in IR-Based Concept Location. In: IEEE International Conference on Software Maintenance, pp. 351–360 (2009)
Dit, B., et al.: Can Better Identifier Splitting Techniques Help Feature Location? In: 19th IEEE International Conference on Program Comprehension, pp. 11–20 (2011)
Poshyvanyk, D., et al.: Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification. In: 14th IEEE International Conference on Program Comprehension, pp. 137–146 (2006)
Poshyvanyk, D., et al.: Feature Location using Probabilistic Ranking of Methods based on Execution Scenarios and Information Retrieval. IEEE Transactions on Software Engineering 33(6), 420–432 (2007)
Poshyvanyk, D., Marcus, A.: Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code. In: 15th IEEE International Conference on Program Comprehension, pp. 37–46. IEEE Computer Society (2007)
Liu, D., et al.: Feature Location via Information Retrieval based Filtering of a Single Scenario Execution Trace. In: 22nd IEEE/ACM International Conference on Automated Software Engineering, pp. 234–243 (2007)
Cleary, B., et al.: An empirical analysis of information retrieval based concept location techniques in software comprehension. Empirical Software Engineering 14(1), 93–130 (2009)
Scanniello, G., Marcus, A.: Clustering Support for Static Concept Location in Source Code. In: 19th IEEE International Conference on Program Comprehension, pp. 1–10 (2011)
Asadi, F., et al.: A Heuristic-based Approach to Identify Concepts in Execution Traces. In: 14th European Conference on Software Maintenance and Reengineering, pp. 31–40 (2010)
Cleary, B., Exton, C.: Assisting Concept Location in Software Comprehension. In: 19th Psychology of Programming Workshop, pp. 42–55 (2007)
Eaddy, M., et al.: CERBERUS: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis. In: 17th IEEE International Conference on Program Comprehension, pp. 53–62 (2008)
Hayashi, S., Sekine, K., Saeki, M.: iFL: An Interactive Environment for Understanding Feature Implementations. In: 26th IEEE International Conference on Software Maintenance, pp. 1–5 (2010)
Lukins, S.K., Kraft, N.A., Etzkorn, L.H.: Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation. In: 15th Working Conference on Reverse Engineering, pp. 155–164 (2008)
Lukins, S.K., Kraft, N.A., Etzkorn, L.H.: Bug localization using Latent Dirichlet Allocation. Information and Software Technology 52, 972–990 (2010)
Nichols, B.D.: Augmented bug localization using past bug information. In: 48th ACM Annual Southeast Regional Conference, pp. 1–6 (2010)
Peng, X., et al.: Iterative Context-Aware Feature Location. In: 33rd International Conference on Software Engineering, NIER Track, pp. 900–903 (2011)
Ratanotayanon, S., Choi, H.J., Sim, S.E.: My Repository Runneth Over: An Empirical Study on Diversifying Data Sources to Improve Feature Search. In: 18th IEEE International Conference on Program Comprehension, pp. 206–305 (2010)
Revelle, M., Poshyvanyk, D.: An Exploratory Study on Assessing Feature Location Techniques. In: 17th IEEE International Conference on Program Comprehension, pp. 218–222 (2009)
Revelle, M., Dit, B., Poshyvanyk, D.: Using Data Fusion and Web Mining to Support Feature Location in Software. In: 18th IEEE International Conference on Program Comprehension, pp. 14–23 (2010)
Shao, P., Smith, R.K.: Feature location by IR modules and call graph. In: 47th ACM Annual Southeast Regional Conference (2009)
Zhao, W., et al.: SNIAFL: towards a static non-interactive approach to feature location. In: 26th International Conference on Software Engineering, pp. 293–303 (2004)
Ahn, S.-Y., et al.: A Weighted Call Graph Approach for Finding Relevant Components in Source Code. In: 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, pp. 539–544 (2009)
Zhao, W., et al.: SNIAFL: Towards a Static Non-interactive Approach to Feature Location. ACM Transactions on Software Engineering and Methodologies 15(2), 195–226 (2006)
Marcus, A., et al.: An Information Retrieval Approach to Concept Location in Source Code. In: 11th IEEE Working Conference on Reverse Engineering, pp. 214–223 (2004)
Cubranic, D., et al.: Learning from project history: a case study for software development. In: ACM Conference on Computer Supported Cooperative Work, pp. 82–91 (2004)
Cubranic, D., et al.: Hipikat: A Project Memory for Software Development. IEEE Transactions on Software Engineering 31(6), 446–465 (2005)
Marcus, A., et al.: Static Techniques for Concept Location in Object-Oriented Code. In: 13th IEEE International Workshop on Program Comprehension, pp. 33–42 (2005)
Enslen, E., et al.: Mining Source Code to Automatically Split Identifiers for Software Analysis. In: 6th IEEE Working Conference on Mining Software Repositories, pp. 71–80 (2009)
Poshyvanyk, D., et al.: IRiSS - A Source Code Exploration Tool. In: 21st IEEE International Conference on Software Maintenance, pp. 69–72 (2005)
Poshyvanyk, D., Marcus, A., Dong, Y.: JIRiSS - an Eclipse plug-in for Source Code Exploration. In: 14th IEEE International Conference on Program Comprehension, pp. 252–255 (2006)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
Cubranic, D., Murphy, G.C.: Hipikat: Recommending pertinent software development artifacts. In: 25th International Conference on Software Engineering, pp. 408–418 (2003)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)
Hatcher, E., Gospodnetić, O.: Lucene in Action. Manning Publications (2004)
Savage, T., Revelle, M., Poshyvanyk, D.: FLAT^3: Feature Location and Textual Tracing Tool. In: 32nd ACM/IEEE International Conference on Software Engineering, Tool Demo, pp. 255–258 (2010)
Deerwester, S., et al.: Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Dit, B.: Monitoring the Searching and Browsing Behavior of Developers in Eclipse during Concept Location. Department of Computer Science, Wayne State University, Detroit (2009)
Hofmann, T.: From Latent Semantic Indexing to Language Models and Back. In: Workshop on Language Modeling and Information Retrieval (2001)
Ponte, J.M., Croft, W.B.: A Language Modeling Approach to Information Retrieval. In: 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281 (1998)
Cleary, B., Exton, C.: The Cognitive Assignment Eclipse Plug-in. In: 14th IEEE International Conference on Program Comprehension, pp. 241–244 (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Kuhn, A., Ducasse, S., Girba, T.: Semantic Clustering: Identifying Topics in Source Code. Information and Software Technology 49(3), 230–243 (2007)
Ohlemacher, S., Marcus, A.: Towards a Benchmark and Automatic Calibration for IR-based Concept Location. In: 19th IEEE International Conference on Program Comprehension, pp. 246–249 (2011)
Henninger, S.: Using iterative refinement to find reusable software. IEEE Software 11(5), 48–59 (1994)
Furnas, G.W., et al.: The Vocabulary Problem in Human-System Communication. Communications of the ACM 30(11), 964–971 (1987)
Starke, J., Luce, C., Sillito, J.: Searching and Skimming: An Exploratory Study. In: International Conference on Software Maintenance, pp. 157–166 (2009)
Song, D., Bruza, P.: Towards Context-sensitive Information Inference. Journal of the American Soceity for Information Science and Technology 4, 321–334 (2003)
Haiduc, S., Marcus, A.: On the Effect of the Query in IR-based Concept Location. In: 19th IEEE International Conference on Program Comprehension, pp. 234–237 (2011)
Antoniol, G., Gueheneuc, Y.G.: Feature Identification: An Epidemiological Metaphor. IEEE Transactions on Software Engineering 32(9), 627–641 (2006)
Marcus, A., Poshyvanyk, D.: The Conceptual Cohesion of Classes. In: 21st IEEE International Conference on Software Maintenance, pp. 133–142 (2005)
Ratanotayanon, S., Choi, H.J., Elliott Sim, S.: Using transitive changesets to support feature location. In: IEEE/ACM International Conference on Automated Software Engineering, pp. 341–344 (2010)
Kagdi, H., et al.: Assigning change requests to software developers. Journal of Software Maintenance and Evolution: Research and Practice (2011) (to appear)
Poshyvanyk, D., Petrenko, M., Marcus, A.: Integrating COTS Search Engines into Eclipse: Google Desktop Case Study. In: Proceedings of the 2nd International ICSE 2007 Workshop on Incorporating COTS Software Into Software Systems: Tools and Techniques, pp. 6–10 (2007)
Rao, S., Kak, A.: Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: 8th Working Conference on Mining Software Repositories, pp. 43–52 (2011)
Chen, K., Vaclav, R.: RIPPLES: Tool for Change in Legacy Software. In: International Conference on Software Maintenance, pp. 230–239 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Marcus, A., Haiduc, S. (2013). Text Retrieval Approaches for Concept Location in Source Code. In: De Lucia, A., Ferrucci, F. (eds) Software Engineering. ISSSE ISSSE ISSSE 2010 2009 2011. Lecture Notes in Computer Science, vol 7171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36054-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-36054-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36053-4
Online ISBN: 978-3-642-36054-1
eBook Packages: Computer ScienceComputer Science (R0)