Abstract
Our current understanding of how programmers perform feature location during software maintenance is based on controlled studies or interviews, which are inherently limited in size, scope and realism. Replicating controlled studies in the field can both explore the findings of these studies in wider contexts and study new factors that have not been previously encountered in the laboratory setting. In this paper, we report on a field study about how software developers perform feature location within source code during their daily development activities. Our study is based on two complementary field data sets: one that reflects complete IDE activity of 67 professional developers over approximately one month, and the other that reflects usage of an IR-based code search tool by nearly 600 developers. Analyzing this data, we report results on how often developers use which type of code search tools, on the types of queries and retreival strategies used by developers, and on patterns of developer feature location behavior following code search. The results of the study suggest that there is (1) a need for helping developers to devise better code search queries; (2) a lack of adoption of niche code search tools; (3) a need for code search tool to handle both lookup and exploratory queries; and (4) a need for better integration between code search, structured navigation, and debugging tools in feature location tasks.
Similar content being viewed by others
Notes
In this paper, we use the Visual Studio term solution to refer to a software project or a code base. A solution is a container consisting of one or more Visual Studio projects, which, in turn, contains a number of source code files.
Sando usage data spanning from 05/2013 to 06/2014 was included in the dataset.
The interaction monitoring extension, called Blaze, is implemented by researchers at ABB, Inc. Its name is the reason we refer to this dataset as such.
Editing sessions were identified by applying the session clustering algorithm on editing events.
References
Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston
Bajracharya SK, Lopes CV (2012) Analyzing and mining a code search engine usage log. Empirical Software Engineering 17(4-5):424–466
Bates MJ (1989) The design of brosing and berrypicking techniques for the online search interface. Online Information Review 13 5:407–424. doi:10.1108/eb024320. http://ci.nii.ac.jp/naid/80004823012/en/
Damevski K, Shepherd D, Pollock L (2014) A case study of paired interleaving for evaluating code search techniques. In: Proceedings of the IEEE Conference on Software Maintenance and Reengineering - Working Conference on Reverse Engineering (CSMR-WCRE)
Dit B, Moritz E, Poshyvanyk D (2011) A tracelab-based solution for creating, conducting, and sharing feature location experiments. In: IEEE International Conference on Program Comprehension
Ge X, Shepherd D, Damevski K, Murphy-Hill E (2014) How the sando search tool recommends queries Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week - IEEE Conference on, pp 425–428
Haiduc S, Bavota G, Marcus A, Oliveto R, De Lucia A, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering International Conference on Software Engineering (ICSE)
Howard MJ, Gupta S, Pollock L, Vijay-Shanker K (2013) Automatically mining software-based, semantically-similar words from comment-code mappings. In: Proceedings of the 10th Working Conference on Mining Software Repositories. http://dl.acm.org/citation.cfm?id=2487085.2487155. IEEE Press, Piscataway, N J, MSR ’13, pp 377–386
Kersten M, Murphy GC (2005) Mylar: A degree-of-interest model for ides. In: Proceedings of the 4th International Conference on Aspect-oriented Software Development. doi:10.1145/1052898.1052912. ACM, New York, NY, USA, AOSD ’05, pp 159–168
Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Soft Eng 32(12):971–987
Manning CD, Raghavan P, Schütze H (2008) Introduction to Information Retrieval. Cambridge University Press, New York, NY
Murphy GC, Kersten M, Findlater L (2006) How are java software developers using the eclipse ide?. IEEE Software 23(4):76–83. doi:10.1109/MS.2006.105
Murphy-Hill E, Parnin C, Black AP (2009) How we refactor, and how we know it. In: Proceedings of the 31st International Conference on Software Engineering. doi:10.1109/ICSE.2009.5070529. IEEE Computer Society, Washington, DC, ICSE ’09, pp 287–297
Murphy-Hill E, Jiresal R, Murphy GC (2012) Improving software developers’ fluency by recommending development environment commands. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. doi:10.1145/2393596.2393645. ACM, New York, FSE ’12, pp 42:1–42:11
ReSharper (2014) The Most Intelligent Extension for Visual Studio. http://www.jetbrains.com/resharper/
Robillard M, Coelho W, Murphy G (2004) How effective developers investigate source code: an exploratory study. IEEE Trans Softw Eng 30(12):889–903
Roldan-Vega M, Mallet G, Hill E, Fails JA (2013) Conquer: A tool for nl-based query refinement and contextualizing code search results. In: Proceedings of the 2013 IEEE International Conference on Software Maintenance. doi:10.1109/ICSM.2013.84. IEEE Computer Society, Washington, DC ICSM ’13, pp 512–515
Shepherd D, Damevski K, Ropski B, Fritz T (2012) Sando: an extensible local code search framework. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE, pp 15:1–15:2
Sillito J, Murphy GC, De Volder K (2006) Questions programmers ask during software evolution tasks. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering. doi:10.1145/1181775.1181779. ACM, New York, SIGSOFT ’06/FSE-14, pp 23–34
Wang J, Peng X, Xing Z, Zhao W (2011) An exploratory study of feature location process: Distinct phases, recurring patterns, and elementary actions. In: Software Maintenance, IEEE Int Conf on, IEEE, pp 213–222
Wang J, Peng X, Xing Z, Zhao W (2013) Improving feature location practice with multi-faceted interactive exploration. In: Proceedings of the 2013 International Conference on Software Engineering, IEEE Press, Piscataway, NJ, USA, ICSE ’13. http://dl.acm.org/citation.cfm?id=2486788.2486888, pp 762–771
Yang J, Tan L (2012) Inferring semantically related words from software context. In: Mining Software Repositories (MSR), 2012 9th IEEE Working Conference on, IEEE, pp 161–170
Acknowledgements
The authors gratefully acknowledge developers at ABB, Inc. and users of the Sando search tool who allowed anonymous data collection during their daily work. We also acknowledge Will Snipes for collecting and sharing the Blaze dataset and data collection tool with us.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Andrea De Lucia
Appendix A: List of Relevant Events in Blaze and Sando Datasets
Appendix A: List of Relevant Events in Blaze and Sando Datasets
Rights and permissions
About this article
Cite this article
Damevski, K., Shepherd, D. & Pollock, L. A field study of how developers locate features in source code. Empir Software Eng 21, 724–747 (2016). https://doi.org/10.1007/s10664-015-9373-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9373-9