Empirical Software Engineering

, Volume 21, Issue 5, pp 2190–2231 | Cite as


Turning the IDE into a self-confident programming assistant
  • Luca PonzanelliEmail author
  • Gabriele Bavota
  • Massimiliano Di Penta
  • Rocco Oliveto
  • Michele Lanza


Developers often require knowledge beyond the one they possess, which boils down to asking co-workers for help or consulting additional sources of information, such as Application Programming Interfaces (API) documentation, forums, and Q&A websites. However, it requires time and energy to formulate one’s problem, peruse and process the results. We propose a novel approach that, given a context in the Integrated Development Environment (IDE), automatically retrieves pertinent discussions from Stack Overflow, evaluates their relevance using a multi-faceted ranking model, and, if a given confidence threshold is surpassed, notifies the developer. We have implemented our approach in Prompter, an Eclipse plug-in. Prompter was evaluated in two empirical studies. The first study was aimed at evaluatingPrompter’s ranking model and involved 33 participants. The second study was conducted with 12 participants and aimed at evaluating Prompter’s usefulness when supporting developers during development and maintenance tasks. Since Prompter uses “volatile information” crawled from the web, we also replicated Study I after one year to assess the impact of such a “volatility” on recommenders like Prompter. Our results indicate that (i) Prompter recommendations were positively evaluated in 74 % of the cases on average, (ii) Prompter significantly helps developers to improve the correctness of their tasks by 24 % on average, but also (iii) 78 % of the provided recommendations are “volatile” and can change at one year of distance. While Prompter revealed to be effective, our studies also point out issues when building recommenders based on information available on online forums.


Recommenders Mining software repositories Stack overflow Empirical studies 



Luca Ponzanelli and Michele Lanza thank the Swiss National Science foundation for the financial support through SNF Project “ESSENTIALS”, No. 153129.


  1. Anvik J, Hiew L, Murphy G (2006) Who should fix this bug?. In: Proceedings of ICSE 2006, 361–370. ACMGoogle Scholar
  2. Bacchelli A, dal Sasso T, D’Ambros M, Lanza M (2012) Content classification of development emails. In: Proceedings of ICSE 2012, 375–385Google Scholar
  3. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-WesleyGoogle Scholar
  4. Bajracharya S, Lopes C (2009) Mining search topics from a code search engine usage log. In: Proceedings of MSR 2009, 111–120Google Scholar
  5. Bajracharya S, Lopes C (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17(4-5):424–466CrossRefGoogle Scholar
  6. Bajracharya S, Ngo T, Linstead E, Rigor P, Dou Y, Baldi P, Lopes C (2006) Sourcerer: A search engine for open source code supporting structure-based search. In: Proceedings of OOPSLA 2006, 25–26Google Scholar
  7. Baker RD (1995) Modern permutation test software. In: Randomization Tests. Marcel DeckerGoogle Scholar
  8. Constantine L (1995) Constantine on Peopleware. YourdonGoogle Scholar
  9. Cordeiro J, Antunes B, Gomes P (2012) Context-based recommendation to support problem solving in software development. In: Proceedings of RSSE 2012, 85–89. IEEE PressGoogle Scholar
  10. Cubranic D, Murphy G (2003) Hipikat: recommending pertinent software development artifacts. In: Proceedings of ICSE 2003, 408–418. IEEE PressGoogle Scholar
  11. Goldman M, Miller R (2009) Codetrail: Connecting source code and web resources. Journal of Visual Languages & ComputingGoogle Scholar
  12. Grissom RJ, Kim JJ (2005) Effect sizes for research: A broad practical approach. Lawrence AssociatesGoogle Scholar
  13. Haiduc S, Bavota G, Marcus A, Oliveto R, De Lucia A, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering. In: 35th International Conference on Software Engineering, ICSE ’13, San Francisco, CA, USA, May 18-26, 2013, 842–851.
  14. Haiduc S, Bavota G, Oliveto R, De Lucia A, Marcus A (2012) Automatic query performance assessment during the retrieval of software artifacts. In: IEEE/ACM International Conference on Automated Software Engineering, ASE’12, Essen, Germany, September 3-7, 2012, 90–99, doi: 10.1145/2351676.2351690, (to appear in print)
  15. Haiduc S, Bavota G, Oliveto R, Marcus A, De Lucia A (2012) Evaluating the specificity of text retrieval queries to support software engineering tasks. In: 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, 1273–1276, doi:  10.1109/ICSE.2012.6227101, (to appear in print)
  16. Hassan AE (2009) Predicting faults using the complexity of code changes. In: 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Proceedings, 78–88, doi: 10.1109/ICSE.2009.5070510, (to appear in print)
  17. Hintze JL, Nelson RD (1998) Violin plots: A box plot-density trace synergism. Am Stat 52(2):181–184Google Scholar
  18. Holm S (1979) A simple sequentially rejective Bonferroni test procedure. Scand J Stat 6:65–70MathSciNetzbMATHGoogle Scholar
  19. Holmes R, Begel A (2008) Deep intellisense: a tool for rehydrating evaporated information. In: Proceedings of MSR 2008, 23–26. ACMGoogle Scholar
  20. Holmes R, Walker R, Murphy G (2005) Strathcona example recommendation tool. SIGSOFT Software Engineering Notes 30:237–240CrossRefGoogle Scholar
  21. Holmes R, Walker R, Murphy G (2006) Approximate structural context matching: An approach to recommend relevant examples. IEEE TSE 32(12):952–970Google Scholar
  22. Horvitz E, Breese J, Heckerman D, Hovel D, Rommelse K (1998) The lumière project: Bayesian user modeling for inferring the goals and needs of software users. In: Proceedings of UAI 1998 (14th Conference on Uncertainty in Artificial Intelligence), 256–265. Morgan Kaufmann Publishers IncGoogle Scholar
  23. Kersten M, Murphy G (2006) Using task context to improve programmer productivity. In: Proceedings of FSE-14, 1–11. ACM PressGoogle Scholar
  24. Ko AJ, DeLine R, Venolia G (2007) Information needs in collocated software development teams. In: Proceedings of ICSE 2007, 344–353. IEEE CS PressGoogle Scholar
  25. Kononenko O, Dietrich D, Sharma R, Holmes R (2012) Automatically locating relevant programming help online. In: Proceedings of VL/HCC 2012, 127–134Google Scholar
  26. LaToza TD, Venolia G, DeLine R (2006) Maintaining mental models: a study of developer work habits. In: Proceedings of ICSE 2006, 492–501. ACMGoogle Scholar
  27. Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theory 10:707–710MathSciNetzbMATHGoogle Scholar
  28. Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007) Mining internet-scale software repositories. In: In Proceedings of NIPS 2007. MIT PressGoogle Scholar
  29. Lohar S, Amornborvornwong S, Zisman A, Cleland-Huang J (2013) Improving trace accuracy through data-driven configuration and composition of tracing features. In: Proceedings of ESEC/FSE 2013, 378–388. ACMGoogle Scholar
  30. Mamykina L, Manoim B, Mittal M, Hripcsak G, Hartmann B Design lessons from the fastest q&a site in the west. In: Proceedings of CHI 2011, 2857–2866. ACMGoogle Scholar
  31. Mandelin D, Xu L, Bodík R, Kimelman D (2005) Jungloid mining: Helping to navigate the api jungle. In: Proceedings of PLDI 2005 (16th ACM SIGPLAN Conference on Programming Language Design and Implementation), 48–61. ACMGoogle Scholar
  32. Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University PressGoogle Scholar
  33. McMillan C, Grechanik M, Poshyvanyk D, Fu C, Xie Q (2012) A source code search engine for finding highly relevant applications. IEEE TSE 38(5):1069–1087Google Scholar
  34. McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: Proceedings of ICSE 2011, 111–120. ACMGoogle Scholar
  35. Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, LondonGoogle Scholar
  36. Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: Proceedings of ICSE 2013, 522–531. ACM/IEEEGoogle Scholar
  37. Ponzanelli L, Bacchelli A, Lanza M (2013) Leveraging crowd knowledge for software comprehension and development. In: Proceedings of CSMR 2013, 59–66Google Scholar
  38. Ponzanelli L, Bacchelli A, Lanza M (2013) Seahawk: Stack overflow in the ide. In: Proceedings of ICSE 2013, Tool Demo Track, 1295–1298. IEEEGoogle Scholar
  39. Core Team R (2012) R: a language and environment for statistical computing. Vienna, Austria. ISBN 3-900051-07-0
  40. Reid RH, Murphy GC (2005) Using structural context to recommend source code examples. In: Proceedings of ICSE 2005, 117–125. ACMGoogle Scholar
  41. Reiss S (2009) Semantics-based code search. In: Proceedings of ICSE 2009, 243–253. IEEEGoogle Scholar
  42. Rigby P, Robillard M (2013) Discovering essential code elements in informal documentation. In: Proceedings of ICSE 2013, 832–841Google Scholar
  43. Robertson S (2004) Understanding inverse document frequency: On theoretical arguments for IDF. J Doc 60:2004CrossRefGoogle Scholar
  44. Robillard M, Walker R, Zimmermann T (2010) Recommendation systems for software engineering. IEEE SoftwareGoogle Scholar
  45. Sawadsky N, Murphy G (2011) Fishtail: from task context to source code examples. In: Proceedings of TOPI 2011, 48–51. ACMGoogle Scholar
  46. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423. 625–56MathSciNetCrossRefzbMATHGoogle Scholar
  47. Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures (fourth edition). Chapman & AllGoogle Scholar
  48. Sim S, Umarji M, Ratanotayanon S, Lopes C (2011) How well do search engines support code retrieval on the web ACM TOSEM:1–25Google Scholar
  49. Stylos J, Myers BA (2006) Mica: A web-search tool for finding api components and examples. In: Proceedings of VL/HCC 2006, 195–202Google Scholar
  50. Subramanian S, Inozemtseva L, Holmes R (2014) Live api documentation. In: Proceedings of ICSE 2014 (36th International Conference on Software Engineering), ICSE 2014, 643–652. ACMGoogle Scholar
  51. Takuya W, Masuhara H (2011) A spontaneous code recommendation tool based on associative search. In: Proceedings of SUITE 2011, pp. 17–20. ACMGoogle Scholar
  52. Thummalapenta S (2007) Exploiting code search engines to improve programmer productivity. In: Proceedings of OOPSLA 2007, 921–922. ACMGoogle Scholar
  53. Thummalapenta S, Xie T (2007) Parseweb: a programmer assistant for reusing open source code on the web. In: Proceedings of ASE 2007, 204–213. ACMGoogle Scholar
  54. Thummalapenta S, Xie T (2008) Spotweb: Detecting framework hotspots and coldspots via mining open source code on the web. In: Proceedings of ASE 2008, 327–336. IEEEGoogle Scholar
  55. Umarji M, Sim S, Lopes C (2008) Archetypal internet-scale source code searching. In: Proceedings of OSS 2008, 257–263Google Scholar
  56. Vassallo C, Panichella S, Di Penta M, Canfora G (2014) Codes: mining source code descriptions from developers discussions. In: 22nd International Conference on Program Comprehension, ICPC 2014, Hyderabad, India, June 2-3, 2014, 106–109Google Scholar
  57. Wang T, Harman M, Jia Y, Krinke J (2013) Searching for better configurations: a rigorous approach to clone evaluation. In: Proceedings of ESEC/FSE 2013, 455–465. ACMGoogle Scholar
  58. Wettel R, Marinescu R (2005) Archeology of code duplication: recovering duplication chains from small duplication fragments. In: Proceedings of SYNASC 2005, 63–70Google Scholar
  59. Williams L (2001) Integrating pair programming into a software development process. In: Proceedings of CSEET 2001, 27–36. IEEEGoogle Scholar
  60. Zimmermann T, Weißgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of ICSE 2004, 563–572. IEEEGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Luca Ponzanelli
    • 1
    Email author
  • Gabriele Bavota
    • 2
  • Massimiliano Di Penta
    • 3
  • Rocco Oliveto
    • 4
  • Michele Lanza
    • 1
  1. 1.REVEAL @ Faculty of InformaticsUniversità della Svizzera italiana (USI)LuganoSwitzerland
  2. 2.Faculty of Computer ScienceFree University of Bozen-BolzanoBolzanoItaly
  3. 3.Department of EngineeringUniversity of SannioBeneventoItaly
  4. 4.CSSC Lab - Department of Bioscience and TerritoryUniversity of MolisePesche (IS)Italy

Personalised recommendations