Skip to main content

A Controlled Experiment on the Process Used by Developers During Internet-Scale Code Search

  • Chapter
Finding Source Code on the Web for Remix and Reuse

Abstract

It has become common practice for developers to search the Web for source code. In this paper, we report on our analysis of a laboratory experiment with 24 subjects. They were given a programming scenario and asked to find source code using five different search engines. The scenarios varied in terms of size of search target (block or subsystem) and usage intention (as-is reuse or reference example). Every subject used five search engines (Google, Koders, Krugle, and Google Code Search, and SourceForge). We looked at how these factors influenced three phases of the search process: query formulation, query revision, and judging relevance. One consistent trend was searching for reference examples required more effort, as measured by average number of terms per query, average number of queries, clickthrough rate, and time spent. This additional effort paid off in a higher rate of precision for the first ten results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sushil Bajracharya and Cristina Lopes. Mining search topics from a code search engine usage log. In Proceedings of the 6th IEEE Working Conference on Mining Software Repositories, pages 111–120, 2009.

    Google Scholar 

  2. Sushil Bajracharya, Joel Ossher, and Cristina Lopes. Searching API usage examples in code repositories with sourcerer api search. In Proceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation, pages 5–8, Cape Town, South Africa, 2010. ACM.

    Google Scholar 

  3. Joel Brandt, Philip J. Guo, Joel Lewenstein, Mira Dontcheva, and Scott R. Klemmer. Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the 27th international conference on Human factors in computing systems, pages 1589–1598, Boston, MA, USA, 2009. ACM.

    Google Scholar 

  4. L. Granka, T. Joachims, and G. Gay. Eye-tracking analysis of user behavior in www search. In Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), pages 478–479, 2004.

    Google Scholar 

  5. Mark Grechanik, Chen Fu, Qing Xie, Collin McMillan, Denys Poshyvanyk, and Chad Cumby. A search engine for finding highly relevant applications. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, pages 475–484, Cape Town, South Africa, 2010. ACM.

    Google Scholar 

  6. Raphael Hoffmann, James Fogarty, and Daniel S. Weld. Assieme: finding and leveraging implicit references in a web search interface for programmers. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, Newport, Rhode Island, USA, 2007. ACM.

    Google Scholar 

  7. Reid Holmes, Robert J. Walker, and Gail C. Murphy. Strathcona example recommendation tool. In Michel Wermelinger and Harald Gall, editors, ESEC/ SIGSOFT FSE, pages 237–240. ACM, 2005.

    Google Scholar 

  8. C. Holscher and G. Strube. Web search behavior of internet experts and newbies. Computer Networks, 33(1–6):337–346, 2000.

    Article  Google Scholar 

  9. Oliver Hummel, Werner Janjic, and Colin Atkinson. Code conjurer: Pulling reusable software out of thin air. IEEE Software, 25(5):45–52, 2008.

    Article  Google Scholar 

  10. Bernard J. Jansen and Udo Pooch. A review of web searching studies and a framework for future research. Journal of the American Society for Information Science and Technology, 52(3), 2001.

    Google Scholar 

  11. V. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl., 10(8), 1966.

    Google Scholar 

  12. Erik Linstead, Sushil Bajracharya, Trung Ngo, Paul Rigor, Cristina Lopes, and Pierre Baldi. Sourcerer: mining and searching internet-scale software repositories. Data Mining and Knowledge Discovery, 18(2):300–336, 2009.

    Article  MathSciNet  Google Scholar 

  13. Naiyana Sahavechaphan and Kajal T. Claypool. XSnippet: mining for sample code. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 413–430, New York, NY, 2006. ACM Press.

    Google Scholar 

  14. Susan Elliott Sim, Medha Umarji, Sukanya Ratanotayanon, and Cristina V. Lopes. How well do internet code search engines support open source reuse strategies? ACM Transactions on Software Engineering and Methodology, 21(1), December 2011.

    Google Scholar 

  15. Janice Singer and Timothy Lethbridge. What’s so great about ‘grep’? implications for program comprehension tools. Technical report, National Research Council, Canada, 1997.

    Google Scholar 

  16. Amanda Spink. Study of interactive feedback during mediated information retrieval. Journal of the American Society for Information Science, 48(5), 1997.

    Google Scholar 

  17. M. Umarji, S. E. Sim, and C. Lopes. Archetypal internet-scale source code searching. In B. Russo, E. Damiani, S. Hissam, B. Lundell, and G. Succi, editors, IFIP International Federation for Information Processing 275: Open Source Development, Communities and Quality, pages 257–263. Springer, 2008.

    Google Scholar 

  18. R. B. Zajonc. Attitudinal effects of mere exposure. Journal of personality and social psychology, 9(2), 1968.

    Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the NSF under Grant No. IIS-0846034. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessary reflect the views of the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Susan Elliott Sim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Sim, S.E., Agarwala, M., Umarji, M. (2013). A Controlled Experiment on the Process Used by Developers During Internet-Scale Code Search. In: Sim, S.E., Gallardo-Valencia, R.E. (eds) Finding Source Code on the Web for Remix and Reuse. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6596-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6596-6_4

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6595-9

  • Online ISBN: 978-1-4614-6596-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics