Skip to main content

Test-Driven Reuse: Key to Improving Precision of Search Engines for Software Reuse

  • Chapter
Finding Source Code on the Web for Remix and Reuse

Abstract

The applicability of software reuse approaches in practice has long suffered from a lack of reusable material, but this situation has changed virtually over night: the rise of the open source movement has made millions of software artifacts available on the Internet. Suddenly, the existing (largely text-based) software search solutions did not suffer from a lack of reusable material anymore, but rather from a lack of precision as a query now might return thousands of potential results. In a reuse context, however, precisely matching results are the key for integrating reusable material into a given environment with as little effort as possible. Therefore a better way for formulating and executing queries is a core requirement for a broad application of software search and reuse. Inspired by the recent trend towards test-first software development approaches, we found test cases being a practical vehicle for reuse-driven software retrieval and developed a test-driven code search system utilizing simple unit tests as semantic descriptions of desired artifacts. In this chapter we describe our approach and present an evaluation that underlines its superior precision when it comes to retrieving reusable artifacts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For topological methods it is difficult to define or estimate recall and precision. See [26] for more.

  2. 2.

    Result source (visited Dec, 14th 2011): http://www.purpletech.com/xp/wake/src/Sheet.java

  3. 3.

    Which is hosted on sourceforge.net and available at www.code-conjurer.org

  4. 4.

    The only requirement is that the tests should be written according to best SE practices (e.g. the name of the test should reflect the class under test’s name).

References

  1. Atkinson, C., Bayer, J., Bunse, C., Kamsties, E., Laitenberger, O., Laqua, R., Muthig, D., Paech, B., Wüst, J., Zettel, J.: Component-based Product Line Engineering with UML, Addison Wesley (2002)

    Google Scholar 

  2. Atkinson, C., Brenner, D., Hummel, O., Stoll, D.: A Trustable Brokerage Solution for Component and Service Markets. Proceedings of the Intern. Conference on Software Reuse (2008)

    Google Scholar 

  3. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley (1999)

    Google Scholar 

  4. Beck, K.: Test-driven development: by example. Addison-Wesley (2003)

    Google Scholar 

  5. Hummel, O., Atkinson, C., Schumacher, M.: Artifact Representation Techniques for Large-Scale Software Search Engines. In Sim and Gallardo-Valencia (eds.): Finding Source Code on the Web for Remix and Reuse, Springer, 2012.

    Google Scholar 

  6. Crnkovic, I.: Component-based software engineering – new challenges in software development. Software Focus, Vol. 2, No. 4 (2001)

    Google Scholar 

  7. Erl, T: Service-Oriented Architecture: Concepts, Technology and Design. Pearson (2005)

    Google Scholar 

  8. Frakes, W.B.: An empirical study of representation methods for reusable software components. IEEE Transactions on Software Engineering, Vol. 20, no.8 (1994)

    Google Scholar 

  9. Frakes, W.B., Terry, C.: Software Reuse: Metrics and Models. ACM Computing Surveys, Vol. 28, No. 2 (1996)

    Google Scholar 

  10. Garcia, V.C., de Almeida, E.S., Lisboa, L.B., Martins, A.C., Meira, S.R.L., Lucredio, D., de M. Fortes, R.P.: Toward a Code Search Engine Based on the State-of-Art and Practice. Proceedings of the Asia Pacific Software Engineering Conference (2006)

    Google Scholar 

  11. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: elements of reusable object-oriented software, Addison-Wesley (1995)

    Google Scholar 

  12. Grechanik, M., Chen Fu, Qing Xie, McMillan, C., Poshyvanyk, D., Cumby, C.: A search engine for finding highly relevant applications. 32nd International Conference on Software Engineering (2010)

    Google Scholar 

  13. Horowitz, B.: A fall sweep. Google Blog, http://googleblog.blogspot.com/2011/10/fall-sweep.html(2011), last retrieved Dec. 2011

  14. Hatcher, E., Gospodnetic, O., McCandless, M.: Lucene in Action (2nd edition). Manning (2010)

    Google Scholar 

  15. Hummel, O., Atkinson, C.: Extreme Harvesting: Test Driven Discovery and Reuse of Software Components. Proceedings of the International Conference on Information Reuse and Integration (2004)

    Google Scholar 

  16. Hummel, O., Atkinson, C.: Using the Web as a Reuse Repository. Proceedings of the International Conference on Software Reuse (2006)

    Google Scholar 

  17. Hummel, O., Janjic, W., Atkinson, C.: Evaluating the efficiency of retrieval methods for component repositories. Proceedings of the International Conference on Software Engineering and Knowledge Engineering (2007)

    Google Scholar 

  18. Hummel, O.: Semantic component retrieval in software engineering. PhD dissertation, University of Mannheim (2008)

    Google Scholar 

  19. Hummel, O., Janjic, W., Atkinson, C.: Code conjurer: Pulling reusable software out of thin air. IEEE Software, Vol.25, No. 5 (2008)

    Google Scholar 

  20. Hummel, O.: Facilitating the comparison of software retrieval systems through a reference reuse collection. Proceedings of the ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation (2010)

    Book  Google Scholar 

  21. Hummel, O., Atkinson, C.: Automated Creation and Assessment of Component Adapters with Test Cases. Symposium on Component-Based Software Engineering (2010)

    Google Scholar 

  22. Inoue, K., Yokomori, R., Fujiwara, H., Yamamoto, T., Matsushita, M., Kusumoto S.: Ranking Significance of Software Components Based on Use Relations. IEEE Transactions on Software Engineering, Vol. 31, No. 3 (2005)

    Google Scholar 

  23. Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, Vol. 36, No. 2 (2000)

    Google Scholar 

  24. Krueger, C.W.: Software reuse. ACM Computing Surveys, vol. 24, no 2. (1992)

    Google Scholar 

  25. McIlroy, D.: Mass-Produced Software Components. Software Engineering: Report of a conference sponsored by the NATO Science Committee (1968).

    Google Scholar 

  26. Mili, A., Mili, R., Mittermeir, R.: A Survey of Software Reuse Libraries. Annals of Software Engineering 5 (1998)

    Google Scholar 

  27. Mili, A., Yacoub, S., Addy, E., Mili, H.: Toward an engineering discipline of software reuse. IEEE Software, vol. 16, no. 5 (1999)

    Google Scholar 

  28. Nezhad, H., Benatallah, B., Martens, A., Curbera, F., Casati, F.: Semi-automated adaptation of service interactions. Proceedings of the 16th International Conference on World Wide Web (2007)

    Google Scholar 

  29. Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Algorithm: Bringing Order to the Web. Proceedings of the International Conference on the World Wide Web (1998)

    Google Scholar 

  30. Podgurski, A., Pierce, L.: Retrieving reusable software by sampling behavior. ACM Transactions on Software Engineering and Methodology, Vol.2, No. 3 (1993)

    Google Scholar 

  31. Poulin, J.: Reuse: Been there. Done that. Communications of the ACM. Vol. 42, Iss. 5 (1999)

    Google Scholar 

  32. Prieto-Diaz, R., Freeman, P.: Classifying Software for Reusability. IEEE Software, Vol. 4, No. 1 (1987)

    Google Scholar 

  33. Reiss, S.P.: Semantics-based code search. Proceedings of the 31st International Conference on Software Engineering (2009)

    Google Scholar 

  34. Sahavechaphan, N., Claypool, K.T.: X Snippet: Mining for Sample Code. OOPSLA (2006)

    Google Scholar 

  35. Seacord, R.C.: Software Engineering Component Repositories. Proceedings of the International Workshop on Component-Based Software Engineering (1999)

    Google Scholar 

  36. Seacord, R.C., Hissam, S.A., Wallnau, K.C.: AGORA: a search engine for software components. IEEE Internet Computing, Vol. 2, No. 6 (1998)

    Google Scholar 

  37. Bajracharya, S., Ossher, J., Lopes, C.: Sourcerer: An internet-scale software repository. Proceedings of the ICSE Workshop on Search-Driven Development: Users, Infrastructure, Tools and Evaluation (2009)

    Google Scholar 

  38. Thummalapenta, S. Xie, T.: Parseweb: a programmer assistant for reusing open source code on the web. Proceedings of the International Conference on Automated Software Engineering (2007)

    Google Scholar 

  39. Lemos, O., Bajracharya, S., Ossher, J.: CodeGenie: a tool for test-driven source code search. Proceedings of the International Conference on Object-Oriented Programming (2007)

    Google Scholar 

  40. Szyperski, C.: Component Software: Beyond Object-Oriented Programming (2nd ed.), Addison-Wesley (2002)

    Google Scholar 

  41. Ye, Y. and Fischer, G.: Supporting reuse by delivering task-relevant and personalized information. Proceedings of the International Conference on Software Engineering (2002)

    Google Scholar 

  42. Zaremski, A.M., Wing, J.M.: Signature Matching: A Tool for Using Software Libraries. ACM Transactions on Software Engineering and Methodology, Vol. 4, No. 2 (1995)

    Google Scholar 

  43. Zaremski, A.M., Wing, J.M.: Specification Matching of Software Components. ACM Transactions on Software Engineering and Methodology, Vol. 6, No. 4 (1997)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Colin Atkinson, Philipp Bostan, Daniel Brenner, Matthias Gutheil, Christian Ritter, Marcus Schumacher and Dietmar Stoll from the Software Engineering Group at the University of Mannheim for their contributions to developing the tools described in this chapter. Furthermore, we would like to express our gratitude for the helpful comments of the anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oliver Hummel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Hummel, O., Janjic, W. (2013). Test-Driven Reuse: Key to Improving Precision of Search Engines for Software Reuse. In: Sim, S.E., Gallardo-Valencia, R.E. (eds) Finding Source Code on the Web for Remix and Reuse. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6596-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6596-6_12

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6595-9

  • Online ISBN: 978-1-4614-6596-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics