Test-Driven Reuse: Key to Improving Precision of Search Engines for Software Reuse

  • Oliver Hummel
  • Werner Janjic


The applicability of software reuse approaches in practice has long suffered from a lack of reusable material, but this situation has changed virtually over night: the rise of the open source movement has made millions of software artifacts available on the Internet. Suddenly, the existing (largely text-based) software search solutions did not suffer from a lack of reusable material anymore, but rather from a lack of precision as a query now might return thousands of potential results. In a reuse context, however, precisely matching results are the key for integrating reusable material into a given environment with as little effort as possible. Therefore a better way for formulating and executing queries is a core requirement for a broad application of software search and reuse. Inspired by the recent trend towards test-first software development approaches, we found test cases being a practical vehicle for reuse-driven software retrieval and developed a test-driven code search system utilizing simple unit tests as semantic descriptions of desired artifacts. In this chapter we describe our approach and present an evaluation that underlines its superior precision when it comes to retrieving reusable artifacts.


Search Engine Information Retrieval Virtual Machine Information Retrieval Method Java Source Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors would like to thank Colin Atkinson, Philipp Bostan, Daniel Brenner, Matthias Gutheil, Christian Ritter, Marcus Schumacher and Dietmar Stoll from the Software Engineering Group at the University of Mannheim for their contributions to developing the tools described in this chapter. Furthermore, we would like to express our gratitude for the helpful comments of the anonymous reviewers.


  1. [1].
    Atkinson, C., Bayer, J., Bunse, C., Kamsties, E., Laitenberger, O., Laqua, R., Muthig, D., Paech, B., Wüst, J., Zettel, J.: Component-based Product Line Engineering with UML, Addison Wesley (2002)Google Scholar
  2. [2].
    Atkinson, C., Brenner, D., Hummel, O., Stoll, D.: A Trustable Brokerage Solution for Component and Service Markets. Proceedings of the Intern. Conference on Software Reuse (2008)Google Scholar
  3. [3].
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley (1999)Google Scholar
  4. [4].
    Beck, K.: Test-driven development: by example. Addison-Wesley (2003)Google Scholar
  5. [5].
    Hummel, O., Atkinson, C., Schumacher, M.: Artifact Representation Techniques for Large-Scale Software Search Engines. In Sim and Gallardo-Valencia (eds.): Finding Source Code on the Web for Remix and Reuse, Springer, 2012.Google Scholar
  6. [6].
    Crnkovic, I.: Component-based software engineering – new challenges in software development. Software Focus, Vol. 2, No. 4 (2001)Google Scholar
  7. [7].
    Erl, T: Service-Oriented Architecture: Concepts, Technology and Design. Pearson (2005)Google Scholar
  8. [8].
    Frakes, W.B.: An empirical study of representation methods for reusable software components. IEEE Transactions on Software Engineering, Vol. 20, no.8 (1994)Google Scholar
  9. [9].
    Frakes, W.B., Terry, C.: Software Reuse: Metrics and Models. ACM Computing Surveys, Vol. 28, No. 2 (1996)Google Scholar
  10. [10].
    Garcia, V.C., de Almeida, E.S., Lisboa, L.B., Martins, A.C., Meira, S.R.L., Lucredio, D., de M. Fortes, R.P.: Toward a Code Search Engine Based on the State-of-Art and Practice. Proceedings of the Asia Pacific Software Engineering Conference (2006)Google Scholar
  11. [11].
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: elements of reusable object-oriented software, Addison-Wesley (1995)Google Scholar
  12. [12].
    Grechanik, M., Chen Fu, Qing Xie, McMillan, C., Poshyvanyk, D., Cumby, C.: A search engine for finding highly relevant applications. 32nd International Conference on Software Engineering (2010)Google Scholar
  13. [13].
    Horowitz, B.: A fall sweep. Google Blog,, last retrieved Dec. 2011
  14. [14].
    Hatcher, E., Gospodnetic, O., McCandless, M.: Lucene in Action (2nd edition). Manning (2010)Google Scholar
  15. [15].
    Hummel, O., Atkinson, C.: Extreme Harvesting: Test Driven Discovery and Reuse of Software Components. Proceedings of the International Conference on Information Reuse and Integration (2004)Google Scholar
  16. [16].
    Hummel, O., Atkinson, C.: Using the Web as a Reuse Repository. Proceedings of the International Conference on Software Reuse (2006)Google Scholar
  17. [17].
    Hummel, O., Janjic, W., Atkinson, C.: Evaluating the efficiency of retrieval methods for component repositories. Proceedings of the International Conference on Software Engineering and Knowledge Engineering (2007)Google Scholar
  18. [18].
    Hummel, O.: Semantic component retrieval in software engineering. PhD dissertation, University of Mannheim (2008)Google Scholar
  19. [19].
    Hummel, O., Janjic, W., Atkinson, C.: Code conjurer: Pulling reusable software out of thin air. IEEE Software, Vol.25, No. 5 (2008)Google Scholar
  20. [20].
    Hummel, O.: Facilitating the comparison of software retrieval systems through a reference reuse collection. Proceedings of the ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation (2010)CrossRefGoogle Scholar
  21. [21].
    Hummel, O., Atkinson, C.: Automated Creation and Assessment of Component Adapters with Test Cases. Symposium on Component-Based Software Engineering (2010)Google Scholar
  22. [22].
    Inoue, K., Yokomori, R., Fujiwara, H., Yamamoto, T., Matsushita, M., Kusumoto S.: Ranking Significance of Software Components Based on Use Relations. IEEE Transactions on Software Engineering, Vol. 31, No. 3 (2005)Google Scholar
  23. [23].
    Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, Vol. 36, No. 2 (2000)Google Scholar
  24. [24].
    Krueger, C.W.: Software reuse. ACM Computing Surveys, vol. 24, no 2. (1992)Google Scholar
  25. [25].
    McIlroy, D.: Mass-Produced Software Components. Software Engineering: Report of a conference sponsored by the NATO Science Committee (1968).Google Scholar
  26. [26].
    Mili, A., Mili, R., Mittermeir, R.: A Survey of Software Reuse Libraries. Annals of Software Engineering 5 (1998)Google Scholar
  27. [27].
    Mili, A., Yacoub, S., Addy, E., Mili, H.: Toward an engineering discipline of software reuse. IEEE Software, vol. 16, no. 5 (1999)Google Scholar
  28. [28].
    Nezhad, H., Benatallah, B., Martens, A., Curbera, F., Casati, F.: Semi-automated adaptation of service interactions. Proceedings of the 16th International Conference on World Wide Web (2007)Google Scholar
  29. [29].
    Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Algorithm: Bringing Order to the Web. Proceedings of the International Conference on the World Wide Web (1998)Google Scholar
  30. [30].
    Podgurski, A., Pierce, L.: Retrieving reusable software by sampling behavior. ACM Transactions on Software Engineering and Methodology, Vol.2, No. 3 (1993)Google Scholar
  31. [31].
    Poulin, J.: Reuse: Been there. Done that. Communications of the ACM. Vol. 42, Iss. 5 (1999)Google Scholar
  32. [32].
    Prieto-Diaz, R., Freeman, P.: Classifying Software for Reusability. IEEE Software, Vol. 4, No. 1 (1987)Google Scholar
  33. [33].
    Reiss, S.P.: Semantics-based code search. Proceedings of the 31st International Conference on Software Engineering (2009)Google Scholar
  34. [34].
    Sahavechaphan, N., Claypool, K.T.: X Snippet: Mining for Sample Code. OOPSLA (2006)Google Scholar
  35. [35].
    Seacord, R.C.: Software Engineering Component Repositories. Proceedings of the International Workshop on Component-Based Software Engineering (1999)Google Scholar
  36. [36].
    Seacord, R.C., Hissam, S.A., Wallnau, K.C.: AGORA: a search engine for software components. IEEE Internet Computing, Vol. 2, No. 6 (1998)Google Scholar
  37. [37].
    Bajracharya, S., Ossher, J., Lopes, C.: Sourcerer: An internet-scale software repository. Proceedings of the ICSE Workshop on Search-Driven Development: Users, Infrastructure, Tools and Evaluation (2009)Google Scholar
  38. [38].
    Thummalapenta, S. Xie, T.: Parseweb: a programmer assistant for reusing open source code on the web. Proceedings of the International Conference on Automated Software Engineering (2007)Google Scholar
  39. [39].
    Lemos, O., Bajracharya, S., Ossher, J.: CodeGenie: a tool for test-driven source code search. Proceedings of the International Conference on Object-Oriented Programming (2007)Google Scholar
  40. [40].
    Szyperski, C.: Component Software: Beyond Object-Oriented Programming (2nd ed.), Addison-Wesley (2002)Google Scholar
  41. [41].
    Ye, Y. and Fischer, G.: Supporting reuse by delivering task-relevant and personalized information. Proceedings of the International Conference on Software Engineering (2002)Google Scholar
  42. [42].
    Zaremski, A.M., Wing, J.M.: Signature Matching: A Tool for Using Software Libraries. ACM Transactions on Software Engineering and Methodology, Vol. 4, No. 2 (1995)Google Scholar
  43. [43].
    Zaremski, A.M., Wing, J.M.: Specification Matching of Software Components. ACM Transactions on Software Engineering and Methodology, Vol. 6, No. 4 (1997)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Software Engineering GroupUniversity of MannheimMannheimGermany

Personalised recommendations