Skip to main content

Recommending Library Methods: An Evaluation of the Vector Space Model (VSM) and Latent Semantic Indexing (LSI)

  • Conference paper
Reuse of Off-the-Shelf Components (ICSR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4039))

Included in the following conference series:

Abstract

The development and maintenance of a reuse repository requires significant investment, planning and managerial support. To minimise risk and ensure a healthy return on investment, reusable components should be accessible, reliable and of a high quality. In this paper we concentrate on accessability; we describe a technique which enables a developer to effectively and conveniently make use of large scale libraries. Unlike most previous solutions to component retrieval, our tool, RASCAL, is a proactive component recommender.

RASCAL recommends a set of task-relevant reusable components to a developer. Recommendations are produced using Collaborative Filtering (CF). We compare and contrast CF effectiveness when using two information retrieval techniques, namely Vector Space Model (VSM) and Latent Semantic Indexing (LSI). We validate our technique on real world examples and find overall results are encouraging; notably, RASCAL can produce reasonably good recommendations when they are most valuable i.e., at an early stage in code development.

Funding provided by the IRCSET under grant RS/2003/127.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mohagheghi, P., et al.: An empirical study of software reuse vs. defect-density and stability. In: ICSE 2004: Proceedings of the 26th International Conference on Software Engineering, Washington, DC, USA, pp. 282–292. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  2. Yongbeom, K., Stohr, E.: Software reuse: Survey and research directions. Management Information Systems 14(4), 113–147 (1998)

    Google Scholar 

  3. Ye, Y., Fischer, G.: Reuse-conducive development environments. International Journal of Automated Software Engineering 12, 199–235 (2005)

    Article  Google Scholar 

  4. Poulin, J.: Reuse: Been there done that. Communications of the ACM 42(5) (1999)

    Google Scholar 

  5. Inoue, K., et al.: Component rank: relative significance rank for software component search. In: ICSE 2003: Proceedings of the 25th International Conference on Software Engineering, Washington, DC, USA, pp. 14–24. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  6. Sarwar, B.M., Karypis, G., Konstan, J.A., Reidl, J.: Item-based collaborative filtering recommendation algorithms. In: World Wide Web, pp. 285–295 (2001)

    Google Scholar 

  7. Letsche, T.A., Berry, M.W.: Large-scale information retrieval with latent semantic indexing. Inf. Sci. 100(1-4), 105–137 (1997)

    Article  Google Scholar 

  8. Landauer, T., Foltz, P., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284 (1998)

    Article  Google Scholar 

  9. Deerwester, S., et al.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)

    Article  Google Scholar 

  10. Prieto-Diaz, R., Freeman, P.: Classifying software for reuse. IEEE Software 4(1), 6–16 (1987)

    Article  Google Scholar 

  11. Mili, A., Mili, R., Mittermeir, R.T.: A survey of software reuse libraries. Annals of Software Engineering 5, 349–414 (1998)

    Article  Google Scholar 

  12. Sugumaran, V., Storey, V.C.: A semantic-based approach to component retrieval. SIGMIS Database 34(3), 8–24 (2003)

    Article  Google Scholar 

  13. Girardi, M., Ibrahim, B.: Using english to retrieve software. Journals of Systems and Software 30(3), 249–270 (1995)

    Article  Google Scholar 

  14. Drummond, C.G., Ionescu, D., Holte, R.C.: A learning agent that assists the browsing of software libraries. IEEE Trans. Softw. Eng. 26(12), 1179–1196 (2000)

    Article  Google Scholar 

  15. Sarwar, B.M., et al.: Application of dimensionality reduction in recommender systems–a case study. In: Proceedings of ACM WebKDD Workshop (2000)

    Google Scholar 

  16. Marcus, A., Maletic, J.I.: Recovering documentation-to-source-code traceability links using latent semantic indexing. In: ICSE 2003: Proceedings of the 25th International Conference on Software Engineering, Washington, DC, USA, pp. 125–135. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  17. Marcus, A., Maletic, J.I.: Identification of high-level concept clones in source code. In: ASE 2001: Proceedings of the 16th IEEE International Conference on Automated Software Engineering, Washington, DC, USA, p. 107. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  18. Ebert, J.: Storm - a user story tool (2002), http://xpstorm.sourceforge.net

  19. Apache: Apache software foundation - bytecode engineering library (2002-2003) (2003), http://jakarta.apache.org/bcel/index.html

  20. Dumais, S.: Improving the retrieval of information from external sources. Behavior Research Methods, Instruments and Computers 23(2), 229–236 (1991)

    Article  Google Scholar 

  21. Dumais, S.: Latent semantic indexing (lsi) and trec-2. In: The Second Text REtrieval Conference (TREC2), National Institute of Standards and Technology Special Publication 500-215, pp. 105–116 (1994)

    Google Scholar 

  22. Zelikovitz, S., Hirsh, H.: Using lsi for text classification in the presence of background text. In: CIKM 2001: Proceedings of the tenth international conference on Information and knowledge management, pp. 113–118. ACM Press, New York (2001)

    Chapter  Google Scholar 

  23. Berry, M.: Large scale singular value computations. Int. Journal of Supercomputer Applications 6, 13–49 (1992)

    Google Scholar 

  24. Bezos, J.: Amazon.com plc., Seattle, WA, USA, 98108–91226 (2004), www.amazon.com

  25. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)

    Google Scholar 

  26. McCarey, F., Cinnéide, M.O., Kushmerick, N.: Knowledge reuse for software reuse. In: Proceedings of the 17th International Conference on Software Engineering and Knowledge Engineering (2005)

    Google Scholar 

  27. van Rijsbergen, C.: Information Retrieval. Butterworths, London (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McCarey, F., Cinnéide, M.Ó., Kushmerick, N. (2006). Recommending Library Methods: An Evaluation of the Vector Space Model (VSM) and Latent Semantic Indexing (LSI). In: Morisio, M. (eds) Reuse of Off-the-Shelf Components. ICSR 2006. Lecture Notes in Computer Science, vol 4039. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11763864_16

Download citation

  • DOI: https://doi.org/10.1007/11763864_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34606-7

  • Online ISBN: 978-3-540-34607-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics