Empirical Software Engineering

, Volume 14, Issue 1, pp 57–92 | Cite as

Assessing IR-based traceability recovery tools through controlled experiments

  • Andrea De Lucia
  • Rocco Oliveto
  • Genoveffa Tortora


We report the results of a controlled experiment and a replication performed with different subjects, in which we assessed the usefulness of an Information Retrieval-based traceability recovery tool during the traceability link identification process. The main result achieved in the two experiments is that the use of a traceability recovery tool significantly reduces the time spent by the software engineer with respect to manual tracing. Replication with different subjects allowed us to investigate if subjects’ experience and ability play any role in the traceability link identification process. In particular, we made some observations concerning the retrieval accuracy achieved by the software engineers with and without the tool support and with different levels of experience and ability.


Traceability recovery Information retrieval Latent semantic indexing Singular value decomposition Program comprehension Impact analysis 



We would like to thank the anonymous reviewers for their detailed, constructive, and thoughtful comments that helped us to improve the presentation of the results in this paper. We are very grateful to Dr. Massimiliano Di Penta of University of Sannio, Italy, for his constructive comments that helped us to improve the presentation of the experimental results in this paper. Special thanks are also due to the students who were involved in the experiment as subjects. The work described in this paper is supported by the project METAMORPHOS (MEthods and Tools for migrAting software systeMs towards web and service Oriented aRchitectures: exPerimental evaluation, usability, and tecHnOlogy tranSfer), funded by MiUR (Ministero dell’Università e della Ricerca) under grant PRIN-2006-2006098097.


  1. Antoniol G, Casazza G, Cimitile A (2000a) Traceability recovery by modelling programmer behaviour. In: Proceedings of 7th working conference on reverse engineering, vol 240–247. IEEE CS, BrisbaneGoogle Scholar
  2. Antoniol G, Canfora G, Casazza G, De Lucia A (2000b) Identifying the starting impact set of a maintenance request. In: Proceedings of 4th European conference on software maintenance and reengineering. IEEE CS, Zurich, pp 227–230CrossRefGoogle Scholar
  3. Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng 28(10):970–983CrossRefGoogle Scholar
  4. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, ReadingGoogle Scholar
  5. Basili VR, Selby RW, Hutchens DH (1986) Experimentation in software engineering. IEEE Trans Softw Eng 12(7):758–773Google Scholar
  6. Bruegge B, De Lucia A, Fasano F, Tortora G (2006) Supporting distributed software development with fine-grained artefact management. In: Proceedings of 2nd international conference on global software engineering. Florianopolis, 16–19 October 2006, pp 213–222Google Scholar
  7. Cleland-Huang J, Settimi R, Duan C, Zou X (2005) Utilizing supporting evidence to improve dynamic requirements traceability. In: Proceedings of 13th IEEE international requirements engineering conference. IEEE CS, Paris, pp 135–144CrossRefGoogle Scholar
  8. Conover WJ (1998) Practical nonparametric statistics, 3rd edn. Wiley, New YorkGoogle Scholar
  9. Cullum JK, Willoughby RA (1998) Lanczos algorithms for large symmetric eigenvalue computations, vol 1, chapter real rectangular matrices. Birkhauser, BostonGoogle Scholar
  10. De Lucia A, Oliveto R, Sgueglia P (2006a) Incremental approach and user feedbacks: a silver bullet for traceability recovery. In: Proceedings of 22nd IEEE international conference on software maintenance. IEEE CS, Philadelphia, pp 299–309CrossRefGoogle Scholar
  11. De Lucia A, Di Penta M, Oliveto R, Zurolo F (2006b) Improving comprehensibility of source code via traceability information: a controlled experiment. In: Proceedings of 14th IEEE international conference on program comprehension. IEEE CS, Athens, pp 317–326CrossRefGoogle Scholar
  12. De Lucia A, Fasano F, Francese R, Tortora G (2004) ADAMS: an artefact-based process support system. In: Proceedings of 16th international conference on software engineering and knowledge engineering. KSI, Banff, pp 31–36Google Scholar
  13. De Lucia A, Oliveto R, Tortora G (2007a) Recovering traceability links using information retrieval tools: a controlled experiment. In: Proceedings of international symposium on grand challenges in traceability. ACM, Lexington, pp 46–55Google Scholar
  14. De Lucia A, Fasano F, Oliveto R, Tortora G (2007b) Recovering traceability links in software artefact management systems using information retrieval methods. ACM Trans Softw Eng Methodol 16(4):13CrossRefGoogle Scholar
  15. De Lucia A, Oliveto R, Tortora G (2008) ADAMS re-trace: traceability link recovery via latent semantic indexing. In: Proceedings of 30th IEEE/ACM international conference on software engineering. ACM, Leipzig, pp 839–842Google Scholar
  16. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRefGoogle Scholar
  17. Devore JL, Farnum N (1999) Applied statistics for engineers and scientists. Brooks/Cole, DuxburyGoogle Scholar
  18. Di Penta M, Gradara S, Antoniol G (2002) Traceability recovery in RAD software systems. In: Proceedings of 10th international workshop in program comprehension. IEEE CS, Paris, pp 207–216CrossRefGoogle Scholar
  19. Domges R, Pohl K (1998) Adapting traceability environments to project specific needs. Commun ACM 41(12):55–62CrossRefGoogle Scholar
  20. Duan C, Cleland-Huang J (2007) Clustering support for automated tracing. In: Proceedings of 22nd IEEE/ACM international conference on automated software engineering. ACM, Atlanta, pp 244–253Google Scholar
  21. Dumais ST (1991) Improving the retrieval of information from external sources. Behav Res Meth Instrum Comput 23:229–236Google Scholar
  22. Dumais ST (1993) LSI meets TREC: a status report. In: Proceedings of the first text retrieval conference (TREC-1). NIST Special Publication, pp 137–152Google Scholar
  23. Gotel O, Finkelstein A (1994) An analysis of the requirements traceability problem. In: Proceedings of 1st international conference on requirements engineering. IEEE CS, Colorado Springs, pp 94–101CrossRefGoogle Scholar
  24. Harman D (1992) Information retrieval: data structures and algorithms, chapter ranking algorithms. Prentice-Hall, Englewood Cliffs, pp 363–392Google Scholar
  25. Hayes JH, Dekhtyar A, Osborne J (2003) Improving requirements tracing via information retrieval. In: Proceedings of 11th IEEE international requirements engineering conference. IEEE CS, Monterey, pp 138–147Google Scholar
  26. Hayes JH, Dekhtyar A, Sundaram SK (2006) Advancing candidate link generation for requirements tracing: the study of methods. IEEE Trans Softw Eng 32(1):4–19CrossRefGoogle Scholar
  27. Juristo N, Moreno A (2001) Basics of software engineering experimentation. Kluwer Academic, DordrechtMATHGoogle Scholar
  28. Leffingwell D (1997) Calculating your return on investment from more effective requirements management. Technical report, Rational Software CorporationGoogle Scholar
  29. Lin J, Lin CC, Cleland-Huang J, Settimi R, Amaya J, Bedford G, Berenbach B, Khadra OB, Duan C, Zou X (2006) Poirot: a distributed tool supporting enterprise-wide automated traceability. In: Proceedings of 14th IEEE international requirements engineering conference. IEEE CS, Minneapolis, pp 356–357Google Scholar
  30. Lormans M, van Deursen A (2006) Can LSI help reconstructing requirements traceability in design and test? In: Proceedings of 10th European conference on software maintenance and reengineering. IEEE CS, Bari, pp 45–54Google Scholar
  31. Lormans M, Gross H, van Deursen A, van Solingen R, Stehouwer A (2006) Monitoring requirements coverage using reconstructed views: an industrial case study. In: Proceedings of 13th working conference on reverse enginering. IEEE CS, Benevento, pp 275–284CrossRefGoogle Scholar
  32. Marcus A, Maletic JI (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Proceedings of 25th international conference on software engineering. IEEE CS, Portland, pp 125–135CrossRefGoogle Scholar
  33. Marcus A, Xie X, Poshyvanyk D (2005) When and how to visualize traceability links? In: Proceedings of 3rd international workshop on traceability in emerging forms of software engineering. ACM, Long Beach, pp 56–61CrossRefGoogle Scholar
  34. Oliveto R (2008) Traceability management meets information retrieval methods: strengths and limitations. PhD thesis, University of Salerno, March. www.sesa.dmi.unisa.it/thesis/oliveto.pdf
  35. Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, LondonGoogle Scholar
  36. Pfleeger SL, Menezes W (2000) Marketing technology to software practitioners. IEEE Softw 17(1):27–33CrossRefGoogle Scholar
  37. Pinhero FAC, Goguen JA (1996) An object-oriented tool for tracing requirements. IEEE Softw 13(2):52–64CrossRefGoogle Scholar
  38. Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137Google Scholar
  39. Ricca F, Di Penta M, Torchiano M, Tonella P, Ceccato M (2007) The role of experience and ability in comprehension tasks supported by UML stereotypes. In: Proceedings of 29th international conference on software engineering. IEEE Computer Society, Minneapolis, pp 375–384Google Scholar
  40. Settimi R, Cleland-Huang J, Ben Khadra O, Mody J, Lukasik W, De Palma C (2004) Supporting software evolution through dynamically retrieving traces to UML artifacts. In: Proceedings of 7th IEEE international workshop on principles of software evolution. IEEE CS, Kyoto, pp 49–54CrossRefGoogle Scholar
  41. Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in software engineering—an introduction. Kluwer, DeventerMATHGoogle Scholar
  42. Yadla S, Huffman Hayes J, Dekhtyar A (2005) Tracing requirements to defect reports: an application of information retrieval techniques. Innov Syst Softw Eng NASA J 1(2):116–124CrossRefGoogle Scholar
  43. Zou X, Settimi R, Cleland-Huang J (2007) Term-based enhancement factors for improving automated requirement trace retrieval. In: Proceedings of international symposium on grand challenges in traceability. ACM, Lexington, pp 40–45Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Andrea De Lucia
    • 1
  • Rocco Oliveto
    • 1
  • Genoveffa Tortora
    • 1
  1. 1.Department of Mathematics and InformaticsUniversity of SalernoFisciano (SA)Italy

Personalised recommendations