Empirical Software Engineering

, Volume 13, Issue 1, pp 97–121 | Cite as

Evaluating guidelines for reporting empirical software engineering studies

  • Barbara Kitchenham
  • Hiyam Al-Khilidar
  • Muhammed Ali Babar
  • Mike Berry
  • Karl Cox
  • Jacky Keung
  • Felicia Kurniawati
  • Mark Staples
  • He Zhang
  • Liming Zhu
Article

Abstract

Background

Several researchers have criticized the standards of performing and reporting empirical studies in software engineering. In order to address this problem, Jedlitschka and Pfahl have produced reporting guidelines for controlled experiments in software engineering. They pointed out that their guidelines needed evaluation. We agree that guidelines need to be evaluated before they can be widely adopted.

Aim

The aim of this paper is to present the method we used to evaluate the guidelines and report the results of our evaluation exercise. We suggest our evaluation process may be of more general use if reporting guidelines for other types of empirical study are developed.

Method

We used a reading method inspired by perspective-based and checklist-based reviews to perform a theoretical evaluation of the guidelines. The perspectives used were: Researcher, Practitioner/Consultant, Meta-analyst, Replicator, Reviewer and Author. Apart from the Author perspective, the reviews were based on a set of questions derived by brainstorming. A separate review was performed for each perspective. The review using the Author perspective considered each section of the guidelines sequentially.

Results

The reviews detected 44 issues where the guidelines would benefit from amendment or clarification and 8 defects.

Conclusions

Reporting guidelines need to specify what information goes into what section and avoid excessive duplication. The current guidelines need to be revised and then subjected to further theoretical and empirical validation. Perspective-based checklists are a useful validation method but the practitioner/consultant perspective presents difficulties.

Categories and Subject Descriptors

K.6.3 [Software Engineering]: Software Management—Software process.

General Terms

Management, Experimentation.

Keywords

Controlled experiments Software engineering Guidelines Perspective-based reading Checklist-based reviews 

References

  1. Abdelnabi Z, Cantone G, Ciolkowski M, Rombach D (2004) Comparing code reading techniques applied to object-oriented software frameworks with regard to effectiveness and defect detection rate Proceedings ISESE 04.Google Scholar
  2. Abrahao S, Poels G, Pastor O (2004) Assessing the reproducibility and accuracy of functional size measurement methods through experimentation, Proceedings ISESE 04.Google Scholar
  3. Dybå T, Kampenes VB, Sjøberg DIK (2006) A systematic review of statistical power in software engineering experiments. Inf Softw Technol 48(8):745–755CrossRefGoogle Scholar
  4. Harris P (2002) Designing and reporting experiments in psychology, 2nd edn. Open University Press.Google Scholar
  5. Hartley J (2004) Current findings from research on structured abstracts. J Med Libr Assoc 92(3):368–371Google Scholar
  6. Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. IESE-Report IESE-035.5/EGoogle Scholar
  7. Kitchenham B (2004) Procedures for performing systematic reviews. Joint Technical Report, Keele University TR/SE-0401 and NICTA 0400011T.1, JulyGoogle Scholar
  8. Kitchenham B, Pfleeger SL, Pickard L, Jones P, Hoaglin D, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734CrossRefGoogle Scholar
  9. Kitchenham B, Al-Khilidar H, Ali Babar M, Berry M, Cox K, Keung J, Kurniawati F, Staples M, Zang H, Zhu L (2006) Evaluating guidelines for empirical software engineering studies, ISESE06, BrazilGoogle Scholar
  10. Moher D, Schultz KF, Altman D (2001) The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. Lancet 357:1191–1194, April 14CrossRefGoogle Scholar
  11. Pickard LM, Kitchenham BA, Jones P (1998) Combining empirical results in software engineering. Inform Softw Technol 40(14):811–821CrossRefGoogle Scholar
  12. Schroeder PJ, Bolaki P, Gopu V (2004) Comparing the fault detection effectiveness of N-way and random test suites, Proceedings ISESE 04 Google Scholar
  13. Sjøberg DIK, Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg N-K, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng (9):733–753, September 31Google Scholar
  14. Shull Forest, Rus Ionna, Basili Victor (2000) How perspective-Reading can Improve Requirements Inspection. IEEE Computer 73–78, JulyGoogle Scholar
  15. Verelst Jan (2004) The influence of the level of abstraction on the evolvability of conceptual models of information systems. Proceedings ISESE 04 Google Scholar
  16. Wohlin C, Runeson P, Höst M, Regnell B, Wesslén A (2000) Experimentation in software engineering. An introduction. Kluwer Academic PublishersGoogle Scholar
  17. Wohlin C, Petersson H, Aurum A (2003) Combining data from reading experiments in software inspections. In: Juristo N, Moreno A (eds) Lecture notes on empirical software engineering. World Scientific PublishingGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Barbara Kitchenham
    • 4
  • Hiyam Al-Khilidar
    • 1
    • 3
  • Muhammed Ali Babar
    • 2
  • Mike Berry
    • 1
    • 3
  • Karl Cox
    • 1
  • Jacky Keung
    • 1
    • 3
  • Felicia Kurniawati
    • 1
  • Mark Staples
    • 1
  • He Zhang
    • 1
    • 3
  • Liming Zhu
    • 1
  1. 1.National ICT Australia LtdSydneyAustralia
  2. 2.Lero, The Irish Software Engineering Research CentreUniversity of LimerickLimerickIreland
  3. 3.School of Computer Science & EngineeringUniversity of New South WalesSydneyAustralia
  4. 4.Keele UniversitySchool of Computing and MathematicsStaffordshireUK

Personalised recommendations