Empirical Software Engineering

, Volume 23, Issue 3, pp 1594–1663 | Cite as

Challenges and pitfalls on surveying evidence in the software engineering technical literature: an exploratory study with novices

  • Talita Vieira Ribeiro
  • Jobson Massollar
  • Guilherme Horta Travassos


The evidence-based software engineering approach advocates the use of evidence from empirical studies to support the decisions on the adoption of software technologies by practitioners in the software industry. To this end, many guidelines have been proposed to contribute to the execution and repeatability of literature reviews, and to the confidence of their results, especially regarding systematic literature reviews (SLR). To investigate similarities and differences, and to characterize the challenges and pitfalls of the planning and generated results of SLR research protocols dealing with the same research question and performed by similar teams of novice researchers in the context of the software engineering field. We qualitatively compared (using Jaccard and Kappa coefficients) and evaluated (using DARE) same goal SLR research protocols and outcomes undertaken by similar research teams. Seven similar SLR protocols regarding quality attributes for use cases executed in 2010 and 2012 enabled us to observe unexpected differences in their planning and execution. Even when the participants reached some agreement in the planning, the outcomes were different. The research protocols and reports allowed us to observe six challenges contributing to the divergences in the results: researchers’ inexperience in the topic, researchers’ inexperience in the method, lack of clearness and completeness of the papers, lack of a common terminology regarding the problem domain, lack of research verification procedures, and lack of commitment to the SLR. According to our findings, it is not possible to rely on results of SLRs performed by novices. Also, similarities at a starting or intermediate step during different SLR executions may not directly translate to the next steps, since non-explicit information might entail differences in the outcomes, hampering the repeatability and confidence of the SLR process and results. Although we do have expectations that the presence and follow-up of a senior researcher can contribute to increasing SLRs’ repeatability, this conclusion can only be drawn upon the existence of additional studies on this topic. Yet, systematic planning, transparency of decisions and verification procedures are key factors to guarantee the reliability of SLRs.


Novice researchers Systematic literature review Evidence-based software engineering Exploratory study 



We thank Daniela Cruzes, Marcela Genero, Martin Höst, Natalia Juristo, Nelly Condori-Fernandez, Oscar Dieste and Oscar Pastor for the initial discussions at ISERN 2009 that started this work; Vitor Faria Monteiro for his contribution to the original protocol planning; David Budgen for suggestions regarding an earlier version of this study report; all students for their engagement during the Experimental Software Engineering course in 2010 and 2012, and also the CNPq and CAPES for supporting this research. Prof. Travassos is a CNPq Researcher.


  1. Babar, M. A., Zhang, H (2009) Systematic literature reviews in software engineering: preliminary results from interviews with researchers. Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Lake Buena Vista: IEEEGoogle Scholar
  2. Basili VR (1992) Software modeling and measurement: the goal/question/metric paradigm. Technical Report. University of Maryland at College Park: College Park, MD, p 24Google Scholar
  3. Biolchini J et al (2005) Systematic review in software engineering. Federal University of Rio de Janeiro. Rio de Janeiro, p 31. (RT-ES 679/05). Available at: Accessed 17 Aug 2017
  4. Brereton P (2011) A study of computing undergraduates undertaking a systematic literature review. IEEE Trans Educ 54(4):558–563CrossRefGoogle Scholar
  5. Carver JC et al (2013) Identifying barriers to the systematic literature. Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Baltimore: IEEE, p 203–213Google Scholar
  6. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRefGoogle Scholar
  7. Condori-Fernandez N et al (2009) A systematic mapping study on empirical evaluation of software requirements specifications techniques. Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Lake Buena Vista: IEEE, p 502–505Google Scholar
  8. Corbin, J.; Strauss, A. (2007) Basics of qualitative research: techniques and procedures for developing grounded theory. 3. ed. [S.l.]: Thousand Oaks, SAGE Publicationse. ISBN 978-1412906449Google Scholar
  9. Dias Neto AC et al (2007) Characterization of model-based software testing approaches. PESC/COPPE/UFRJ. Rio de Janeiro. (ES-713/07). Available at: Accessed 17 Aug 2017
  10. Diest O, Grimán A, Juristo N (2009) Developing search strategies for detecting relevant experiments. Empir Softw Eng 14(5):513–539CrossRefGoogle Scholar
  11. Dybå T, Kitchenham B, Jørgensen M (2005) Evidence-based software engineering for practitioners. IEEE Softw 22(1):58–65CrossRefGoogle Scholar
  12. Fantechi A et al (2002) Application of linguistic techniques for use case analysis. Proceedings of the IEEE Joint International Conference on Requirements Engineering. Essen: IEEE, p 157–164Google Scholar
  13. Garousi V, Eskandar MM, Herkiloglu K (2016) Industry–academia collaborations in software testing: experience and success stories from Canada and Turkey. Softw Qual J:1–53.
  14. Hassler E et al (2014) Outcomes of a community workshop to identify and rank barriers to the systematic literature review process. Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. London: ACM. No. 31Google Scholar
  15. Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11(2):37–50CrossRefGoogle Scholar
  16. Kasoju A, Petersen K, Mäntylä MV (2013) Analyzing an automotive testing process with evidence-based software engineering. Inf Softw Technol 55(7):1237–1259. CrossRefGoogle Scholar
  17. Kitchenham B; Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Keele University and University of Durham. Keele/Durham, p 65. (EBSE-2007-01)Google Scholar
  18. Kitchenham B et al (2011) Repeatability of systematic literature reviews. Proceedings of the 15th International Conference on Evaluation and Assessment in Software Engineering. Durham: IEEE, p 46–55Google Scholar
  19. Kitchenham B; Brereton P; Budgen, D (2012) Mapping study completeness and reliability - a case study. Proceedings of the 16th International Conference on Evaluation and Assessment in Software Engineering. Ciudad Real: IET, p 126–135Google Scholar
  20. Kuhrmann M, Fernández DM, Daneva M (2017) On the pragmatic design of literature studies in software engineering: an experience-based guideline. Empir Softw Eng, Springer, US, pp 2852–2891.
  21. Lavallée M, Robillard P-N, Mirsalari R (2014) Performing systematic literature reviews with novices: an iterative approach. IEEE Trans Educ 57(3):175–181CrossRefGoogle Scholar
  22. López L, Costal D, Ayala CP, Franch X, Annosi MC, Glott R, Haaland K (2015) Adoption of OSS components: A goal-oriented approach. Data Knowl Eng 99:17–38. CrossRefGoogle Scholar
  23. Losavio F et al (2004) Designing quality architecture: incorporating ISO standards into the unified process. Inf Syst Manag 21(1):27–44CrossRefGoogle Scholar
  24. MacDonell S et al (2010) How reliable are systematic reviews in empirical software engineering? IEEE Trans Softw Eng 36(5):676–687CrossRefGoogle Scholar
  25. Munir H, Moayyed M, Petersen K (2014) Considering rigor and relevance when evaluating test driven development: a systematic review. Inf Softw Technol 56(4):375–394CrossRefGoogle Scholar
  26. NHS Centre for Reviews and Dissemination, University of York (2002) The Database of Abstracts of Reviews of Effects (DARE). Effect Mat 6(2):1–4Google Scholar
  27. Oates BJ, Capper G (2009) Using systematic reviews and evidence-based software engineering with masters students. Proceedings of the 13th International Conference on Evaluation and Assessment in Software Engineering. Durham: British Computer. Society:79–87Google Scholar
  28. Pai M et al (2004) Systematic reviews and meta-analyses: an illustrated, step-by-step guide. Natl Med J India 17(2):86–95MathSciNetGoogle Scholar
  29. Petersen K; Ali NB (2011) Identifying strategies for study selection in systematic reviews and maps. Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement. Banff: IEEE, p 351–354Google Scholar
  30. Petersen K et al (2008) Systematic mapping studies in software engineering. Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering. Bari: British Computer SocietyGoogle Scholar
  31. Petersen K, Vakkalanka S, Kuzniarz L (2015) Guidelines for conducting systematic mapping studies in software engineering: an update. Inf Softw Technol 64(1):1–18CrossRefGoogle Scholar
  32. Phalp KT, Vincent J, Cox K (2007) Assessing the quality of use case descriptions. Softw Qual J 15(1):69–97CrossRefGoogle Scholar
  33. Preiss O; Wegmann A; Wong J (2001) On quality attribute based software engineering. Proceedings of the 27th Euromicro Conference. Warsaw: IEEE, p 114–120Google Scholar
  34. Rago A, Marcos C, Diaz-Pace JA (2013) Uncovering quality-attribute concerns in use case specifications via early aspect mining. Requir Eng 18(1):67–84CrossRefGoogle Scholar
  35. Rainer A, Hall T, Baddoo N (2006) A preliminary empirical investigation of the use of evidence based software engineering by undergraduate students. Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering. Keele: British Computer. Society:91–100Google Scholar
  36. Ramos R et al (2009) Quality improvement for use case model. Proceedings of the 23rd Brazilian Symposium on Software Engineering. Fortaleza: IEEE, p 187–195Google Scholar
  37. Riaz M et al (2010) Experiences conducting systematic reviews from novices' perspective. Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering. Swinton: British Computer. Society:44–53Google Scholar
  38. Shull F, Rus I, Basili V (2000) How perspective-based reading can improve requirements. Computer 33(7):73–39CrossRefGoogle Scholar
  39. Travassos GH et al (2008) An environment to support large scale experimentation in software engineering. Proceedings of the 13rd IEEE International Conference on Engineering of Complex Computer Systems. Belfast: IEEE, p 193–202Google Scholar
  40. Ulziit B, Warraich ZA, Gencel C, Petersen K (2015) A conceptual framework of challenges and solutions for managing global software maintenance. Journal of Software: Evolution and Process 27(10):763–792. Google Scholar
  41. Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363Google Scholar
  42. Wohlin C (2014) Writing for synthesis of evidence in empirical software engineering. Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Torino: ACM. No. 46Google Scholar
  43. Wohlin C et al (2013) On the reliability of mapping studies in software engineering. J Syst Softw 86(10):2594–2610CrossRefGoogle Scholar
  44. Zhang H, Babar MA (2010) On searching relevant studies in software engineering. Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering. Swindon: ACM, p 111-120Google Scholar
  45. Zhang H, Babar MA, Tell P (2011) Identifying relevant studies in software engineering. Journal. Inf Softw Technol 53(6):625–637CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Talita Vieira Ribeiro
    • 1
  • Jobson Massollar
    • 1
  • Guilherme Horta Travassos
    • 1
  1. 1.Systems Engineering and Computer Science Program (PESC/COPPE)Federal University of Rio de JaneiroRio de JaneiroBrazil

Personalised recommendations