Skip to main content
Log in

Sampling in software engineering research: a critical review and guidelines

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Representative sampling appears rare in empirical software engineering research. Not all studies need representative samples, but a general lack of representative sampling undermines a scientific field. This article therefore reports a critical review of the state of sampling in recent, high-quality software engineering research. The key findings are: (1) random sampling is rare; (2) sophisticated sampling strategies are very rare; (3) sampling, representativeness and randomness often appear misunderstood. These findings suggest that software engineering research has a generalizability crisis. To address these problems, this paper synthesizes existing knowledge of sampling into a succinct primer and proposes extensive guidelines for improving the conduct, presentation and evaluation of sampling in software engineering research. It is further recommended that while researchers should strive for more representative samples, disparaging non-probability sampling is generally capricious and particularly misguided for predominately qualitative research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

Supplementary materials, which have been archived on Zenodo (Baltes and Ralph 2020), include:

– An Excel spreadsheet containing the complete list of articles, all of the extracted data and all of our analyses;

– The scripts we used to retrieve sampling frame and sample.Footnote 9

Notes

  1. https://github.com

  2. Diversity can be defined along many different axes, gender being one of them Vasilescu et al. (2015).

  3. http://respondentdrivensampling.org/

  4. True random number generation is available from numerous sources, including https://www.random.org/

  5. https://www.core.edu.au/conference-portal

  6. https://github.com/sbaltes/dblp-retriever

  7. https://dblp.uni-trier.de/

  8. https://www.random.org/

  9. a more recent version of dblp-retriever may be available at https://github.com/sbaltes/dblp-retriever

References

  • Amir B, Ralph P (2018) There is no random sampling in software engineering research. In: Proceedings of the 40th international conference on software engineering: companion proceeedings, pp 344–345

  • Arnett JJ (2008) The neglected 95%: why American psychology needs to become less American. Am Psychol 63(7):602

    Article  Google Scholar 

  • Baltes S, Diehl S (2016) Worse than spam: issues in sampling software developers. In: Genero M, Jedlitschka A, Jorgensen M (eds) 10th international symposium on Empirical Software Engineering and Measurement (ESEM 2016), ACM, Ciudad Real, Spain, pp 52:1–52:6, DOI https://doi.org/10.1145/2961111.2962628, (to appear in print)

  • Baltes S, Ralph P (2020) Sampling in software engineering research supplementary material [data set]. https://doi.org/10.5281/zenodo.3666824

  • Baltes S, Dumani L, Treude C, Diehl S (2018) SOTOrrent: reconstructing and analyzing the evolution stack overflow posts. In: Zaidman A, Hill E, Kamei Y (eds) 15th international conference on Mining Software Repositories (MSR 2018), ACM, Gothenburg, Sweden, pp 319–330

  • Beecham S, Baddoo N, Hall T, Robinson H, Sharp H (2008) Motivation in software engineering: a systematic literature review. Inf Softw Technol 50(9):860–878. https://doi.org/10.1016/j.infsof.2007.09.004

    Article  Google Scholar 

  • Breckenridge J, Jones D (2009) Demystifying theoretical sampling in grounded theory research. Grounded Theory Rev 8(2):112–126

    Google Scholar 

  • Caine K (2016) Local standards for sample size at chi. In: Proceedings of the 2016 CHI conference on human factors in computing systems, ACM, New York, NY, USA, CHI ’16, pp 981–992

  • Charmaz K (2014) Constructing grounded theory. Sage, London

    Google Scholar 

  • Checkland P, Holwell S (1998) Action research: its nature and validity. Syst Pract Action Res 11(1):9–21. https://doi.org/10.1023/A:1022908820784

    Article  Google Scholar 

  • Cochran WG (2007) Sampling techniques. John Wiley & Sons

  • Cohen J (1988) Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates, Hillsdale

    MATH  Google Scholar 

  • Cosentino V, Izquierdo JLC, Cabot J (2016) Findings from github: methods, datasets and limitations. In: 2016 IEEE/ACM 13th working conference on Mining Software Repositories (MSR), IEEE, pp 137–141

  • Daniel J (2011) Sampling essentials: practical guidelines for making sampling choices. Sage Publications

  • De Mello RM, Travassos GH (2016) Surveys in software engineering: Identifying representative samples. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement, pp 1–6

  • Dillman DA, Smyth JD, Christian LM (2014) Internet, phone, mail and mixed-mode surveys: the tailored design method, 4th edn. John Wiley & Sons, Hoboken

    Google Scholar 

  • Draucker CB, Martsolf DS, Ross R, Rusk TB (2007) Theoretical sampling and category development in grounded theory. Qual Health Res 17(8):1137–1148

    Article  Google Scholar 

  • Duignan B (2019) Postmodernism. In: Encyclopedia Britannica, Encyclopedia Britannica, Inc.. https://www.britannica.com/topic/postmodernism-philosophy

  • Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. In: Guide to advanced empirical software engineering, Springer, pp 285–311

  • Easton G (2010) One case study is enough. Lancaster University technical report https://eprints.lancs.ac.uk/id/eprint/49016/

  • Falessi D, Juristo N, Wohlin C, Turhan B, Münch J, Jedlitschka A, Oivo M (2018) Empirical software engineering experts on the use of students and professionals in experiments. Empir Softw Eng 23(1):452–489

    Article  Google Scholar 

  • Faugier J, Sargeant M (1997) Sampling hard to reach populations. J Adv Nurs 26(4):790–797

    Article  Google Scholar 

  • Feldt R, Zimmermann T, Bergersen GR, Falessi D, Jedlitschka A, Juristo N, Münch J, Oivo M, Runeson P, Shepperd M, Sjøberg DIK, Turhan B (2018) Four commentaries on the use of students and professionals in empirical software engineering experiments. Empir Softw Eng 23(6):3801–3820

    Article  Google Scholar 

  • Fitts PM (1954) The information capacity of the human motor system in controlling the amplitude of movement. J Exp Psychol 47(6):381–391

    Article  Google Scholar 

  • Foster E (2014) Software engineering: a methodical approach apress. New York, USA

  • Gentles SJ, Charles C, Ploeg J, McKibbon KA (2015) Sampling in qualitative research: insights from an overview of the methods literature. Qual Rep 20(11):1772–1789

    Google Scholar 

  • Glaser BG, Strauss AL (2017) Discovery of grounded theory: strategies for qualitative research. Routledge

  • Goel S, Salganik MJ (2010) Assessing respondent-driven sampling. Proceedings of the National Academy of Sciences 107 (15):6743–6747. https://doi.org/10.1073/pnas.1000261107

    Article  Google Scholar 

  • Gousios G (2013) The GHTorrent dataset and tool suite. In: Zimmermann T, Di Penta M, Kim S (eds) 10Th international working conference on Mining Software Repositories (MSR, vol 2013. IEEE, San Francisco, CA, USA, pp 233–236

  • Guba EG, Lincoln YS (1982) Epistemological and methodological bases of naturalistic inquiry. Educ Commun Technol J 30(4):233–252

    Article  Google Scholar 

  • Heckathorn DD (1997) Respondent-driven sampling: a new approach to the study of hidden populations. Soc Probl 44(2):174–199

    Article  Google Scholar 

  • Henrich J, Heine SJ, Norenzayan A (2010) The weirdest people in the world? Behav Brain Sci 33(2-3):61–83

    Article  Google Scholar 

  • Henry GT (1990) Practical sampling. Sage

  • van Hoeven LR, Janssen MP, Roes KC, Koffijberg H (2015) Aiming for a representative sample: simulating random versus purposive strategies for hospital selection. BMC Med Res Methodol 15(1):90

    Article  Google Scholar 

  • Huang X, Zhang H, Zhou X, Babar MA, Yang S (2018) Synthesizing qualitative research in software engineering: a critical review. In: Proceedings of the 40th international conference on software engineering, pp 1207–1218

  • Humbatova N, Jahangirova G, Bavota G, Riccio V, Stocco A, Tonella P (2020) Taxonomy of real faults in deep learning systems. In: 2020 IEEE/ACM 42nd international conference on software engineering (ICSE), IEEE

  • Ingram C, Drachen A (2020) How software practitioners use informal local meetups to share software engineering knowledge. In: 2020 IEEE/ACM 42nd international conference on software engineering (ICSE), IEEE

  • Johnston LG, Sabin K (2010) Sampling hard-to-reach populations with respondent driven sampling. Methodological Innovations Online 5(2):38–48. https://doi.org/10.4256/mio.2010.0017

    Article  Google Scholar 

  • Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Tech. rep., Keele University and University of Durham

  • Kitchenham B, Pfleeger SL (2002) Principles of survey research: Part 5: populations and samples. ACM SIGSOFT Softw Eng Notes 27(5):17–20

    Article  Google Scholar 

  • Kitchenham BA, Pfleeger SL (2008) Personal opinion surveys. In: Guide to advanced empirical software engineering, Springer, pp 63–92

  • Kruskal W, Mosteller F (1979a) Representative sampling, i: mon-scientific literature. Int Stat Rev 47(1):13–24

    Article  Google Scholar 

  • Kruskal W, Mosteller F (1979b) Representative sampling, iii: the current statistical literature. Int Stat Rev 47(3):245–265. https://doi.org/10.2307/1402647

    Article  MATH  Google Scholar 

  • Landon Jr, EL, Banks SK (1977) Relative efficiency and bias of Plus-One telephone sampling. J Mark Res 14(3):294. https://doi.org/10.2307/3150766

    Article  Google Scholar 

  • Lee AS, Baskerville RL (2003) Generalizing generalizability in information systems research. Inf Syst Res 14(3):221–243

    Article  Google Scholar 

  • Maalej W, Robillard MP (2013) Patterns of knowledge in api reference documentation. IEEE Trans Softw Eng 39(9):1264–1282

    Article  Google Scholar 

  • Malekinejad M, Johnston LG, Kendall C, Kerr LRFS, Rifkin MR, Rutherford GW (2008) Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS and Behavior 12(1):105–130. https://doi.org/10.1007/s10461-008-9421-1

    Article  Google Scholar 

  • van Manen M (2016) Phenomenology of practice: meaning-giving methods in phenomenological research and writing. Routledge

  • McElreath R (2020) Statistical rethinking: a Bayesian course with examples in R and Stan. CRC press

  • de Mello RM, Travassos GH (2015) Characterizing sampling frames in software engineering surveys. In: Proceedings of the Ibero-American conference on sofware engineering (CibSE)

  • de Mello RM, Da Silva PC, Travassos GH (2015) Investigating probabilistic sampling approaches for large-scale surveys in software engineering. J Softw Eng Res Dev 3(1):1–26

    Article  Google Scholar 

  • Miles MB, Huberman AM, Saldaña J (2014) Qualitative data analysis: a methods sourcebook, 4th edn. Sage, Thousand Oaks, California, USA

    Google Scholar 

  • Mohanani R, Turhan B, Ralph P (2019) Requirements framing affects design creativity. IEEE Trans Softw Eng

  • Moher D, Liberati A, Tetzlaff J, Altman DG, Group P et al (2009) Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. PLoS Med 6(7):e1000097

    Article  Google Scholar 

  • Mullinix KJ, Leeper TJ, Druckman JN, Freese J (2015) The generalizability of survey experiments. J Exp Polit Sci 2(2):109–138

    Article  Google Scholar 

  • Nagappan M, Zimmermann T, Bird C (2013) Diversity in software engineering research. In: Proceedings of the 9th joint meeting on foundations of software engineering, ACM, pp 466–476

  • Patton MQ (2014) Qualitative research & evaluation methods: integrating theory and practice. Sage Publications

  • Paulson JW, Succi G, Eberlein A (2004) An empirical study of open-source and closed-source software products. IEEE Trans Softw Eng 30(4):246–256. https://doi.org/10.1109/TSE.2004.1274044

    Article  Google Scholar 

  • Ralph P (2019) Toward methodological guidelines for process theories and taxonomies in software engineering. IEEE Trans Softw Eng 45(7):712–735

    Article  Google Scholar 

  • Ralph P, Ali Nb, Baltes S, Bianculli D, Diaz J, Dittrich Y, Ernst N, Felderer M, Feldt R, Filieri A et al (2020a) Empirical standards for software engineering research. arXiv:201003525

  • Ralph P, Baltes S, Adisaputri G, Torkar R, Kovalenko V, Kalinowski M, Novielli N, Yoo S, Devroey X, Tan X et al (2020b) Pandemic programming: how covid-19 affects software developers and how their organizations can help. Empir Softw Eng https://doi.org/10.1007/s10664-020-09875-y

  • Russo D, Stol K (in press) Gender differences in personality traits of software engineers. IEEE Trans Softw Eng https://doi.org/10.1109/TSE.2020.3003413

  • Salleh N, Hoda R, Su MT, Kanij T, Grundy J (2018) Recruitment, engagement and feedback in empirical software engineering studies in industrial contexts. Inform Software Technol 98:161–172

    Article  Google Scholar 

  • Sax LJ, Gilmartin SK, Bryant AN (2003) Assessing response rates and nonresponse bias in web and paper surveys. Res High Educ 44(4):409–432

    Article  Google Scholar 

  • Sedano T, Ralph P, Péraire C (2019) The product backlog. In: 2019 IEEE/ACM 41St international conference on software engineering (ICSE), IEEE, pp 200–211

  • Sjøberg D, Anda B, Arisholm E, Dyba T, Jørgensen M, Karahasanovic A, Koren EF, Vokac M (2002) Conducting realistic experiments in software engineering. In: 2002 international symposium on empirical software engineering. IEEE, Nara, Japan, pp 17–26, DOI https://doi.org/10.1109/ISESE.2002.1166921, (to appear in print)

  • Stol KJ, Fitzgerald B (2018) The abc of software engineering research. ACM Transactions on Software Engineering and Methodology (TOSEM) 27 (3):11

    Article  Google Scholar 

  • Stol KJ, Ralph P, Fitzgerald B (2016) Grounded theory in software engineering research: a critical review and guidelines. In: Proceedings of the international conference on software engineering, IEEE, Austin, TX, USA, pp 120–131

  • Tempero E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J (2010) The qualitas corpus: a curated collection of java code for empirical studies. In: Proceedings of the 17th Asia Pacific software engineering conference. IEEE, Sydney, Australia, pp 336–345, DOI https://doi.org/10.1109/APSEC.2010.46, (to appear in print)

  • Theisen C, Dunaiski M, Williams L, Visser W (2018) Software engineering research at the international conference on software engineering in 2016. ACM SIGSOFT Software Engineering Notes 42(4):1–7

    Article  Google Scholar 

  • Thomas G, Myers K (2015) The anatomy of the case study. Sage

  • Thompson SK (1990) Adaptive cluster sampling. J Am Stat Assoc 85(412):1050–1059

    Article  MathSciNet  Google Scholar 

  • Toepoel V (2012) Effects of incentives in surveys. In: Gideon L (ed) Handbook of survey methodology for the social sciences, springer, pp 209–223

  • Torchiano M, Fernández DM, Travassos GH, de Mello RM (2017) Lessons learnt in conducting survey research. In: 2017 IEEE/ACM 5th international workshop on Conducting Empirical Studies in Industry (CESI), IEEE, pp 33–39

  • Trochim WM, Donnelly JP (2001) Research methods knowledge base, vol 2. Atomic Dog Publishing, Cincinnati, OH, USA

    Google Scholar 

  • Trost JE (1986) Statistically nonrepresentative stratified sampling: a sampling technique for qualitative studies. Qual Sociol 9(1):54–57

    Article  Google Scholar 

  • Turk P, Borkowski JJ (2005) A review of adaptive cluster sampling: 1990–2003. Environ Ecol Stat 12(1):55–94

    Article  MathSciNet  Google Scholar 

  • Valliant R, Dever JA, Kreuter F (2018) Designing multistage samples. In: Practical tools for designing and weighting survey samples, Springer, pp 209–264

  • Vasilescu B, Posnett D, Ray B, van den Brand MG, Serebrenik A, Devanbu P, Filkov V (2015) Gender and Tenure Diversity in GitHub Teams. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems - CHI ’15. ACM Press, Seoul, Republic of Korea, pp 3789–3798, DOI https://doi.org/10.1145/2702123.2702549, (to appear in print)

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer Science & Business Media

  • Yin RK (2018) Case study research: Design and methods, 6th edn. Sage, Thousand Oaks, California

    Google Scholar 

  • Zannier C, Melnik G, Maurer F (2006) On the success of empirical studies in the international conference on software engineering. In: Proceedings of the 28th international conference on software engineering, pp 341–350

  • Zhang H, Huang X, Zhou X, Huang H, Babar MA (2019) Ethnographic research in software engineering: a critical review and checklist. In: Proceedings of the 2019 27th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 659–670

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Baltes.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Filippo Lanubile

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baltes, S., Ralph, P. Sampling in software engineering research: a critical review and guidelines. Empir Software Eng 27, 94 (2022). https://doi.org/10.1007/s10664-021-10072-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10072-8

Keywords

Navigation