Skip to main content
Log in

Data collection in global software engineering research: learning from past experience

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Global Software Engineering has become a standard in today’s software industry. Research in distributed software development poses severe challenges that are due to the spatial and temporal distribution of the actors, as well as to language, intercultural and organizational aspects. These challenges occur in addition to “traditional” challenges of the domain itself in large-scale software projects, like coordination and communication issues, requirements volatily, lack of domain knowledge, among others. While several authors have reported empirical studies of global software development projects, the methodological difficulties and challenges of this type of studies have not been sufficiently discussed. In this paper, we share our experiences of collecting and analysing qualitative data in the context of Global Software Engineering projects. We discuss strategies for gaining access to field sites, building trust and documenting distributed and complex work practices in the context of several research projects we have conducted in the past 9 years. The experiences described in this paper illustrate the need to deal with fundamental problems, such as understanding local languages and different cultures, observing synchronous interaction, or dealing with barriers imposed by political conflicts between the sites. Based on our findings, we discuss some practical implications and strategies that can be used by other researchers and provide some recommendations for future research in methodological aspects of Global Software Engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, IEEE Computer Society, pp 298–308

  • Argyris C, Putnam R, Smith DM (1985) Action science. San Francisco

  • Avram G (2007a) Of deadlocks and peopleware: colaborative work practices in global software development. In: ICGSE, Munich, Germany

  • Avram G (2007b) Knowledge work practices in global software development. The European Conference on Knowledge Management, Barcelona

    Google Scholar 

  • Avram G, Wulf V (2011) Guest editorial: Studying work practices in Global Software Engineering, Information and Software Technology 53(9):949–954, ISSN 0950-5849, 10.1016/j.infsof.2011.01.010. (http://www.sciencedirect.com/science/article/pii/S0950584911000371)

  • Avram G, Bannon L, Bowers J, Sheehan A, Sullivan D (2009) Bridging, patching and keeping the work flowing: defect resolution in distributed software development. doi:10.1007/s10606-009-9099-6. Special Issue dedicated to ‘Software Engineering as Cooperative Work’ Guest Edited by Yvonne Dittrich, Dave W. Randall and Janice Singer, Journal of Computer Supported Cooperative Work, Volume 18, Numbers 5–6, pp 477–507

  • Biolchini J, Mian PG, Natali ACC, Travassos GH (2005) Systematic review in software engineering. Technical Report TR—ES 679 / 05, COPPE/UFRJ

  • Boden A, Nett B, Wulf V (2007) Coordination practices in distributed software development of small enterprises. In: ICGSE, Munich, Germany, pp 235–244

  • Boden A, Nett B, Wulf V (2008) Articulation work in small-scale offshore software development projects. Proceedings of the 2008 international workshop on Cooperative and human aspects of Software Engineering. ACM, Leipzig, Germany, pp 21–24

  • Boden A, Avram G, Bannon L, Wulf V (2009) Knowledge management in distributed software development teams—does culture matter? Proceedings of the 2009 International Conference on Global Software Engineering (ICGSE), Limerick, Ireland, pp 18–27

  • Boden A, Avram G, Bannon L, Wulf V (2011) Knowledge sharing practices and the impact of cultural factors: reflections on two case studies of offshoring in SME. J Softw Maint Evol Res Pract

  • Brown B, Lundin J, Rost M, Lymer G, Holmquist L (2007) Seeing ethnographically: teaching ethnography as part of CSCW. ECSCW 2007:411–430

    Google Scholar 

  • Carmel E (1999) Global software teams—collaborating across borders and time-zones. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Carmel E (2006) Building your information systems from the other side of the world: How Infosys manages time zone differences. MISQ 5(1):43–53

    Google Scholar 

  • Carver J, Seaman C, Jeffery R (2004) Using qualitative methods in software engineering. In: International Advanced School of Empirical Software Engineering (IASESE04), Los Angeles, CA, USA

  • Creswell JW (2003) Research design: qualitative, quantitative, and mixed methods approaches. SAGE Publications, USA

    Google Scholar 

  • Cross R, Parker A (2004) The hidden power of social networks: understanding how work really gets done in organizations. Harvard Business School Press, 304 pp

  • Curtis B, Krasner H et al (1988) A field study of the software design process for large systems. Commun ACM 31(11):1268–1287

    Article  Google Scholar 

  • Damian D, Moitra D (2006) Guest editors’ introduction: global software development: how far have we come? IEEE Softw 23(5):17–19

    Article  Google Scholar 

  • Dawson R, Bones P, Oates BJ, Brereton P, Azuma M, Jackson ML (2003) Empirical methodologies in software engineering. In: 11th Annual International Workshop on Software Technology and Engineering Practice, Washington, DC, pp 52–58

  • De Souza CRB, Redmiles DF (2011) The awareness network. To whom should I display my actions? And, whose actions should I monitor? IEEE T Software Eng 37(3):325–340

    Google Scholar 

  • De Souza CRB, Hildenbrand T, Redmiles DF (2007) Towards visualization and analysis of traceability relationships in distributed and offshore software development projects. In: SEAFOOD, LNCS

  • de Souza CRB, Sharp H, Singer J, Cheng L-T, Venolia G (2009) Guest editors’ introduction: cooperative and human aspects of software engineering. IEEE Software 26(6):17–19

    Article  Google Scholar 

  • Dittrich Y, John M, Singer J, Tessem B (2007) For the special issue on qualitative software engineering research. Inf Softw Technol 49(6):531–539

    Article  Google Scholar 

  • Dourish P (2006) Re-space-ing place: “place ” and “ space ” ten years on. In: Proceedings of the Conference on Computer Supported Cooperative Work

  • Dybå T, Prikladnicki R, Rönkkö K, Seaman CB, Sillito J (2011) Qualitative research in software engineering. Empir Softw Eng 16(4):425–429

    Article  Google Scholar 

  • Easterbrook SM, Singer J, Storey M, Damian D (2007) Selecting empirical methods for software engineering research. In: Shull F, Singer J (eds) Guide to advanced empirical software engineering. Springer

  • Espinosa JA, Nan N, Carmel E (2007) Do gradations of time zone separation make a difference in performance? A first laboratory study. In: ICGSE, Munich, Germany

  • Geertz C (1973) Thick description: towards an interpretive theory of culture. In: Geertz C (ed) The interpretation of cultures: selected essays. Basic Books, New York, pp 3–30

    Google Scholar 

  • Gobo G (2008) Doing ethnography. Sage Publications, Los Angeles

    Google Scholar 

  • Gonçalves MK, De Souza CRB, Gonzales VMG (2011) Collaboration, information seeking and communication: an observational study of software developers work practices. J Univers Comput Sci 17:1913–1930

    Google Scholar 

  • Gupta A, Ferguson J (1997) Culture, power, place: explorations in critical anthropology. Duke University Press, Durham

    Book  Google Scholar 

  • Herbsleb JD (2005) Beyond computer science. 27th ICSE, 23–27

  • Herbsleb JD (2007) Global software engineering: the future of socio-technical coordination. 29th ICSE, 188–198

  • Herbsleb JD, Grinter RE (1999) Architectures, coordination, and distance: Conway’s Law and beyond. IEEE Software: 63–70

  • Herbsleb JD, Moitra D (2001) Guest editors’ introduction: global software development. IEEE Softw 18(2):16–20

    Article  Google Scholar 

  • Herbsleb JD, Mockus A, Finholt T, Grinter RE (2001) An empirical study of global software development: distance and speed. In 23rd ICSE, pp 81–90

  • Hine C (2000) Virtual ethnography. Sage, London

    Google Scholar 

  • Jacobson I, Booch G, Rumbaugh J (1999) The unified software development process. Addison Wesley Longman, Inc., Reading

    Google Scholar 

  • Jarvenpaa SL (1998) Communication and trust in global virtual teams. J Comput-Mediat Commun 3(4). http://jcmc.huji.ac.il

  • King WR, Torkzadeh G (2008) Information systems offshoring: research status and issues. MIS Q 32(2):205–225

    Google Scholar 

  • Klein HK, Myers MD (1999) A set of principles for conducting and evaluating interpretive field studies in information systems. MIS Q 23(1):67–93

    Article  Google Scholar 

  • Ko AJ, DeLine R, Venolia G (2007) Information needs in collocated software development teams. International Conference on Software Engineering (ICSE), Minneapolis, MN, May 20–26, 344–353

  • Kraut RE, Streeter LA (1995) Coordination in software development. Commun ACM 38(3):69–81

    Article  Google Scholar 

  • Kraut R, Egido C, Galegher J (1990) Patterns of contact and communication in scientific research collaborations. In: Galegher J, Egido C, Kraut R (eds) Intellectual teamwork: social and technological foundations of cooperative work. Lawrence Erlbaum, pp 149–172

  • Lethbridge T, Sim S, und Singer J (2005) Studying software engineers: data collection techniques for software field studies. Empir Softw Eng 10:311–341

    Article  Google Scholar 

  • Lutz B (2009) Linguistic challenges in global software development: lessons learned in an International SW Development Division. Global Software Engineering, International Conference on, Los Alamitos, CA, USA: IEEE Computer Society, pp 249–253

  • Marcus GE (1998) Ethnography through thick and thin. Princeton University Press, Princeton, New Jersey

  • Oates BJ (2006) Researching information systems and computing. Sage Publications, Thousand Oaks

    Google Scholar 

  • Onwuegbuzie A, Leech N (2007) Validity and qualitative research: an oxymoron? Qual Quant 41(2):233–249

    Article  Google Scholar 

  • Paré G. Enhancing the rigor of qualitative research: application of a case methodology to build theories of IT implementation. The Qualitative Report, 7(4), retrieved from http://www.nova.edu/ssss/QR/QR7-4/pare.html, April of 2007

  • Parnas DL (2003) The limits of empirical studies of software engineering. In: Proceedings of the 2003 International Symposium on Empirical Software Engineering (September 30–October 01, 2003). International Symposium on Empirical Software Engineering. IEEE Computer Society, Washington, DC, 2

  • Patil S, Kobsa A, John A, Seligmann D (2011) Methodological reflections on a field study of a globally distributed software project. Inf Softw Technol 53(9):969–980. doi:10.1016/j.infsof.2011.01.013

    Article  Google Scholar 

  • Perry DE, Staudenmayer NA, Votta LG (1994) People, organizations, and process improvement. IEEE Softw 11(4):36–45

    Article  Google Scholar 

  • Plonka L, Sharp H, van der Linden J (2012) Disengagement in pair programming: does it matter? International Conference on Software Engineering, Zurich

    Google Scholar 

  • Prikladnicki R, Audy JLN, Evaristo R (2003) Global software development in practice: lessons learned. J Softw Process Improv Pract 8(4):267–282

    Article  Google Scholar 

  • Prikladnicki R, Audy JLN, Evaristo R (2006) A reference model for GSE: findings from a case study. In: ICGSE 2006, Florianopolis, Brazil

  • Prikladnicki R, Audy JLN, Damian D, Oliveira TC (2007) Distributed software development: practices and challenges in different business strategies of offshoring and onshoring. In: ICGSE, Munich, Germany

  • Prikladnicki R, Evaristo R, Damian D, Audy JLN (2008) Conducting qualitative research in an international and distributed research team: challenges and lessons learned. In: HICSS 2008, Hawaii, USA

  • Randall D, Harper R, Rouncefield M (2007) Fieldwork for design: theory and practice. Springer

  • Richardson I, Avram G, Deshpande S, Casey V (2008) Having a foot on each shore—bridging global software development in the case of SMEs. In: Proceedings of the 3rd IEEE International Conference on Global Software Engineering, Bangalore, India, 19–21st August, 2008, IEEE Computer Society Washington, DC, USA, 13–22

  • Rigby PC, Storey MA (2011) Understanding broadcast based peer review on open source software projects. International Conference on Software Engineering, Hawaii

    Google Scholar 

  • Rönkkö K (2000) Ethnography and distributed software development. In: ICSE Workshop Beg, Borrow or Steal: Using Multidisciplinary Approaches In Empirical Software Engineering Research, Limerick, Ireland

  • Seaman CB (1999) Qualitative methods in empirical studies of software engineering. IEEE Trans Softw Eng 25(4):557–572

    Article  Google Scholar 

  • Sengupta B, Chandra S, Sinha V (2006) A research agenda for distributed software development. In: 28th ICSE, pp 731–740, Shanghai, China

  • Sharp H, Robinson H (2004) An ethnographic study of XP practice. Empir Softw Eng 9(4):353–375

    Article  Google Scholar 

  • Spanoudakis G, Zisman A (2004) Software traceability: a roadmap. Handbook of Software Engineering and Knowledge Engineering. In: Chang SK (ed). World Scientific Publishing Co

  • Strauss A, Corbin J (1998) Basics of qualitative research: techniques and procedures for developing grounded theory, 2nd edn. Sage Publications, USA

    Google Scholar 

  • Suchman L (1987) Plans and situated actions: the problem of human-machine communication. Cambridge University Press, New York

    Google Scholar 

  • Taylor SJ, Bogdan R (1984) Introduction to qualitative research methods. Wiley, New York

    Google Scholar 

  • Tosun A, Turhan B, Bener A (2008) Ensembe of software defect predictors: a case study. Proceedings of the 2nd Symposium on Empirical Software Engineering and Measurement

  • Wittel A (2000) Ethnography on the move: from field to net to internet. Forum Qualitative Sozialforschung, vol 1; http://www.qualitative-research.net/fqs-texte/1-00/1-00wittel-e.htm

  • Wohlin C, Höst M, Henningsson K (2003) Empirical research methods in software engineering. In: Conradi R, Wang AI (eds) ESERNET 2001-2003, LNCS 2765. Springer, pp 7–23

  • Wulf V, Rohde M, Pipek V, Stevens G (2011) Engaging with practices: design case studies as a research framework in CSCW. In: Proceedings of the 2011 Conference on Computer Supported Cooperative Work, pp 505–512

Download references

Acknowledgments

The studies reported in this paper were partially supported by: the Research Group on Distributed Software Development of the PDTI program, financed by Dell Computers of Brazil Ltd. (Law 8.248/91); CAPES (project BEX 426006-6); CAPES (project BEX 1312/99-5); FAPERGS; CNPq (project 479206/2006-6, project 550130/2011-06, project 560037/2010-4, project 483125/2010-5); Science Foundation Ireland and Lero—The Irish Software Engineering Research Centre under PI grant 03/IN3/1408C; and the German Research Foundation (DFG) within the Project ARTOS—Articulation work in Offshoring Projects of Small and Medium- sized Enterprises of the Software Branch.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Prikladnicki.

Additional information

Communicated by: Premkumar Thomas Devanbu

Appendix A—The Experiences of the Authors

Appendix A—The Experiences of the Authors

This appendix provides details on the aims and methodologies of the various case studies and ethnographies which formed the basis of our experiences. Table 3 also shows the details of each of the experiences.

Table 3 Details of the experiences of the authors

Study 1

In Avram’s first study (2007a, b), Avram et al. (2009), the field site was an Irish subsidiary of a Fortune 500 multinational company involved in software development. Following a first contact with R&D managers, the request of following a software development team over an extended period of time was presented to a number of development managers, and one manager volunteered, with the agreement of his team. The study began in January 2006 and the research team adopted an ethnographically informed approach. The first 3 months allowed the two researchers to familiarize themselves with the context and the work being done. In the following 15 months, they spent more than 70 full days in the field, observing the activity of the team in its own work environment, participating in meetings and group activities and occupying desks in the open plan in the team’s area. 26 interviews were realised with members of the team based in Ireland, Germany and in the US, including the team lead, the two software architects, the lead of the QA team, software developers and quality engineers. Some of the team members were interviewed more than once. The research team was granted access to the company intranet, to the project’s document repository and to the team’s mailing list. They were also allowed to make use of the company’s own instant messaging system, useful both as an awareness mechanism and as a communication channel with the members of the observed team. Observation and interaction continued online when the researchers were not present on site. Participation in teleconferences allowed them to observe directly the team members’ interactions with people in various other locations (US, Germany, India). A good working relationship developed between the research team and the software development team and the researchers found many opportunities for conducting both formal interviews, and informal discussions on various topics—without disrupting the actual work. The researchers kept diaries, taking detailed notes on every day spent in the field. Remote collaborators of the observed team were also interviewed, either via instant messaging/phone or face-to-face. One of the researchers travelled to one of the company’s sites in Germany in November 2006 and interviewed five people with different roles in the collaboration between the two sites.

The data collected from the field was periodically discussed and analyzed on a weekly basis by the extended research team, in order to identify topics, trends and problems and compare the findings to those from other similar sites where fellow research team members were observing similar processes and activities. The findings were shared with the members of the software development team, their managers and remote collaborators in conversations, specially designed workshops and in the form of draft reports and papers.

Study 2

In Avram’s second study (Richardson et al. 2008; Boden et al. 2009, 2011), one of the cases presented is that of an Irish company with a development division in Romania. The method chosen for this study was that of a case study. One of the researchers, of Romanian origin, found the company website on the Internet. She contacted the Irish manager and organized an interview at the company headquarters in September 2007. After 2 months, she had the opportunity to travel to Romania, visit the Romanian branch and interview the Romanian manager (and co-owner) and 3 of the developers. At the time, the company was employing 3 project managers based in Ireland and Romania and 19 developers, all of them being located in Romania. Besides interviews, a limited amount of observation was undertaken in the two locations and a few documents were collected. Both the intermediary report and the finished paper were shared with the two managers and received positive feedback.

Study 3

In Boden’s study (Boden et al. 2007, 2008), the researchers conducted an ethnographically informed case study in two German SMEs engaging in offshore software development with partners in Russia. The goal was to understand how software developers in distributed teams organize their development work in terms of Articulation Work, and how organizational learning can be organized in distributed settings.

The researchers started with an exploratory analysis of the literature on offshoring, covering discourses of various communities of practitioners and scholars. Based on these findings, relevant research questions were identified, focusing on informal and situated coordination practices of developers. On this basis, the researchers conducted 15 semi-structured interviews with managers and developers of different German SMEs. From this sample, two companies (named Alfa and Beta for the purpose of this study), which had offshored part of their software development to subsidiaries in Eastern Europe, were selected for closer investigation in the form of case studies.

For the case studies, the researchers conducted a triangulation of several ethnographic research methods, comprising of further semi-structured interviews, participant observation, as well as artifact analysis. For the participant observation, each of the German SMEs was visited for a period of ten to twenty working days. A third participant observation period was conducted at the Russian partner company of company Alpha—in Tomsk, Siberia. Beta’s partner company could not be visited because of an abrupt end of the cooperation, but the researchers were able to cover the perspective of the Russian developers by conducting interviews via Skype with the Russian senior developer. During their time spent at the companies, the researchers had ample opportunities to observe local and distributed articulation processes, in the context of meetings, individual work situations and cooperative tasks. Several informal interviews were conducted and the researchers were allowed to analyze artifacts such as e-mails, chat protocols, internal work papers and whiteboard sketches.

The findings were documented by means of field notes and photos that were taken during the research. For validation, the findings were correlated with the literature focusing on articulation processes in software development. Furthermore, the identified concepts and topics were discussed with the participants of the study during a workshop.

Study 4

In Prikladnicki’s first study (Prikladnicki et al. 2007, 2008) the authors have conducted an exploratory case study of distributed projects in five multinational companies. The data collection methods included interviews and document reviews. The documents reviewed were project plans, lessons learned, and documents describing the software development process. A total of 20 individual interviews (lasting 1 h each) included technical leaders, project managers, IT managers and directors. Interviews were conducted face-to-face -in Brazil and Canada, and over the telephone—with informants in the U.S. In Brazil, the access to the companies was facilitated by the researcher’s previous contact. In Canada, the contact was facilitated by a professor living in that country, who had previous contact with the two companies. The professor was also part of the research team. In the U.S., the contact was made through the Brazilian subsidiary director, who has made the interview possible. The respondents were selected by convenience. Informants from three management levels: project management, information technology and portfolio management, and organization management were interviewed. Among them, there were six site directors, five information technology managers, seven project managers, and two technical leaders. The unit of analysis was the subsidiary. Findings were shared in the form of technical reports, papers, and also included in a book (the first book published in Portuguese on this theme).

Study 5

In Prikladnicki’s second study (Prikladnicki et al. 2003, 2006), a case study was conducted in two software development subsidiaries, each one owned by a multinational organization with worldwide spread units. The first organization worked mainly in consulting, software development projects and training and had external clients. At the time when the research was conducted, it had nine software development subsidiaries located in Brazil. The organization also had offices located in Brazil and other countries in Latin America, as well as in the U.S. and Europe. Its headquarters were located in Brazil. The second organization supported and manufactured computers. It had three software development subsidiaries located in two continents that were responsible for internal client demand worldwide. The headquarters were located in the U.S. The data collection included primary and secondary sources. We conducted 22 individual interviews (11 in each organization), covering four projects, two in each organization. All interviews were conducted in Brazil, facilitated by previous connections between the companies and the research team. Secondary sources were also used: document reviews, meetings minutes, and software development process descriptions, together with public information available on the homepage of each organizations. For data analysis a content analysis protocol was defined and applied.

The respondents were selected according to the unit of analysis and the purpose of the study. Among the interviewees, there were project team members, development managers, quality assurance team members, software process improvement responsible and individuals representing the organizations’ strategic level. Two questionnaires were developed, each considering a specific dimension to be explored: “organizational,” containing information about the organization as a whole, and “project,” with information on the four projects included in this study. One development manager was interviewed in each organization, whereas five interviews were conducted for each project. The convenience sample was not probabilistic, although the research team tried to get a good representation of all groups involved. Data collected was evaluated by practitioners in both companies—with positive feedback. Findings were used as an input for a training program on Global Software Engineering within one of the companies under study.

Study 6

In the de Souza’s study (De Souza and Redmiles 2011), fieldwork was conducted in a large software development company that will be called LAR for the purpose of this article. LAR was one of the largest software development companies in the United States, with products ranging from operating systems to software development tools, including e-business and tailored applications. The project studied was responsible for developing a mobile application that had not been released yet. The project staff was divided into three major groups: user interface (UI) designers, software developers, and the quality assurance (QA) team. The staff was distributed over five different sites, spread in three different countries: North Carolina, US; Massachusetts, US; Beijing, China; Shanghai, China; and Taipei, Taiwan. To be more specific, user interface design and evaluation was performed by six professionals in North Carolina, and the implementation was performed in all other sites distributed as follows: nine developers in Massachusetts, five in Shanghai, five in Beijing, and four in Taipei. The quality assurance team was divided between U.S. and Chinese sites: three engineers were located in Massachusetts and six engineers in Beijing. The main coordination of the project and the project manager were located in Massachusetts, where all the data were collected.

Data was collected through document analysis and semi-structured interviews. Among other documentation, artifacts, emails and instant messages exchanged among the software engineers were collected. Access to shared discussion databases used by the software engineers was also granted for the research team. All of this information was used in addition to the notes generated by the interviews. We conducted 17 semi-structured interviews with members of all teams from the different sites: some interviews were conducted face to face, and others were conducted by telephone, with one interview conducted via instant messaging. The interview questions were designed to encourage the participants to talk about their everyday work, including work processes, problems, tools, communication, collaboration, and coordination efforts between their collocated and distributed colleagues. The interviews also aimed to explore the relationship between software dependencies and the coordination of software development projects, or, to be more specific, the potential usage of dependency information to facilitate collaborative software development. Interviews lasted between 20 and 70 min. The interviews with some of the Chinese team members were conducted while they were visiting one of the US sites.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prikladnicki, R., Boden, A., Avram, G. et al. Data collection in global software engineering research: learning from past experience. Empir Software Eng 19, 822–856 (2014). https://doi.org/10.1007/s10664-012-9240-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-012-9240-x

Keywords

Navigation