Empirical Software Engineering

, Volume 21, Issue 3, pp 1002–1034 | Cite as

Assessing the impact of real-time machine translation on multilingual meetings in global software projects

  • Fabio Calefato
  • Filippo Lanubile
  • Tayana Conte
  • Rafael Prikladnicki


Communication in global software development is hindered by language differences in countries with a lack of English speaking professionals. Machine translation is a technology that uses software to translate from one natural language to another. The progress of machine translation systems has been steady in the last decade. As for now, machine translation technology is particularly appealing because it might be used, in the form of cross-language chat services, in countries that are entering into global software projects. However, despite the recent progress of the technology, we still lack a thorough understanding of how real-time machine translation affects communication. In this paper, we present a set of empirical studies with the goal of assessing to what extent real-time machine translation can be used in distributed, multilingual requirements meetings instead of English. Results suggest that, despite far from 100 % accurate, real-time machine translation is not disruptive of the conversation flow and, therefore, is accepted with favor by participants. However, stronger effects can be expected to emerge when language barriers are more critical. Our findings add to the evidence about the recent advances of machine translation technology and provide some guidance to global software engineering practitioners in regarding the losses and gains of using English as a lingua franca in multilingual group communication, as in the case of computer-mediated requirements meetings.


Global software development Machine translation Distributed meetings Computer-mediated communication Controlled experiment 



This research is partially funded by the Rio Grande do Sul State funding agency (FAPERGS) under projects 11/2022-3 and 002061-2551/13, and by CNPq (483125/2010-5, and 309000/2012-2). We would like to thank all the participants who took part in the experiment and Pasquale Minervini for the support he provided with the management of the Apertium service.


  1. Altman DG (1991) Practical statistics for medical research. Chapman and Hall, LondonGoogle Scholar
  2. Arnold D (2003) Why translation is difficult for computers. In Computers and translation: a translator’s guide. Benjamins Translation LibraryGoogle Scholar
  3. Arnold D, Balkan L, Meijer S, Humphreys RL, Sadler L (1994) Machine translation: an introductory guide. NCC Blackwell, LondonGoogle Scholar
  4. Aziz W, Sousa SCM, Specia L (2012) PET: a tool for post-editing and assessing machine translation. Proc 8th Int’l Conf Lang Resour Eval (LREC’12):3982–3987Google Scholar
  5. Berander P (2004) Using students as subjects in requirements prioritization. Int’l Symp Empir Softw Eng (ISESE’04):167–176. doi: 10.1109/ISESE.2004.34
  6. Brazil IT-BPO Book 2010–2011 (2013) Published by Brasscom, Brazilian Association of Information Technology and Communication Companies, São Paulo, BrazilGoogle Scholar
  7. Burchardt A, Tscherwinka C, Eleftherios A, Uszkoreit H (2013) Machine translation at work. Computational Linguistics, Studies in Computational Intelligence, Springer, 458:241–261. doi: 10.1007/978-3-642-34399-5_13
  8. Calefato F, Lanubile F (2007) Using frameworks to develop a distributed conferencing system: an experience report. Softw Pract Exper 39(15):1293–1311. doi: 10.1002/spe.937 CrossRefGoogle Scholar
  9. Calefato F, Damian D, Lanubile F (2007) An empirical investigation on text-based communication in distributed requirements engineering. Proc 2nd Int’l Conf Glob Softw Eng (ICGSE’07):27–30. doi: 10.1109/ICGSE.2007.9
  10. Calefato F, Lanubile F, Minervini P (2010) Can real-time machine translation overcome language barriers in distributed requirements engineering? Proc 5th Int’l Conf Glob Softw Eng (ICGSE’10):257–264. doi: 10.1109/ICGSE.2010.37
  11. Calefato F, Lanubile F, Prikladnicki R (2011) A controlled experiment on the effects of machine translation in multilingual requirements meetings. Proc 6th Int’l Conf Glob Softw Eng (ICGSE’11):94–102. doi: 10.1109/ICGSE.2011.14
  12. Calefato F, Damian D, Lanubile F (2012a) Computer-mediated communication to support distributed requirements elicitations and negotiations tasks. Empir Softw Eng J 17(6):640–674. doi: 10.1007/s10664-011-9179-3 CrossRefGoogle Scholar
  13. Calefato F, Lanubile F, Conte T, Prikladnicki R (2012b) Assessing the impact of real-time machine translation on requirements meetings: a replicated experiment. Proc of the 6th Int’l Symp Empir Softw Eng and Meas (ESEM’12):251–260. doi: 10.1145/2372251.2372299
  14. Carmel E, Agarwal R (2001) Tactical approaches for alleviating distance in global software development. IEEE Softw 18(2):22–29. doi: 10.1109/52.914734 CrossRefGoogle Scholar
  15. Clark HH, Brennan SE (1991) Grounding in communication, in perspectives on socially shared cognition. American Psychological Association, Washington DC, pp 127–149CrossRefGoogle Scholar
  16. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46. doi: 10.1177/001316446002000104 CrossRefGoogle Scholar
  17. Cohen J (1992) A power primer. Psychol Bull 112(1):155–159. doi: 10.1037/0033-2909.112.1.155 CrossRefGoogle Scholar
  18. Cohn M (2005) Agile estimating and planning. Prentice HallGoogle Scholar
  19. Conover WJ (1980) Practical nonparametric statistics. Wiley, New YorkGoogle Scholar
  20. Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16:297–334CrossRefGoogle Scholar
  21. Damian D (2007) Stakeholders in global requirements engineering: lessons learned from practice. IEEE Softw 24(2):21–27. doi: 10.1109/MS.2007.55 CrossRefGoogle Scholar
  22. Damian D, Zowghi D (2003) Requirements engineering challenges in multi-site software development organizations. Requir Eng J 8(3):149–160. doi: 10.1007/s00766-003-0173-1 CrossRefGoogle Scholar
  23. DeSanctis D, Gallupe RB (1987) A foundation for the study of group decision support systems. Manag Sci 33(5):589–609. doi: 10.1287/mnsc.33.5.589 CrossRefGoogle Scholar
  24. Fleiss JL (1981) Statistical methods for rates and proportions, 2nd edn. Wiley, New York, pp 38–46zbMATHGoogle Scholar
  25. Gao G, Wang H-C, Cosley D, Fussell SR (2013) Same translation but different experience: the effects of highlighting on machine-translated conversations. In Proc SIGCHI Conf Hum Factors Comput Syst (CHI ‘13):449–458. doi: 10.1145/2470654.2470719
  26. Garland R (1991) The mid-point on a rating scale: is it desirable? Mark Bull 2:66–70Google Scholar
  27. Gottesdiener E (2002) Requirements by collaboration: workshops for defining needs. Addison-Wesley Longman Publishing Co., Inc.Google Scholar
  28. Herbsleb JD, Atkins DL, Boyer DG, Handel M, Finholt TA (2002) Introducing Instant messaging and chat into the workplace. Proc Int’l Conf Comput Hum Interact (CHI ‘02):171–178. doi:  10.1145/503376.503408
  29. Höst M, Regnell B, Wohlin C (2000) Using students as subjects - a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214. doi: 10.1023/A:1026586415054 CrossRefzbMATHGoogle Scholar
  30. Hsieh Y (2006) Culture and shared understanding in distributed requirements engineering. 1st Int’l Conf Glob Softw Eng (ICGSE’06):101–108. doi: 10.1109/ICGSE.2006.261221
  31. Johns R (2005) One size doesn’t fit all: selecting response scales for attitude items. J Elections Public Opin Parties 15(2):237–264. doi: 10.1080/13689880500178849 MathSciNetCrossRefGoogle Scholar
  32. Jurafsky D, Martin JH (2008) Speech and language processing 2nd ed. Prentice Hall Series in Artificial Intelligence, Prentice HallGoogle Scholar
  33. Kearney AT (2007) Destination Latin America: a nearshore alternative, technical reportGoogle Scholar
  34. KPMG (2009) Nearshore attraction: Latin America Beckons as a global outsourcing destination. Technical ReportGoogle Scholar
  35. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. doi: 10.2307/2529310 MathSciNetCrossRefzbMATHGoogle Scholar
  36. Lehtola L, Kauppinen M, Kujala S (2004) Requirements prioritization challenges in practice. LNCS: Product Focused Software Process Improvement 3009:497–508. doi: 10.1007/978-3-540-24659-6_36
  37. Lutz B (2009) Linguistic challenges in global software development: lessons learned in an International SW Development Division. Proc 4th Int’l Conf Glob Softw Eng (ICGSE’09):249–253. doi: 10.1109/ICGSE.2009.33
  38. Macaulay LA (1996) Requirements engineering, Springer-Verlag TelosGoogle Scholar
  39. Mitkov R (2003) The Oxford handbook of computational linguistics. Oxford Handbooks in Linguistics, Oxford University PressGoogle Scholar
  40. Montgomery DC (1996) Design and analysis of experiments. Wiley, New YorkGoogle Scholar
  41. Mudge RS (2009) After the deadline – language checking technology. Automattic.
  42. Nunnally JC (1978) Psychometric theory (2nd ed.). McGraw-Hill, New YorkGoogle Scholar
  43. Ogden W, Zacharski Z, An S, Ishikawa Y (2009) User choice as an evaluation metric for web translation in cross language instant messaging applications. Proc Mach Transl Summit VIIGoogle Scholar
  44. Paulson LD (2001) Translation technology tries to hurdle the language barrier. Computer 34(9):12–15. doi: 10.1109/MC.2001.947080 CrossRefGoogle Scholar
  45. Prikladnicki R, Carmel E (2013) Is time zone proximity an advantage for software development? The case of the Brazilian I.T. industry. Int’l Conf Softw Eng (ICSE’13):973–981. doi: 10.1109/ICSE.2013.6606647
  46. Rautio, J, Koponen, M (2013) “MOLTO evaluation and assessment report.” Technical ReportGoogle Scholar
  47. Raybaud S, Langlois D, Smaïli K (2011) ‘This sentence is wrong’. Detecting errors in machine-translated sentences. Mach Transl 25(11):1–35. doi: 10.1007/s10590-011-9094-9 CrossRefGoogle Scholar
  48. Shah YH, Raza M, UlHaq S (2012) Communication issues in GSD. Int’l J Adv Sci Technol 40:69–76Google Scholar
  49. Stemler S (2001) An overview of content analysis. Practical assessment, research & evaluation, 7(17)Google Scholar
  50. Wisniewski G, Kumar Singh A, Yvon F (2012) Quality estimation for machine translation: some lessons learned. Mach Transl 27(3–4):213–238. doi: 10.1007/s10590-013-9141-9 Google Scholar
  51. Wohlin C, Runesson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering, an introduction. Kluwer Academic PublishersGoogle Scholar
  52. Yamashita N, Ishida T (2006) Effects of machine translation on collaborative work. Proc 20th Int’l Conf Comput Supported Coop Work (CSCW ‘06):515–524. doi: 10.1145/1180875.1180955
  53. Yamashita N, Inaba R, Kuzuoka H, Ishida T (2009) Difficulties in establishing common ground in multiparty groups using machine translation. Proc 27th Int’l Conf Hum Factors Comput Syst (CHI ‘09):679–688. doi: 10.1145/1518701.1518807

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Fabio Calefato
    • 1
  • Filippo Lanubile
    • 1
  • Tayana Conte
    • 2
  • Rafael Prikladnicki
    • 3
  1. 1.Dipartimento di InformaticaUniversity of BariBariItaly
  2. 2.Instituto de ComputaçãoUniversidade Federal do AmazonasManausBrazil
  3. 3.Pontifícia Universidade Católica do Rio Grande do SulPorto AlegreBrazil

Personalised recommendations