1 Background

Although Chinese researchers had reported their findings on the H5N1 avian influenza already in January 2004, the World Health Organization only found out about it after an international symposium several months later. Simply because the original paper was written in Chinese [21]. Would it look different today, with the ready availability of machine translation services such as Google Translate, Microsoft Translator, or DeepL at the fingertips of most Internet users? Since its launch in April 2006, the former online resource for quick translations offering 104 languages is now reportedly being used by more than 500 m users daily. What began as statistical machine translation has, since 2016, switched to neural machine translation using deep learning algorithms [15].

The new technology of machine translation has likewise been spreading among smartphone users, helping order food, ask for directions, talk about the weather, and understand classroom content and engage in meaningful interaction with the teacher. At the same time, widely available instructional videos are accompanied by jokes about the quality of translations [1], ridiculing cases of over-reliance on and uncritical use of technology.

For a long time, machine translation would have limited utility where language served more far-reaching functions than in a simple exchange, involving important nuances such as discussing personal values and ethical concerns or resolving conflicts. However, with Google’s research on and development of the translation engine since 2016, which has been covered even in the popular media, users have been able to notice an improvement in the quality of the translations with the naked eye. The current opinion of many users from different business environments is that Google Translate provides by and large highly effective translations.

Here we investigate whether machine translation can make research more widely known and help researchers become part of the international academic community.

2 Literature on Google Translate

Deep learning improved Google Translate through the use of artificial neural networks [7]. Current literature on the tool's applications includes for instance a contrastive analysis of MT of Arabic verb forms and aspect into English [2], a comparison of the output quality of Google and Bing translators in Chinese-English translation [5], an evaluation of machine translation in more academic fields [14], a survey carried out among medical researchers admitting to using Google Translate to retrieve key study characteristics from the global literature [20], or a meta-analysis of the prevalence and incidence of traumatic tooth injuries which examined articles published in languages other than English [11].

There has also been a reflection on the correctness of translations from the perspective of preventing discrimination [22], and the reliance on Google Translate by students and universities [18]. Assertions have also been made that deployment of Google Translate may help promote a more inclusive international student environment [16], although some researchers express surprise at participants’ readiness to use the tool to submit their assignments and responses [19].

To add to the picture, we check how Google Translate works in situations demanding a high level of communication, i.e., in scientific communication – a field dominated by the English language, but where most stakeholders are not its native speakers, and still not all can communicate in it. A language barrier is not key, but is often mentioned by respondents in quite recent studies, and may undermine the intention to even submit an application to a conference and participate in the international exchange of thoughts [4].

3 Objectives

We check whether universally available free translation technology can help achieve goals in an effective and constructive way in conditions close to natural conditions. Without the intention to understand and assess the underlying R&D behind MT, our approach is user-centered. We are interested in the end-user and the extent to which MT satisfies their need to effectively communicate an important issue requiring expertise.

Although roughly 500 million people use Google Translate for private purposes, we decided to tighten the acceptable translation standards. We used this tool to translate abstracts for submission to a conference in the field of social sciences focused on research conducted in Central and Eastern Europe.

4 Method

We carried out a natural experiment, without a control group in the design. Although comparable designs are the norm in psychological sciences, it is not necessary in a user-centered feasibility and applicability test of technology.

We have chosen a prestigious international conference at which the submitted abstracts undergo a double-blind review procedure. The assessment of abstracts is carried out by English-speaking experts in the relevant field.

Over the past four years, the conference had been receiving upwards of 1,900 submissions annually, at an acceptance rate ranging from 34% to 50%. The event boasts an established reputation; the scientific society organizing it had been founded over 50 years earlier and today boasts over 7 thousand members. Its mission is to understand people for the general good of humanity. The conference is not only a central event of the society, but is also considered by the organizers to be the most important international event in the scientific discipline concerned. Annually, the conference is attended by more than 3.5 thousand participants from various academic and practice-oriented sectors.

4.1 Choice of Academic Discipline

We have chosen the world of science, as its mission is to produce and develop knowledge. For scientists, language is an indispensable tool. What is more, according to the standards adopted in science, international circulation of one’s output is usually a precondition for gaining tenure and recognition in a given field.

The world of science is conventionally divided into several fields, often imagined as distributed along a continuum space. On the axis between strictly technical sciences and humanities one can distinguish social sciences, often treated as an intermediate category. This is the category chosen in our study. It typically requires more verbose skills than mathematical notation and a higher level of linguistic nuance.

4.2 Selection of Abstracts

We used two sources of materials for research, external and internal. The external source (other conference submissions) served only to determine the optimal number of abstracts that would be suitable for submission to reflect the conditions close to natural. From the external source we managed to obtain 57 abstracts. These were approved presentations for the largest conference organized by a local Eastern European association of the relevant social science field in 2018. However, the limit for abstract submissions of 1,200 characters with spaces imposed by the organizer of the international conference we aimed at reduced our database of abstracts to just eleven. After machine translation from Polish into English by Google Translate the character limit of 1,200 characters was exceeded by two abstracts. Based on the procedure we could finally approve 9 potential abstracts for submission. However, we decided not to go forward, given the need to obtain copyright from the authors. For practical and ethical reasons, we were unable to obtain research material from this source. Possible complications of obtaining permission from individual authors to use the abstract and the expected time to obtain these permissions made the source inadequate. Instead, we collected the final research material from an internal source (our colleagues, researchers whom we knew personally). We compiled a database of our colleagues’ Polish-language abstracts from the field of social sciences (internal source), which in previous years had been accepted for other (local) conference presentations, and which met the upper limit of 1,200 characters after translation into English with the help of Google Translate (n = 18). Finally, from this database we randomly selected 9 abstracts, which we used as research material.

These automatically translated texts contained grammatical errors, which we intentionally left uncorrected. The materials were supposed to reflect the situation of a researcher from a non-English-speaking country who intends to share their research with the international scientific community, uses only automatic translation, and does not correct the translation.

In this paragraph we present an example of such automatic translation (1 of 9) from the original Polish text into English, together with the title:

“Do viewers remember what irritates them?”

“Remembering advertising messages is one of the important parameters for assessing the effectiveness of advertising. The level of memorization of individual elements of the message is different and depends on many factors, such as the order in which they are presented, the total number of stimuli in the message, the personal importance of information, the clarity of information and their attractiveness to the viewer. In the study, which will be presented, respondents (n = 1000) in three independent, independent groups watched a properly prepared fragment of a real television program in which advertising messages (auto-promotional) differing in the way of assembly. The obtained results indicate the occurrence of sequence effects, the lack of influence of the number of stimuli, the positive effect of vividness of the message and the correlation between remembering and liking the message. The test results give guidelines for the design of these forms of promotion in the future.”

As mentioned above, we used our own resources. The above conference abstract has been translated and reproduced from an unpublished source with the permission of the copyright holder who is one of the co-authors of the current paper.

4.3 Abstract Submission Procedure

The abstracts were submitted to the conference via e-mail accounts on a globally recognized portal. The accounts were registered as fictitious people with names matching different cultures, such as Zenon Kowalski, Felicia Williams, Carrie Cholmondeley, Mary Surren, etc. The applications were formally sent by academic teachers who had obtained their doctorates before 1 January 2016. The application form also required the submission of affiliations. We used the Academic Ranking of World Universities, from which we randomly selected nine universities in the middle of the ranking, located between the 400th and 500th places (e.g. East China Normal University, Bangor University, Federal University of Minas Gerais, etc.). The proportion of male and female names was 4:5. The nationality of the abstract submitter was determined by the country in which the university was located. From Wikipedia, we selected the most typical names for a given region of the world.

The research described in all abstracts was empirical, and these were submitted as original research intended for oral presentations. However, we expressed our willingness to participate in a poster session in case an abstract would not be approved by the reviewers for an oral presentation.

4.4 Ethics and Research Integrity

This leads us to the research integrity issue: whether program chairs have been previously asked as to whether they agree with this scientific experiment and whether the conference organizers were informed about any activities that were going to take place in the framework of the experiment. They were not, as this would invalidate all our research activity in this study. Our testing method is simple, has some originality and is well thought out in terms of not affecting the review process, and with a small chance for being discovered by the organizers. For ecological validity of natural experiments, the method of covert observation is justified as long as it is not harmful. As has recently been noticed, according to research integrity committees the consent from the participants in such scenarios is not necessary, as long as we focus on the positive consequences of the research, because if the social benefits of the research outweigh the cost, deception is acceptable [13]. Moreover, covert research is acceptable in some contexts, on condition that the researcher constantly questions the ethicality of their action and research, and its consequences [13]. In many very valuable interdisciplinary studies the researchers did not have to ask for the permission of the organization as otherwise the gatekeepers would have likely made the research difficult [13] and the behaviors to be observed would not have been visible to an overt observer [13]. To sum up: to uncover the reality of institutions, one does not ask them for consent. This is exactly the approach which we adopted in this paper.

We also analyzed the American Psychological Association’s (APA) Ethical Principles of Psychologists and Code of Conduct, used globally as a framework of reference [3]. The APA Ethics Code clearly states in its Section 8 (Research and Publication, chapter 8.05 on “Dispensing with Informed Consent for Research”) that: “Psychologists may dispense with informed consent only: “(1) where research would not reasonably be assumed to create distress or harm [AND] involves (a) the study of normal educational practices, curricula, or classroom management methods conducted in educational settings; (b) only anonymous questionnaires, naturalistic observations, or archival research for which disclosure of responses would not place participants at risk of criminal or civil liability or damage their financial standing, employability, or reputation, and confidentiality is protected; [or] (c) the study of factors related to job or organization effectiveness conducted in organizational settings for which there is no risk to participants’ employability, and confidentiality is protected [OR] (2) where otherwise permitted by law or federal or institutional regulations” [3].

In our study we fulfill the above mentioned APA Ethics Code criteria: (Ad 1) The research did not reasonably envisage causing distress or harm; (Ad b) The research involved only naturalistic observations (which we mentioned as a natural experiment, with no control group). The disclosure of responses would not place participants at risk of criminal or civil liability or damage their financial standing, employability, or reputation. The confidentiality is fully protected. (Ad c) The study concerned factors related to academic job and academic organization effectiveness conducted in academic organizational settings for which there is no risk to participants’ employability. The confidentiality is fully protected.

We did not wish to flood the conference organizers with hundreds or dozens of submissions, because to carry out a Google Translate feasibility only a few texts sufficed. In the submission process we abided by ethics principles. First of all, we did not add on much work for the reviewers, given that the abstracts never exceeded 1,200 characters. We also did not troll reviewers and organizers by sending control “lorem ipsum” or otherwise meaningless texts. These would not only pose an unnecessary workload and frustration to the scientific committee, but also not help to address the research question at hand.

The abstracts were veritable and relevant texts that had already been reviewed or accepted for print in the local scientific community. We had only collected texts of already proven scientific quality in the field of social sciences, and merely tested their potential for internationalization: we checked whether they would be seen as useful to the international scientific community when machine translated. We have not misrepresented scientific claims. The only distortion was the identity of the authors. However, we did not violate the rights of the authors, because we only collected our own texts and those by our colleagues.

We did not disrupt the logistics of the conference, because we only submitted a handful of abstracts. Therefore, this did not upset the evaluation of around 2,000 other applications. The principle of the conference is to consider the lack of post-decision reaction on the part of the author of the accepted work as a withdrawal of participation (e.g., for random reasons). This minimized the need for contacting and debriefing the organizers.

In effect the limitation of this study is the lack of a control condition. Although, as one of our reviewers notes, generating stimuli that sound meaningful, are linguistically and grammatically correct, but do not make sense when it comes to conveying scientific content would be feasible, it would be ethically questionable. We leave it to the next generations of authors to decide whether designing such a study would be appropriate from the research integrity viewpoint.

4.5 Results

Out of the nine abstracts submitted, in September 2018 eight were accepted for the conference – but only for poster sessions. One abstract was rejected. Table 1 presents an excerpt from the positive and negative evaluations sent out by the conference organizer in response to the submissions. The positive answer concerns the acceptance of the abstract only for a poster session. Table 2 shows the titles of the abstracts together with the decision whether or not to accept the submission in question.

Table 1. Positive and negative response of the conference organizer to the submitted abstract
Table 2. Abstract titles and status of the proposed abstract

4.6 Discussion of the Results

Almost all the machine-translated conference submissions were accepted, but only as posters.

We assume that if we had sent empty, randomly generated or “lorem ipsum” texts, the organizers would not have accepted them (we did not check this assumption for ethical reasons). The fact that none of the authors were invited to present a paper means that either the topic, its presentation, or the linguistic competence of the submitter was considered to be inadequate. In such a situation the poster format is a way for the organizers to nonetheless ensure the presence of many different ideas.

One abstract, however, did not go through the selection process. Why did this one not pass? A possible reason is that it was less consistent with the conference topic; it concerned cognitive rather than social processes, referring to an experiment using eyetracking and the analysis of oculographic indices, such as fixation length, saccade length and pupil dilation, thus being better suited to a cognitive rather than a social science venue.

One reviewer points out lack of information on the reviewers’ language backgrounds and level of proficiency in English. While this is the case, we do not think it detracts from the implications of this study or is a limitation: in most scientific conferences, the identities of the reviewers are only known to the event chairs, and international events typically likewise recruit their scientific committee members and ad hoc reviewers from an international pool.

5 Implications

Since Google Translate is able to provide translation of abstracts at a level acceptable to the scientific community in the area of conference submissions, it means that a message translated in this way, including relatively complex ideas, concepts or problems, is understandable to the receiver. Thus, this technology can already be used in a wider formal context. Below, we identify possible applications of Google Translate and other analogous resources in formal cases.

5.1 Transfer of Scientific Ideas and Research Results

It is already possible to transfer knowledge between scientists from all over the world, also with the participation of researchers without language skills. These researchers do not have to be worse in their narrow disciplines than English- or other dominant-language-speaking researchers, and one should remember that the vast majority of today’s scientific (and other) publications in English are penned by non-native users of this language [8, 10, 11]. The implementation of Google Translate into conference reporting systems could make the world’s science more readily accessible and transparent, extend its outreach, and balance the distribution of scientific thought centers in specific fields.

5.2 Communication of Officials with Citizens and Non-citizens

Communication with offices and institutions could be made more efficient. Economic migrants, refugees, and other foreigners could participate in automatically translated interviews with officials, for instance via chat rooms, and automatic translation of official forms into the language of the applicant could be facilitated in a similar way as the translation of the content of websites, at least in the preliminary, provisional stages. Residence and work permits, social security, or opening a bank account are the basic formalities one needs to deal with during the first and most difficult period of one’s stay in a new country. At a later stage it could also be possible to browse and search for job offers, prepare a CV and contact a prospective employer.

5.3 Wellbeing and Social Inclusion

We expect that in the long run, thanks to speech recognition and the intensification of communication in foreign languages in real time, translation technology can provide people with numerous personal benefits. We propound that the ability to express complex content can be important for one’s image and foster success by enabling communication with diminished risk of face loss. Increasing communication is also important from the perspective of social inclusion, especially for vulnerable groups.

Recently the potential of voice assistant technology usage as a proxy for production of silver content (ie. creative, productive, wise and autonomous activity of older adults in new media) has been noted, together with number of new challenges it generates [17]. We noticed the interlingual potential of such technology using the translators mentioned in this article. Interactive and intelligent technology will be a substitute for social actors, preventing interlingual and international exclusion and disengagement. Voice assistants can substitute social interaction in a very restricted manner, but with the development of this technology, these interactions can become more meaningful and may become a way to provide aging people with the opportunity to maintain social interactions on the level necessary to stay active, even invisibly from a technological viewpoint [6], and at on the international level, should it be their choice.

5.4 Potential Threats

Among the consequences that we consider to be potentially socially negative, we discern a reduction in the motivation to learn foreign languages [7], but also an easier implantation of ideologies among young people, against which society will remain helpless. The disappearance of the language barrier opens up new spaces for propaganda, recruitment and attitude shaping that may undermine the social or legal order.

As is often the case, it seems that global and international science exemplifies the directions of other possible global social processes. While a facilitated exchange of scientific ideas and a removal of communication barriers may be of great benefit to the whole of humanity, and to the scientific community in particular, it is also worthwhile to carry out an analysis of opportunities and risks associated with machine translation.

According to our reviewers and other readers whom we consulted, even the current paper could make the reader suspect that it was translated using Google Translate, and therefore would place them in the role of participants in yet another study. But, as a reviewer writes: “On the other hand, isn’t that really the point? Who can certify that this review has not been translated by Google Translate? And is it wrong?”