1 Introduction

The notion of context is a term transgressing many disciplinary borders, it is used as a research perspective for studying various phenomena on the grounds of many disciplines and it ascribes distinct social factors. The common thread running through various, discipline-specific definitions of context is that it covers collective, external background phenomena which determine the perception and/or operation of specific actions, whether these belong, for example, to the realm of psychology (e.g. [58]: 50–70), or sociology [55]: 257–287, [59]: 501, 504, [63]: 1–19).

Along with the development of sociolinguistic studies, the notion of context has earned its own discipline-specific definition, yet even here it can be interpreted functionally on a few levels of analysis [29: 81, 54: 13–34] and with reference to different analytic traditions [25: 2]. The main figures in this field are Dell Hymes [16: 293, 60: 199] and John Gumperz [1: 827, 25: 2] who are said to have inspired much research into the sociolinguistic dimensions of context. Hymes [30, 31] noticed the complexity of context in linguistic research and identified the conceptual categories covering the discursively relevant factors. Gumperz [27: 229–252] proposed the concepts of “situated interpretation” and “contextualisation clues” which emphasised the significance of inferences in the meaning construal which consist in the negotiation of social values for the final interpretation by virtue of application of interpretative schemas.

Acknowledging the phenomenon of situated discourse as an interesting field of study, linguistic research has been conducted within different analytic traditions, thus various referential ranges have been assigned to the notion of context. The most common distinction consists in limiting context to factors that can be recovered from the discourse itself [13: 63, 22: 815], as is the case with conversation analysis, or treating context as “independently specifiable causes of behaviour” [22: 815], as a wider range of social factors which are important “for understanding of what is occurring in the conversation”. The latter is the stance adopted by critical discourse analysis and interactional sociolinguistics [22: 815] and this study fits in here as well. This perspective on research into context has been associated with the concept of thick description coined by Geertz [24: 14] with the intention to emphasise that a thorough and reliable account of the contextual background is a prerequisite of effective communication. The need for thick description in this sense is recognised also in reference to professional communication [53: 1]. This study specifies the application of thick context description even further and focuses on specifically situated corporate discourse in English/Polish interlingual communication in the domain of commercial law. It addresses the problem of the social dimensions of language focusing on the distinctiveness of corporate discourse for company registration proceedings in Poland. The author aims at identifying the structural components of the contextual background that are potentially discursively relevant for the communicative situation covered by the analysis, with the purpose of constructing a context model accounting for the heterogeneous character of some parameters, if that proves to be the case. The emerging framework is to serve as a starting point for investigating the validity of the context-conditioned distinctions for the interlingual discursive performance in the said communicative situation.

Referring to the theoretical background, the study is grounded in the conceptual framework developed by sociocultural linguistics with respect to the social factors affecting discursive variantivity in general [19, 28]. The analysis also fits in the strand of research investigating variation in the context of translation performance [36, 37]. It also draws on the theory of context, as proposed by Hymes [30, 31] and incorporates later approaches, including culture as part of the context [32: 181], or those advocating contextualisation of professional genre analysis [4: 166, 5: 3–13, 6: 4, 12: 151, 35: 443–461, 68: 26–40] and investigating the role of context in translation [47]. Finally, as regards the object of the analysis, the study relates to the existing research in professional discourse [14], legal discourse, and specifically in corporate discourse, as communication that is “generated by business organisations” [18: 237–253, 21: 236, 23: 93, 33: 1–5, 44: 240] and it is aimed at complementing the existing state of research with findings on specifically situated corporate discourse.Footnote 1

The aim of the empirical part of the project is to investigate the contextual profile of the texts with regard to the contextual category of setting and to verify the discursive relevance of the potential source text contextual distinctions cross-linguistically. For the purpose of this study a specific context category is assumed to be discursively relevant in the cross-linguistic perspective if it affects the translation performance of the parallel texts. Verification of the said discursive relevance is operationalised by examining whether the margin of the cross-linguistic quantitative distinctions differs depending on the source text contextual variants in the parallel corpus. A set of seven discrete units operate as tertium comparationis here.

The present study aims to answer the following questions: (1) How is the company registration discourse contextually varied with regard to the parameter of setting? Specifically, what is the distribution of the values assigned to the variable of country of origin of the source text and its year of publication? (2) Does the distribution scheme for the specific linguistic features change across the parallel corpus depending on the two source text variants?

It is hypothesised that legal discourse in English, even if it belongs to the same communicative setting, is not contextually homogeneous, including from the diatopic and diachronic perspectives [65], which may have further implications in the cross-linguistic communication in this domain. The hypothesis assumes that source text variantivity affects translation performance. Specifically, two scenarios are hypothesised (1) the cross-linguistic contrasts attributable largely to the systemic distinctions between languages are less marked in translations than in the source texts, which is deemed to be indicative of the operation of convergence,Footnote 2 understood as flattening of the structural variety of the source texts and (2) alternatively, the cross-linguistic contrasts are more pronounced in the translated texts than in the source texts, which is diagnosed as divergence, denoting a larger structural hiatus between the translation variantsFootnote 3 than between the source language variants.

2 Methodology

The operationalisation of the research task involves quantitative processing of the context-related data extracted from the custom-designed, parallel corpus and examining the source text variety for its relevance for translation performance. The notion of parallel corpus is employed here in the sense ascribed to it by, for example, Lewandowska-Tomaszczyk [40: 51], Matulewska [43: 170], McEnery and Hardie [45: 18–20] and Olohan [48: 24–34]. The texts making up the corpus constitute a sample of written corporate discourse, situated in company registration court proceedings in Poland. Specifically, they include documents having the status of attachments to court pleadings (English language texts and their translations), as found on the court files at 2 out of 21 divisions of the National Court Register, Register of Entrepreneurs. The corpus was compiled in a systematic way. The search criterion was defined by spatiotemporal factors, that is (1) the year of entry in the register and (2) the category of registration proceedings, which is registration of a branch of a foreign company.

The methodological framework of the study, specifically with respect to the selection of the data to be processed, draws on the theory of context established by Hymes [30, 31] and developed by his successors who advocated analysing a communicative situation in reference to a set of context parameters providing a conceptual framework for describing distinct aspects of a communicative event in the sociolinguistic perspective. The author examines the context parameter of setting since—in general—it is claimed to be among the most universal, “necessary and typical” ones [61: 204] and thus may be said to be generally of high discursive relevance [16: 293–294].Footnote 4 Additionally, the parameter of setting proves to be functionally salient in the interlingual communicative situation under analysis [66].

The corpus material was annotated for specific features defining two situation-specific context categories which fall under the scope of the Hymesian parameter of setting.Footnote 5 The context categories selected for quantitative analysis include (1) country of origin of the source text, (2) and the year of publication of the source text. The selection of the context categories that are analysed quantitatively was determined by: the findings gathered in related studies and by pragmatic factors. Firstly, the geopolitical provenience of the source texts and their year of publication are claimed to be of high social relevance and they are discussed in terms of their direct discursive relevance in general (e.g. [61: 204]) and/or indirectly causal functional salience for the variable structure of the discourse in this specific communicative context. The latter emerges from the findings gathered in the exploratory study [66] and it is to be inferred from the broad understanding of the notion of context, specifically on the grounds of legilinguistic studies [32: 181, 42: 102].

For the purpose of statistical processing aimed at presenting some tendencies in the distribution of specific values, the notions developed on the grounds of sociolinguistic studies have been transposed into the concepts related to the framework of statistical description. Hence, the categories related to Hymes’ parameter of setting are referred to as variables, and features discerned within specific context categories are assigned the label of values or variable indicators. The quantitative analysis for the two context variables was performed with the use of the Statistical Package for Social Sciences programme. The hypothesis of discursive relevance of the contextual variation for translation performance was verified by statistical computing performed with the R tool for the source texts and translations. The lexical data were initially processed in the Sketch Engine software. They were first normalized and, for calculating the tendencies with regard to the variable of year and country of origin, the metadata were standardised to the following broader categories: (1) country: US, UK, EU, AS, CA corresponding to American, British, European Asian and Canadian texts respectively and (2) year: < 2000, 2001–2010, > 2010. The comparative analysis of the distribution schemes in the source texts and translations involved (1) identifying the quantitative cross-linguistic distinctions with regard to the distribution of the seven discrete unitsFootnote 6 (noun, adjective, preposition, adverb, pronoun, subordinate conjunction and coordinate conjunction), for the specific context categories and (2) determining in which cases the proportion of the cross-linguistic quantitative distinctions varied against the variable of the two source text variants. This involved identifying potential shifts in the proportions that capture the difference between the two source text variants and the corresponding target text variants; that is, translations from the diachronically and diatopically varied source texts. The context categories were assumed to be discursively relevant for the distribution of specific discrete units in the event of incongruous results in the statistical significance test for the source text variants and for the relevant translation sub-corpora. The statistical significance test was applied here outside its primary role and it was used as an indicator of potential deviations in the relevant proportions between the two samples compared, as distinguished by the source text variants. Incongruous results in the statistical significance test are deemed to be an indication of variation in translation performance due to the source text variantivity. It needs to be noted here that according to the operationalisation scheme adopted, in principle, the quantitative distinctions in point are assumed to originate by virtue of language systemic differences and it is only by identifying the distinctions in the quantitative distribution of the set of discrete units in the translation performance depending on the source text variants, confirmed by the statistical significance texts that the relevance of the context factors is ascertained and attributed to the contextual variationFootnote 7 of the source texts.

The reliability of the task operationalisation when it comes to ascribing the quantitatively salient areas to translation-related phenomena is assumed to be secured by the homogeneity of the texts making up the corpus which are closely related to each other in terms of the production situation, content, length, authorship and institutional background. The raw frequencies were normalised per million.

The operationalisation of the research task involved employment of the paradigmatic methodology which consists in quantification of discrete units [39: 3]. The choice of these tertium comparationis units is justified by the organising principle of the project methodology (including the perspective of the ensuing stages of the project) which is to take account of those qualitative distinctions that are based on quantitative over- and/or underrepresentation in the context of interlingual communication. Paradigmatic approach is said to fail to capture “higher dimension of text organisation” [39: 165], including the specification of the functional roles of specific descriptors and thus it is said to have solely ancillary function to the syntagmatic methodology involving closer qualitative analysis of linguistic patterns but it does bring informative results in linguistic studies. Statistically overrepresented features are thought as linguistic singularities, which are peculiar to an author or a text and as such they enable us to set empirically confirmed directions for searching for further textual distinctions in a supervised way. The choice of the set of discrete units for quantification was conditioned by their frequency-related salience as components of pragmatically motivated syntactic structures [26: 67–102] and by their distinctiveness in cross-linguistic contrasts in the interlingual legal communication [11].

The qualitative methods employed at this stage are limited and they are aimed at conducting comparative characterisation of the sociolinguistic background with regard to the relevance of the context categories for the linguistic performance cross-linguistically and with regard to the variation capacity of the linguistic features examined.

The presentation of the results and discussion consists in discussing the frequency data related to the distribution of values within the context categories covered by the analysis and problematising their potential discursive relevance. This part of the discussion includes explication of the situation-specific complexity and significance of the context parameters in the specific communicative situation. Whenever possible, reference is made to the justification of the factors triggering the specific contextual profile of the source texts and to the studies conducted in other, related discourse environments. The discussion goes on to examine the potential discursive relevance of the source text variantivity in the diatopic and diachronic perspectives.

3 Results and Discussion

3.1 Insights into the Contextual Profile of the Company Registration Discourse

The context category of the country of origin fits into the context parameter of setting. The analysis was conducted on a sample of 1430 observations. No cases of lack of data were noted. The study revealed a diffused picture as regards the origin of the source texts in the corpus. Figure 1 evidences the distribution pattern of the values obtained in the quantitative data analysis.

Fig. 1
figure 1

The context category of source text provenience in the company registration discourse

Firstly, the distribution pattern here is significantly unequal, and we can distinguish two frequency bands: higher, including the UK, the USA and the Netherlands with scores of 47.1%, 14.0% and 10.5 respectively, and lower (with less than 10% share of the total volume for each variable indicator). Secondly, the data evidence English to be the language of communication not only among the countries belonging to the Commonwealth tradition. Although the UK and the USA constitute the main share in the document output the marginal representatives of the Commonwealth tradition are also to be found in the lower band, and these include Cyprus with the score 3.4% and Canada, lower in the ranking with a share of 0.7%. The English documents processed in the company registration proceedings in many cases originate from the non-Commonwealth tradition and—except for the Netherlands scoring high—they are ranked below the 10% threshold. With the exception of Northern Ireland and Scotland which obviously politically make part of the UK but were recorded separately with the aim of identifying potential intra-UK stylistic distinctions, the non-Commonwealth agents make up the lower band which includes, in descending numerical order, Hong Kong, Belgium, Portugal, Sweden, Estonia, Luxemburg Hungary, Korea, Denmark, Poland, France Germany, Greece, Slovakia, Switzerland, Finland, Romania, Bulgaria, China, Spain and the Czech Republic.

The frequency data may be said to validate the findings related to the distribution of English use worldwide. The common reference here is the tripartite model according to which English is spoken (1) as the primary language in the English-speaking countries, (2) in non-native settings, where it fulfils the role of second language in multilingual environments and (3) as an international language [17: 34–37] also in the function of lingua franca [46: 6–16]. The corpus data fit into the category of English language use as the primary and international language, the latter covering cases where English is used for communication in professional settings.Footnote 8 The data emerging from the corpus study make it necessary to address two questions of socio-pragmatic value. What is the legal ground for using English in official communication in non-native settings, which causes court communication with foreign entities outside the English-speaking area to be processed via texts translated from English? What are the cross-cultural implications for the geo-political model as identified in the context of interlingual corporate discourse in court settings?

The first question is to be answered by reference to the relevant legal provisions regulating translation and interpretation services in official contexts in Poland. The rule is that the working language for official communicative situations is to be the language that is declared by the party to be known to him or her, as set forth in the following:

Jeżeli osoba biorąca udział w czynnościach: 1) nie zna języka polskiego i do czynności nie jest dołączony przekład na inny znany tej osobie język, notariusz powinien przetłumaczyć akt lub inny dokument osobiście albo przy pomocy tłumacza. (If a person taking part in the legal activities: 1) does not know Polish language and no translation into the language that is known to him or her is submitted, the public notary should translate this act or other document himself or have it translated by a translator—translation mine) (Art. 87, § 1. Act of 14th February 1991 Public Notary Act [Prawo o Notariacie z dnia 14 lutego 1991 r.])

This regulation does not provide for any specific language to be used by the citizens of any particular country, and neither does it introduce any restrictions as regards the use of specific language in official contexts. The law says that the language has to be known by a party, as confirmed by their declaration, which means that documents affiliated by citizenship links to distinct, not necessarily Anglophone, cultures are drawn up in English and subsequently translated, and as such processed in court settings, including the company registration context. As regards judicial documents, the rule is that documents need to satisfy one of four criteria related to the language in which they are served in order to be deemed to have been properly serviced. These include: (1) servicing the document in a language that is known to a party, (2) servicing the document in the language that is the official language of the place of service of the document (3) servicing the translation into one of these languages (Art. 8, Regulation (EC) no. 1393/2007 of the European Parliament and of the Council on the Service in the Member States of Judicial and Extrajudicial Documents in Civil or Commercial Matters (Service of Documents), and Repealing Council Regulation (EC) no. 1348/2000). The interpretation of the corpus data in the light of the legal provisions invoked here accounts for and confirms the status of English as the lingua franca in court settings, both in international judicial and extrajudicial communication.

The second question voiced above touches upon the obvious implication of such geo-politically varied background for the said communicative situation, and it is the issue of the legal culture in the comparative perspective as a factor affecting legal performance. Firstly, it needs to be emphasised that culture, understood as a country-specific legal system, is considered as an important element of context, which makes it relevant to this discussion. We know from the corpus data that the element of the source culture in the present English/Polish communicative situation is variable and the English texts also prove to be affiliated to the non-Commonwealth legal cultures, which makes the situation even more unclear. Secondly, we need to point to the potential implications of this status quo which results from operating between distinct legal cultures in the case of interlingual communicative events.

Referring to the first point, culture has a definition on the grounds of sociology [55: 257–287, 59: 501, 504] and it is also defined specifically on the grounds of legal linguistics. Culture as a system of values influenced by specific legal system is said to be one of the elements of context [32: 188], which legitimises its inclusion in the contextual considerations of the discussed variety of discourse. Two assumptions seem to recur in the literature of legilinguistics. Cultural competence is considered to be necessary in effective interlingual communication [42: 102, 49: 24–44]. Secondarily, legal culture is a considerably multidimensional phenomenon. Cotterrell [15] claims that:

[…] legal culture appears not as a unitary concept but indicates an immense, multi-textured overlay of level and regions of cultures, varying in content, scope and influence and in their relation to the institutions, practices and knowledge of legal systems (cited in Bhatia and Aditi [7: 481]).

Diversity of legal culture sensu largo is manifested in theoretical categorisations. Galdia [23: 354–356] distinguishes between legal traditions and legal systems. The concept of legal tradition accounts for the normative arrangement that stands in opposition to the Western type which dominates the world. In turn, the idea of legal systems as a concept embracing the European types of traditions involves the distinction into civil law and common law systems.

Such complexity of the communicative situations requires a context-sensitive approach which takes account of the cultural distinctions. One of the main tenets of the comparative law perspective is that legal concepts have culture-specific referential range. For example, Engberg [20: 28] claims that “Concepts differ according to where and when they are situated: They change over time and even closely related concepts will, as a rule, be at least partially different in different national legal systems”. Galdia [23: 272] and Mattila [41: 5–21] concur with this line of thinking and claim that the idiosyncratic character of legal culture in general is to be taken into account in transcoding concepts into the target language. This approach is also referred to specifically in reference to the domain of corporate law [33: 4, cited in 23: 93]. Language is said to be in a feedback loop with the operation of comparative law. It is in the individual communicative situation and—in our case—construing the texts in translation that the cross-cultural “conceptual landscape” is established. It can be assumed that interactions of the distinct conceptual systems of various legal cultures featuring comparative law are consequential for the discursive structure of the target texts.

The year of publication of the source texts was selected as another category to be examined with regard to the context profile of the documents processed in company registration proceedings in Poland. The calculations were performed on 1351 observations. 79 cases of lack of data were registered. Figure 2 below illustrates the relevant data in point, standardised to the three categories.

Fig. 2
figure 2

The context category of year of source text publication in the company registration discourse

The three standardised bands corresponding to the time spans in which the documents were drafted cover relatively short time periods, which is conditioned by the nature and extent of the legal communication in the said domain. To explain, a majority of the companies recorded as operating in 2017 in the two National Court Register divisions where the corpus was compiled are young companies with a short record of company operations and thus—in line with the normative regulations prescribing on-going registration of any changes in legal status—the documents cover short time span. They are drafted in English and directly translated into Polish in order to be registered at National Court Register divisions, there being no older source language documents exceeding the borders of ongoing English/Polish communications on court files.

The fact that, as stated above, the companies are young is conditioned by significant dynamics in the legal trade as regards commercial law companies in general. Among the branches of foreign companies operating in Poland, companies with a life-span exceeding 3 decades are few and far between. They are often dissolved, liquidated, transformed or merged. Furthermore, the high proportion of young companies is determined by the increase legal trade after Poland’s accession to the EU, which enacted the concepts of open markets and free economy. The companies established after 2004 make up a large proportion of all the commercial law entities registered as operating in 2017. These economic and political conditionings, as well as the specificity of secondary genres, as is the case of the corpus documents [3: 1–7], may—by virtue of their nature being involved in exceptionally dynamic legal trade and exposed to a variety of multi-cultural influences—create the environment which is significantly prone to short diachronic variation. The quantitative data visualised in Fig. 2 are as follows. The source documents drafted before 2000 account for 3.0% of the total amount. The second band covering documents drafted immediately before Poland’s accession to the EU up to the year 2010 inclusive accounts for 38.7%. Finally, the time band covering the year 2011–2017 constitutes the largest section of the source-text sub-corpus with a percentage of 58.3%.

The findings and assumptions voiced in the literature of the subject with regard to the evolution of legal language seem to justify the potential discursive relevance of the time factor on the grounds of legal communication. The relevance of the time factor seems to be doubtless with regard to the evolution of legal language in the long-term perspective, which is not necessarily the case with the short-time perspective. Admittedly, legal language is known to be conservative in the sense of being resistant to changes [26: 42–44] and characterised by “voluminousness, indistinctness, and unintelligibility” [2: 332, cited in 26: 42] but attempts to introduce changes to legal communication are being made, plain English movement being an example here [62: 168–176].

The short-diachronic perspective brings less spectacular results when it comes to the products of linguistic shifts but we can mention a few aspects as to what specifically changes. Studies bring to the fore short-diachronic changes in terminology [51]. Scholars emphasise the development of new generic profiles by conscious and controlled mixing of conventions and professional practices when drafting, which is referred to as interdiscursiveness [6: 34–54]. Furthermore, the time axis serves as a reference point for noting the development of hybrid legal language understood as “lexical items which are produced through translation into a target language” [57]. Finally, scholars note that term formation in the EU exemplifies language change resulting from the operation of supranational institutional legal language within the EU and the need to approximate domestic legal systems, to make it possible to enforce supranational legal instruments [10: 49–50, 50, 51]. Short-diachronic language change in the domain of law is noted in translated language which, as mentioned above, is featured by hybridity and interference, resulting from the operation of clearly-evident translation universals [56], which –in turn—affected the communication style of the national, non-translated legal language [10, 57].

3.2 Testing the Implications of Source Text Varaintivity on Translation Performance

3.2.1 Diatopic Perspective

In order to examine the relevance of the contextual variation for translation performance with regard to the context category of source text country, the distribution of the seven candidate discrete units was compared between the source text sub-corpus and its parallel Polish sub-corpus, for the two distinct source text varieties: EU source texts and their translations vs. US source texts and their translations. The individual values related to the individual countries (Fig. 1) were fitted in the standardised categories to capture some general tendencies. The standardised categories are: Asian source texts, Canadian source texts, European source texts and British source texts. Asia and Canada did not qualify here for discussion on the grounds of the imbalanced sample principle. An additional argument for selecting the European and US source text varieties for comparative analysis was that the distinctiveness in the translation output between the two source text varieties was expected on the grounds of opposition between the Commonwealth culture and non-Commonwealth cultures [8: 10].

The first stage of the analysis involved extracting the discrete units by lexical data processing, which was performed using Sketch Engine software and doing parallel statistical computing for the whole corpus and with reference to the two source text variants. The holistic parallel data show a consistent tendency for the salience of adjectives, nouns and prepositions [67]. Specifically, the quantitatively distinctive areas include overrepresentation of adjectives in Polish, and overrepresentation of nouns and prepositions in the source language.

The second stage of the analysis consisted in identifying whether the scale of quantitative overrepresentation or underrepresentation of a given discrete unit are at the same level in case of two source text variants. Figure 3 below shows two graphical representations with the Polish and English frequency distribution patterns (EU English on the left and US English on the right) placed on top of one another. The salient edges evidence the contrasts between the languages.

Fig. 3
figure 3

Quantitative variation in the diatopic perspective

Adjectives score 26,342 (26,816.44 per million) in the Polish sub-corpus compared to the EU English score of 12,437 (11,062.94 per million). With regard to nouns the contrast between the two data sets in point is 74,297 (75,635.4 per million) to 81,554 (72,543.77 per million) in favour of the EU English version. Finally, prepositions score 33,836 (30,097.74 per million) in the EU English sub-corpus which stands in contrast with the score of 21,582 (21,970.71 per million) in the Polish sub-corpus.

The tendency is analogical if we compare the US English and Polish data sets. The contrasts are as follows: for adjectives: 6322 (6623.53 per million) in US English compared to 12,045 (13,278.93 per million) in the corresponding Polish sub-corpus. The quantities for nouns and prepositions are higher in US English and amount to 36,724 (32,666.67 per million) and 16,672 (14,829.16 per million) respectively as compared to 35,262 (35,894.35 per million) and 10,340 (1052.543 per million) respectively in the Polish.

Comparison of the two radarplots on Fig. 3 does not allow to visually diagnose the areas of divergence in the contextually-conditioned cross-linguistic distinctions precisely and—as accounted for in the operationalisation scheme—congruity or incongruity of the statistical test results for the translation output of the two source text variants served as an indicator here. The statistical significance test is to identify whether the level of cross-linguistic distinctions in the quantitative distribution of nouns and prepositions is comparable. In both cases the quantitative salience noted in favour of the source texts is registered in comparable proportions in the translated texts. The statistical significance test yields the result of p < 0.001 for nouns, both for the source text variants and for the two contextually varied translation types. For prepositions the statistical significance test provides us with a congruous result on the side of the source text variants and the related translation output; that is, both the proportions of source text distinctions stay the same irrespective of the source text variable in question. Here the data for the source text variants and the corresponding translation output are both statistically significant with the p value = 0.007 for the source text variants and = 0.046 for the corresponding, contextually varied translations.

The situation is different for adjectives, for which—according to the operationalisation scheme adopted for the study—the cross-linguistic quantitative distinction is more pronounced for the texts translated from the EU source texts and thus we may assume the source text variability factor to be operative here. Technically, the finding is formulated on the basis of the incongruous result from the statistical significance test, whereby the quantitative distinction of adjectives between the two source text variants shows as less pronounced than the one between the corresponding sets of translated texts; that is, text translated from the EU and US source texts respectively. The distinction between the source text variants proves to be statistically insignificant, unlike the Polish language result, with the p value = 0.027.

The distinctive distribution pattern in the contextually varied translation performance related to adjectives shows as dependent on the source text variety and may be accounted for in two ways. It points to the stylistic distinctions of the generically parallel texts in the two source texts variants, which potentially might encourage employment of distinct lexico-grammatical structures without equivalents employing adjectival structures in translation. Whether the variation is not due to some other factors, for example, translators institutional practices and/or competence will require further analysis with additional variables, for example with the variable translation type to compare translation practices of in-house translators with the ones of sworn translators.

3.2.2 Diachronic Perspective

In order to verify the hypothesis about the relevance of the source text variantivity noted in the short-diachronic perspective for the translation performance, the author followed the same procedure. The discrete units extracted from the corpus were processed statistically to compare the relevant quantitative distribution schemes cross-linguistically for the three time slots. This involved comparison of the holistic parallel corpus data and—in order to examine the relevance of the context category of the year of publication of the source documents for the translation performance—verification of the potential divergences in the cross-linguistic distinctiveness between the contextually varied sets of translations and their corresponding source text variants which were put into the standardised categories (Fig. 2). Out of the three standardised categories presented in Fig. 2, two were selected as sample material to be discussed here in detail and these cover the years 2001–2010 and the years 2010–2017. It is assumed that—if any—changes in cross-linguistic communication might be expected in this short diachronic period since this was a period of changes in the global economy, which triggered intensive legal trade.

The holistic comparison shows consistent quantitative salience of the adjectives, nouns and prepositions with the effect of overrepresentation of adjectives and underrepresentation of the two latter discrete units in translations. The diachronic perspective presented in Fig. 4 provides data visualizing the cross-linguistic distribution schemes separately for the two source text variants.

Fig. 4
figure 4

Quantitative variation in the diachronic perspective

The frequency distribution scheme generated according to the year of publication (of English texts) in principle confirms the findings gathered for diatopically varied source texts in that the corresponding Polish texts are featured by an increased number of adjectives and lower number of nouns and prepositions. In the time band covering the documents published before 2000 the compared distribution schemes put English in the leading position with its score for nouns and prepositions at the level of 13,022 (11,583.31 per million) and 7143 (6353.33 per million) respectively, as compared to the Polish language scoring 9658 (9831.2 per million) and 2933 (2985.6 per million). The dominance of adjectives on the side of the Polish language is confirmed with the score 4049 (4121.61 per million) and a frequency of adjectives—at the level of 2412 (2145.52) in the corresponding source texts.

An analogical tendency is observed for the time band covering documents published after 2010. Here, the datasets contrasted for the purpose of this analysis evidence the quantitative salience of adjectives in the Polish language: 46,151 (46,978.62 per million) compared to 20,163 (17,935.36 per million). The scores for nouns and prepositions consistently point to the dominance on the English language side: 138,093 (122,836.25 per million) and 55,399 (49,278.42 per million) respectively, juxtaposed with the Polish data showing the corresponding values of 130,027 (132,358.76 per million) and 39,019 (39,718.72 per million).

The statistical significance test here is not indicative of any variation in the proportion of cross-linguistic distinctions in the quantitative distribution of the discrete units with regard to the two diachronically-profiled source text variants.Footnote 9 If we consider the translations from the two source documents corresponding to the two time bands separately, we see that the proportion of the cross-linguistic divergence remains the same. This is confirmed by the results of the statistical significance test which—in all three cases—proved to be positive and—what is more—the p value for statistical significance is invariably p < 0.001 for all the intra-linguistic quantitative distinctions.

The hypothesis about the relevance of the time factor for cross-linguistic communication with regard to our standardised time bands, type of corpus and linguistic features was not positively confirmed. The findings do not point to any correlation between the variable of year of publication of source documents and the linguistic performance in translation. In English the data are fairly balanced in the short-diachronic perspective and consequently the related frequency distribution pattern for the quantitatively salient discrete units remains consistent irrespective of the time factor attributable to the source texts. The related distribution patterns cannot be said to either converge or diverge in translation.

4 Conclusions

The descriptive and exploratory study which employed methods of corpus-based quantitative analysis provides data related to the contextual background of written corporate English/Polish interlingual communication used for the purpose of company registration proceedings in Poland. The results showed that the communicative situation covered by the analysis is not heterogenous, but it allows itself to be described in terms of a finite set of values. The quantitative analysis of the data related to the context featuring variables such as source culture, or year of publication of the source texts, point to the existence of specific tendencies as regards the distribution of the values, and this may constitute the foundation of a context model for the specifically situated communicative situation. Hence, the hypothesis anticipating contextual variation on the side of the source texts was positively verified.

The second hypothesis, related to the discursive relevance of the source text variantivity, was confirmed in part. In general, the pattern of the cross-linguistic contrasts remains consistent throughout the corpus. Translations demonstrate a higher percentage of adjectives and a lower percentage of nouns and prepositions. Other discrete units prove to be variation-resistant in the cross-linguistic perspective. If we examine the cross-linguistic quantitative distinctiveness ratio separately for the diatopic and diachronic source text variants we find that only the category of country of origin of the source text is found to affect the translation performance and only with effect for adjectival structures. In the short-diachronic perspective no variations were noted in the translation performance.

The findings show that the study needs to be extended both with respect to including more context categories, more variety of lexico-grammatical material, possibly exceeding the limits of discrete units distribution. Also, more in-depth qualitative analysis based on examining parallel material would allow for the identification of the degree to which the cross-linguistic distinctions are due to systemic differences between languages and it would single out the distinctions that might be the result of the operation of translation universals. Closer parallel analysis of the material related to the cross-linguistically distinctive areas would also enable us to precisely relate the differences in the distribution of discrete units to the cross-linguistic contrasts with regard to expression of specific discourse functions.

The added value of this study is in presenting a context model for interlingual communication situated in company registration proceedings in the local perspective. Such a design of the research project follows the general, contemporary trend in discourse studies characterised by increasing specialisation in specialised communication, and specifically in legal linguistics [23: 102]. This involves focusing on functionally, instrumentally and contentwise distinct aspects of communication within the domain of professional communication and it is referred to as contextualisation of discourse/genre analysis (e.g. [4: 166]). This study indirectly addresses some issues that have been widely discussed in the context of professional discourse, such as variation in translation, parallel corpus studies, yet a study that would present the national perspective of these aspects in the specifically situated discourse has not yet been registered. The discussion encourages multi-perspective and multidimensional linguistic research, involving concepts from sociocultural and discourse studies discussed as values emerging from quantitative corpus analysis.

The pedagogical application of the contextual framework, whether tested for its discursive relevance or not, would be of significant value, in line with the approach advocating the significant role of thematic competence in the training of legal translators and interpreters [9, 42, 52]. The findings can provide a foundation for the development of a variety of multilingual resources for legal translation (e.g. dictionaries, databases).