1 Introduction

Supplementary pension plans are increasingly important in old-age security due to the decreasing role of public pensions in providing individuals with adequate income in old age. Moreover, additional pension schemes, both collective and individual ones, enable people to tailor the pension security to their individual needs and make the pension security more comprehensive. Being the necessary element of adequate old-age pension security, they are also intensively advertised by financial institutions and usually offered with tax incentives to reach higher coverage. But supplementary pension products may also pose a risk for individual savers and for the whole system of supplementary savings, when their characteristics do not fit financial skills and competences of individual savers. Firstly, the products (plans) may use financial tools that are too sophisticated for individuals. Secondly, the form and language of information and contracts addressed to individuals may be highly unreadable for them.Footnote 1

Old-age pension systems are regularly studied in research publications and reports of international organisations [22, 25,26,27]. But the analyses are usually limited to the architecture, coverage levels and assets gathered. The existing studies usually show supplementary pension plans from the point of view of the regulator or the supervisor but not the pension system participant or individual saver. That is also a consequence of scarce micro data on supplementary pension plans participation and deficits of information policy. The official statistics do not provide any data on the appropriateness of retirement plans for individual savers. That is hardly understandable when costly tax incentives are offered with supplementary pension plans.

The problem may be further intensified by the incomprehensible language used in the content of these agreements and their complicated structure. All this may result in an increased risk of error, i.e. the purchase of unsuitable products by individuals (misselling), which does not provide adequate income in old age. Hence, the need for further research on clarity and readability of supplementary pension plan contracts is obvious.

The aim of our study is to determine the level of comprehensibility (readability) and clarity of the individual pension product agreements offered in PolandFootnote 2 in the form of institutional individual pension accounts (IKE) and individual pension security accounts (IKZE) as of the beginning of 2017. We analyse the great majority of supplementary pension products offered in the market (77 out of 97) and verify the following hypotheses:

H1

The language of agreements on individual pension products is too difficult for most Poles, because understanding it requires at least higher education.

H2

The incomprehensibility of agreements on individual pension products is due to the level of complexity of the structure of documents, as well as the difficulty of the language used.

H3

Psycholinguistic methods provide more accurate information about pension product contracts comprehensibility for a specific audience than analytical methods.

To achieve the aim of the paper we used primarily automatic tools of readability assessment to measure text difficulty combined with the psycholinguistic cloze test method for one of the tests. The purpose of this procedure was to verify the results of the analytical methods. Both of these methods have distinct advantages: analytical methods provide quick information about the readability of a text for the average reader, while the cloze test reports in more detail about the difficulties in comprehending a text for a specific group of respondents. To our knowledge, is the first such a study on an international scale. Thus, our analysis fills an important research gap.

2 Literature Review

There is a broad literature on the readability of non-literary texts. Interest in the impact of certain linguistic features on text comprehensibility emerged in the nineteenth century. Sherman [37], for example, noted that the length of sentences in literature had decreased since Shakespeare's time, which had a positive effect on the reception of nineteenth century literature. At the end of the nineteenth century the German researcher Kaeding [16] begun to link greater text comprehensibility and the occurrence of frequency words in the text. However, the first readability formulas were developed in the 1920s. Then American psychologist Thorndike [41] compiled a list of the 10,000 most commonly used words. This list was used to assess the comprehensibility of texts intended for students at different stages of education. In 1935, Gray and Leary [12] distinguished five statistical and stylistic indicators which influenced the understanding of texts by adults. In 1948, Flesch [10] proposed a fundamental readability formula for English [17], which combined two simple indicators: word length and sentence length. Another formula, using the same indicators, was the Flesch-Kincaid test which presented a score in terms of a grade level in US schools [23], making it easier for teachers and others to judge the readability level of various books and texts. Another popular formula, the FOG index, was established by Gunning [14]. FOG index results are presented as the number of years of education required to understand a text. This formula is used until the present, even in other countries. However, for Polish texts, the percentage of difficult words used in FOG index had to be modified regarding words that have more than three syllables. The FOG index has frequently been used to calculate the readability of Polish official texts about European Funds in comparison with the Polish press [6] or Polish official texts on the Internet [43]. There are also hundreds of readability formulas that have been devised since Flesch’s time, including for languages other than English. Björnsson [4] created the so called LIX-formula for Swedish. For Polish texts, there is the formula proposed by Pisarek [30], based on American formulas using percentage of difficult words and average sentence length. In 2015, the ‘Jasnopis’ computer application was developed based on the Pisarek formula and on the results of both linguistic and psycholinguistic tests conducted by Polish researchers [13].

The readability of non-literary texts has been examined extensively in current and recent linguistic studies. But this research did not develop any methods or indices for the examination of text transparency. Moreover, texts that were analysed in studies on readability usually included materials from the press, popular science or legal acts.

Previous linguistic studies conducted in Poland concerned the readability of non-literary texts in general (e.g. [2, 6, 30, 33]) and the first attempt at assessing the comprehensibility of texts addressed to individual customers in the market of pension products were rather fragmentary and used only an automated tool for assessing the readability of texts [9].

The great majority of literature on old-age pension provision focuses on the architecture of pension systems, coverage and economic aspects of pension funds operation [22, 25,26,27]. The research publications rarely consider the linguistic appropriateness of pension products contracts and their understandability for individuals. This research gap exists both on the Polish as well as on the international market.

The first attempts at assessing the comprehensibility of the content of pension agreements have been made on the US market [8, 19, 42]. Based on the analysis of the level of complexity and comprehensiveness of information on the system of charges applied to financial and pension products, it has been found that more expensive products are characterised by more complex language used to inform about the fees charged. None of the published analyses, however, involves a comprehensive examination of documents affecting the content of agreements on individual pension products, i.e. general terms and conditions concerning products, rules of the offer and the content of agreements with customers, despite the fact that there are clear legal regulations stating that the content of agreements on financial services addressed to individual customers should be formulated in a clear and understandable way.

The most recent studies on readability, clarity and efficiency of individual pension products in Poland [28, 35, 36] proved that the difficulty level of individual pension contracts is high and the documents are unclear irrespective of the type of financial provider. However, these studies did not examine the linguistic complexity directly and used only the automatic tools of comprehensibility assessment. Moreover, they generally aimed at indicating the relations between the linguistic characteristics, efficiency and costliness of the contracts, not at the deep analysis of readability and clarity of the contracts.

3 Research Methods and Data Sources

There are many terms in the literature for text comprehensibility, used to refer to various text features. The terms comprehensibility [3] and readability [5] are often used synonymously and difficulty as a contradiction, while the term clarity is often used to mean: the quality of being clear and easy to understand, see or hear. In the study, we use the following terms related to the topic:

  • We understand readability as a characteristic based on text features such as word and sentence length,

  • We do not distinguish between readability and comprehensibility (or understandability) and use those terms alternatively,

  • We use clarity for the quality of being clear or transparent to the eye. Clarity is based on typographical features, but is combined with some clear expressions (imperative mode in the second person singular (or plural) of Polish).

The study of comprehensibility (readability) and clarity has been divided into two parts. The first part is devoted to the readability of texts, i.e. the group of purely linguistic features that allow an individual to understand a text. The study therefore focused on the level of difficulty of texts. It should be mentioned, however, that the concepts of comprehensibility and difficulty of a text are differentiated in the literature [13:9]. Difficulty is defined as an objective feature of a text, independent of the reader’s skills, while comprehensibility comes down to the individual capabilities of the reader of the text, mainly the level of education. A text can be objectively difficult, e.g. because of complicated syntax or specialist terminology, but still it will be well understood by readers familiar with the subject. It should be noted, however, that the terms “difficult” or “difficulty” are sometimes used interchangeably in this study for stylistic reasons, not to overuse the terms “unreadable” and the like. In assessment of comprehensibility of contracts we used the Jasnopis application and the cloze test. The second part of the analysis focuses on the clarity of agreements and we applied our original scale of clarity based on few key features of texts.

3.1 Readability Measurement Methods

3.1.1 Jasnopis Application

The primary tool that has been used to study the readability of texts is the Jasnopis application, which was created in 2015 [13]. It helps in the calculation of many text parameters, such as the average sentence length, the number of long words or percentage of nouns and verbs, which are used to determine the level of difficulty of a text. In the development of the application, psycholinguistic tests were used, which made it possible to establish which texts are easy for particular readers based on the level of their education. This allows the assumption that Jasnopis determines readability of texts for a specific group of readers.

Its primary function is to determine the text difficulty class in the range from 1 to 7, which refers to the indicative stages of education (as organised prior to the introduction of the education reform in Poland in 2017). The text readability scale is presented in Table 1.

Table 1 Text readability level determined by Jasnopis application

In addition to the text readability class, Jasnopis measures its vagueness, i.e. the FOG Index, which indicates the number of years of education required to understand a text [14]. The vagueness index is calculated according to the formula:

$$T = 0.4 \times (T_{w} + \, Ts),$$
(1)

where Twthe average number of words in a sentence; Tsthe percentage of difficult words (i.e. longer than the average in a given language).

For texts in Polish, it is assumed that difficult words have four or more syllables. Jasnopis calculates three variants of the vagueness index (FOG index) for: base forms,Footnote 3 textual formsFootnote 4 and rare base forms.Footnote 5

Another factor given by Jasnopis are Pisarek indices, or the text difficulty factors according to the original, non-linear (NL) Pisarek formula [13]:

$$NL - Pisarek = \frac{{\sqrt {T_{s}^{2} + T_{w}^{2} } }}{2}$$
(2)

where Tw—the average number of words in a sentence; Ts—the share of four-syllable or longer words, and the text difficulty indices according to the linear (L) Pisarek formula [13]:

$$L{-}Pisarek = {1}/{3}T_{s} + {1}/{3}T_{w} + {1}$$
(3)

Each of these indices, both non-linear (NL-Pisarek) and linear (L-Pisarek), is calculated in three variants for: base forms, textual forms and rare base forms.

Moreover, Jasnopis provides the following quantitative information about a text:

  • Number of paragraphs. This number may be somewhat overstated if the text includes lists. For Jasnopis, a paragraph is any piece of text between the paragraph marks (in MS Word the sign is “Enter”).

  • Number of sentences. This is calculated on the basis of punctuation and paragraph marks. Punctuation marks indicating the end of a sentence include a full stop (except for a full stop with abbreviations or numbers), question mark and exclamation point.

  • Number of words. For Jasnopis, a word is any string of letters or numbers, not separated by a space (gap) or punctuation mark.

  • Number of difficult words. Difficult words are considered words whose base (primary) forms have four or more syllables and which are not generally known, i.e. they are not words belonging to the group of the five thousand most common words in Polish texts or words with the so-called subjective probability, or words which, although statistically not very frequent, are well known [15].

  • The average word length (textual word) calculated by the number of syllables.

  • The average length of sentences, expressed as the number of words.

  • the average length of paragraphs, calculated as the number of words.

The average sentence length depends mainly on the type of text [13]. For example, in journalistic texts sentences should consist of 10–15 words [20, 32] and of 20–30 words in scientific texts [11, 20].

  • As part of a comprehensive analysis of the linguistic parameters of the studied text, the program also shows the following percentages: nouns, difficult (i.e. long) nouns, verbs, adjectives, difficult (i.e. long) adjectives and determines the so-called nominality index [31] which is the ratio of nouns to verbs.Footnote 6

All the above-mentioned indices and parameters can be used in the analysis of texts, although it should be remembered that when calculating the text readability class—beyond the statistical values of a text—Jasnopis uses the results of psycholinguistic studies of literacy. Hence, the starting point of the analysis is the text readability class calculated in principle for the pdf version of the text. The text readability class has been compared with the vagueness index (FOG index) according to base forms, giving the range of values for texts in a given readability class, and with the percentage of nouns and verbs. In addition, we present in the Appendix the values of all the indices calculated by Jasnopis, obtained for the entire group of documents analysed.

3.1.2 Cloze Test

In addition to the analytical methods described above, we also used the cloze test to measure text readability. It is a method developed by Taylor [39, 40] referring to psychological research. The cloze test consists in the ability to fill in the gaps in a text in order to obtain the correct whole. Testees are instructed to fill in each gap (usually 50) in the text using only one word which they think has been deleted. The classic version of the test contains only blanks, with no response options, while the modified version offers the reader three or more answers (multiple-choice-test). Regardless of the version, usually the first and last sentences of a text do not contain gaps. It has been stated that for greater reliability, texts should contain at least 50 gaps [5].

The test score is calculated as the percentage of gaps that are correctly filled in. The higher the number of correctly filled gaps, the more readable the text is to the reader. The level of comprehension of a text is indicated by the following standards that have been developed for the cloze test [5, 31]:

  • 60% and more correct answers suggest an "independent" reading level: the person does not need help while reading to fully understand the text;

  • 40–59% correct answers suggest an instructional level—the person reading needs guidance to fully understand the text;

  • less than 40% correct answers suggests a level of frustration—the text is too difficult to be understood by the reader.

In a traditional cloze text, all words regardless of grammatical class and type (including proper names and numerical data) can be replaced by a gap. The test taker therefore does not choose which word is removed from the text, exactly every nth word is removed from the text. However, there is also another approach where only words relevant to the understanding of the whole text are deleted.

The advantage of the cloze procedure is that it seems to take into account all possible factors affecting text readability, e.g. prior knowledge, vocabulary and linguistic ability [1, 30, 31] as well as non-textual and non-linguistic factors, such as the recipient's interest in the topic or familiarity with the subject matter of the text. It should not be forgotten, however, that cloze tests cannot replace analytical formulas. Analytical methods are definitely less time-consuming, as they do not require research with readers.

We conducted the cloze test with a document concerning an insurance pension product. We selected for testing an excerpt from general terms of insurance (pl. ogólne warunki ubezpieczenia, OWU) of an individual retirement account offered by Nationale-Nederlanden TUnŻ S.A. (NN IKE). The whole text of this document represents Jasnopis readability class 6 and has FOG-index 18.92. Class 6 is interpreted as a difficult text for the average Pole to read, requiring an university education (master degree). The FOG index of almost 19 means that the text is even more difficult because it requires about 19 years of education, i.e., doctoral study. The numerical values of both texts (the full document and the excerpt) in terms of percentage of verbs are identical.

The excerpt we selected for the cloze test is the introduction part (NN INTRO IKE) that uses easier and less specialized vocabulary. It belongs to Jasnopis class 5, has FOG-index 8.19 (text forms) and 9 percent of verbs. The Jasnopis class 5 is interpreted as a more difficult text, comprehensible to educated people and defines the approximate level of education needed as an undergraduate (or engineering) degree. This interpretation is somewhat at odds with the FOG index, which indicates a required 8 years of schooling (high school level). The high percentage of verbs in the text confirms that the text is easy to read. The figures and text readability rates are shown in Table 2.

Table 2 Numerical data oft the whole text (NN IKE) and the introduction part (NN INTRO IKE)

The difference in the assessment of the difficulty of the whole text and the examined excerpt is due to its linguistic heterogenity. The introduction represents less formal language, as it is intended for the layman-reader, while the second part is a standard version of the general conditions of insurance. The introduction part presents the benefits of having an IKE account and the basic information about the contract and the rules of collecting funds in an IKE. The introduction is structured in the form of questions and answers, with questions often formulated in the first person singular and answers in the second person singular. It also uses less formal vocabulary (e.g. "money").

It is worth mentioning that the text is graphically clear and received the highest score in the transparency assessment in the survey of the whole set of insurance documents. The advantages of the contract text was the reader-friendly font size (9°–13°) and font type (sans serif Arial) in a 2-column-format, and relating to the content illustrations. For this reason, both texts received the maximum number of points with a subjective assessment of clarity (5 out of 5).

The cloze test used a passage from the introduction to the contract, on the subject of contract payments and was conducted by e-mail. The text had 50 gaps. Two test groups of 20 persons each were created. The first research group consisted of people with a non-economics university degree, the second group consisted of people with an economics degree. Gender parity was applied in both groups. The characteristics of the tested group are presented below (Table 3).

Table 3 Characteristics of the cloze-test group

3.2 Clarity Measurement Methods

The clarity of a text is understood in this study as the graphic, metatextual and pragmatic features affecting its perception. Graphic features include primarily font size, width, spacing, headings and subheadings, illustrations and diagrams. In turn, metatextual elements include components that guide the reader through the text, for example table of contents, introduction and summary. They play a particularly useful role in longer texts composed of several or several dozen pages, which are the subject of our study. In contrast, the pragmatic elements affecting the perception of a text include the author directly addressing the reader of the text, or verbs, usually in the second person singular. The latter are, admittedly, measurable, and as verb forms they are included in the calculation of Jasnopis, but because of the potentially low number of such forms (at least in economic and legal texts), they do not affect significantly the result of the calculation. However, they play a special role in the reception of the text, facilitating communication by bringing written language closer to everyday language [24]. We can say that they give concrete form to activities, indicating their executor. This particular function of direct forms of address to the reader is used more and more frequently, e.g. in instructions on websites [13], or information prints and advertising materials. On the margins of these considerations, it should be noted that the use of the direct language form “you” in the second person singular in the studied regulations and statutes does not violate the existing language standard to address people in official situations using the form “you” in the third person singular/plural (“Mr.”, “Ms.”, “Sir or Madam”), as the recipient of these texts is collective, not individual.

The elements of the analysis of clarity cannot be counted automatically, since there is currently no suitable tool. For this reason, we performed the analysis of clarity based on five criteria: font, texture, graphics, metatextual elements and direct address forms. One point was granted for one criterion fulfilled.

In studies on the clarity of texts, there is a distinction between documents intended to be read on paper [44] and on the computer screen [3]. For reading on a computer screen, sans serif typefaces (e.g. Arial) are recommended, while for publication on paper serif fonts, which include ornaments at the ends of each letter, are recommended. Additionally, according to the WCAG 2.0 standardsFootnote 7 that define the conditions of accessibility of websites for people with disabilities, it is considered that sans serif fonts are preferred by the majority of visually impaired people.Footnote 8 It should be noted, however, that it is difficult to determine whether the text of an agreement on individual pension product will be read on paper or on the screen of some electronic device. The analysis therefore did not include any of these two forms of writing.

In the literature, it is emphasised that a font size below 10° hinders perception. Therefore, font size has been taken into account in the assessment. However, certain exceptions are allowed—in the case of two-column texts, it is assumed that in a narrow column the vision must embrace a shorter line to enable the reader to focus on smaller details, so in the case of two-column texts, fonts up to 9 degrees are assessed positively.Footnote 9

Interline spacing is also important for readability of a text. Experts [13:40] [34] recommend the use of a minimum interline of 1.15 typographic point. In the documents available in the analysed set, fonts and line spacing of different sizes in different parts of the text are sometimes used. In this category, one point was granted for font size of at least 10° (9° for two-column text), and for minimum spacing of 1.15 typographic point—at least for the greater part of the document.

Segmentation of texts into paragraphs, chapters, sections and points facilitates reading, as long as the individual segments are justified in terms of content, and their number is not too large [13]. It is believed that a frequent defect in official texts are paragraphs that are too long or vaguely demarcated. This note also applies to texts affecting the content of retirement accounts agreements, although they are not strictly official texts. Regulations, agreements, and general terms of insurance were formulated by financial institutions according to the existing generic models. The segmentation of texts of regulations, agreements and general terms of insurance usually covers two or three levels (chapters, articles, sections), but the individual segments are not always clearly separated from each other. For this criterion, one point was granted for the proper segmentation of a text, i.e. individual segments (chapters, paragraphs, articles, points) are not too long and are clearly separated from each other, whereby the maximum recommended length of the segment is 15 lines.

Apart from its segmentation—an important role in the structure of a text is played by headings, subheadings, and graphics (pictures, images, charts, etc.) [44]. For headings and subheadings, it is not so much about their presence (usually numerous in the presented set), but their visibility. They should be visible but possibly not exaggerated, e.g. through the use of multiple colours or typefaces. However, there is a certain degree of subjectivity in the evaluation of graphics. In conclusion, in this criterion one point was granted for well highlighted subheadings and/or for placing other graphical elements in the text.

The structure of a text also includes metatextual elements in the form of tables of contents, separate introductions, summaries, etc., which are guides to the text. These types of elements, of course, make sense in the case of longer texts, which include the documents analysed. A characteristic element of regulations and general terms of agreements are glossaries of terms used in the texts placed at the beginning of the text. In this category, one point was granted if a text contained at least one of these elements, except for glossaries of terms used in documents which—despite their functional metatextual value—are a mandatory component of the generic model of agreements, statutes and regulations.

The last element of clarity assessment were direct forms of address to the reader that include verbs in the second person singular (e.g. “You can sign an agreement”), imperative forms (e.g. “Sign an agreement”) or derivatives (e.g. “Your agreement”). These types of forms are rarely used in official documents relating to retirement account products, but there are exceptions. In this category, one point was granted if a text contained at least one such form.

In general, in the clarity analysis documents could therefore receive from zero to five points that indicates the level of text clarity from very low to very high (Table 4).

Table 4 Text clarity levels

3.3 Documents Covered by the Analysis

Supplementary individual pension schemes (IKE and IKZE) in Poland allow individuals to save in five different types of pension products: unit-linked life insurance, investment funds, voluntary pension funds, bank savings accounts, and securities accounts at brokerage houses. These are managed by life insurers, asset management companies, general pension societies, banks and brokerage houses. Our analysis includes both types of individual pension schemes (IKE and IKZE) in operation in the first half of 2017. In total, the study covered contract documents of 77 out of 97 individual pension products offered by life insurance companies (15), investment fund companies (31), banks (9), brokerage houses (10) and general pension societies (12). A complete list of products included in the analysis is shown in Table 10 in Appendix.

The body consisted of 77 texts affecting the content of 78Footnote 10 agreements on individual pension products, which presented several genres or subgenres of specialised texts in the field of finance: agreements, regulations, general terms of insurance, terms and conditions of participation in the program and product cards. The analysis covers 90% of individual retirement product contracts offered on the market. The documents were obtained in the first half of 2017 from the websites of financial institutions. Table 5 lists the number of documents of each subgenre broken down into financial institutions which prepared the documents.

Table 5 Number of documents of the individual genres analysed

The analysis of both comprehensibility and clarity was carried out by type of financial institution. The purpose of the breakdown of documents by institution is to determine which of these entities prepare clearer texts that contribute to a better understanding of the financial products they offer. The purpose of the analysis was not to determine whether the individual texts representing different subgenres are more understandable than others (e.g. product cards vs. agreements), and therefore the analysis abandoned the breakdown of texts into subgenres.

4 Results and Discussion

4.1 Readability

The analysed documents are quite similar in terms of the values detected by Jasnopis, particularly in terms of readability classes. Most of the analysed texts frequently represent readability class 6 or 7, and in the case of one pair of texts (IKE and IKZE contracts of the same financial institution), class 5. The number of texts in a group of institutions in each readability class is presented in Table 6. It also indicates the range of the FOG index and the percentage of verbs in the agreement for documents assigned to specific readability classes. The full detailed results of the quantitative analysis conducted with Jasnopis application are enclosed in the Appendix (Table 11).

Table 6 Number of documents affecting the content of agreements on individual pension products assigned to the readability class by type of financial institution

The documents in the entire set, including IKE and IKZE contracts, are quite similar in content and form, hence the analysis results obtained using the Jasnopis and FOG index are very similar. Only life insurance companies prepared two documents (10% of the analysed documents) in readability class 5. Other documents in this group belong to class 6 and 7. Investment fund companies prepared 27 documents (90%) in class 6, while other texts belong to class 7. All documents prepared by brokerage houses belong to readability class 6. Class 6 also includes eight (89%) banking documents and ten documents (80%) of general pension funds.

It can therefore be considered (according to the interpretation of Jasnopis) that the texts analysed are difficult to understand for the average Pole and are understandable for university graduates (class 6), or are even very complicated, professional texts the understanding of which may require specialist knowledge or a doctorate (class 7). Only for one pair of texts in class 5 can it be considered that the texts were understood by people with an undergraduate education.

The numerical values of the text, including the percentage ratio of the number of verbs and nouns to verbs in the text proved to be more useful than the results of the vagueness indices. The latter value correlates with the text readability class (a higher ratio of nouns to verbs usually proves a higher readability class).

An attempt to verify the results given according to the vagueness index (FOG index according to base forms) showed the incompatibility of the results obtained from Jasnopis, which are considered in this study as more reliable. This assumptions on text difficulty levels had to be confirmed or challenged using psycholinguistic tests that we did using the cloze test.

To verify the Jasnopis comprehension results, we conducted the cloze-test with the excerpt from the Nationale-Nederlanden general terms of insurance (introduction part). In the first stage of this survey the given below results were achieved (Table 7).

Table 7 Results of the first stage of the cloze-test (percent of correct answers)

None of the test groups—neither those with higher economic education nor those with higher non-economic education—exceeded the instructional level of reading comprehension according to the mentioned scale, and the average score in the economist group (55%) was, however, significantly higher than the average score in the non-economist group (45.5%). The spread of results among economists (42 p.p.) and non-economists (38 p.p.) was similar. The results achieved by men in the group of economists were worse (48%) than those achieved by women (61%) who achieved independent reading comprehension, while in the group of non-economists the results achieved by both genders were almost identical.

It should be added that a follow-up study was also conducted in which gaps in the text were filled in by a pension fund expert. The expert achieved a score of 47/50 demonstrating reading comprehension at the independent level.

The text under study is a specialised text, but, as mentioned, it contains some colloquial lexical units, which contributed to the use of synonyms from a different linguistic register (e.g. the specialised word "funds" (pl. środki) was often used instead of the original colloquial word "money"—especially by economists). In addition, lexical units that are fully interchangeable in Polish (e.g. konto and rachunek—both referring to ”account”) were often used. Generally it should be mentioned that economists used correct synonyms more often (140/1000 gaps) than non-economists (110/1000 gaps), who often used incorrect synonyms.

For reasons mentioned above, it was decided to carry out a supplementary stage of survey taking into account the synonyms used. The results of this stage of the analysis are presented in Table 8.

Table 8 Results of the second stage of the cloze-test (percent of correct answers)

None of the test groups improved their average results by ca. 10 p.p. what allowed the economists to achieve an independent level of reading comprehension (68%), while non-economists stayed at the instructional level. After including the synonyms to the cloze-test results calculation the spread of scores stayed the same for economist (42 p.p.) but increased significantly for non-economists (54 p.p.) what proved that the second stage of the cloze-test better revealed the comprehensibility level than the analyzed text according to the knowledge and experience of individuals tested.

The results achieved by men were slightly worse than women in both groups (65% vs. 71% among economists and 55% vs. 61% among the non-economists).

The numerical data of the studied text (NN INTRO IKE), namely Jasnopis class 5, FOG-index and percentage of verbs indicate that the text is easy to read, but the level of education required to understand the text is indicated as varying: higher education (master degree) in case of Jasnopis class and high school in case of FOG-index. According to the authors, this indicates the weakness of the FOG index, which is less accurate in determining text difficulty. In our opinion, the main difficulty of the text lies in the specialized vocabulary and phraseology of the text, as confirmed by the cloze test. According to the cloze test in the classic version a group of specialists and a group of non-specialists reached a intructional level of reading comprehension. Only by counting the 1:1 synonyms as valid results it was possible to conclude that the group of specialists did not need help understanding the text. Thus, it should be concluded that the numerous group of synonyms in Polish specialized language of economics is the main reason for the relatively poor score of the cloze test. The result of the cloze test also indicates that both study groups recognize the specialized variety of language well. Where colloquial expressions (“money“) were used in the original text, most of the respondents intuitively used the default equivalent from the specialized language (“funds“).

As stated earlier, it is difficult to dispense with analytical methods in determining text comprehensibility, because they give us a quick indication of the difficulty of the text. The above study indicates that the Jasnopis result is more trustworthy than FOG-index. At the same time, it is important to note the advantages of the cloze test, which indicates clearly the readability of the text and further—which expressions in the original text are more difficult for readers to understand. One basic conclusion in this regard may be to refrain from using too many synonyms and to use as many of the same phrases as possible.

4.2 Clarity

The analysis of the clarity showed that the analysed texts are more diverse in this aspect than in the level of readability of the language used (Table 9). Detailed evaluation of the documents in terms of clarity by types of financial institutions are presented in Table 12 in Appendix (Fig. 1).

Table 9 Number of documents affecting the content of agreements on individual pension products by type of financial provider and by level of clarity
Fig. 1
figure 1

Source: authors’ work

Decomposition of clarity scores by type of financial provider. *Product card (pl. karta produktu). **Regulations (pl. regulamin).

Based on the average scoring for texts in each group, a ranking of groups of documents from the most to the least clear can be presented:

  1. 1.

    Life insurance companies (the average number of points for the text: 2.3/5).

  2. 2.

    Voluntary pension funds (the average number of points for the text: 1.8/5).

  3. 3.

    Brokerage houses (the average number of points for the text: 1.6/5).

  4. 4.

    Investment fund companies (the average number of points for the text: 1.3/5).

  5. 5.

    Banks (the average number of points for the text: 0.6/5).

In almost all groups of financial institutions the same drawbacks in texts occur:

  • too small font and/or too narrow spacing;

  • too long or poorly highlighted paragraphs;

  • lack of metatextual elements;

  • lack of direct address to customers.

The top-rated criterion in all groups of texts is the highlighting of subheadings and graphics (an average of 0.8/1 point); the criterion with the lowest scores was direct forms of address to the reader (an average of 0.1/1 point). Other criteria are generally fulfilled at an even level (0.3 points/1 point).

Individual groups of financial institutions received an average of 0.6 points to 2.3 points for their texts. None of the entities received 50% of all possible points, which allows (in conjunction with the analysis of comprehensibility) positive verification of the H2 hypothesis, i.e. no comprehensibility of agreements on individual pension products which is also due to the complex structure of the documents. In the analysis of the clarity of texts, the best results were obtained by life insurance companies (an average of 2.3 points for text), and the worst—by bank documents with an average of 0.6 points for the text. This shows rather that banks do not strive to prepare more accessible documents, treating the financial activity in the IKE/IKZE sector as incidental.

Only two texts (general terms of insurance for IKE and IKZE accounts by Nationale-Nederlanden) representing the group of life insurance companies fully met the expectations of the expert. Judging by the quality of these documents, one can assume that they were prepared by professional editors and graphic designers. Furthermore, in many other cases it is difficult to resist the impression that the documents are prepared by non-professional graphic designers who experimented with the aesthetics of the text. This is evident above all in the ignorance of the authors of graphic design in terms of font size and line spacing. As for the other three institutions, we consider the ranking as an outline of a trend due to the small difference in the average scoring and the uneven size of bodies. Brokerage houses, voluntary pension funds and investment fund companies produce documents with average clarity, and it can be said that these institutions still have much to do in terms of clarity of the text.

5 Conclusions

The level of readability of documents affecting the studied pension agreements turned out to be low. The vast majority (80%) of the documents affecting the content of agreements on IKE/IKZE accounts that existed in the market in late 2016/early 2017 belong to difficulty class 6 according to Jasnopis, which means that they are understood by people with a Master’s degree or adequate, highly specialised knowledge. Approximately 19% of texts are even more difficult to understand, i.e. they are so complicated and difficult that they are only understandable for people with even higher level of education, i.e. a doctorate, or experts in the field. In contrast, only 1% of textsFootnote 11 is a bit simpler, i.e. so simple that their understanding requires education at the undergraduate level, but also the difficulty level is generally too high given that approximately 56%Footnote 12 of adult Poles have secondary or lower education. Hence, hypothesis H1 according to which the language of agreements on individual pension products is too difficult for most Poles, because understanding it requires at least higher education has been positively verified. We used both an automated tool Jasnopis to assess the level of readability of texts and the cloze test.

The psycholinguistic cloze test that we conducted for the Nationale-Nederlanden IKE agreement (NN INTRO IKE) shows the lowest reading comprehension score, in contrast to the clarity test which performs the best, where the clarity test refers mainly to the graphic features of the text and is also more subjective as it is based on the assessment of one expert. Of the analytical methods, Jasnopis indicates a score closer to the cloze test than to the FOG, which in turn indicates an excessive difference in the education needed to understand NN IKE and NN INTRO IKE texts (higher education versus primary education). Thus, we confirmed hypothesis H3 saying that psycholinguistic methods provide more accurate information about pension product contracts comprehensibility for a specific audience than analytical methods while also highlighting the advantages of the Jasnopis tool over the FOG index.

The lack of clarity of the documents does not make them more accessible. In the adopted five-point scale, only approximately 1% of texts have been assessed positively, which corresponds to the highest score, and generally an average of approximately 40% of the maximum number of points were granted. Such low scores result mainly from the use of print that is too small (below 10°), spacing that is too narrow, and rare use of means of guiding the reader through the text and direct address to the reader. Only a few financial institutions are aware of the important function of text clarity and apply the appropriate measures, possibly hiring professional text editors. Taking into account the general results of the analysis of comprehensibility and clarity, hypothesis H2: incomprehensibility of agreements on individual pension products is due to the level of complexity of the structure of documents, as well as the difficulty of the language used has been also assessed positively.

The clearest texts can be found in the documents prepared by life insurance companies. Banking products are characterised by the lowest clarity. Clarity is not correlated with the level of comprehensibility of the language used. All documents analysed are very difficult to read, which suggests a general lack of attention to clarity of the language used on the part of financial institutions. Since all agreements are almost equally incomprehensible, there is no internal market pressure for the use of easier language and improving the clarity of documents.

The study showed that it is necessary to use and enforce legal requirements for the comprehensibility of documents provided to purchasers of individual pension products, both at the stage of concluding an agreement on individual pension products, and the implementation of its provisions. Even the most extensive and complete information is in fact not especially useful for the individual customer, if its form is unclear and the language used is incomprehensible.