Double Dutch Finally Fixed? A Large-Scale Investigation into the Readability of Mandatory Financial Product Information

With the introduction of short-form disclosure documents, financial regulation in the EU emphasizes the use of plain language to facilitate comprehensibility. We evaluate whether these documents and the accompanying plain language guidelines improve the readability of mandatory product information addressed to mutual fund investors. Applying advanced text mining algorithms, we benchmark the readability of product information by means of objective and readily replicable methods. While mutual fund information on average does not come in plain language, we find that readability improved significantly following the introduction of Key Investor Information Documents (KIIDs). Improvements are driven by simpler syntax and writing style. By contrast, the authors find that the use of jargon remains pervasive and report noncompliance with mandatory design requirements. We discuss our results and propose potential disclosure improvements.


Introduction
In this study, we investigate into the readability of mandatory product information addressed to retail investors. The regulatory goal behind imposing the disclosure of financial product information is to help investors arrive at informed decisions (Garrison et al. 2012). This is of particular importance in situations which are likely characterized by a substantial information asymmetry between product providers and investors. To achieve this objective, financial regulation has traditionally mandated that an overwhelming quantity of information be provided to investors in order to support their financial decision-making (Ben-Shahar and Schneider 2011;Choi and Pritchard 2003). Clearly, however, mandatory disclosure may not serve its intended purpose if investors are not able to process the wealth of information they are provided with (Chater et al. 2010;De Pascalis 2014;Oehler et al. 2014). Prior research documents multiple instances of "information overload" in retail finance, i.e., a situation in which too much information is provided to consumers and, as a result, reduces their ability to arrive at wellinformed decisions (Agnew and Szykman 2005;Eppler and Mengis 2004;Malhotra 1984;Oehler and Wendt 2017). First, as has been shown by Huhmann and Bhattacharyya (2005), investment companies regularly provide excessively detailed information which goes well beyond what is required by law, and some even conjecture that mandatory information disclosure is abused to obfuscate investors rather than inform them (Warren 2008). 1 Second, the language in which essential information is provided to consumers tends to be "(…) focused on compliance with law, not on consumer comprehension" (Garrison et al. 2012, p.206).
Assessing the usefulness of disclosure is a complex task and recent research using consumer testing has identified multiple factors which improve the ease of understanding: Text written in plain language, a simple and clear structure supported by visual elements (such as graphs, colours, etc.) and contextual information, all support the readability of mandatory disclosures (Hillenbrand and Schmelzer 2017;Hogarth and Merry 2011). In light of the questionable merits of traditional financial product information disclosure and the evidence in consumer testing which bridges the gap between legally obliged transparency and ultimate consumer usefulness, regulatory authorities have shifted their focus away from pure information provision. 2 Recent disclosure requirements pay heed to long-standing evidence with respect to how humans process information as documented articulately in Kozup et al. (2008, p. 38): " […] in order for […] information to have a positive impact on the consumer decision-1 On a lighter note, Allan Sloan, the former senior editor at Fortune magazine, provides a witty distinction between disclosure and transparency stating that "(…) disclosure is when you bury information in widely separated places in a 400-pages document filled with small type. Transparency is when you tell people what they need to know in simple terms in readable type on the cover of a document or within the first few pages. (…) Disclosure is a legal obligation. Transparency is an ethical obligation." (Sloan 2011). 2 See the "Mutual Fund Product Information: Regulatory Background" section for details on the regulatory background of mutual fund product information in the EU. In a broader context, the Organisation for Economic Co-operation and Development (OECD) released a consumer policy toolkit in 2010, in which they argue that "(…) consumers cannot make well informed decisions when they are presented with information that is incomplete, misleading, overly complex or too voluminous. Behavioral economics has demonstrated that, among other things, the manner in which information is presented and the way that choices are framed can significantly influence marketplace choices" (OECD 2010, p.3). making process, it must be easily accessible and presented in a clear and understandable format." The respective regulatory measures have focused on standardization to increase the comparability of information (e.g., financial fact boxes as discussed in Kozup and Hogarth (2008)) and language guidelines to improve the readability of information (Cox 2007;De Pascalis 2014;Loughran and McDonald 2014a;Rennekamp 2012).
Specifically, the European Union (EU) has mandated Key Investor Information Documents (KIIDs) for all mutual funds in an effort to improve the readability of mutual fund product information. As of 2012, KIIDs have superseded simplified prospectuses (SPs) and are supposed to provide a comprehensible description of a fund's essential product features. In this study, we examine if these documents live up to their purpose. At this, we focus on investor information disclosed by mutual funds, since holdings in this asset class represent by far the largest fraction of aggregate household investments: In the EU, for example, total assets under management in mutual funds have reached an all-time high of as much as 17.2 trillion euros by the end of 2019, with the large majority being actively managed (EFAMA 2020). Moreover, information disclosure requirements are pervasive for fund companies, and the market is a prime candidate for unintended consequences of mandatory disclosure such as information overload: Investors face a dizzying number of product options-e.g., as much as 62,511 mutual funds were domiciled in the EU by 2019-and each product carries a host of decision-relevant characteristics.
The introduction of KIIDs marks a paradigm shift in information disclosure in that these documents are legally required to be written in plain language. 3 In order to specify this requirement, the Committee of European Securities Regulators (CESR) has issued the "CESR guide to clear language and layout for the key investor information document" which provides fund companies with guidance and best practices on how to write in plain language (CESR 2010). 4 Given that KIIDs are the blueprint for the upcoming Key Information Documents for Packaged Retail Investment and Insurance Products (PRIIPs) which extend the concept of harmonized information provision to virtually all consumer finance products, a comprehensive evaluation of these documents is a matter of particular importance.
In fact, the design of KIIDs and PRIIPs was accompanied by extensive consumer testing on behalf of the European Commission (EC) (EC 2009(EC , 2015. However, the tests were confined to the visualizations of, e.g., product-specific investment risk, and lacked an analysis of how subjects experience the readability of these documents. This is particularly surprising given that the need to use plain and clear language-besides being academic consensus (Godwin and Ramsay 2016)-is explicitly highlighted in the executive summary of the final consumer testing report. This study closes this gap.
To assess the readability of KIIDs, we apply state-of-the-art textual analysis methods. To this end, we retrieve original full-length product information documents pertaining to a largescale sample of 9,363 funds domiciled in 11 different countries and issued by 205 different mutual fund companies. For each of these documents, we compute an individual Flesch Reading Ease (FRE) score using a fully automated text mining routine which builds on the 3 Commission Regulation (EU) No. 583/2010-Section II/Article 5: "A key investor information document shall be: a) presented and laid out in a way that is easy to read, using characters of readable size; b) clearly expressed and written in language that communicates in a way that facilitates the investor's understanding of the information being communicated, in particular where: (i) the language used is clear, succinct and comprehensible […]." 4 Note that, as of January 1, 2011, the CESR has been superseded by the European Securities and Markets Authority (ESMA). natural language toolkit (NLTK) package in Python. Moreover, we are able to extract information on textual characteristics such as font size and style.
Our results suggest that key mutual fund investor information on average does not come in plain language. Compared to mandatory patient information leaflets or national newspapers, product information for mutual funds remains significantly more challenging to read and understand for the average retail investor. At the same time, however, several of our findings document that the introduction of KIIDs and the accompanying CESR's guidelines have improved the comprehensibility of mutual fund product information. On average, KIIDs are significantly easier to read than the preceding SPs. While the use of jargon has not improved much, we find that the CESR guidelines regarding syntax and style indeed lead to enhanced readability of mandatory mutual fund product information. Additional metadata analysis reveals that only roughly half of the product information documents under review comply with CESR design requirements (i.e., font type and size). Finally, we find fund companies domiciled in countries whose official language is the same as the document language comply more accurately with the plain language regulations specified for KIIDs. This ties in with the notion that composing a plain text is particularly challenging if done in a foreign language.
We make three contributions to prior research on KIIDs. First, we add to the ongoing debate about the usefulness of mandatory short-form disclosure. In an early response to the finding that information representation matters for retail investors, the Securities and Exchange Commission (SEC) introduced the SEC Plain English Rule in 1998 which Loughran and McDonald (2014b) indeed found to be useful. In a related study, Godwin and Ramsay (2016) evaluate the effectiveness of short-form disclosures in five other economies and find that the Canadian fact sheet launched in 2007, which puts special emphasis on intelligibility, ranks highest in a student online experiment. By contrast, the Australian fact sheet which has remained largely unregulated in terms of comprehensibility comes in last. Moreover, they find that the complexity of language ranks as the most relevant determinant of investor attention to product information. Recently, Gilbert and Scott (2017) analyse investors' perception of product information in New Zealand pre-and after the introduction of a short-form disclosure document and report mixed results with respect to its usefulness. While they observe an improvement in finance-related terminology, they find that general language has become more complex and harder to understand in the short-form disclosure. In earlier research,  show that summary data affects investor assessment of different mutual funds. By contrast, Beshears et al. (2011) find that summary prospectuses lead to a welfare gain in that they reduce the time being spent on an investment decision but do not alter investment choices.
Second, we analyse comprehensibility and legal compliance of KIIDs by applying objective and thus readily reproducible methods of advanced text mining. This methodology differs from approaches taken in most prior research in the field. Habschick et al. (2012) review a micro-sample of 160 short-form disclosures in Germany (among them 41 mutual fund KIIDs) and conclude that only roughly half of the fact sheets under review complied with the regulatory requirements on completeness of contents, intelligibility, and comparability. Oehler et al. (2014) ask students to compare the usefulness of real-world KIIDs with (mock-) fact sheets designed by the authors themselves and show a preference for the latter. In a related study, Walther (2015) conducts an online experiment in which 137 subjects make hypothetical investment decisions using KIIDs and fund prospectuses and rate their respective usefulness for the task. By contrast, our methodological approach allows for a significantly higher generalizability of results. Moreover, by focusing exclusively on KIIDs of mutual funds, i.e., the most important investment vehicle for retail investors in terms of asset volume, we ensure that our findings are not confounded by variation in intelligibility and regulatory compliance across different asset classes.
Third, we contribute to the recent discussion on textual analysis in the accounting and finance domain (see McDonald (2016, 2020) for a review) by demonstrating the applicability of readability formulas in the context of financial product information. For example, Bonsall et al. (2017) only recently provide evidence supporting the value of quantitative methods in assessing the readability of accounting disclosures. In fact, the use of readability formulas in order to assess complexity in financial communication has been advocated by, e.g., the former SEC chairman, Christopher Cox (Cox 2007).

Mutual Fund Product Information: Regulatory Background
Mandatory disclosure of mutual fund product information in the EU dates back to the EU Directive 85/611/EEC (UCITS I) effective as of 1985 which marked the first step towards a single European financial market. Specifically, investment companies were mandated to disclose a sales prospectus as well as (semi-)annual financial reports for all open-ended funds distributed to individual investors in the EU. However, because of inconsistent national law as to the marketing of mutual funds, the level of harmonization proved insufficient for the standard setter's ultimate goal of establishing cross-border authorization and supervision.
As of 2001, Directive 2001/107/EC (UICITS III) required management companies to publish a SP-intended to summarize the usually lengthy sales prospectus-and make this document available to their investors prior to the purchase of a mutual fund. However, the SP did not live up to its purpose: "The Simplified Prospectus has failed as a consumer communication tool because the rules have led to long documents […], written in technical or legalistic language […] poorly structured and designed" (CESR 2010, p. 4).
Hence, a new short-form document, the Key Investor Information Document (KIID), has replaced the SP in all EU Member States in a "[…] radical attempt […] to produce a document that is readily understandable by the average retail investor" (CESR 2010, p. 4). As of 2012, KIIDs have been a mandatory part of pre-contractual product information for retail mutual funds distributed in the EU (cf. Directive 2009/65/EC; UCITS IV). Again, KIIDs are supposed to provide investors with information on the key features of a UCITS in order to correctly assess scope and risks of the product. However, as a response to the shortcomings observed under the SP regime, KIIDs are limited to a maximum of two pages, have a predefined structure and content, and, most importantly, should be easy to read and understand for an average retail investor. 5 Specifically, EU Regulation 583/2010 requires KIIDs be "(…) written in language that communicates in a way that facilitates the investor's understanding of the information being communicated" (Section II, Article 5, 1(b)). In order to provide fund companies with guidance on how to prepare KIIDs, the CESR (2010) has published guidelines on clear language and layout of the document.
We list these guidelines in Table 1 and examine whether they are adhered to in the subsequent sections of this study. Notably, highlighting the importance of plain language as well as standardized format and content marks a paradigm shift in the disclosure strategy of European regulatory authorities.

Sampling and Summary Statistics
We collect product information documents for all funds marketed to retail investors in Germany from FWW Fundservices GmbH, who maintain a unique database of product disclosure documents for financial products. To control for potential time effects, we retrieve four different documents from the database: two SPs (earliest and most recent version available) and two KIIDs (first and most recent version available) for each fund in our sample. We access all documents in November 2018. The earliest document (SP) in our sample was disclosed in 2004, whereas the most recent document (KIID) dates as of October 2018. For a given fund to be admitted to the sample, we require all four documents be available. 6 Our final sample consists of 9363 funds for which we obtain 37, 452 product information documents in PDF format. Sampled funds are domiciled in 11 different countries and issued by 205 different mutual fund companies and asset managers.
To obtain textual characteristics from each document, we need to translate PDF files into machine-readable TXT files. We use the Python PDF2TXT module to extract plain text from the PDF files and supplement this module by a self-coded script, which enables us to obtain textual characteristics on various syntax measures (e.g., sentences, words, syllables) from the product information files in order to automate information processing. At this, we apply stateof-the-art textual analysis methods included in Pythons' NLTK and adopted to the investigation of text written in German. Finally, we merge the textual information with a host of fund "Use a typeface that is easy to read, such as Arial (or similar sans serif) or Times New Roman" (Part 3-Typeface) "Aim for at least 11pt for serif fonts and 10pt for sans serif fonts" (Part 3-Type size) Notes: This table summarizes the main CESR guidelines. Table 5 in the appendix provides detailed variable descriptions.
characteristics we obtain from Morningstar Direct in order to explore differences with regard to the readability and format of product information depending on fund characteristics. Following the conventional procedure, we convert all variables at the share class level to fund-level aggregates weighted with their respective contribution to the fund's total net assets (e.g., Doshi et al. 2015). Table 2 reports summary statistics of our fund sample. 7 The average fund has been run 12.5 years since inception, holds 721 million EUR in assets, yields an annual gross return of 4.98%, turns over 71% of its assets in a given year, features a three-star Morningstar rating, and charges their investors 1.59% annually. 8 Table 3 reports summary statistics on the sampled SPs and KIIDs. Our sample comprises SPs from 2004 to 2012 (panels A and B), i.e., the entire period in which these documents were required, and KIIDs from their introduction in 2012 through to 2018 (panels C and D), which marks the end of our observation period. While SPs feature an average number of roughly 500 sentences and 13, 000 words, respectively (see means reported in panels A and B), the shortform disclosure KIIDs, unsurprisingly, make do on only about 56 sentences (1000 words) on average (see means reported in panels C and D). However, the share of unique words in a given document is almost twice as high for KIIDs as compared to SPs, suggesting that the page limit of KIIDs forced management companies to eliminate redundancies in their product information. Interestingly, we do not observe a measurable impact of the shift from SP to KIID with respect to the usage of complex words. 9

Measuring Textual Readability
To assess the comprehensibility of a given fund's product information documents, we apply well-established measures of textual readability. DuBay (2004) argues that traditional readability formulas are the best predictors of text difficulty as measured by comprehension tests. Moreover, he shows that readability formulas are highly positively correlated with the reader's comprehension of a given text, which qualifies them as valid proxies for comprehensibility. The vast majority of extant research relies on two different survey methods to test the validity of readability formulas. Subjects are provided with text passages of different difficulty as measured by a given readability formula and either answer a questionnaire eliciting the extent to which they have understood the text or fill in words that are omitted from the text. Using the number of correct answers to the questionnaire or the number of correct words in the cloze test, validity is then measured as the correlation of subjects' comprehension of the respective text with its readability score provided by the formula under review. 10 Average sentence length (ASL) and word complexity vary greatly across different languages, and readability measures are designed to capture the comprehensibility of a given text in a specific language. While there is a range of different readability measures available for the English language, only a handful of measures have been developed for the German language. In our main analyses, we use the German version of the FRE as introduced by Amstad (1978), i.e., the most prominently applied German measure, which is formalized as follows: 11 where readability measured by the FRE score pertaining to document d of fund i is a function of the ASL and the average syllables per word (ASW). ASL denotes the average number of words in a sentence of document d, and ASW counts the average number of syllables used in a given word of document d.
We calculate the FRE for each document in our sample using a self-coded text analytics programme which builds on Python's NLTK package. Specifically, we analyse the lexical characteristics of each processed text document by tokenizing sentences, counting the number of words in each sentence, and, if applicable, deconstructing all words into their respective syllables. The FRE takes values between 0 and 100, where a higher score indicates better readability. Amstad (1978) classifies FRE scores as follows: Texts scoring 60 to 70 (higher than 70) are referred to as easy (very easy) to read, average readability is defined by a score between 40 and 60, FRE scores below 40 indicate hard-to-read documents, and texts featuring scores below 30 rate as difficult, i.e., requiring at least university education in order to fully grasp their content.
Moreover, we include a number of additional determinants of textual readability suggested in the CESR guidelines on the design of KIIDs but not captured via the FRE measure. Specifically, we analyse whether important style components (active voice, personal style, and superfluous words), the use of jargon (financial terminology), and textual design (font type and font size) comply with the CESR guidelines summarized in Table 1. To this end, we apply a bag-of-words technique (Loughran and McDonald 2015) using several established dictionaries and conduct a metadata analysis of the PDF files under review.
To investigate into the effectiveness of KIIDs and the accompanying CESR guidelines on plain language, we test for statistical significance of differences in readability and writing style measured for SPs and KIIDs, respectively. At this, we focus on changes between the most recent SP and the earliest, i.e., first-ever, KIID for a given fund in our sample, since the date of the regulatory intervention can be pinned down to lie in the time period between the issuance Notes: This table reports descriptive statistics of our sample of N = 9363 mutual funds. Table 5 in the appendix provides detailed variable descriptions of the two documents. Yet, to uncover potential time trends, we also include the earliest SP as well as the most recent KIID, i.e., analyse a total of four documents per fund.

Results
Has the Introduction of KIIDs Improved Readability?

Main Results
Figure 1 plots average FRE scores pertaining to the four documents under investigation. As can be inferred, all documents score below 30, indicating very low levels of readability. According to the Flesch interpretation scheme adapted by Amstad (1978), mutual fund product information provided in both SPs and KIIDs is very difficult to read for its addressees and requires at least university education. These values square well with the results reported in a study by Henselmann et al. (2009), which, to the best of our knowledge, has thus far been the only investigation into the readability of financial information in German language. For a micro-sample of 19 IPO prospectuses, they obtain FRE scores ranging from 20 to 35. Notes: This table reports descriptive statistics of our sample of N = 9,363 mandatory disclosure statements for mutual funds registered for sale in Germany. Panel A (B) reports text statistics for mutual funds' simplified prospectus (SP) in the earliest (most recent) version available. Panel C (D) reports statistics for product information disclosure in the first (most recent) available Key Investor Information Documents (KIID) as of October 2018. Unique words represent the share of unique words (vocabulary) on total words. Complex words are defined as consisting of three or more syllables (Gunning 1952). Table 5 in the appendix provides detailed variable descriptions To put these scores into perspective, we compare them with extant readability analyses of patient information leaflets, i.e., a related instance of mandatory information disclosure to consumers which has been researched quite extensively. Pires et al. (2015) survey a total of 22 studies on the readability of medicine package leaflets published between 2008 and 2013. Overall, those studies manifest a low level of readability of the package leaflets and the need to simplify the texts (Pinero-Lopes et al. 2011;Roskos et al. 2008;Weiss and Smith-Simone 2010). Product information is found to be too complex, with some package leaflets requiring university education. 12 This is in stark contrast to the five years of education recommended by the Food and Drug Administration (FDA). Yet, featuring an average FRE score of 29, the comprehensibility of mutual fund product information is well below that of German patient information leaflets, which feature average readability (FRE score of 47.6) reported by Merges and Fathi (2011). Hence, while dealing with an arguably equally complex topic, mandatory product information for drugs appears to be substantially more readable. Interestingly however, Pinero-Lopes et al. (2016) show that the readability of patient information leaflets has not improved ever since the EC introduced its guideline on the readability of package leaflets, a finding which is corroborated by Segura-Bedmar and Martinez (2017).
Moreover, Kercher (2010) analyses the readability of national newspapers in Germany around the federal elections in 2009. He finds that the news outlets under review vary greatly in readability (Bild Zeitung: 61.3; Süddeutsche Zeitung: 52.7; Spiegel: 54.0); however, none of them rates as hard-to-read. In a related study, Kercher and Brettschneider (2011) analyse the comprehensibility of speeches delivered by German politicians. For example, the FRE score computed for the 2007 Christmas speech of the former federal president Horst Köhler was as high as 65 and thus qualified as easy to understand, while an online podcast issued by  Pires et al. (2015) review readability studies pertaining to drug leaflets in English, Portuguese, Italian, French, or Spanish language, thus applying slightly different readability formulas than those applicable for the German language.
Chancellor Angela Merkel still featured a FRE score of 47, i.e., average readability. Thus, financial product information in Germany also seems to be significantly harder to understand than newspapers or political speeches.
As can be inferred from Fig. 1, we document a statistically significant and meaningful increase in readability along with the introduction of KIIDs in 2012. The improvement is not only large in magnitude but has also been persistent ever since SPs have been replaced by KIIDs. Specifically, the average FRE score increases by 13.1 points, i.e., a relative improvement in readability of a given funds' KIID of as much as 81% when compared to the FRE score of that funds' most recent SP. 13 Figure 2 plots the distribution of readability scores for the sampled SPs and KIIDs and implies moderate variation within the respective document types. While product information provided in SPs is very difficult to read in virtually all cases, at least some of the KIIDs are in the range of 30 and above, indicating hard-to-read, but not difficult, text. Taken together, these empirical results point to a significant improvement in comprehensibility of mandatory mutual fund product information, as proxied by the FRE score, in recent years. Yet, readability of financial product disclosures lags behind that of mandatory consumer disclosures in other domains.
Better readability as measured by FRE may result from either (i) avoiding complex words or (ii) using shorter sentences. 14 In addition to these two characteristics, several other factors have been shown to affect text difficulty (CESR 2010) but do not enter the FRE formula. To account for a potential influence of these additional characteristics, we supplement our analysis by examining (iii) writing style, (iv) use of financial jargon, and (v) text design. Figure 3 illustrates the share of complex words in the sampled texts (LHS) as well as their average word complexity captured by the number of syllables (RHS). We do not find a significant change in the usage of complex words: For both SPs and KIIDs, we obtain a share of complex words between 36% and 38%. Moreover, the decrease in the average number of syllables per word from 2.35 in case of SPs down to 2.2 for the sampled KIIDs is negligible and only borderline significant, hence confirming the pattern. 15 This evidence suggests that the vocabulary used to describe the key characteristics of a mutual fund has not changed much.

Sentence Structure
Next, Fig. 4 plots the percentage of long sentences in the documents under review (LHS) as well as their average sentence length (RHS). We find that KIIDs have a significantly lower share of long sentences than SPs; in fact, the use of hard-to-understand syntax has almost halved ever since SPs have been replaced by KIIDs. Moreover, KIIDs contain an average 18 words per sentence, while the average SP in our sample comes to as much as 25 words, i.e., 13 Average FRE of earliest KIIDs: 29.2 and average FRE of most recent SPs: 16.1: 29:2-16:1 16:1 = 81.37% 14 The CESR guidelines call for avoiding sentences featuring more than 25 words. 15 To put our results pertaining to word complexity into perspective: Kercher (2010), in his study on the text structure of national newspapers in Germany, e.g., documents an average word length ranging from 1.83 syllables (Bild Zeitung) to 1.91 syllables (Süddeutsche Zeitung). features about 50% longer sentences. Note that our results on sentence length of KIIDs tie in with prior evidence in the medical domain. Investigating patient information leaflets, Fuchs et al. (2006) report an ASL of 15.7 words, a number which is corroborated by Merges and Fathi (2011), who find an ASL of 15 to 17 words for the sampled leaflets. Despite the improvements in readability brought by the introduction of KIIDs, the average sentence in the mutual fund product information under review still features almost three times as many Fig. 3 Complex words and syllables per word. This figure shows the usage of complex words and the average syllables per word for the four product information documents under investigation. Complex words are defined as those word containing three or more syllables ( Loughran and McDonald (2014a)). Bar comparison lines describe the relative difference in complex words (avg. syllables per word) between the earliest and most recent document of a fund in our sample (outer comparison) as well as between the change in relevant product information document from SP to KIID (inner comparison). ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 earliest SP most recent SP first KIID most recent KIID Fig. 2 Readability box plots. This figure shows box plots of the product information documents readability as measured by their (German) Flesch Reading Ease (following Amstad 1978). We provide a detailed description of the applied variables in our appendix Table 5 words as the average text passage in national German newspapers, i.e., making this information substantially more difficult to digest. 16

Writing Style
In our first supplementary analysis beyond the FRE-determinants, we analyse the writing style of the different documents in our sample. Writing style is relevant, since "[…] formal, passive and impersonal style [… can] be unengaging to the reader" (CESR 2010, p.7). Thus, we first infer the active tone of either document type by computing the number of verbs in passive form relative to the total number of sentences. Second, we construct a variable capturing the number of personal pronouns in a given text relative to the total number of words in order to approximate the personal style of the documents under review. Third and finally, we count the number of superfluous words as defined in the German edition of the Technical Writers' Companion (2016). Figure 5 reports our results pertaining to the occurrence of the relevant style components in SPs and KIIDs, respectively. KIIDs are different from SPs in that they feature a more active tone and use a higher share of personal pronouns and a lower share of superfluous words than SPs. Note that these differences in writing style are highly statistically significant along all dimensions under review, i.e., suggesting that fund companies live up to the style requirements set down in the CESR guidelines. 16 Kercher (2010) documents an average sentence length of 5.6 words (Bild Zeitung) to 7.1 words (Süddeutsche Zeitung).

Fig. 4
Long sentences and average sentence length. This figure shows the usage of long sentences and the average sentence length (measured in number of words) for the four product information documents under investigation. Long sentences are defined as those sentences consisting of more than 25 words according to the section "Short Sentences" of the CESR's guide to clear language and layout for the Key Investor Information document. Bar comparison lines describe the relative difference in long sentences (sentence length) for the change in relevant product information document type from SP to KIID. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 Financial Jargon Moreover, the use of jargon, i.e., special words and phrases which are exclusively or primarily used by a particular group of people (Cambridge Dictionary 2020), compromises textual readability considerably. In fact, jargon is pervasive in the mutual fund industry, and therefore-beyond word complexity and writing style-many text passages have been shown to be difficult to understand for retail investors simply because they are not fluent in "fund jargon" (De Pascalis 2014). Consequently, the CESR guidelines call for management companies to avoid jargon whenever possible.
In the vein of Loughran and McDonald (2015), we capture the extent to which mandatory mutual fund product information includes jargon by applying a dictionary approach referred to as the "bag-of-words method." Specifically, we quantify the overlap of words used in the documents under review with two widespread glossaries, which aim at explaining financial jargon to German retail investors, i.e., the ING (2020) and the Frankfurter Allgemeine Zeitung (FAZ) (2020). To this end, we define all entries in the two glossaries to represent financial jargon and count each word's unique occurrence in the text under review as well as the total number of jargon words per document. We draw on the ING Börsenlexikon because ING is one of the largest retail banks in Germany. Likewise, we select the FAZ Börsenglossar because the FAZ is Germany's third-largest national newspaper and features the most reputable business news (IVW 2020). Figure 6 plots the corresponding results. Counterintuitively, we document a strong and significant increase in the percentage of financial jargon used in KIIDs as compared to SPs. This increase holds for both the total number of jargon words used, which is up by 25% (55%) when using the ING glossary (FAZ glossary), and the share of unique jargon words in a given document, which is up by 20% (22%). Moreover, when we compare the first KIIDs to the most recent ones, we observe that the use of jargon has further increased in recent years. The increase is robust to the financial glossary we employ and runs counter to the improvements in comprehensibility documented in the "Complex Words" and "Sentence Structure" sections.  5 Writing style-passive voice, personal pronouns, and superfluous words. Notes: This figure illustrates the "writing style" used in the four product information documents under investigation. Passive voice captures verbs used in the "passive" form. Personal pronouns proxy for the 'personality' writing style and comprise all relevant pronouns (e.g., "I", "You", "We", "Our", "Your", etc.). Superfluous words obtained from the German edition of the Technical Writers' Companion (2016). Bar comparison lines describe the relative difference in passive voice (personal pronouns; superfluous words) for the change in relevant product information document type from SP to KIID. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 Text Design Finally, the CESR guidelines pay special attention to certain design elements which make for a document that appeals to the reader and thus increase the likelihood of being read. 17 Hence, we investigate into whether or not the mandatory mutual fund product information documents heed the CESR advice as to font size and typeface. The CESR recommends easy-to-read fonts such as Arial or Times New Roman and a font size of at least 11 point for serif (e.g., Times New Roman) and 10 point for sans serif fonts (e.g., Arial). To read out the typeface and font size used in the sampled documents, we compute a text algorithm based on the Python module PDFMiner which enables us to automatically analyse the PDF metadata of each document under review. Figure 7 plots the distribution of the primary font size used in the sampled documents which turns out to be approximately normal with a mean font size of 10 point. 18 However, the fat tail at the lower end of the distribution indicates that a sizeable number of SPs and KIIDs under review do not heed the CESR design guidelines. Similarly, Figure 8 illustrates the share of documents written in the recommended typeface (i.e., Arial or Times New Roman) as well as the share of documents which meet the minimum font size recommendation. To spell out the differences, Figure 11 in the appendix provides examples of a KIID in conformity with the CESR design guidelines (LHS; typeface Arial, font size 12 point) as well as a KIID which does not meet the respective recommendations (RHS; typeface Times New Roman, font size 8 point).
The left-hand side of Fig. 8 shows that only about half of all sampled documents use either Arial or Times New Roman. Moreover, our results pertaining to font size show a virtually linear downward-sloping trend. Whereas about 85% of SPs applied a font size of 10 point or greater (depending on typeface), only 53% of the most recent KIIDs meet the recommendations regarding minimum font size. The substantial decline could owe to the strict page limits of KIIDs which in turn might lead to fund companies cramming these documents with information. Importantly, the fontsize effect transfers to online settings: Rello et al. (2016) show that text comprehension and focus of subjects (measured via eye-tracking) are significantly affected by font size. Clearly, font size may be adjusted easily for product information documents distributed via digital channels rather than in print. Still, font size does affect processing fluency (Yang et al. 2018). A reduction in size has generally been shown to discourage consumers from reading, and readers' attention tends to be distracted when faced with small fonts leading to a high information density (Adams and Edworthy 1995;Bernardini et al. 2001;Luna et al. 2019). By contrast, larger font sizes positively affect peoples' self-reported ability to remember the content they have read (Mueller et al. 2014).

Fund-Specific Differences in Document Readability
It is conceivable that the readability of mutual fund product information also relates to fund characteristics other than the textual characteristics of the mandatory disclosure documents. Specifically, note that the CESR guidelines make the implicit assumption that investment companies are capable of preparing easy to read product information. Yet, this need not 17 See Part 3: Designing a KIID in the CESR guidelines. 18 We define the primary font size of a document as the font size used for at least 85% of the words in that document. We drop observations in which multiple font sizes occur in the text body (excluding headings and legal disclaimers). necessarily be the case. Even if fund managers comply with the new regulations related to content and structure of KIIDs, they might not be familiar with the "art of plain language writing" (Aboulian 2011). Following this rationale, we would expect fund companies domiciled in countries where German is among the official languages-i.e., Germany, Austria, Liechtenstein, and Switzerland-to cope more easily with the plain language regulations specified for KIIDs. This is because writing in a foreign language is generally associated with specific difficulties (Reichelt et al. 2012) and composing a text in plain language is particularly challenging for non-natives (EC 2011). Figure 9 depicts the absolute improvement (LHS) and relative improvement (RHS) in the readability of the sampled product information documents by comparing the most recent (last) SP of a given fund with its first-ever KIID. Solid bars illustrate FRE scores of funds issued by investment companies domiciled in countries where German is among the official  Total 'Jargon' is defined as the relative share of financial jargon in a document contained in the respective glossary divided by the total number of words in this document (Total 'Jargon'/Total Words). Unique 'Jargon' refers to the vocabulary's unique overlap with the respective glossary (Unique 'Jargon' /Unique Words). Bar comparison lines describe the percentage difference in financial jargon usage for the change in relevant product information document type from SP to KIID. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 languages, whereas hashed bars represent funds domiciled in the remainder of domicile countries. Indeed, we observe a 47% difference in document readability between the two groups: While the average improvement in readability increases by 15.7 FRE points among countries in the former subsample, it only improves by 12.1 FRE points for funds domiciled in the latter group of countries.
Next, we investigate whether certain types of mutual funds vary in their response to the change in policy. The goal of this supplementary analysis is to inform standard setters about whether or not a "one size fits all" approach is likely to lead to success. To test for heterogeneous effects depending on product and provider characteristics, respectively, we apply a fixedeffects panel regression. Hence, for each fund in our sample, we include the last SP and firstever KIID in this panel and estimate the effect differences using interaction terms, formally: where PostPolicy t represents a time indicator variable separating the pre-policy product information (i.e., PostPolicy t = 0) of a given fund i from its post-policy product information. The respective fund characteristics of interest enter our regression model as dichotomized indicator variables. We include four different fund level indicator variables: German speaking i , Affiliated i , ETF i , and Equity i . First, following the notion that writing a text in plain language is especially difficult for non-native speakers, German speaking i indicates whether a fund's management company is domiciled in a country where German is among the official languages. Second, three out of the five largest fund companies in Germany are asset management divisions of banks-and the products they distribute referred to as "bank-affiliated funds" in the literature-with Deka (central asset manager of all German savings banks), Union Investment (central asset manager of all German Fig. 8 Font typeface and font size compliance. This figure illustrates the percentage share of product information documents written in accordance with the CESR's guideline regarding typeface (font) and type size as specified in "Part 3: Designing a KIID". Bar comparison lines exhibit the relative difference of documents written in accordance with those requirements for change in product information document type from SP to KIID (inner comparison). ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 cooperative banks), and DWS (Deutsche Bank's asset manager) leading the way (Ferreira et al. 2018;Heyden et al. 2021). 19 Retail clients of banks with an affiliated investment management company are typically recommended a limited selection of funds issued by this particular provider (Florentsen et al. 2020;Heyden et al. 2021). Given the relevance of these funds in Germany, we examine if there are differences in the readability of the mandatory fund information of affiliated versus unaffiliated funds. Thus, we include the indicator variable Affiliated i , which takes a value of one if fund i is a bank-affiliated fund and zero otherwise.
Third, there is some heterogeneity as to the product complexity of the sampled funds. Active portfolio management strategies, e.g., are arguably more difficult to explain to the investor than an investment which simply tracks a given benchmark index. This is likely to affect readability levels and, supporting this conjecture, Fig.  12 in the appendix reports higher FRE scores for exchange-traded funds (ETFs). Hence, we include the binary variable ETF i denoting whether fund i is an index tracker in the regression model to test whether the policy has a different effect on active versus passive investment products.
Fourth, mutual funds cover various asset classes (e.g., pure equity, fixed income, or balanced funds)-each characterized by a very different risk and return profile. Thus, we include the dummy variable Equity i in the model above to see whether the policy changes' impact differs among funds' investment focus.
Finally, we control for time-variant fund characteristics (captured in the vector γ ′ c i, t and described in Table 5 in the appendix) as well as time-invariant characteristics by including fund fixed effects. Table 4 reports the corresponding results organized by fund characteristic. For instance, β 1 in the first row reports the effect of the simplification policy on readability (measured by FRE) for the subgroup of funds domiciled in non-German-speaking Fig. 9 Readability-German versus non-German speaking domiciled investment management companies. This figure shows the absolute (left) and relative (right) improvement in product information documents readability from the latest SP to the first KIID of a fund in our sample for funds of investment management companies that are (not) domiciled in a primarily German speaking country (Germany, Austria, Liechtenstein, Switzerland). Readability is measured by the (German) Flesch Reading Ease (following Amstad 1978). Bar comparison lines indicate relative difference in document readability between funds domiciled in German vs. non-German speaking countries. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 countries (i.e., German speaking i = 0). β 1 + β 3 denotes the effect of the policy change on the subsample of funds domiciled in German-speaking countries, and β 3 shows the difference in the effects of the policy change between the two subgroups. Analogously, the third (fourth, fifth) row in Table 4 reports betas for bank affiliation (ETF, Equity) and the respective difference of the policy change across the respective subsamples.
Two results are worth highlighting. First, we obtain β 1 and β 1 + β 3 to be highly significant for all product characteristics under review, i.e., there seems to be a baseline improvement in readability independent of the fund characteristics under review. Second, as can be inferred from the coefficients pertaining to β 3 , the improvement in readability is more pronounced among funds domiciled in Germanspeaking countries as well as funds with a bank affiliation and, to a lesser extent, actively managed funds as well as balanced and fixed-income funds. This difference in the effect of the policy change is not only statistically significant but also large in terms of magnitude. Specifically, we observe an additional improvement in readability of nearly 26% (19%) for the product information of funds domiciled in Germanspeaking countries (affiliated funds). 20

Alternative Measures of Readability
Lastly, we check whether our main findings are robust to the choice of the readability measure and replicate our analyses documented in the "Has the Introduction of KIIDs Improved Readability?" and "Further Analyses" sections with alternative readability metrics available for texts in German language. Figure 10 reports the results of this supplementary analysis. Our alternative measures of readability include the Wiener Sachtextformel (WST) and the German version of the SMOG index (gSMOG). Note that these metrics quantify readability by means of years of education necessary to understand a given text (Bamberger and Vanecek 1984;McLaughlin 1969), and lower levels therefore indicate better readability.
Corroborating our main finding obtained using the FRE, we document a statistically significant improvement in readability regardless of which alternative metric we employ. Using the WST, for instance, we find that the education necessary to understand the average KIID amounts to 13 years, down from 15 years for the average SP. Taken together, our main results prove robust to alternative proxies of readability.

Discussion and Concluding Remarks
In this paper, we investigate whether the introduction of KIIDs facilitates the readability of pre-contractual mutual fund product information addressed to retail investors. As of 2012, KIIDs-short-form disclosure documents limited to two pages in length-have replaced the SP. SPs had been found to be too long, poorly designed, and written in an overly technical language. Thus, the EC requires KIIDs be written concisely and non-technical. The introduction of KIIDs was accompanied by a guide 20 The mean FRE of the SP of funds domiciled in German-speaking countries (affiliated funds) prior to the policy change is 14.9 (15.3). Hence, we measure a 3:57 14:9 = 25.97% ( 2:87 15:3 = 18.75%) difference.
on clear language and layout for the new disclosure document developed by the CESR. Applying a fully automated textual analysis approach on a large-scale sample of mutual fund product information, we are the first to benchmark and assess this policy change with objective and replicable methods. Specifically, this study (i) quantitatively benchmarks the readability of mutual fund product information, (ii) evaluates the effectiveness of the change in product information documents, and (iii) assesses whether KIIDs live up to the CESR guidelines.
We find that, while mutual fund product information documents are generally difficult to read, comprehensibility as captured by the FRE score has significantly improved ever since the introduction of KIIDs. The improvement in readability is primarily driven by shorter sentences, while we do not observe a reduction in word complexity per se. In addition, a more active and personal writing style and omitting superfluous words contribute to a better understandability of KIIDs versus SPs. However, fund companies tend to more frequently use financial jargon and significantly smaller font sizes in KIIDs, which compromise these efforts. While the CESR guidelines with respect to plain language have a positive effect on the overall presentation of information to retail investors, other parts of the guide (e.g., font size and financial jargon) do not seem to be broadly recognized in practice. Thus, we present mixed evidence with respect to the impact of the policy change on the readability of mutual fund product information. While KIIDs provide investors with significantly more readable information and heed several of the recommendations where the dependent variable is fund i's readability as measured by FRE (for the latest SP and the first KIID in our sample). PostPolicy is a time indicator variable separating the pre-(PostPolicy=0) from the post-policy product information (PostPolicy=1). German speaking is a binary variable and indicates whether a fund's management company is domiciled in a German-speaking country. Affiliated is a binary variable and denotes "bank-affiliated" mutual funds. ETF is a binary variable and indicates whether a fund qualifies as exchangetraded fund (passive investment). Equity is a binary variable indicating whether a fund is primarily invested in equities. We include fund and firm fixed effects to control for any other time-invariant characteristics and additionally control for time-variant fund level characteristics using the vector γ ′ c i, t , which includes a large set of variables specified in Table 5. We report robust standard errors in parentheses. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively made by the CESR, some of the respective guidelines are shown to be neglected. Compared to related evidence on the readability of package leaflets, we document room for improvement not only pertaining to the overall readability of the product information under review but also with respect to font type and size.
In light of the empirical evidence documented in this study, we suggest that the European standard setter provides stakeholders in charge of preparing financial product information with more detailed guidelines-potentially even accompanied by trainings-on how to write in plain language, e.g., including examples and best practices. Moreover, we advocate rigorous consumer testing when designing disclosure rules (Garrison et al. 2012). Specifically, this entails including readability checks in future consumer tests of financial product information. This might be operationalized by providing subjects with various representations of a given information item featuring different degrees of reading difficulty as captured by, e.g., the FRE score. 21 Finally, we reiterate a claim originally made by the former SEC president Christopher Cox and propose that regulatory authorities take advantage of automated textual analysis when checking for the compliance of product information documents (Cox 2007). In fact, given the novel directive for PRIIPs which extends the universe of investment products coming under the ambit of KIID requirements quite dramatically, efficiency gains will likely become more important for the fulfilment of regulatory tasks. In the USA, the states of Florida and Massachusetts have already incorporated the FRE score as a readability metric in the authorization process of insurance policies, requiring the associated product information documents to score no lower than 50 and 45, respectively. 22 To borrow from Robert Gunning, a pioneer in textual readability analysis, fund companies preparing product information for their investors should try and heed a simple rule of written communication: "Don't write up. Don't write down. Write to." (Gunning 1952, p. 10). textual readability. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5  Table 5 Variable descriptions

Variable name Description
Textual characteristics Simplified prospectus (SP) The simplified prospectus (German: "Vereinfachter Verkaufsprospekt") is a legal document filed with securities regulators and disclosed to investors prior to the purchase of securities. It should provide investors important information about the mutual fund. In Germany, the simplified prospectus was mandatory from 2004 to 2012 and replaced by the Key Investor Information Document (German: "Produktinformationsblatt").

Key Investor Information Document (KIID)
Becoming effective with the directive of the EU UCITS IV, the Key Investor Information Document (KIID), a standardized product information fact sheet with a maximum length two pages, replaces the SP in an attempt to provide investors with better structured and more readable product information.

Disclosure date
Date on which a product information document i has been disclosed.
# words Total number of words making up a specific fund product information document i. Variable truncated at the 2.5% and 97.5% percentile, respectively.
# sentences Total number of sentences contained in a specific fund product information document i. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Complex words Percentage share of words consisting of three or more syllables (e.g., "Di-vi-den-de") on total words contained in a specific product information document i. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Financial jargon Percentage share of words in a specific fund product information document i that are also included in a financial glossary/stock exchange lexicon on total words contained in the product information document. We apply the "ING-Börsen Lexikon" as well as the "FAZ-Börsen Glossar." Variable truncated at the 2.5% and 97.5% percentile, respectively.
Passive voice Percentage share of verbs on total words in a specific fund product information document i that are used in their "passive" form (e.g., "was issued," "is being prepared"). Variable truncated at the 2.5% and 97.5% percentile, respectively.
Superfluous words (in %) Share of "superfluous" words, i.e., words that do not alter the meaning of a sentences or phrase and therefore could be omitted, on total words contained in a specific fund product information document i. Source: Technical Writers' Companion (2016), German edition. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Unique words (in %) Percentage share of unique words (or vocabulary) represents the share of unique words on total words on total words contained in a specific fund product information document i. Variable truncated at the 2.5% and 97.5% percentile, respectively.

Variable name Description
Fund characteristics German speaking Indicator variable equals one if a fund's investment management company is domiciled in a primarily German-speaking country (Germany, Austria, Liechtenstein, Switzerland), zero otherwise.

Affiliated
Indicator variable equals one if a fund's investment management company is affiliated to a commercial bank, zero otherwise. Variable constructed in the vein of Ferreira et al. (2018).

ETF
Indicator variable equals one if a fund qualifies as entirely passively managed or exchange-traded funds (ETF), zero otherwise. We screen the fund's legal names for the following words/phrases: "ETF," "Exchange traded," "index," "tracker," and "passive." Equity Indicator variable equals one if a fund is primarily invested in equities, zero otherwise. We use the Morningstar Direct's fund classification.
Fund age (years) The fund age in number of years computed from the date of a fund's inception.
Fund size (in EUR million) Total net assets under management of a fund as of December 2018. We use the logarithm of the variable in regression analysis. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Avg. annual return (in %) Denotes the average yearly raw return of a fund in the period from 2012 to 2018. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Avg. SD annual return (in %) Denotes the average yearly raw return standard deviation in the period from 2012 to 2018. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Turnover ratio (in %) A fund's average yearly turnover ratio, i.e., the percentage share of net assets of a fund invested/disinvested in a given year, in the period from 2012 to 2018. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Morningstar rating (1-5 stars) Morningstar rates mutual funds and ETFs from 1 to 5 stars based on how well they have performed (after adjusting for risk and accounting for sales charges) in comparison to similar funds and ETFs. Within each Morningstar Category, the top 10% of funds and ETFs receive 5 stars and the bottom 10% receive 1 star. The variable denotes the Morningstar rating as of December 2018.
Ongoing charge (in %) Represents the average yearly costs as percentage of total net assets an investor can reasonably expect to pay from one year to the next, under normal circumstances. It encompasses the fund's professional fees, management fees, audit fees, and custody fees but neglects incurred performance fees. Variable truncated at the 2.5% and 97.5% percentile, respectively.
Notes: This table defines the variables used in the empirical analysis  Table 5 Fig. 12 Readability-passive (ETF) versus active mutual funds. This figure shows the differences in product information documents readability as measured by their (German) Flesch-Reading Ease (following Amstad 1978) for passively vs. actively managed funds. We consider funds as being passive if the classify as Exchange Traded Funds (ETF) according to Morningstar Directs fund database. Crosshatched bars represent actively managed funds. Comparison lines indicate relative difference in document readability between active funds and ETFs. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. We provide a detailed description of the applied variables in our appendix Table 5 Double Dutch Finally Fixed? A Large-Scale Investigation into the...