This section presents the sub-corpora and the linguistic features associated with reduced comprehensibility, such as sentence length, noun phrases and passive voice. Table 1 provides a quantitative overview, which is then followed by the qualitative analysis and discussion of genre characteristics of court forms and the impact of the syntactic structure of sentences and incomplete sentences (phrases or subordinate clauses occurring independently) on comprehensibility and access to information.
The high degree of deviation throughout different categories points to differences among the court forms. The deviation is generally lower in the sub-corpus 2 with child-related forms because these forms are managed by the HMCTS teams on private and public family proceedings and thus share the micro institutional discourse within HMCTS and similar legal domains. Since sub-corpus 1 covers a broader spectrum of court forms, the higher deviation shows more variety among the individual legal areas and the broader spectrum of institutional discursive cultures within HMCTS.
Overall, the average length of syntactical units in both types of sub-corpora is 15 words per sentence and around six words per phrases or clauses functioning as incomplete sentences, but there is a lot of variation not only between the sub-corpora or among individual forms but also within the forms. For instance, for some forms the average length of sentences is 27.53 words (‘A101: Consent to the placement of my child for adoption with identified prospective adopters’) with the deviation as high as 10.61. Interestingly, even the form A101 with the highest average sentence length does not reach the average estimated sentence length for legal discourse; the length of sentences in legislative documents has been estimated as follows: 55.11 words per sentence in Courts Act 1971 [20], 45.05 words in a sample of statutes [23] or 37.06 words in the statutes collected in 1990 [25]. Legislative documents form a clearly defined genre with specific drafting rules to conform to [9], which generally require longer sentences. There are thus two reasons for the lower average sentence length in the analysed court forms.
Firstly, the software methods used here were only able to identify sentences which were presented in a linear way; the graphically separated sentences with several options to choose from were treated as syntactic segments (e.g. Example 3). This potentially skews the overall picture of such quantifiable characteristics as the overall length of sentences (the average length of sentences would, in practice, be higher) or the ratio between sentences and syntactic segments (Table 1 shows that sentences amount to 55.5% and 43% of syntactic constructions in the analysed sub-corpora, but in practice the ratio of sentences would be higher). Due to the amount of the text analysed and the frequent use of non-linear sentences in legal discourse [22], it was not possible to manually amend tagging of individual sections for their syntactic characteristics.
Secondly, court forms are a heterogeneous genre: they have several diverse communicative aims (elicit information, define legal scope and provide guidance on how to complete questions) and include diverse syntactic constructions and functions (interrogatives, imperatives, declaratives, independent phrases, independent subordinate clauses). The explanatory sections intertwine with information-eliciting sections, which leads to discontinuity in information flow and breaks in elicitation strategies. The forms thus combine different genres and incorporate guidance with information elicitation: instructions on what to include, explanations of legal concepts and multiple intertextual links often followed/preceded by individual questions [51]. The sentence average as presented in Table 1 thus does not reflect the syntactic complexity within individual forms (e.g. information-eliciting sections vs. guidance sections); but due to the quantity of data and the close integration of the guidance and information-eliciting sections, it was not possible to tag sentences for their functions.
To address the above mentioned challenges and to explore the syntactic complexity of forms in more depth, we qualitatively examined (1) the top 50 longest and shortest sentences and fragments across the sub-corpora and (2) all syntactic constructions in one court form shared between the sub-corpora (‘C100: Apply for a court order to make arrangements for a child or resolve a dispute about their upbringing’) in order to contextualise the length and function of sentences (see a more detailed analysis of the C100 form in [51]). As discussed elsewhere [51], shorter sentences and syntactic segments are mostly part of sections eliciting everyday information (e.g. personal details) or specific legal information (e.g. which court order the applicant is applying for), or provide information on procedures (e.g. confidentiality of information) and thus tend to refer to concepts which are familiar from workplace and general administration domains. The comprehension challenges they cause are linked to the use of legal concepts or legal procedures [49]. Longer sentences include more propositional content expressed through complex lexico-grammatical constructions [51]. Although, as shown in Table 1, passive voice constructions do not play a significant role in court forms (1.2% and 1.6% of verb phrases are in passive voice), noun phrases occupy a much more predominant position (17.9% and 18.3%) and tend to be the longest constructions within sentences (e.g. Example 1 below).
Because longer sentences mostly appear in explanatory sections which present crucial procedural and legal information for filling in court forms, such as defining legal terms or legal scope (see Example 1–3 below), they impact the overall comprehension of court forms. The following examples illustrate the challenges long syntactic constructions present; their analysis is followed by a reflection on the rationale for their length and potential for enhancing their comprehensibility. For instance, the longest sentence in the most frequent forms sub-corpus appears in the guidance document “Notes to help you fill in form C1 Confirmation Inventory and form C5 (2006), HM Revenue & Customs Return”, designed to support the completion of the C1 and C5 forms. The sentence in Example 1 is the only guidance provided for the question category which says “Net qualifying value of estate”:
Example 1
(82 words): ‘To work out the amount of spouse or civil partner, or charity exemption for the purposes of the excepted estates regulations, where there are people entitled to claim legitim, you will have to work out the amount of the legitim fund and then adjust the amount which would be payable to the spouse or civil partner or charity if the legitim fund were claimed in full after taking account of any legitim claimed or renounced before the application for Confirmation is made.’
The sentence relates to working out the gross value of an estate for inheritance tax purposes. Aside from referring to specialist legal and financial terms (legitim fund) and legal processes (claimed or renounced), the sentence identifies legal scope by including the range of possible participants (spouse, civil partner, or charity) and establishing possible conditions and procedures (where there are people entitled to claim legitim, if the legitim fund were claimed in full). The identification of legal scope and the intention to provide an accurate description break the linear flow of information (to work out the amount…, you will have to work the amount of the legitim fund and then adjust the amount…). For instance, the adverbial clause of reason (where there are people entitled to claim legitim) incorporates explicit specification of the identity of referent [15] and breaks the infinitive clause from the main clause. Another series of coherence challenges is introduced through the co-ordination of the main clause (you will have to work out … and then adjust…), the conditional sentence used for hypothetical situations (would be payable … if the legitim fund were claimed), ambiguous language (adjust the amount), unclear time-reference due to the syntactic position of the embedded clauses (the adverbial clause after taking account of any legitim claimed or renounced may relate to the second main clause, adjust the amount which would be payable, or the conditional clause if the legitim fund were claimed; the adverbial clause before the application for Confirmation is made may equally relate to the second main clause, adjust the amount which would be payable, or the adverbial clause after taking account of any legitim claimed or renounced). The lexico-grammatical characteristics of the sentence are further complicated by the fact that it belongs to two highly specialised domains, law and finance embedded within the domain of estate law.
Given the conceptual complexity embedded within the sentence, a common question that arises from legal professionals is the extent to which it could be simplified linguistically [2]. Expressing the idea in several shorter sentences could help address the syntactic complexity and the broken information flow. But the lexical complexity would require framing through exemplification and a clear definition [51]. The role of examples in aiding the addresses’ comprehension has been widely recognised in educational and other professional contexts [55], but they have not been fully utilised by the legal profession. This is partly due to the fact that prototypical situations are not easily achievable for legal purposes, but also due to the resistance from the courts and judiciary against overstepping the boundaries of guidance provision and being seen as offering unsolicited legal advice [36]. Similarly, defining legal terms is a notoriously difficult task, complicated by the fact that law constantly develops and the meaning of concepts may shift due to updates to legislation or developments in case law [21]. The implicit connections between legal concepts within a system of cognate legal principles and rules, as discussed above, further complicate the comprehension of legal texts among the wider public [2: 378]. This transcends the lexico-grammatical level and formal linguistics into the cognitive and discursive domains, abounding in inherent connections. In this respect, it can be argued that the term legitim is more straightforward to explain than, for instance, domestic violence, which is loaded with cultural, historical, linguistic and legal connotations [6, 37], and the definition for which varies even among professional service providers, such as law enforcement agencies, healthcare professionals, lawyers or policy-makers [4]. The following example incorporates intertextual links, which potentially make it possible for inherent links to be expressed more explicitly. Example 3 comes from the guidance notes for the FP2 form (Application notice Part 18 of the Family Procedure Rules 2010–‘Note 3: The order you are asking the court to make’) used for adoption cases.:
Example 2
(59 words in the first sentence and 79 words in the second one; 69 and 54 words without references to legislation):
• if you are making an application under Sect. 26(3)(f) of the Adoption and Children Act 2002 (seeking an order giving permission to apply for contact with a child who an adoption agency has placed for adoption or is authorised to place for adoption), you must also attach a draft of your application form for a contact order (Form A53);
• if you are making an application under Sect. 42 (6) of the Adoption and Children Act 2002 (seeking an order giving permission to apply for an adoption order before the child you are intending to adopt has lived with you for the period required under the Act) you must also attach to your application notice an additional sheet giving the details required in paragraph 3.3 of the Practice Direction 18A supplementing Part 18 of the Family Procedure Rules 2010.’
The extract refers to gap 4 in the following section of the form: ‘I (We) [gap 1] and [gap 2] of [gap 3] intend to apply for an order (a draft of which is attached) which: [gap 4] because [gap 5].’ To provide the information, the applicant needs to know which order they are applying for and the only support provided is the extract in Example 2. Instead of building a coherence thread with the form and explaining the options, the guidance note focuses on the additional evidence to be provided for the two types of orders, which are presented through intertextual links. What partially supports coherence is the double identification of the orders through intertextual links to legislation and present participle clauses describing the orders (seeking an order giving permission to apply). A more gradual approach, first outlining the orders and only then expanding on supplemental evidence, would reduce information density and support the information flow [17]. Example 2 thus illustrates that court forms incorporate several micro genres, such as information provision micro genre resembling legislation, and elicitation micro genre, combining elements of bureaucratic and examination styles; the constant variation among the genres results in increased cognitive load and interrupts the coherence threads.
A further communicative barrier is created by the vague information in the note for gap 5: ‘Briefly set out why you are seeking the order. Include material facts on which you rely, identifying any rule or statutory provision.’ The absence of any form of identification of specific rules or statutory provisions to follow leaves lay court users without specific information on relevant guidance. As mentioned above, it is the technical nature of legal rules and the systematicity within law that hinders comprehensibility of legal texts or legal proceedings more generally [56]. Although the gap-fill sentence in the FP2 form looks as a simple sentence, the guidance provided creates a cognitive barrier due to the increased information density (for gap 4) and lack of information (for gap 5). This results in the presentation of midinformation [53], i.e. when the information is being provided only partially or in an incomprehensible way. Informational justice (access to information about the legal process and procedures) is perceived as an important part of procedural justice [41] and any gaps in guidance materials create challenges for access to justice [16].
Overall, the exploration of long sentences has illustrated that guidance sections present crucial information, but often in an overly complex manner. Given that lexico-grammatical complexity is an inherent part of legal discourse and enables it to achieve precision, unambiguity and generalisability [12], there are some features (e.g. complexity within embedded constructions) which create unnecessary barriers. These challenges could be avoided with the introduction of information step-by-step (if necessary, according to the chronological order of procedures), dividing longer sentences into shorter information units, explaining and illustrating legal terms, and simplifying the structure of embedded clauses and phrases. One crucial aspect to explore further is noun phrases as their structural complexity is constructed through several levels of embedded phrases/clauses as part of post-modification of head nouns, which makes it difficult to unpack the propositional content (see Example 1 and the following section).