A significant number of adolescents and adults worldwide are only able to read at low proficiency levels, even in economically developed countries. The consequences of low reading proficiency levels can be harmful in many ways for both the individuals concerned and their communities in terms of health, political, social and economic outcomes (OECD 2012, 2013).

This issue is being addressed in the fourth of the United Nations Sustainable Development Goals (SDG 4), in particular in SDG target 4.6:

By 2030, ensure that all youth and a substantial proportion of adults, both men and women, achieve literacy and numeracy (UN 2016).

A profound understanding of the phenomenology and components of low reading proficiency levels across different large-scale assessments is relevant to the efforts of achieving SDG target 4.6 in various ways. For example, in order to improve educational programmes and policies, policymakers should know to what extent adolescent students or adults identified as “low-literate” are comparable across studies. Moreover, it would be useful for practitioners such as teachers (not only those engaged in further education) to obtain guidelines for person-oriented interventions for formative assessment. Finally, test developers may need to construct or evaluate reading tasks based on theoretically justifiable criteria for low reading proficiency levels; for example for the purpose of describing and defining proficiency levels.

Although research on reading comprehension of adolescents and adults with low literacy proficiency levels has been conducted for decades, there is still no precise, consistent and generally accepted definition of what constitutes low reading proficiency (for reviews, see Eme 2011; Vágvölgyi et al. 2016). In public debate, the description of low reading proficiency is mainly derived from the description of the lowest proficiency level(s) in international comparative large-scale assessments such as the Programme for the International Assessment of Adult Competencies (PIAAC), where low refers to “Level 1” and “below Level 1” (OECD 2013),Footnote 1 the Programme for International Student Assessment (PISA), where low refers to “Level 1” (OECD 2012),Footnote 2 or the German Level-One Survey (LEO), where low refers to “Alpha Level 3” (Grotlüschen and Riekmann 2012).Footnote 3 All these assessments divide the continuous latent scale,Footnote 4 which measures a person’s ability, into meaningful levels and describe each of them using proficiency level descriptors (PLDs), which represent specific requirements necessary to solve a task at a given proficiency level.

However, to obtain a precise understanding of low reading proficiency, further perspectives should be taken into account – for two reasons. First, different large-scale assessments implement distinct design parameters, for example regarding the population of interest (national, cross-national, adolescent, adult etc.) as well as in terms of literacy frameworks (focus on low proficiency or entire ability spectrum, focus on writing and reading or reading only, paper-based or digital reading etc.) (e.g. Gehrer et al. 2013; Hartig and Riekmann 2012; Kirsch 2001; OECD 2012, 2013; Olsen and Nilsen 2017), which in turn might affect the description of low literacy in the respective large-scale assessment.

Second, large-scale assessments provide only limited information about the factors that underlie difficulties in solving a particular reading task. A better understanding of the cognitive processes and reading-related skills that lead to such difficulties would considerably enhance our comprehension of low reading proficiency in large-scale assessments. Although there is a great deal of research on this subject (e.g. Barth et al. 2015; Gernsbacher et al. 1990; Kintsch 1998; Long et al. 1999; Tighe and Schatschneider 2016), findings from these studies are largely unrelated to the perspective of large-scale assessments.

To close this gap, our article introduces a new process modelFootnote 5 that explains low reading proficiency from an integrative perspective to obtain a comprehensive understanding of the reader-related, text-related and task-related factors along different stages of the reading process that can cause reading difficulties. Our article is structured into three parts:

In the first part, we outline the process model, including research on low reading proficiency from (1) large-scale assessments with a focus on task-text-reader characteristics; (2) cognitive psychology explaining proficiency differences according to underlying cognitive structures and processes relevant for text processing; and (3) research explaining proficiency differences according to developmental precursors of reading comprehension.

Second, based on this process model, we outline core difficulty-generating factors, in particular task and text characteristics that are relevant in evaluating the difficulty of a reading task and thus in determining whether low-literate readers can solve it.

In the third step, we illustrate the benefit of the process model for assessing low reading proficiency by incorporating the model into standard-setting practice. For this purpose, we outline how the process model provided the framework for developing PLDs used for standard setting in a German large-scale assessment, the National Educational Panel Study (NEPS, Blossfeld and Roßbach 2019), to differentiate between low-literate and functionally literate adolescents and adults.Footnote 6

A process model for explaining difficulties in the accomplishment of reading tasks among low-literate readers

Reading comprehension involves more than the ability to read. It includes a variety of reading activities that rely on several developmental precursors of reading comprehension (e.g. word recognition, comprehension, working memory) as well as on cognitive and metacognitiveFootnote 7 strategies that together help a reader to understand, process, restructure, evaluate and monitor information to accomplish all kinds of reading requirements (e.g. Artelt et al. 2005; Eme 2011; Tighe and Schatschneider 2016). In task-oriented reading situations, such as in test situations of large-scale assessments, those activities differ depending on the stage of the reading process, the specific requirements of the task and the characteristics of the written material readers are confronted with (Cerdán et al. 2011; Rouet 2006).

Roughly speaking, the reading process comprises three stages: In the preparation stage, the reader builds a task model in which he or she determines the purpose of the task. In the subsequent execution stage, the reader needs to find relevant information in the text and, if necessary, he or she must relate several pieces of information to each other. Within this process, the reader has to evaluate the relevance of each piece of information and then decide whether to maintain or reject it. This verification and integration process may require several cycles. The reader may also have to revise his or her task model. Finally, in the production stage, the reader needs to verify once again whether the selected information suffices to answer the question and if so, he or she provides a response (see Figure 1).

Fig. 1
figure 1

Proposed process model for explaining difficulties in the accomplishment of reading tasks among low-literate readers

Difficulties in the accomplishment of reading tasks among low-literate readers can be traced along these stages of the task-oriented reading process (Kirsch and Mosenthal 1990; Rouet 2006). Therefore, in the next three sections, we sketch the challenges that low-literate readers might face during (1) the construction of a task model, (2) the process of searching for and selecting information and (3) the process of integrating information along with task-related, text-related and readers’ characteristics.

Difficulties among low-literate readers in task model construction

Reading tasks in large-scale assessments often aim to reflect realistic reading requirements that occur in a variety of reading contexts. Such tasks might require a reader to locate a single piece of information or to draw simple inferences. Other tasks might tap requirements like locating several pieces of information, integrating widely distributed information and reflecting on the intention, content or form of the text (e.g. Gehrer et al. 2013; Kirsch 2001; OECD 2012, 2013). In this context, research has identified several difficulty-generating factors that relate to the wording of the reading task (question and given options), the task format or the reading requirements. In this article, we focus on the wording and the reading requirements because of their particular importance for people with low reading proficiency.

In terms of wording, to understand what should be done in the execution stage (the second stage of the reading process), a reader must first extract the given and requested information from the reading task. This requires the coordination of multiple basic reading skills, ranging from word retrieval and syntactic parsingFootnote 8 to comprehension of the meaning of the information (Perfetti and Hart 2002; Perfetti and Stafura 2014). Low-literate readers are often portrayed as having lower basic reading skills than more proficient readers (e.g. Eme 2011; Landi 2010; McKoon and Ratcliff 2018; Tighe and Schatschneider 2016). Therefore, differences in goal formation (task model construction) can be traced to the wording of the task. In particular, it can be assumed that low-literate readers are more likely to struggle with understanding the reading requirements when tasks contain more low-frequency words and are propositionally denserFootnote 9 (e.g. Embretson and Wetzel 1987; Hartig and Frey 2012; Hartig and Riekmann 2012; Kintsch and van Dijk 1978; Ozuru et al. 2008; Sonnenleitner 2008; Zimmermann 2016).

Regarding reading requirements, existing studies stress that each type of reading question evokes different reading activities that readers will execute to solve a reading task (Cerdán et al. 2019; Rouet et al. 2001). According to Jean-François Rouet and colleagues (2001, p. 175), reading questions that require a reader to locate single pieces of information promote “locate-and-memorize” reading strategies, whereas questions that require integrating information or building up a situation modelFootnote 10 promote “review-and-integrate” reading strategies. Therefore, another potential explanation for proficiency differences is that low-literate readers are less likely to construct an appropriate task model than proficient readers. This may impede them setting themselves appropriate reading goals regarding which information to search for and what type of answer to provide in order to progress through the text and complete the task.

This assumption is supported by a number of experimental studies which demonstrated that readers with lower reading abilities are less likely than proficient readers to construct an appropriate task model that helped them to adapt their reading process (e.g. Cerdán et al. 2011, 2019; de Milliano et al. 2016). For example, Raquel Cerdán and colleagues (2019) designed an experiment in which they manipulated the reading question given to participants. Their aim was to test whether low-literate and literate readers differed in their reading comprehension depending on whether or not the reading question was paraphrased to highlight the required cognitive processes to answer the question. For example, the reading question

Miguel works in the ACOL Company and in the 17th of May week will be on a business trip. Justify if Miguel should contact Raquel.

was paraphrased into

Explain if Miguel should get in touch with Raquel to get the vaccine and why (Cerdán et al. 2019, p. 2117).

Cerdán and colleagues (2019) found that only low-literate readers, but not proficient readers, benefited from such cues. Similar results were found with respect to the wording of test instructions for students with special educational needs in Germany (e.g. Nusser and Weinert 2017). This indicates both a poorer situation model of the reading task among low-literate readers and the dependence of a task’s difficulty on the concreteness of the question concerning the required cognitive processes.

Difficulties among low-literate readers in searching for and selecting information

Once readers have constructed a task model, they have to search for, select and process relevant parts of the text according to the task requirements, irrespective of whether only one piece of information needs to be located or several pieces of information need to be integrated (Cerdán et al. 2011; Rouet 2006). How easily and successfully this can be accomplished depends heavily on the extent of a reader’s cognitive effort to select the relevant information. The cognitive effort, in turn, seems to depend particularly on semantic cues at the micro level and text-signalling devices at the macro levelFootnote 11 that aid a reader in drawing his/her attention to where the relevant information can be found (Kintsch 1998; White 2012).

At the micro level, it is well established that the cognitive complexity of searching for and selecting relevant information is influenced by the type of semantic match between the information in the text and in the reading task (e.g. Kintsch 1998; Lumley et al. 2012; OECD 2012, 2013; Todaro et al. 2010; White 2012; Zimmermann 2016). In its simplest form, the match is facilitated by a literal overlap that activates the same semantic unit, called a semantic proposition (Kintsch 1998). In practical terms, this means, for example, that reading tasks where key words appear in both the reading question/or options and in the headline of the text are easier than tasks where a reader has to infer the relationship. To give an example (Yang et al. 2005, p. 235):

After being dropped from the plane, the bomb hit the ground and exploded. The explosion was quickly reported to the commander [literal match].

compared to

After being dropped from the plane, the bomb hit the ground. The explosion was quickly reported to the commander [inference].

It has been shown that reading tasks in large-scale assessments at the lowest proficiency levels require a reader to retrieve information from the text which is for the most part literally identical to the information in the reading task more frequently than reading tasks for higher proficiency levels. By contrast, tasks at the second-lowest proficiency level more often require paraphrasing or inferring (Kirsch 2001; OECD 2012, 2013).

One possible explanation is that low-literate readers require more cognitive resources to comprehend the meaning of the information in the text because the reading processes at their disposal are less automated in terms of word retrieval and syntactic parsing (McKoon and Ratcliff 2018; Yang et al. 2005). This lack of automated processes, in turn, leaves less processing capacity for other types of matching, such as drawing inferences (Perfetti and Stafura 2014). However, studies have also demonstrated that a literal match can be misleading and can increase difficulty, namely when not the relevant but rather the irrelevant information (also termed distracting information) matches with that in the task. For example, Cerdán and colleagues (2011) showed in an experiment that low-literate readers tended to select information in the text that matched information in the task (question):

Which adverse reactions can the vaccine provoke? Those students who would be falsely seduced by superficial cues would tend to visit the following distracting location of the text: Ask your doctor if you are under medical treatment or have had adverse reactions to the Flu vaccine. If these are very intense it may be dangerous to the Fetus in case of pregnancy (Cerdán et al. 2011, pp. 203–204).

Because of this strategy, low-literate students were more often misled by distracting information. Corresponding results from large-scale assessments have shown the difficulty-generating effect of the amount of distracting information in the text (Kirsch 2001; Lumley et al. 2012; Ozuru et al. 2008; Zimmermann 2016). The tasks at the lowest proficiency levels hardly contained any distracting information, whereas the amount of distracting information increased at the second-lowest proficiency level (Hartig and Frey 2012; Kirsch 2001; OECD 2012, 2013).

Low-literate readers further differ from more proficient readers in terms of the efficiency with which they use adequate reading strategies that help them in searching for and selecting task-relevant information (Cataldo and Oakhill 2000; Cerdán et al. 2009; Hahnel et al. 2018). For example, Giulia Maria Cataldo and Jane Oakhill (2011) showed that low-literate secondary students more often used an inefficient and undirected reading strategy, such as rereading the entire text when they were required to locate specific pieces of information, compared to skilled readers who were more capable of locating the relevant passage. Therefore, the second explanation in proficiency differences refers to the presence or absence of cues at the macro level with respect to text-signalling devices. Studies have shown that low-literate readers use headlines or bullet points rather than larger, more sustained text segments to retrieve information (DelVecchio et al. 2019; Weeks 2001). Moreover, research in the field of health communication has demonstrated that texts with additional visual information (Braich et al. 2011; Dowse and Ehlers 2005; van Beusekom et al. 2016) or with typographical devices, such as coloured font, capital letters or underlining (Bass et al. 2016), helped low-literate readers to find and recall information better than textual information alone. Corresponding results from large-scale assessments also found that low-literate readers were more capable of locating and selecting relevant information when it was prominently placed, such as in the heading or at the beginning of the text, or stood out due to other features such as the presence of a number in the text (OECD 2012, 2013).

Difficulties among low-literate readers with integrating information

In many reading tasks, readers must often integrate several pieces of information (e.g. for comparing statements or understanding the author’s intention). There is ample research showing that low-literate readers struggle with integrating information (e.g. Barth et al. 2015; Gernsbacher et al. 1990; Kirsch 2001; Long and Chong 2001; Magliano and Millis 2003; OECD 2012, 2013). Some explanations have already been provided (see the two earlier sections, Difficulties among low-literate readers in task model construction and Difficulties among low-literate readers in searching for and selecting information).

Another explanation relates to the number of and distance between pieces of information relevant to the reading task. Results from large-scale assessments indicate that whereas more proficient readers are more likely to integrate two or more pieces of information distributed across the text, low-literate readers are more likely to locate single pieces of information and integrate more than one piece of information located in neighbouring sentences (OECD 2012, 2013; Kirsch 2001). Similarly, studies from cognitive psychology suggest that low-literate readers have greater difficulty in drawing inferences using distributed pieces of information (global inference) than from information found in neighbouring sentences (local inference) (Barth et al. 2015; Long and Chong 2001; Gernsbacher et al. 1990; Magliano and Millis 2003). This seems to be especially due to differences in working memory capacities (e.g. Abadzi 2008; Carretti et al. 2009). Studies have shown that low-literate readers can only keep a small number of pieces of information active in their short-term memory and reactivate it when they read; both of these activities are important prerequisites for integrating pieces of information (Carretti et al. 2009; Chiappe et al. 2000; De Beni et al. 1998; Mellard and Fall 2012). In particular, low-literate adults might struggle to integrate information provided across larger text segments because the relevant corresponding information is no longer available in their short-term memory but must be retrieved from their long-term memory. This is coupled with the finding that they have more difficulties in suppressing irrelevant information (Carretti et al. 2009; Daneman and Carpenter 1980; Gernsbacher et al. 1990; Long et al. 1999).

In addition, difficulties can be traced to readers’ metacognitive and cognitive strategies. One important metacognitive skill is to monitor the comprehension process itself, since the detection of inconsistencies helps a reader to adapt his or her reading behaviour to restore coherence (Helder et al. 2016). Thus, a successful integration of information depends on how well the reader succeeds in monitoring the coherence of new pieces of information with those that are already activated. In this context, studies have shown that low-literate readers are less sensitive to detecting coherence breaks than literate readers (Barth et al. 2015; Long and Chong 2001; Todaro et al. 2010). Investigations into how much the distance between relevant pieces of information mattered found that low-literate readers were more likely to detect local coherence breaks, but less likely to detect inconsistencies when contradictory pieces of information were separated by intervening sentences (Barth et al. 2015; Long and Chong 2001). Regarding cognitive reading strategies, studies emphasise that, unlike proficient readers, low-literate readers use fewer efficient reading strategies that would allow them to retain pieces of information more easily and then relate them to each other. For example, compared to more proficient readers, low-literate readers less often used strategies such as underlining, note-taking and summarising key information that would have helped them to capture, retrieve and restructure information (Artelt et al. 2001; Cromley and Azevedo 2007; de Milliano et al. 2016).

Establishing difficulty-generating factors

In this section, we demonstrate how the process model outlined earlier in this article makes it possible to identify the main difficulty-generating factors that underlie proficiency differences between low-literate and literate readers. The process model (Figure 1) highlights that the difficulty of solving a reading task depends on the extent to which the task and stimulus text demand different reading-related skills.

First, whether or not low-literate readers adequately progress through the text depends on the readability of the reading task (e.g. in the form of high-frequency words) and the concreteness of the wording in terms of how clearly the processing steps required to answer the reading question are communicated (e.g. whether the question highlights or merely vaguely formulates the key information to be sought).

Second, the model emphasises that the success of low-literate readers in locating the relevant information depends on both the degree to which the required information matches the information given in the text, and the degree of clarity with which the text draws attention to the relevant passage in which the information can be found. However, in the absence of such cues, low-literate readers need to apply more cognitive resources and strategies to locate the information and to draw inferences about its relevance. Therefore, both the degree of semantic match and the presentation of information are further difficulty-generating factors.

Third, the number of and distance between relevant pieces of information affect whether or not low-literate readers can solve the reading task. The more pieces of information there are, and the larger the gap between the relevant pieces of information, the more demanding the task becomes in terms of basic reading (e.g. larger amount of syntactic parsing), comprehension (e.g. understanding and evaluating whether the piece of information is relevant or not) and processing (e.g. using appropriate reading strategies to integrate multiple pieces of information, thus suppressing irrelevant information).

Fourth, the number of and distance between distracting pieces of information can cause reading difficulties. Low-literate readers are more likely to erroneously choose answers that overlap with the distracting information. This seems to be particularly problematic when the distracting information appears before or near the relevant information.

Fifth, not only the readability of the reading task (the question and options) but also the readability of the text can explain why low-literate readers have difficulties in solving the task. For example, if the task requires a reader to locate a single piece of information and the text provides neither semantic cues nor text-signalling devices, then it is likely to be much more demanding to progress through a more complex text (e.g. in terms of propositional density) than through other texts with a lower complexity.

Standard setting in the German National Educational Panel Study (NEPS)

In this section, we describe how this process model and the identified difficulty-generating factors were applied for the purpose of standard setting in a German large-scale assessment, the National Educational Panel Study (NEPS; Blossfeld and Roßbach 2019). The aim was to establish a proficiency level for differentiating between low-literate and functionally literate adolescents and adults. We begin with a brief overview of NEPS and the assessment of reading proficiency before presenting the steps of the standard-setting procedure.

Overview of NEPS

NEPS provides rich longitudinal data on the development of competencies (reading, mathematics, scientific literacy, and information and communication literacy) following a representative multi-cohort sequence designFootnote 12 with six cohorts ranging from early childhood to adulthood. Thus, it provides a rich data source that enables scholars and practitioners to understand how learning environments shape competence development (e.g. among low-literate readers) and how competencies are related to educational decisions and returns to education in formal, non-formal and informal contexts throughout the lifespan (Blossfeld and Roßbach 2019).Footnote 13

Briefly, the six NEPS starting cohorts (SCs)Footnote 14 are

SC 1:

Starting Cohort Newborns

SC 2:

Starting Cohort Kindergarten

SC 3:

Starting Cohort Grade 5

SC 4:

Starting Cohort Grade 9

SC 5:

Starting Cohort First-Year University Students

SC 6:

Starting Cohort Adults

In this article, we focus on two starting cohorts, Grade 9 adolescents (SC 4) and adults (SC 6), to apply a general understanding of the usability of the process model.Footnote 15 The ninth graders (SC 4, n = 13,897) attended regular schools (lower secondary school, intermediate secondary school, upper academic school track, comprehensive school and multi-track schools) in Germany and were, on average, 14.74 years old (SD = 0.73). Among the SC 4 sample, 49.14 per cent are female and 12.37 per cent have a non-German background. The students were first tested for their reading competence in 2011/12. Students with special educational needs are not included in the sample. The age of the adult sample (SC 6, n = 8,480) ranged from 24 to 69 years. They were, on average, 47.82 years old (SD = 11.68). Among the SC 6 sample, 48.62 per cent are female and 12.68 per cent have a non-German background. The participants of SC 6 were first tested for their reading competence in wave 3, 2010/11, or wave 5, 2012/13, by applying identical assessments (Haberkorn et al. 2012; Koller et al. 2014).

NEPS reading tests

Being a longitudinal study, NEPS aims to track the development in reading comprehension among other competencies over the lifespan. Therefore, the reading tests for all cohorts are based on the same framework, but they are adapted for age-appropriate texts and text topics (Gehrer et al. 2013). The NEPS reading framework considers three main dimensions, namely text functions, cognitive requirements and task formats. Each participant in the two cohorts under consideration here received five texts, each representing a different text function, including only continuous texts:Footnote 16 (1) an information text; (2) a commenting text or an argumentative text; (3) a literary text; (4) an instructional text; and (5) an advertising text. Cognitive requirements refer to the process participants must engage in to solve the task and are classified into finding information in the text, drawing text-related conclusions and reflecting and assessing (situation model). In terms of task formats, most of the items are presented in a multiple-choice format. Further task formats are decision-making items and matching items. Participants were provided with a standardised instruction and sample items to ensure that they understood the item formats (Gehrer et al. 2013; Haberkorn et al. 2012; Hardt et al. 2013; Koller et al. 2014).

The NEPS reading tests which we used for differentiating between a low reading proficiency level and a functional reading proficiency level consist of 31 items (SC 4) and 30 items (SC 6) respectively. To estimate reading competence, the NEPS data were scaled on item response theory (IRT),Footnote 17 showing a good reliability in SC 4 (WLE reliability = 0.749)Footnote 18 (Haberkorn et al. 2012) and in SC 6 (WLE reliability = 0.717/0.743 in wave 3/5) (Hardt et al. 2013; Koller et al. 2014). Differential item functioning (DIF) was tested for several population indicators (e.g., gender, school certificate, migration background) to evaluate the test fairness across several subgroups. The analyses showed that the tests were fair for the examined subgroups, indicating that differences in reading competences and thus the probability of being assigned to the low-literate group are not due to differences in the item difficulties between subgroups such as native and non-native speakers (Haberkorn et al. 2012; Hardt et al. 2013; Koller et al. 2014). Participants were tested by trained interviewers either at their schools (SC 4) or in their homes (SC 6) via a paper-and-pencil test with a maximum test duration of 28 minutes.

Determining a threshold for low reading proficiency

Reports on (cross-sectional) international large-scale assessments regularly offer “cut-offs” (thresholds) for proficiency levels across the entire ability spectrum, including a more or less theoretically motivated definition of low reading proficiency at the lower levels (e.g. Grotlüschen and Riekmann 2012; OECD 2012, 2013; Olsen and Nilsen 2017). Because of the multi-cohort sequence design implemented in NEPS, however, the definition of proficiency levels for repeated assessments within and across cohorts is far more ambitious, particularly with respect to linkage across time and cohorts.

All parties involved in the project agreed that the standard-setting procedure for the definition of low reading proficiency within NEPS needed to follow an a priori theory-driven approach (from the outset); first because of the multi-cohort sequence design, and second because the NEPS reading literacy framework has conceptual overlaps with those of other large-scale assessments such as PIAAC and PISA (OECD 2012, 2013), but also differences. Overlaps include, for example, considering the entire reading ability spectrum, cognitive requirements and function of the texts. Conceptual and methodological differences include, for example, the breadth of texts such as non-continuous texts and digital texts, and differences in terms of scaling procedures.

The project team decided that the Bookmark method (Mitzel et al. 2012), due to its logical appeal and practicality (Karantonis and Sireci 2006), would be best suited to determining a threshold. It is one of the most frequently used standard-setting methods, and its approach involves subjective judgments to determine proficiency levels. In this method, the items are presented according to their item difficulty in an ordered item booklet (OIB), beginning with the easiest item. It is then the task of the panellists to place a cut-score “bookmark” between those items which, in their view, define the boundary between two proficiency levels (e.g. Cizek et al. 2005; Karantonis and Sireci 2006; Mitzel et al. 2012). Once the method had been chosen, the following two steps were carried out:

Step 1: Translation of the difficulty-generating factors into proficiency level descriptors (PLDs)

Based on the process model, we translated the identified difficulty-generating factors into PLDs that served as guidance for the Bookmark Method (see Figure 2).

Fig. 2
figure 2

Steps for developing proficiency level descriptors (PLDs)

As shown in Table 1, the PLDs differentiate among the dimensions of the difficulty-generating factors. Since we expected the reading abilities of participants who were classified as literate readers to cover a wide range of achievements, we decided not to compare what individuals at the lowest proficiency level and those at the next highest proficiency level could do, but to contrast instead what low-literate readers can probably do and not do.

Table 1 Proficiency level descriptors (PLD) of low-literate adolescents and adults within NEPS

Step 2: Determining a proficiency level with the Bookmark method

First, we ordered the reading items in an OIB according to their item difficulty in the IRT models (Haberkorn et al. 2012; Hardt et al. 2013; Koller et al. 2014), beginning with the easiest item. Next, we briefed the panellists on how to set the cut score by explaining the Bookmark method and discussing the PLDs. The group of panellists consisted of experts, including test developers and professionals, who were working with large-scale assessments and reading comprehension tests in Germany. After this training, the Bookmark method was applied in three rounds (Cizek et al. 2005; Mitzel et al. 2012). Within those rounds, the panellists, in repeated comparison of the reading items with the PLDs, had to set a cut score “bookmark” between those items which, in their view, defined the boundary between a low reading proficiency level and a literate reading proficiency level. In the third round, the final cut score was determined. Based on this standard-setting procedure, 4.24 per cent of adolescents (SC 4, weighted) and 14.25 per cent of adults (SC 6, weighted) were assigned to the low-literacy group (for further results, see Wicht et al. forthcoming).


This article presents an integrative process model we developed drawing on different research traditions – large-scale assessments, research on underlying cognitive processes and developmental precursors of reading (e.g. Barth et al. 2015; Gernsbacher et al. 1990; Hartig and Riekmann 2012; Kintsch 1998; Kirsch 2001; Long et al. 1999; OECD 2012, 2013). Our purpose in developing it was to obtain a comprehensive understanding of the reader-related, text-related and task-related factors that explain reading difficulties along different stages of the reading process. Hence, our process model bridges the gap between factors proven to be related to low reading proficiency in large-scale assessments and factors proven to explain reading proficiency differences based on underlying processes and developmental precursors of reading comprehension. The combination of these perspectives offers a comprehensive explanatory model for low reading proficiency that can be applied to different educational contexts.

For example, as we demonstrated in the translation of the process model into difficulty-generating factors and subsequent application of the results to the standard-setting process within NEPS (Blossfeld and Roßbach 2019), the advantage of the chosen a priori approach is that the description of the proficiency levels has a stronger theoretical basis than with a purely post hoc (retrospective) interpretation of the item characteristics used in many comparative large-scale assessments (OECD 2012, 2013; Olsen and Nilsen 2017). The connection between the a priori developed PLDs and the item difficulties, determined independently of each other, can be regarded as a construct validation (Egan et al. 2012).Footnote 19 Furthermore, the difficulty-generating factors we identified using the process model are useful for the comparison of large-scale assessments. For example, it emerges that there is a high similarity in the descriptions of the low reading proficiency levels of NEPS, PISA (OECD 2012), PIAAC (OECD 2013) and in parts of the Progress in International Reading Literacy Study (PIRLS)Footnote 20 (Bos et al. 2012). However, further research is required to validate comparability.

At the same time, the process model also provides a template to question how valid and reliable the assignment of test participants to the low-literacy group is. For example, we have argued that the readers’ skills, the task requirements and factors related to the text are relevant to understand reading difficulties among low-literate readers. To follow up on this interdependence, it is worth investigating the extent to which cognitive and language-related deficits explain assignment to the low-literacy group, especially if the readers are non-native speakers, in order to validate that a person is assigned to the low-literacy group because of difficulties in reading only (e.g. Abadzi 2008; Vágvölgyi et al. 2016).

Moreover, our process model provides some hints regarding a process-oriented assessment of a person’s reading difficulties that can guide teaching measures. Through the model, the assessment process could start with an examination of a person’s reading profile to assess whether the weaknesses lie, for example, primarily in the basic reading skills or rather in the processing skills (e.g. working memory or strategic skills). After having a clear picture of the individual’s needs, the teacher could then assess which text-related and task-related factors might cause additional difficulties; for example, he or she might probe whether the difficulties result from the degree of concreteness of the reading task or the text structure. To ensure a comprehensive assessment, it might be helpful to probe the reading difficulties with different task formats, text forms (e.g. in terms of text types or text-signalling devices) and task requirements.