1 Introduction

Spelling errors are an important source of evidence for theories of linguistic processing, especially when it comes to morphology (see e.g. Kuperman et al., 2009), as well as regarding the relationship between writing and cognition more generally (see e.g. Kuperman et al., 2009). In particular, spelling errors can help us assess the role of morphological structure in language users’ mental representations of words (see e.g. Schmitz et al., 2018; Bar-On & Kuperman, 2019; Surkyn et al., 2020).

This paper builds on this strand of research with a corpus-based study of spelling errors in handwritten German school-exit exams (K. Berg et al., 2021). Specifically, we focus on the phenomenon of letter omissions. Take, for example, the following spelling errors that are usually viewed as “slips of the pen” (e.g. Ellis, 1979), i.e., errors that depend on general cognitive factors like concentration, attention etc. In other words, the writer knows the orthographically correct form of a written word, but is – for some reason or other – not able to produce it (angle-shaped brackets indicate that we are referring to the written form).

  1. (1)
    1. a.

      <Lebewese> (instead of <Lebewesen> ‘living being’)

    2. b.

      <Gesellschaf> (instead of <Gesellschaft> ‘society’)

    3. c.

      <ermöglich> (instead of <ermöglicht> ‘enabled’)

In these cases, the attested form contains less material than what is expected given the codified standard spelling. All forms in (1) are attested in the corpus, and we would expect the respective writers to be able to rectify them with enough time and concentration. However, the forms differ regarding one crucial aspect: In (1a), it is the stem that suffers the erosion; in (1b), it is a derivational suffix; in (1c), it is an inflectional suffix. Accordingly, the question this paper addresses is: Does the type of morphological category have an effect on the probability of letter omissions? To tackle this question, we will first give a brief overview of research on spelling errors, particularly focusing on studies that take the role of morphological boundaries into account. Then we present our case study on omission errors in handwritten German school-exit exams. We will show that word-final letter omissions are particularly likely to affect inflectional morphemes.

2 State of the art

A number of studies investigating different languages have already provided evidence that morphology affects the distribution of spelling errors. Three studies are particularly relevant here, one on Hebrew, one on English, and one on Dutch:

Bar-On and Kuperman (2019) investigate the insertion of the Hebrew letter Y () representing the vowel /i/, which goes against Hebrew spelling norms, in data from a corpus of unedited blog texts, e.g. MYRPST <> instead of conventional MRPST <> /miʁpeset/ ‘balcony’. They show that Y-insertion occurs in ca. 25% of nouns with an appropriate phonological environment, especially, however, in lower-frequency words. Importantly, the occurrence of this error was much less likely if it would disrupt a morphological boundary.

Gahl and Plag (2019) show that the strength of morphological boundaries affects misspellings of derivationally complex words in English. Drawing on Twitter data, they investigate cases in which the target suffix is replaced by a similarly-sounding suffix, e.g. <segmentible>, <bettermint>. Their concept of morphological boundary strength is taken from Hay & Baayen (e.g. 2005), who argue that the strength of morphological boundaries is influenced by multiple factors including morphological productivity, semantic transparency, and the frequencies both of the base and of the derived word. Consider, for example, the allomorphs -able and -ible: The boundary strength of the former is considered to be higher as it attaches to a wider range of bases and is highly productive (Gahl & Plag, 2019: 8). Their results provide tentative evidence that low segmentability is associated with increasing spelling difficulty, or conversely, that stronger morphological boundaries facilitate standard spellings.

Schmitz et al. (2018) single out analogy as one influential factor for the distribution of spelling errors in Dutch Twitter data: For example, they find that the third person singular form beantwoordt ‘answered’ is often misspelled <beantwoord>, which can be explained as an analogical extension of -<d> as found in other members of the paradigm, e.g. <beantwoord> (1SG), <beantwoorden> (infinitive), or <beantwoordend> (present participle). This is an important result for understanding how morphologically complex words are produced in (spontaneous) writing, as it provides further evidence for the idea corroborated by the results of Ernestus and Mak (2005), who showed that “people prefer analogy to the inflectional paradigm.” (Schmitz et al., 2018: 120)

These three studies indicate that morphology seems to play a major role in the distribution of spelling errors, but that it is also closely intertwined with other, usage-based factors. Firstly, spelling errors seem to “respect” morphological boundaries, which indicates that writers are aware of these boundaries. Apart from the boundaries themselves, the strengths of the boundaries have an effect on the occurrence of spelling errors. But apart from the morphological make-up of words, factors like frequency and analogy have an effect on the distribution of errors. In sum, these results suggest that some morphological units are more prone to spelling errors than others, partly depending on the degree to which they are perceived as separate units (rather than as parts of the words in which they occur).

Further evidence for the role of morphological boundaries in written language production comes from experimental studies that investigate typewriting in German: Weingarten et al. (2004) found prolonged keystroke latencies at combined morpheme and syllable boundaries compared to the same letter combinations in other positions. For example, typists needed on average 110 ms. longer to type the digraph <ls> in <Roll-schuh> ‘roller skater’ or <Schaukel-stuhl> ‘rocking chair’ as compared to e.g. <Hal-stuch> ‘scarf’ or <fal-sch> ‘wrong’. While pure syllable boundaries also led to increased latencies, they were considerably lower than the ones for combined morpheme- and syllable boundaries (ca. 70 ms). They also found that “[t]he intervals were longer when the digraph spanned the boundary between two stem morphemes (e.g. <n-e> in <Korn-ernte> ‘corn harvest’) than between two derivational morphemes (e.g. in <an-erkennen> ‘acknowledge’)” (Weingarten et al., 2004: 540). Interestingly, they found similar results when analyzing the number and duration of pen lift-offs in handwriting (Weingarten et al., 2004: 540). They also took word and base frequency effects into account, showing that word frequency affects the duration of interkey intervals at combined syllable/morpheme boundaries independently from the respective base frequency (Weingarten et al., 2004: 542). This leads them to the conclusion that “complex words are not composed during typing” (Weingarten et al., 2004: 542).

Another strand of evidence comes from dysgraphic patients. Badecker et al. (1996) present data from a dysgraphic individual who makes (phonologically plausible) mistakes on irregularly spelled affixes but not on irregularly spelled stems (e.g. surfed as <sourfed>). This strongly indicates that different morpheme types differ in their cognitive representation (Rapp & Fischer-Baum, 2014). Rapp and Fischer-Baum (2014) therefore assume that together with letters and digraphs, morphemes constitute an important unit of graphemic representation. Rapp et al. (2015) propose what they call the “morpho-orthography hypothesis”. According to this hypothesis, “in addition to morpho-phonological processes, there are morpho-orthographic ones that operate over morphologically complex orthographic representations.” (Rapp & Damian, 2018: 412) Research on neurotypical individuals also underlines the role of morphology in written language production, with morphological structure even affecting the material realization of written words: For instance, a handwriting study by Kandel et al. (2012) shows that letter stroke duration and inter-letter intervals are longer for (French) suffixed words such as pruneau ‘prune’ than for pseudo-suffixed words such as pinceau ‘brush’.

Both the three corpus-based studies reviewed above and experimental approaches like the one pursued by Weingarten et al. (2004) or Kandel et al. (2012) fill an important gap in previous research as they focus on spelling: Rapp and Fischer-Baum (2014: 345) point out that “[t]here have been hundreds of studies examining morphological decomposition of written words in reading. In contrast, there has been only a handful examining this issue in spelling.” Nevertheless, research on reading has also yielded a number of findings that are highly relevant for the research questions addressed in the present paper. For example, Bredel et al. (2013) show that reading experience determines morphological processing in German. Stems of German words are usually spelled in a uniform way: <schaffen> ‘to manage’ is spelled with <ff> for phonological reasons, but the doubled consonant is transposed to monosyllablic forms within the inflectional paradigm (e.g., <schafft> ˋmanages’). Bredel et al. (2013) demonstrate that less skilled readers do not detect morphological spelling errors. In Bredel et al.’s (2013) reaction time experiment, their reaction times to sentences with misspelled words like <SCHAFT MAN DAS?> ‘Can one manage this?’ did not differ significantly from the morphologically uniformly spelled version <SCHAFFT MAN DAS?>, while for more experienced readers, the morphologically deviant version led to an increased reaction time. This suggests that language users’ experience with written language affects their sensitivity to regularities in word spelling that are related to morphology (on the so-called morphological principle in spelling, see e.g. Sandra, 2019: 155–157).

Note that many of the studies reviewed above focus on errors in typewritten language – i.e. on what Th. Berg (2002) calls “slips of the typewriter key”. Results that have been obtained on typewritten language cannot necessarily be transferred to handwriting in a simple way. T. Berg (2002: 186) points out that typing is a discrete activity at the level of the keystroke, while handwriting is more continuous; as such, it is “not entirely obvious how similar the mental processes underlying typing and writing are.” (T. Berg, 2002: 86) This may also explain one of the key findings of his corpus-based study: Contextual slips, i.e. errors that can be explained as anticipation or repetition of letters from the surrounding context (e.g. <rebember> instead of <remember>), outnumber noncontextual ones in typing, handwriting, and speaking, but typing shows the highest rate of noncontextual errors (see T. Berg, 2002: 189).

These considerations suggest that while typewriting – especially spontaneous typewriting, for which the blog and Twitter data used in previous studies offer a good proxy – can provide an important window to insights about the role of morphology in writing, taking a closer look at handwritten language can offer important additional insights. Even spontaneously typewritten text is usually produced with a considerable amount of conscious planning, and even when a Tweet or a blog post is written in a very fast manner, skilled writers can correct many spelling errors they may have made before publishing it. In handwritten texts, by contrast, errors usually have to be corrected by e.g. striking out words. Thus, handwritten texts arguably provide a more direct window to cognition and an ideal testing ground for research on “morpho-graphemics”. The study discussed in the present paper represents a first step towards filling this research gap.

3 Material and methods

We draw on the GraphVar corpus (K. Berg et al., 2021), a corpus of handwritten German A-level exams. It comprises 1,667 exams, covering the time span from 1923 to 2018. For the present, synchronically-oriented study, we use the 667 texts from 1990 onwards. The data have been transcribed, POS-tagged employing the STTS tagset (Schiller et al., 1995), lemmatized, and annotated with “target hypotheses” by trained annotators. The target hypothesis layer represents the orthographically correct form of each token. Note that the official, codified norm is variable; it changed in 1996 and again in 2006. The target hypotheses are always based on the norms at the time when the exam in question was written.

For each combination of actual spelling and orthographically correct spelling, we semi-automatically filtered out the ones containing omissions, i.e. cases where the actual spelling contains fewer letters than the orthographically correct spelling. We further categorized these cases according to the position of the omission, and the constituent type containing it. We used a very broad classification into stems, inflectional and derivational affixes. We focus on nouns as the largest word category in the corpus (type-wise).

We employed the annotation layers “IST” (actual appearance of the token in the exam) and “IST_Ziel” (representing the target hypothesis) to automatically extract all tokens deviating from the expected (correct) form, yielding more than 50,000 errors of different types. We discarded the largest group of errors, punctuation mistakes, as irrelevant for the given research question. We then limited the set of errors to nouns excluding proper nouns by filtering the remainder of observations for the STTS tag “NN”, turning up 14,200 instances of misspelled nouns. We discarded all capitalization errors and hyphenation errors in compounds and incorrectly separated detachable verb particles from this set. In a final filter step, we limited our analysis to tokens on the “IST” level containing fewer characters than expected as in the “NORMAL” annotation layer, thereby excluding the error types of insertion and replacement. The remaining set consisted of 919 nouns with initial, medial, and final character omissions. To keep the manual annotation work feasible, our manual analysis, reported on in more detail below, focused particularly on word-final omissions of individual letters and letter pairs. Two words were omitted from the analysis as they alternate between two different stem forms: Friede/Frieden ‘peace’ and Glaube/Glauben ‘faith’.

Drawing on these data, we took a closer look at the following aspects, which we will elaborate on in more detail one by one in Sect. 4: First, taking all one-letter omission errors into account, we checked the positions of spelling errors in each word (Sect. 4.1). Second, we focused on word-final one-letter and two-letter omissions. Apart from assessing the frequency of word-final one-letter and two-letter omissions, we also took a closer look at the morphological category – stem, inflectional suffix or derivational suffix – of the segment to which the omitted letter(s) belong(s), comparing the distribution to the distribution of morphological categories among the correctly spelled words (Sect. 4.2). Zooming in on omissions of letters that belong to inflectional suffixes, we also checked the type of inflectional category (case or number marker) to which the omitted letter(s) belonged, comparing the distribution to the distribution of case and number markers in present-day German as attested in the morphologically annotated Tüba/DZ corpus (Telljohann et al., 2005) (Sect. 4.3). In a final exploratory step, we compared the synchronic data to earlier (pre-1970) data from the GraphVar corpus, as there is some evidence in the data that the earlier exams were more carefully planned and redacted by the writers than the later ones, which offers an interesting additional point of comparison.

4 Results

4.1 Distribution of omissions

For the present study, we will largely focus on word-final omissions. One reason for this decision is that an exploratory analysis of our data revealed that one-letter omissions (which are the most frequent type of omission) tend to occur at the right boundaries of words. Figure 1 shows the overall frequency of omissions for words with 3, 4,…, 22 letters, taking all instances of words into account that differ in one single letter from their target hypothesis. While this can of course only give a coarse-grained overview of the data because it neglects differences in type and token frequency, the skew to the right-hand side becomes relatively clear. Following an anonymous reviewer’s suggestion, we have also coded the data (919 types) for the morphological category of the segment the omitted letter is part of. Unsurprisingly, word-final omissions are usually (parts of) inflectional morphemes, e.g. <Akzent> instead of <Akzents> ‘accent (genitive)’ or <Ängst> instead of <Ängste> ‘fears’. Thus, the skew to the right-hand side in the distribution of omission errors seems to be largely due to errors affecting the written representation of inflectional morphemes.

Fig. 1
figure 1

Frequency of single-letter omissions in nouns with n letters (from n = 3 to n = 22) at different positions in the GraphVar corpus, with the linguistic category of the segment to which the omitted letter belongs

In all words (save for the four longest words in Fig. 1, see last row), the word-final position is the most frequent position for omissions. For up to a length of nine letters, omissions at the word-final position are more frequent than omissions at all other positions combined. Word-initial omissions, on the other hand, are not attested at all in our data. And word-medial omissions show no clear distributional pattern when we compare words of different length. This is an interesting difference to the results of Wing and Baddeley (2009), who investigated English data and showed that errors usually occur in the middle of the word. The distribution of morphological categories in Fig. 1 points to an explanation for this difference: English has notoriously little inflectional morphology, and in our German data, it is mostly inflectional morphemes that boost the frequency of omission errors at the right-hand side of words.

Because they are the most frequent pattern, we will focus on word-final omissions in the remainder of this paper. More specifically, as mentioned above, we will only take omissions of one or two letters into account at this point, i.e. we will focus exclusively on the omission of word-final unigrams and bigrams.

4.2 Word-final omissions

There are 393 omissions of single word-final letters in our data, and 156 omissions of word-final bigrams. To determine whether any distribution we find is actually meaningful, we need more information about the general distribution of word endings: How often is the last letter of a random noun part of the stem, an inflectional or a derivational suffix? Or, put differently: How likely is it for a random noun to end with a stem? – We gauge this baseline distribution by collecting all nouns in the corpus (from all texts written after 1990) and manually classifying them according to the morphological type of their last letter. Table 1 shows the distribution of the single final letter omissions across the three morphological categories, as well as the general distribution of these categories.

Table 1 Number of omissions of single final letters in nouns and distribution of these final letters in nouns as a baseline, both according to morphological type. Data base: Texts from the GraphVar corpus >1990

85% of missing final letters are part of inflectional suffixes, but only 30% of nouns (as running words in texts) end with inflectional suffixes.

We fit a mixed-effects zero-inflated negative binomial model to the data, using the number of errors as the response variable. Zero-inflated negative binomial regression (see e.g. Hilbe, 2011) was chosen because we are dealing with count data with excess zero counts (after all, many more words are written correctly than incorrectly, so most lexemes in our dataset are never misspelled). Morphological category (stem, inflectional suffix, derivational suffix) and word form type frequency as gauged from the DWDS Core Corpus of the 20th CenturyFootnote 1 were added as predictor variables. We also added an interaction between category and frequency, as it seems plausible to assume that category and frequency might work in tandem in determining the presence or absence of spelling errors. Finally, we used random intercepts for the individual documents in which each word occurred in order to account for potential writer-specific patterns, as well as for the individual word form types to account for lemma-specific effects. The total frequency of the word form type in the document was added as an exposure variable. For fitting the model, the glmmTMB package (Brooks et al., 2017) for R (R Core Team, 2023) was used.

As Table 2 as well as the effects plot in Fig. 2 show, the morphological category of inflection as well as the lemma frequency of the word in question emerge as highly significant predictors. The interaction of category and frequency is not identified as a significant predictor, and the model with the interaction does not perform significantly better than a model without the interaction. The former observation is in line with our hypothesis: Missing final letters in German nouns are much more likely on inflectional suffixes as opposed to stems or derivational suffixes. Frequency effects can also be expected from a usage-based perspective: it makes intuitive sense that lexemes that occur less frequently are more error-prone than lexemes that language users encounter on a more regular basis.

Fig. 2
figure 2

Effects plot for the two fixed effects in our model, “Category” and “Frequency”. The effects plot was created using the package effects (Fox & Hong, 2009)

Table 2 Coefficients of the fixed effects in the regression model

Turning to word-final bigrams, their omission is less frequent, but the pattern is even more striking, as Table 3 shows.

Table 3 Omissions of final letter bigrams in nouns. Data base: GraphVar corpus (exams after 1990)

The ratio of omissions on inflectional suffixes is even higher than in the case of single final letters. And what is more important, we find no instance of an omission crossing a morphological boundary. If final bigrams were randomly left out, we would expect to find at least some cases like <Zustan> instead of <Zustands> (‘state-GEN’), where both the stem final <d> and the inflectional <s> are missing. Instead, omitted bigrams seem to respect morphological boundaries.

4.3 Morphological categories

We also coded the spelling errors for the kinds of suffixes that the missing final letter are part of: What kind of suffixes are they, and what morpho-syntactic categories do they encode?

German nouns are inflected for case and number. There are two numbers, singular and plural, and four cases, Nominative, Accusative, Dative and Genitive. There are widespread syncretisms within the paradigms, depending on grammatical gender and inflectional class, which entails that in many inflectional paradigms, the same form occurs in multiple cells, cf. e.g. Beziehungen ‘relationships’, which can be both nominative and accusative plural. But in such cases, the information about case is usually encoded elsewhere, e.g. in the determiner: die Beziehungen (nominative) vs. den Beziehungen (accusative). Hence, we would categorize -en as a number suffix, rather than a case suffix.

Three different letters are affected on nominal inflectional suffixes, -e, -s, and -n. Consider the following examples (with the missing letter underlined):

  1. (2)
    figure c

The suffix -e is usually a plural marker (2a), with the exception of dative-e as in dem Hunde ‘the dog’ (largely obsolete in present-day German, dem Hund would be the default option), but the letters- s and -n can be both part of a plural marker (2c and e) as well as case markers (2b, d and f). For -s, the standard case in our data is clearly case marking (2b); there are only two instances of plural marking as in (2c). For -n, the majority also marks case, but one third of the cases mark number as well (Table 4).

Table 4 Single final letter omissions of -e, -s and -n as part of inflectional suffixes, categorized according to the feature type that is marked (number vs. case)

Overall, case marking is dominant in our data on final letter omissions. Omissions of the final letter of a case marking suffix are about four times as frequent as omissions of the final letter of a number marking suffix.

Of course, the absolute numbers can be misleading without a comparison to non-omitted forms. As a baseline, we use morphosyntactically annotated data from the Tüba/DZ corpus (Telljohann et al., 2005). A CoNLL-formatted treebank based on German newspaper texts of 1989 to 1999, Tüba/DZ comprises roughly 1.8 million tokens in about 3.600 newspaper texts of different topics (see Buchholz & Marsi, 2006 for details about the CoNLL annotation standard). In the absence of a morphological layer in GraphVar at its current state of gold-standard annotation, Tüba/DZ appears to be a reasonably comparable corpus with respect to domain and register to obtain distributions of case and number markings. Comparing the distributions of open word class part-of-speech tags further facilitates this claim of structural resemblance. Table 5 shows that even in terms of word class distribution, TübaDZ and GraphVar are comparable to a considerable extent.

Table 5 Relative word class frequencies in TübaDZ and GraphVar

To assess how many words that end with <-s> and <-n> there are (where this letter is not part of the lemma), and whether this letter is part of a case or a number marking, we make use of the morphological annotation layer in TübaDZ (Table 6).

Table 6 Absolute and relative numbers of case and number marking of German nominal suffixes -s and -n. Database: Tüba/DZ corpus

The distribution is very similar; number marking is the dominant function of both letters. That in turn means that the distribution of omissions we find in Table 4 is not the result of a random process: Case marking is comparatively rare in general, but seems to be the major category as far as omissions are concerned. As a tentative interim conclusion, we can therefore state that in our data, case marking is more prone to omissions compared to number marking.

4.4 Level of attention

Older exams until well into the 1970s show special characteristics on the formal side. The script is often very regular, and there are only few deletions, insertions, or crossed-out words. Older texts were regularly composed on a separate sheet (which contains these modifications), but were then carefully copied to the final set of sheets. The legibility and aesthetic appeal of the script was explicitly and regularly commented upon by the grading teacher; these comments are attested in the GraphVar data until the 1980ies.

These older texts (we use 1970 as a relatively arbitrary cut-off) are thus very interesting for our investigation: Carefully copying an existing text means reading it, and then re-writing it. That means that potential errors that the writers were aware of could be caught. What does that mean for word-final omissions? If attention (i.e. re-reading the texts) helps catching these errors, we would expect fewer errors before 1970 than after 1990 (a second arbitrary cut-off to keep the groups distinct). While we cannot be sure how attentive the writers were in each case, what we do know is that they had to re-write every word. For later texts, this does not hold; the first version of a word is also the last in most cases (save for self-corrections). Thus, we end up with a (albeit somewhat indirect) measure of attention. Table 7 shows the distribution.

Table 7 Number of single letter word-final omissions, nouns, and amount of nouns with omissions in texts before 1970 and after 1990. Data base: GraphVar corpus

The ratio of nouns with final omission is 38% higher in the later texts (of which we can expect that they have been written less carefully). Attention may indeed play a role in catching some of the errors. However, the overall numbers are relatively low, so we have to be careful in interpreting the difference in Table 7.

With respect to the word-final omissions, the distribution of morphological categories is remarkably similar between older and newer texts, as Table 8 shows.

Table 8 Number of single final letter omissions in texts before 1970 and after 1990, according to morphological category. Data base: GraphVar corpus

In both data sets, word-final omissions are most frequent among inflectional suffixes. That indicates that the pattern is stable and independent of the level of attention.

5 Discussion

In sum, our results show that omissions in handwritten German texts exhibit a number of interesting patterns:

  • They are much more frequent on inflectional suffixes when compared to stems or derivational suffixes; this holds for single final letters, and even more so for bigrams

  • Omitted bigrams respect morpheme boundaries

  • Among the inflectional suffixes that had a missing final letter, case is disproportionally more affected by omission compared to number.

  • More carefully redacted texts show less omissions, but the relative amounts of morphological types does not change.

How can we interpret these observations? Apparently, missing final letters are not randomly distributed. They depend on the type of their morphological unit. That in turn means that until the very last stages of planning, this morphological information must be represented in the mind of the writers. Morphological information must be present in a buffer similar to the “articulatory buffer” in which Levelt (1989) assumes that the pieces of the “phonetic plan” are stored.

The overrepresentation of dedicated case affixes as opposed to number affixes lines up with a number of different approaches. From a structuralist perspective, Eisenberg (2020: 162) observes that number is the dominant and hierarchically higher category when compared with case. From a cognitive perspective, our results are also in line with accounts of the diachronic change of German inflection classes adopting Bybee’s (1985) relevance hierarchy. For example, Kürschner (2008: 25) posits that in the case of nouns, number is more “relevant” than case, i.e. leads to a stronger modification of the corresponding semantic concept. Taking the word Bär ‘bear’ as an example, it makes a huge difference whether there are one or more of them; if we use the plural form die Bären, this means that we are talking about more than one animal. By contrast, if we use, say, the dative form dem Bären ‘the bear-DAT’, however, the concept itself does not change, we are still talking about the same animal. From this point of view, it is not surprising that case markers should be affected by omissions to a higher degree than number markers. The distinction between case and number is also mirrored in Booij’s (1995) morphological-theoretical distinction between inherent and contextual inflection. Contextual inflection is determined by syntactic necessities; examples are case for nominal elements (dictated by, e.g., a verbal head) or person/number for verbs. Inherent inflection like number, on the other hand, is determined solely by semantic requirements: We want to talk about Bären ‘bears’ and not about a singular Bär.

Booij (1995) argues further that inherent inflection is similar to derivation in a number of ways. Our data support his view: With regard to omissions, number (as an inherent category for nouns) behaves much like derivational suffixes, and unlike case (as a contextual category).

As shown in our review of previous research, another strand of evidence comes from dysgraphic patients: Badecker et al. (1996) present data from a dysgraphic individual who can correctly spell stems, but not affixes. That means both types of morphological units must be stored and/or processed separately, at least to a certain extent.

An open question is whether attention or self-monitoring is subject to the same asymmetry between morphological categories. It could theoretically be the case that final omissions on stems and inflectional suffixes are equally distributed, but that we simply catch more stem-final omissions over the course of re-reading our texts. That would lead to the same distribution described above, with the dominance of stem-final omissions. Tackling this question is rather straightforward – for example in an experimental set-up where individuals are presented with a text containing errors. Does the asymmetry also shows when they are correcting texts?

It can be argued that self-monitoring written texts is inhibited by top-down processes – after all, writers know what they intended to write, and this knowledge may prevent them from actually perceiving the errors. It would accordingly be interesting to test how long this effect lasts. If we give writers their own texts back for proofreading a day, a week, a month, or even a year after initial composition – how many errors will they catch? This is closely connected to another issue that we have neglected for the purposes of our study: There is a broad consensus in research on written language that not all spelling errors are alike. Instead, we should at least distinguish between “competence errors” and “performance errors” (Ellis, 1979) – i.e. errors that a writer makes because they do not know the correct form on the one hand, and true “slips of the pen” on the other. This distinction is of course fairly hard to operationalize: for high-frequency words, it would make sense to check whether the correct spelling of a misspelled word is attested somewhere else in the same exam, which would be a strong indicator that we are dealing with a performance error, while multiple attestations of a misspelled form in the same exam could be seen as strong evidence in favor of a competence error. But in many cases, misspelled lexemes are only attested once in an exam, which makes it impossible to use this proxy to distinguish between the two types of error.

Another aspect that we have not taken into account so far is the position of a misspelled word within the sentence (phrase- or sentence-initial, -medial or -final). This too could have an effect, since the omissions could be explained by ‘overwriting’ the buffer with new information, as e.g. in Kandel’s (2023) APOMI model. In this model, information flows continuously and in a top-down manner from lexical processing to the motor program. Words, morphemes, and graphemes are passed on to the motor program simultaneously. Writers start writing before the end of the respective word is passed on to motor control. We assume that information about the next word is accordingly activated before the preceding word is spelled out completely, which in turn could capture ‘overwriting’ information about the last letter with information from the next word. It could be the case that the proclivity to do so is higher when we are busy planning the rest of the sentence, eager not to forget the sequence of words that came to our mind. In that case, we would expect more final omissions on sentence-internal words as opposed to sentence- (or even paragraph-) final words.

Another important open question is of course to what extent the omission patterns in our data are specifically graphemic in nature. There might be good reasons to assume that many of them are not modality-specific, and follow-up studies comparing spoken and written language would be necessary to tell apart “slips of the pen” from “slips of the tongue”. But for testing hypotheses about the representation of morphological knowledge, this aspect is of minor importance.

Summing up, then, our study provides further evidence for the key role of morphological units in the distribution of spelling errors, which in turn has consequences for theories of morphological storage and processing. While the pilot study presented here is certainly far from answering all pertinent questions, we hope that it provides a valuable starting point for further in-depth corpus-based research on the interface between morphology and writing.