1 Younger Readers

Research with young children is of interest in the present connection. First, as novice readers, young children may be disproportionately sensitive to variations in typographic variables such as the presence or absence of serifs. Second, the preparation of educational material in either serif typefaces or sans serif typefaces might affect the facility with which children acquire the ability to read using such material.

Children’s books were generally printed in serif typefaces until the 1930s (Walker & Reynolds, 2003). A notable exception was the educational material published by Nellie (Ellen) Dale in collaboration with the artist Walter Crane in the United Kingdom in the 1890s and early 1900s (Dale, 1902b, 1903; for discussion, see Brockington, 2012). Young children were introduced to individual letters and combinations of letters printed in a sans serif typeface and carried out various exercises that involved writing on blackboards or slates using coloured chalk as well as reading letters aloud (Dale, 1903, pp. 15–18). They were next introduced to a series of readers or “primers” containing groups of new words of increasing phonological complexity that were once again presented in a sans serif typeface. Each group of new words was then used in a short narrative that was printed in a serif typeface and accompanied by a relevant illustration. Examples are shown in Fig. 7.1 (see also Dale, 1902a; Walker, 2013, pp. 88–89). Dale’s work was mentioned by Burt (1959, p. 8; 1960) in reviews of the literature, but her work now seems to have been almost completely forgotten.

Fig. 7.1
figure 1

Two pages from one of Dale’s Readers. From The Dale readers: Infant reader (new ed.), by N. Dale. 1902. George Philip & Son. In the original, the illustrations and some letters in the sans serif headings were rendered in colour. Reproduced by kind permission of the Philip’s Division of Octopus Publishing Group and Sue Walker from http://www.bookdata.kidstype.org/database/database/getImage?id=1018

Subsequently, many educational publishers produced their own typefaces consisting of “infant characters” modified for novice readers. For instance, the loops in characters such as a and g are only partially circular in sans serif typefaces (see the right-hand panel of Fig. 1.1 in Sect. 1.2) but are often more completely circular when rendered as infant characters. The assumption appears to have been that children should learn the simpler shapes of letters printed in a sans serif typeface in developing their handwriting before learning more complex shapes printed in a serif typeface (Coghill, 1980; Walker, 2013, pp. 31–35 et passim; Watts & Nisbet, 1974, p. 33). (This is, of course, exactly the pedagogical approach that had been promoted by Dale.) As a result, teachers nowadays tend to prefer sans serif typefaces over serif typefaces (Raban, 1984; Walker & Reynolds, 2003), and many books for younger readers are printed in sans serif typefaces (Bluhm, 1991; Walker, 2013, pp. 33–35).

2 Burt and Kerr’s Research

One study often cited in support of the idea that serif typefaces are more legible than sans serif typefaces (e.g., Gallagher & Jacobson, 1993; Schriver, 1997, p. 274) was described by a psychologist, Cyril Burt, who collaborated with a physician, James Kerr. The research involved ten groups of children, each consisting of 15 boys and 15 girls aged 10–11 years. It was apparently carried out between 1913 and 1917, but the findings were not reported until after Burt’s retirement in the 1950s (Burt, 1959; Burt et al., 1955). The final experiments used ten different serif typefaces. Burt et al. (1955) mentioned that they had carried out initial experiments with four other typefaces, including a sans serif typeface, but that these had proved “much inferior” to those that had been selected (p. 32). Burt (1959) explained that “in our own early experiments Dr Kerr and I [7] found almost at once that, for word recognition, a sans serif type face was the worst of all” (p. 9).

The reference in square brackets was to Kerr’s (1926) account of his own findings. Kerr mentioned the possibility of using letters in a sans serif typeface, but he simply asserted, without argument or evidence, that “owing to irradiation they are not as legible as letters with thicker ends” (p. 552). (On the topic of irradiation, see Sect. 1.2.) No further information was provided regarding experimental comparisons between serif and sans serif typefaces in either Burt’s account or Kerr’s. Moreover, Hearnshaw (1979, pp. 227–261) concluded that there were serious doubts about the validity and authenticity of the data presented in Burt’s publications, while Hartley and Rooum (1983) argued that this was true in particular of his work on typography. In fact, Burt et al. (1955, p. 32) claimed that the sans serif typeface which they had used was Gill Sans (see also Burt, 1959, p. 9), but this was not made available until 1928 (Kinross, 1992, pp. 62–63), which was more than a decade after Burt and Kerr had supposedly carried out their research.

3 Zachrisson’s Research

The first systematic analysis of the legibility of serif and sans serif typefaces among children was carried out by Zachrisson (1965, pp. 97–108). Groups of 36 boys in Grade 1 (aged 7–8 years) were drawn from two Swedish schools. At one school, reading instruction was based on material printed in a serif typeface; at the other school, it was based on material printed in a sans serif typeface. The boys were tested individually and were asked to read aloud two pages of text. They were randomly assigned to receive text that was printed in one of two serif typefaces (Bembo or Nordisk Antikva) or one of two sans serif typefaces (Gill Sans or Mager Konsul). Zachrisson analysed the number of errors made when reading the text. There was a significant variation among the four typefaces, but the overall difference between the serif typefaces and the sans serif typefaces was not significant, and the difference between the two schools (and hence the typefaces used for reading instruction) was not significant.

In a subsequent experiment (pp. 109–115), Zachrisson drew a sample of 24 boys and 24 girls from Grade 4 (aged 10–11 years) of the school where instructional material was printed in a sans serif typeface. They were asked to read silently two different passages, but their reading was interrupted from time to time by comprehension questions. Once again, the passages were printed in one of two serif typefaces (Bembo or Fairfield) or one of two sans serif typefaces (Fin Grotesk or Gill Sans). Each participant read one passage in a serif typeface and the other in a sans serif typeface. There was no overall difference in the time taken to read each passage between the serif typefaces and the sans serif typefaces, and there was no significant variation among the four typefaces.

For his next experiment (pp. 115–121), Zachrisson presented individual words in isolation by means of a tachistoscope; alternate words were presented in a serif typeface (Mediaeval) and in a sans serif typeface (Mager Futura). The sample consisted of 12 boys in Grade 1 at each of the schools involved in the first experiment, together with six boys and six girls in Grade 4 at the school involved in the second experiment. The words were presented for 40 ms to the younger children and for 20 ms to the older children. Zachrisson analysed the number of words correctly reported, with partial credit for words that were incorrectly reported. There was no significant difference between the two typefaces for either the younger children or the older children, and no significant difference between the two schools.

Zachrisson employed the same research design and materials in a further study that employed Weiss’s (1917) focal variator (pp. 121–124), which was described in Sect. 1.2. In this case, he analysed the threshold at which the blurred image of a word was correctly recognised. The difference between the two typefaces was not significant for either the younger children or the older children. The difference between the two schools for the children in Grade 1 was highly significant, but Zachrisson did not report the direction of the difference. Zachrisson also adopted the same research design using a perimeter, which is a device for presenting visual material in peripheral vision (pp. 124–128). This study failed to yield any significant results, and he inferred that this was not a useful way to measure the legibility of typefaces.

Zachrisson had also asked the children who participated in his first two experiments to rank order the four typefaces in which the passages had been presented (pp. 131–132). They were instructed as follows: “The point is to say which of these you find most legible—inviting, pleasant, easy, to read, and in which order you want to put them according to their legibility. Which one do you like best, next best, third and least?” (p. 131). There was no significant difference in the ranks assigned to the four typefaces, no significant difference between the children in Grade 1 and the children in Grade 4, and no significant difference between children in Grade 1 at the two schools (and hence between the typefaces used for reading instruction). Zachrisson concluded on the basis of all his findings that “there is no significant difference, in objectively measured legibility, or subjective opinion regarding ease of reading between the OF [serif] and SS [sans serif] type faces” (p. 132, italics in original).

4 Other Research with Children

Weiss (1978, 1982) asked 145 boys and girls in Grades 3 and 6 (aged 8–9 and 11–12) at two public schools in New York City to express their preferences among printed material that differed in page size, type size, and typeface. Each example was printed as a two-page spread of text but consisted simply of familiar words in a random sequence together with an arbitrary illustration presented in the top, middle, or bottom of the right-hand page. Weiss chose three typefaces that were easily discriminable: a sans serif typeface (Futura) and two serif typefaces (Paladium and Parinesy). The children were divided into three ability groups based on their scores in an achievement test and were interviewed individually about their perceptions and preferences regarding the different page sizes, the different typefaces, and the different positions of the illustrations.

Of the three factors, the typeface was regarded as important by children in Grade 3 but not by children in Grade 6, by boys but not by girls, and by children of medium ability but not by children of low or high ability. Children of low and middle ability tended to prefer the Futura sans serif typeface, whereas children of high ability tended to prefer the Paladium serif typeface. There was no significant difference in preferences between boys and girls or between the children in different grades. When asked to give reasons for their preferences, the children mainly referred to the legibility and the attractiveness of the printed material. As Weiss noted, this pattern is consistent with the results obtained by Tinker and Paterson (1942) in the case of adult readers.

Coghill (1980) carried out an informal study in which 38 children aged about 5 years who were being taught to read using materials printed in a sans serif typeface (Gill Sans) were asked to read aloud sentences printed in that typeface or in a number of serif typefaces: “In almost every case the children found little difficulty in reading alternative typefaces” (p. 257), and any reading errors tended to be repeated across the different typefaces.

Sassoon (1993) described a study in which 50 8-year-old children were shown a short passage in four different typefaces and four different settings and were asked to choose the typeface that they liked best and found easiest to read. There were two serif typefaces, Times Roman and Times Italic, and two sans serif typefaces, Helvetica and a slanting sans serif typeface that Sassoon herself had developed. Their preferences were fairly evenly distributed, and Sassoon argued that children were able to assimilate different typefaces as a result of their exposure to television graphics and other kinds of advertising. These findings led her to promote her own typeface, Sassoon Primary, in material for young readers (Bluhm, 1991). However, using two different procedures, Wilkins et al. (2009) found that children read words in this typeface less quickly (both aloud and silently) than they read words of the same x-height in the sans serif typeface Verdana. Wilkins et al. suggested that this was because, with Sassoon, neighbouring letters tended to use strokes that were more similar in shape.

De Lange et al. (1993) asked 160 schoolchildren to read two pages of text and to mark all occurrences of a particular word. Half received two pages in the same serif typeface (Times Roman), but the other half received the first page in Times Roman and the second page in a sans serif typeface (Helvetica). Each group contained equal numbers of children from each of four schools, equal numbers of children from Years 4 and 6, and equal numbers of children with high and low academic performance. De Lange et al. calculated the scanning speed on each page by dividing the number of marked targets by the scanning time and then calculated the gain in scanning speed from the first page to the second. There was no sign of any significant difference between the two conditions in the gain scores obtained by the children in either year. De Lange et al. concluded that there was no significant difference between the legibility of Times Roman and Helvetica as measured by a scanning process.

Walker and Reynolds (2003) presented excerpts from a children’s reading book to 24 children aged 5–7 years in either a serif typeface (Century) or a sans serif typeface (Gill Sans) with or without infant characters. The typefaces were balanced across the four excerpts in different children. Walker and Reynolds measured the time taken to read each excerpt aloud as well as the errors that the children made in doing so. There was no significant variation in the time taken or in the number of errors made across the four typefaces. Walker and Reynolds took these results to confirm Coghill’s (1980) view that children do not find non-infant characters problematic. The children were also asked to express their preferences among the typefaces: eight expressed no preference, but eight expressed a preference for Gill Sans without infant characters, and five expressed a preference for Gill Sans with infant characters.

Ripoli (2015) noted that in many Spanish-speaking countries children are taught to read using material printed in a cursive typeface before moving on to serif and sans serif typefaces. He tested 115 children who had been taught to read using a cursive typeface in the final year of preschool. They were asked to read aloud six short texts taken from a children’s book in Spanish, each in one of six typefaces: a cursive typeface (Escolar 1), two serif typefaces (Sylfaen and Times New Roman), and three sans serif typefaces (Arial, Lexia Readable, and Comic Sans). Assignment of the six typefaces to the six texts was counterbalanced across different children, and the x-height of the lowercase letters in each typeface was matched to that of Arial 14-point type. There was significant variation across the six typefaces in the number of incorrect line breaks made by the children, but no significant variation in the number of words read correctly per minute or in the number of errors that they made. Ripoli observed that, despite having been taught to read using a cursive typeface, the children had no difficulty reading using non-cursive typefaces, even using Lexia Readable (which they had not previously encountered). There was also no difference in their performance on the serif typefaces and on the sans serif typefaces.

Griffiths (2020) devised the Comparative Rate of Reading Speed Test. This involved two displays, each consisting of 13 lines with 60 characters in each line. The characters were random groups of between one and seven lowercase letters. The first display was printed in black in the serif typeface Times, and the second display was printed in teal in the sans serif typeface Gill Sans. A total of 92 children aged 11–12 years were asked to read the characters aloud and were timed on their reading of the fifth line in each display. The mean times were 40.53 s for the Times display and 34.81 s for the Gill Sans display. The difference between these mean times was highly significant. Griffiths commented that the use of teal for the Gill Sans display had been “a concession to light sensitive subjects” (p. 11). He suggested that the result was due to binocular deficiency (i.e., unstable co-ordination of the eyes), but this only affects 15% of the general population (Hargreaves, 2008). He acknowledged that the effect of typeface had been confounded with that of colour, and he argued that the latter was the more important variable, possibly due to a reduction in contrast. However, since all of the children saw the displays in the same order, the result might simply have represented a practice effect.

5 Letter Reversals

It has been known for more than a century that children who are learning to read tend to confuse pairs of lowercase letters that are mirror images of each other (e.g., b and d; p and q) (Mach, 1897, p. 50). This appears to be true in all cultures that use the Western alphabet in reading and writing (Goikoetxea, 2006). It is mainly apparent in the final year of kindergarten and the early years of compulsory education, and it is apparent in a variety of tasks extending beyond reading aloud and writing to dictation (Thompson, 2009). Such errors are found in normal readers as well as in children with learning disabilities and children who are dyslexic; nevertheless, they become less common in older children as their reading develops (Cossu et al., 1995; Davidson, 1936; Kennedy, 1954). (They are sometimes seen in older neurological patients and occasionally in healthy older adults: Balfour et al., 2007.)

In the regular forms of most sans serif typefaces, the relevant letter pairs are exact mirror images of one another (see Fig. 1.1 in Sect. 1.2). Some authors have argued that the addition of serifs enhances legibility by making individual letters more discriminable from one another (e.g., Legros, 1922, p. 11; McLean, 1980, pp. 42–44). Yule (1988) and Wiebelt (2004) argued in particular that serifs help to differentiate between confusable letter pairs because they are no longer exact mirror images of each other. (For instance, the left-hand panel of Fig. 1.1 shows that, in both the letters b and d, the serifs are on the left-hand side of the ascenders.) However, other authors have argued that, even with the addition of serifs, the relevant pairs of lowercase letters do not differ appreciably from each other (e.g., Potter, 1949, p. 11). Indeed, Lockhead and Crist (1980) found that more explicit cues were needed to enable children to discriminate between such letter pairs.

Evaluating the two positions is difficult, because most researchers who have described letter reversals in young children have not specified the typeface used in their reading tests. It is therefore impossible to say whether the children had been asked to read letters printed in serif or sans serif typefaces. Popp (1964) did specify the use of a serif typeface (Century) to present lowercase letters in a two-alternative forced-choice experiment where children had to match letters presented on a touch-sensitive projection screen. Their error rate was highest on the pairs bd and p–q, thus confirming that mirror reversals still occur with letters that are presented in a serif typeface.

In the study mentioned in Sect. 7.4, Ripoli (2015) examined the number of letter reversals made by 115 children when reading six texts printed in six different typefaces. The number of letter reversals was least for texts printed in the cursive typeface Escobar 1 and the sans serif typeface Lexia Readable, where there are additional cues that serve to differentiate the critical pairs of lowercase letters. However, there was no significant variation in the number of letter reversals for texts printed in the other four typefaces: two serif typefaces (Sylfaen and Times New Roman) and two sans serif typefaces (Arial and Comic Sans). These results imply that, in the absence of additional cues, the presence of serifs does not in itself help children to discriminate between pairs of letters that are otherwise mirror images of each other. In general, mirror reversals in children’s reading and writing relate to structural properties of the relevant alphabet and not to the particular typeface used to render that alphabet (Treiman et al., 2014).

6 Older Readers

Vanderplas and Vanderplas (1980) suggested on the basis of interviews carried out with older people that many did not read as much as they would have liked because of difficulties with illegible type, although some publishers do produce large-type versions of newspapers and books intended for older readers. The researchers asked 28 volunteers aged between 60 and 83 to read 30 passages of 30–33 lines taken from Samuel Butler’s novel, Erewhon. The 30 passages were presented in one of five type sizes from 10 to 18 points and in one of six typefaces. Three were serif typefaces (Century Schoolbook, Times Roman, and Bodoni), and three were sans serif typefaces (Helvetica, Spartan, and Trade Gothic). The type sizes and the typefaces were counterbalanced, but the order of the passages reflected the narrative structure of the novel. After reading each passage, the participants were given a short test of their comprehension to ensure that they had actually read the material, and they then rated the passage on six aspects of its presentation using a 7-point scale.

Their reading speed was significantly faster for passages with serif typefaces than for passages with sans serif typefaces, but there was also significant variation in reading speed across the six typefaces: Century Schoolbook yielded the fastest reading speed, but Spartan yielded the slowest. Their reading speed generally tended to increase with the type size. The participants also rated passages with serif typefaces more positively than passages with sans serif typefaces on their apparent size, how easily they could be read, and how easily they could be understood. They also rated 12-point typefaces as being the easiest to read.

One situation in which older readers encounter problems is in reading labels on their medication, regardless of whether the labels are prepared using dot-matrix printers (Zuccollo & Liddell, 1985) or more advanced laser printers (Watanabe et al., 1994). Smither and Braun (1994) asked 19 younger adults (mean age 25.48 years) and 20 older adults (mean age 71.05 years) to read the labels on 18 prescription bottles printed in a serif typeface with proportional spacing (Century Schoolbook), a monospaced slab serif typeface (Courier), or a sans serif typeface with proportional spacing (Helvetica). They read more slowly and made more errors on the labels printed in Courier than on the labels printed in either Century Schoolbook or Helvetica. The older adults also read more slowly and made more errors than the young adults when reading labels printed in Courier.

Smither and Braun suggested that the participants might have had problems because of the curvature of medication bottles. They repeated their experiment with new participants and with the 18 labels placed on a flat surface. These participants once again read the labels printed in Courier more slowly, but they did not make more errors than on the labels printed in Century Schoolbook or Helvetica. The older adults read more slowly than the young adults across the board, but they did not make more errors. Smither and Braun inferred that reading medication labels was more effective if they were printed with proportional spacing (Century Schoolbook or Helvetica) than if they were printed with monospacing (Courier). However, the presence or absence of serifs seemed to have little or no effect on the legibility of medication labels.

7 Conclusions

As novice readers, young children may be disproportionately affected by different typefaces. The use of different typefaces may also affect how readily children acquire the ability to read. Research by Burt (1959), Burt et al. (1955) and Kerr (1926) is often cited in support of the idea that serif typefaces are more legible. However, their accounts are inadequate and contain many contradictions. Zachrisson (1965) provided a more thorough account of the role of typographic variables in reading among children of different ages using various research methods and found no evidence for any difference in legibility between serif and sans serif typefaces. Subsequent research by other investigators has tended to confirm Zachrisson’s conclusions. It has been known for more than 100 years that children tend to confuse letters that are mirror images of each other (such as p and q). This phenomenon occurs with both sans serif letters (which are true mirror images) and serif letters (which are not). Older readers tend to suffer from visual problems which may depend on typographical factors. This is of practical importance, as in the design of labels for medication containers. Nevertheless, there are no differences in the reading capability of older readers when presented with material printed in serif and sans serif typefaces.