Materials
When performing a content analysis on children’s books, the selection of books is key. In previous research, books were often selected based on their use in schools (e.g., Eisenberg 2002; Hughes-Hassell et al. 2009), their received awards (e.g., Kaltenbach 2005; Koss et al. 2017; Lee 2017), library records (e.g., Poarch and Monk-Turner 2001) or best-seller lists (e.g., Hamilton et al. 2006). Combining sources of information will give the most complete overview of books to which children are most likely exposed.
The first step of selecting the books (see Fig. 1), therefore, was to gather all children’s books published in the Netherlands that had been (1) sold most, (2) borrowed most, and (3) received an award from 2009 until 2018, without any requirements as to (the diversity of) the content. Lists with books that were sold and borrowed most often were available from the CPNB (Stichting Collectieve Propoganda van het Nederlandse Boek n.a.). Awards included in the selection process were Prentenboek van het jaar, Kinderboekwinkelprijs, Wouter Pieterse prijs, Griffels en Penselen, IBBY Honour List, Boekensleutel, Hotze de Roos Prijs, Jenny Smelik-IBBY-Prijs and Prijs van de Kinderjury. The second step was to delete all duplicates. The third step was to select books that were aimed at children of 6 years old and younger, as indicated by libraries in Amsterdam, Rotterdam and The Hague (three large cities in the Netherlands). Books labelled as AP (A Peuter, books aimed at toddlers) or AK (A Kleuter, books aimed at preschool children) were selected, as well as books from additional categories (e.g., informative books or poetry books) described as being aimed at children up to 6 years of age. If libraries used different labels, books were selected if the majority of libraries in the three cities labelled the book as described above. The fourth step was to remove books that did not contain any human figures. After coding, but prior to the analyses, six books were excluded: one book was excluded because this large-size book contained more human characters (n = 3343) than all other books together (n = 2128) and five books were excluded because all characters in the books were coded as having an unclear ethnic appearance. Lastly, five books from the same series were treated as one book (i.e., scores in terms of representation of characters were added up), as were two books from another series, resulting in a total number of 64 analyzed books (for a list of titles, see Online Appendix A).
Procedure
Coding of the selected books started with general information on the book (publisher, number of pictures, and labels of the book; for an overview of coded variables, see Online Appendix B). The label of the book could refer to a topic or genre and was based on the categorization by the Dutch libraries from which the books were borrowed (all in The Hague). Labels included were: (1) being different, (2) animals, (3) everyday life, (4) emotions, (5) family, (6) holidays/parties, (7) behavior, (8) learning, (9) body, (10) music, (11) nature, (12) on the road, (13) fairytales, (14) TV-characters, (15) friends, (16) poetry, and (17) informative. For informative books, books received an additional topic categorization. If the libraries in The Hague used different labels, the label that was used most often was selected. Next, information on the authors and illustrators was coded (ethnicity, gender), based on information that was available about them online (using a combination of information from photographs, personal webpages or Wikipedia pages). A male–female dichotomy was applied, as we did not come across any information indicating other gender identities.
Finally, all human characters displayed in the book were coded, independent of whether they were discussed in the text. Animals and other objects with human characteristics were not coded. The following was coded for each human character: name, ethnic appearance, gender, age group, role in the story, representation based on number of pictures, and representation on the cover. A White ethnic appearance was defined as not only a White skin color, but also a European ethnic appearance (e.g., Asian appearing characters could be drawn with a White skin color, but were not coded as White). Ethnic appearance was coded as of color if the ethnic appearance of a character was not (fully) perceived as White or European (e.g., had a darker skin tone and/or ethnic characteristics such as hair structure or eye shape that was not perceived as European), and could therefore be applied to characters with a wide variety of ethnic appearances. Unclear ethnic appearance was coded if the characters were drawn transparently (not using colors to distinguish the characters from the background), if the characters had a unnatural skin color (e.g., green), or if characters were drawn in such a position that ethnic features such as skin color, face, and hair could not be seen. The protagonist was defined as the most important character in the story. Multiple characters could be coded as a protagonist, for instance when the book was about a duo. Secondary characters were defined as characters who were also mentioned in the text and contributed to the story line, but not as much as the protagonist. A background character was defined as a character who was visible in the pictures, but was not mentioned in the text and did not contribute to the story. Human characters who did not contain enough detail to code two out of three main characteristics (ethnic appearance, gender, and age) were not included in the dataset. Relative representation in terms of pictures was calculated by dividing the number of pictures on which the character was presented by the total number of pictures in the book.
To assess the reliability of the coding system, two independent coders coded ten random books from the set, including 98 human characters. Interrater reliability was good for most variables (> 0.91 Cronbach’s α for numeric variables and > 0.88 Cohen’s κ for nominal variables), and lower for two character variables: 0.69 Cohen’s κ for ethnic appearance and 0.78 Cohen’s κ for representation on the cover. For this last variable, differences appeared because the coding book did not specify whether the back of the book should also be taken into account. This was therefore specified afterwards. For ethnic appearance, only 14 out of 98 characters were coded as of color, and 6 were coded as unclear. The differences in coding occurred mostly because one of the coders thought the ethnicity of the characters was not clear enough because they were drawn in the shadow. Without these ‘unclear’ characters, interrater reliability was 0.90 Cohen’s κ. The two independent coders reached consensus scores for the ones that diverged, and clarified coding rules where necessary, after which one of the coders proceeded to code the rest of the books. Reliability between that coder’s original coding and the consensus reliability set was above 0.99 (Cronbach’s α) for numeric variables and above 0.82 (Cohen’s κ) for nominal variables. Books that contained a large number of characters in the background, and books with ambiguous or unclear pictures (n = 25) were discussed with the other coder of the reliability set. If the two coders did not reach a consensus, the books (n = 4) were discussed in a bigger, ethnically diverse, research team that focuses on interethnic bias and prejudice in children to reach a final coding.
Population Statistics
Target Audience
Demographics of the target audience are based on statistics about children aged 0–5 years old in the Netherlands in 2019 (CBS 2019b). In accordance with the coding of ethnic appearance, percentages of White (Western) children and children of color (non-Western) in the population are calculated. Percentages of White (Western) children are based on the number of children with a Dutch and first or second generation Western migration background, and percentages of children of color (non-Western) are based on the number of children with a first or second generation non-Western migration background. Non-Western is defined as countries in Africa, Latin-America, Asia, and Turkey (CBS’ definition, CBS n.a.), as well as Indonesia, Japan, and countries in Oceania (apart from Australia and New Zealand). Results are shown in Table 1 (G1-2 method). One could argue that the percentage of children of color (non-Western) might be higher in the actual population, as some children of second-generation migrant parents might also be of color. Percentages are recalculated while taking third-generation backgrounds into account (CBS 2018; Table 1, G1-3 method). One could argue that this percentage of children of color (non-Western) might be lower in the actual population, as not all children of second-generation non-Western migrants might be seen as of color. Therefore, the true percentages will lie somewhere in between.
Table 1 Population statistics in The Netherlands General Population
Demographics of the general population are also based on data from 2019 from CBS, without distinguishing between age groups (CBS 2019b). Again percentages of the White population and population of color are calculated in two ways, similar to the statistics of the target audience. Results are shown in Table 1.
Analyses
The final dataset consisted of 64 books and 2053 characters. Characters whose ethnic appearance was coded as unclear (n = 371) were excluded from analyses on the character level. After calculating descriptive statistics, the association between ethnicity of the authors and illustrators and ethnic diversity in the books was examined using Pearson Chi-Square tests. Fisher’s Exact Tests are reported if the expected count in more than 20% of the cells was below five. Associations between ethnic appearance, and gender and age group of the characters were also examined using Pearson Chi-Square tests. In addition, ethnic representation among the authors, illustrators and characters was compared to population statistics using the following formulas:
$$p = \frac{X1 + X2}{{n1 + n2}}$$
(1)
$$SE = \sqrt {p\left( {1 - p} \right)\left( {\frac{1}{n1} + \frac{1}{n2}} \right)}$$
(2)
$$z = \frac{p1 - p2}{{SE}}$$
(3)
Furthermore, Pearson Chi-Square tests were conducted to examine the relation between ethnic appearance of the characters (White or of color) and prominence factors (role of the character, having a name, being represented on the cover). In addition, a Mann–Whitney U test was conducted to examine potential differences in representation in terms of pictures for White characters versus characters of color, as the variable was highly skewed (Zskew > 3.29). These analyses were conducted separately per gender and per role of the character (protagonist, secondary, background). Moreover, books containing ethnic diversity (i.e. both White characters and characters of color) were compared to books without ethnic diversity in terms of number of characters using a Mann–Whitney U test, as the variable again was highly skewed (Zskew > 3.29).