1 Introduction

A recent census of early learning programme provision in South Africa for children under five years identified 42,420 Early Learning Programmes (ELPs) attended by approximately 1.6 million children (Department of Basic Education, 2022). Figures vary from source to source (Gustafsson, 2017), but official household surveys indicate that 45–55% of the 1.2 million South African children aged 50–59 months (the age range covered in this study), attend an ELP (Statistics South Africa, 2018).

Measurement and monitoring, including longitudinal studies that can inform policy development, are crucial actions in achieving the Sustainable Development Goals (SDGs), especially in developing nations like South Africa (Raikes et al., 2017). SDG Target 4.2 specifically requires quality national data to populate the SDG Indicator 4.2.1: The proportion of children under 5 years of age who are developmentally On Track in health, learning and psychosocial well-being, by sex.

Prior to the present survey, there was no South African population level data on this indicator. The results reported here are from the Thrive by Five Index 2021 survey (https://www.thrivebyfive.co.za), the only nationally and provincially representative survey of preschool child outcomes yet attempted in South Africa. The present results constitute the baseline for three nationally and provincially representative surveys planned for the period 2021 to 2030. These will monitor trends over time in the proportions of children between 50- and 59-months old who are On Track for their age in the areas of development specified in SDG 4.2.

We commence with an overview of evidence from the extant research literature on the relationship between key areas of functioning prior to children’s entry to school, and their future adjustment and performance.

1.1 The Importance of the Early Years

It is well-established that children who are nourished and nurtured in their earliest years, and who have opportunities for learning in the home and in ELPs, are more likely to start school On Track and are more likely to succeed in formal schooling.

Family socio-economic status, caregiver education and the quality of stimulation provided to preschool children by caregivers and other household members are all powerful predictors of cognitive and related skills such as literacy and numeracy during early childhood (Bradley & Corwyn, 2002; Duncan et al., 2012; Taylor & Yu, 2009; Tran et al., 2017).

In the sections that follow, evidence for the importance of these foundational skills for later academic achievement is presented in brief.

An early meta-analysis of the influence of early abilities and social-emotional functions on later school achievement was conducted by Duncan et al. (2007), on six American, British and Canadian longitudinal datasets. Children were first assessed between 54- and 72-months old. The strongest predictors measured at school entry were emergent mathematics skills, early reading ability and ability to pay attention (assessed through teacher reports). No relationship between measures of emotional and behavioural difficulties and later school achievement was apparent. A later, large-scale study of some 33,000 United States public pre-K children (Ricciardi et al., 2021) investigated the extent to which a range of preschool abilities assessed at age four predicted school achievement in Grade 5 (in which children are typically 10–11 years of age). They found long-term effects of early abilities on primary school outcomes similar to those reported by Duncan et al., namely that cognitive, language and fine and gross motor functioning at age four was significantly related to academic performance in Grade 5. However, in contrast to Duncan et al., Ricciardi et al. found that the social-emotional functioning of four-year-olds was associated with Grade 5 performance, although the effects were smaller than for the other predictors.

Below we summarise findings on the relationship between functioning in specific domains prior to entering school, and later school achievement (domains correspond to those measured in this study). As will be evident, domains are not functionally independent, but influence one another. Research from high- and to a lesser extent middle-income countries predominates, particularly from the USA and the UK.

South African research can contribute little to this literature. A search conducted for this article did not discover any published research from South Africa on the relationship between pre-kindergarten cognitive and non-cognitive skills and primary school performance. This is likely to be partly due to the lack of suitable standardised tests to measure these aspects of psychological development in preschool children from the various ethnolinguistic groups in the country prior to the development of the ELOM test used in this study (in 2016). The only research on the relationship between school readiness and primary school performance in Grade 1 and Grade 4 has been conducted on Grade 1 children (van Zyl, 2011). School readiness at the start of Grade 1 predicted language and mathematics performance in Grades 1 and 4.

The evidence gap will begin to be filled soon as South Africa’s first longitudinal panel study to investigate the relationship between preschool and later performance is currently under way (Hofmeyr et al., 2022). The same instrument is being used to measure preschool children’s functioning as that used in the study reported in this paper (see Method section).

1.1.1 Cognition and Executive Functioning

Behavioural and emotional regulation are important features of executive functioning (EF). With neurological maturation and ongoing socialisation, improved EF is apparent (including attention, concentration, and emotional evenness), increasing the benefit from learning opportunities in the classroom and other environments. Enhanced EF also allows more capable, active, and flexible manipulation of information using rules, working memory and other cognitive facilities (Nayfeld et al., 2013).

Research on EF is proving helpful in identifying capacities that underpin both early mathematics and literacy abilities in primary school. Fitzpatrick et al., (2014, p. 25) note that as children reach school age, “executive function skills can help [them] hold information or instructions in mind during classroom activities, focus on task-relevant stimuli during problem-solving tasks, and resist internal or external distractions”.

1.1.2 Numeracy and Mathematics

Early numeracy skills are strongly predictive of school performance in mathematics (Duncan et al., 2007; Siegler et al., 2012). In Duncan et al.’s (2007) meta-analysis, knowledge of numbers and ordinality on school entry were the strongest predictors of later mathematics achievement from Grade 3 through to early high school (when children were twelve to fourteen years). The ability to identify numerals grows over the early childhood years and is predictive of later mathematics ability (Chard et al., 2005). Oral counting fluency and number identification are known as ‘gateway skills’ and are comparable to letter-naming fluency in reading ability. Nguyen et al. (2016) compared the extent to which four domains of mathematical knowledge in pre-schoolers (counting and cardinality, patterning, geometry, and measurement skills), predicted mathematics ability in Grade 5. While all domains were predictive, numeracy and counting skills had the strongest effects.

Other developmental abilities at ages five and six make specific contributions to later mathematics skills. Studies have pointed to the importance of fine motor skills and visual-spatial abilities in predicting Grade 2 mathematics performance (e.g., Pagani & Messier, 2012), as well as components of Executive Functioning, such as verbal working memory, short-term memory, inhibitory control and planning abilities, including flexible use of rules (Bull & Scerif, 2001; Gunderson et al., 2012; LeFevre et. al., 2010).

1.1.3 Literacy and Language

Children who have an opportunity to read in their early years have a clear advantage, as early reading ability is the strongest predictor of reading ability in middle childhood (Duncan et al., 2007). Emergent literacy skills that are identified as strong predictors of later literacy achievement include having a large vocabulary, being capable of explanatory talk, demonstrating some letter identification before age five, understanding narrative and story, understanding writing functions, knowing nursery rhymes, and demonstrating phonological awareness (O’Carroll & Hickman, 2012; Strickland & Riley-Ayers, 2006). Skills related to print knowledge, phonological processing and oral language are fundamental and independent predictors of later literacy development (National Early Literacy Panel, 2008).

Many of the skills required for the formal learning of reading and writing involve specific visual motor abilities. For example, visual motor integration and hand–eye coordination have been shown to be independent predictors of handwriting ability (e.g., Tseng & Murray, 1994). Literacy skill acquisition is strongly underpinned by perceptual motor skills of the kind assessed in this study, including spatial awareness and orientation, and auditory, visual, and temporal sensory awareness (Demetriou et al., 2017; Excell & Linington, 2011; Joubert, 2015). Also, some literacy skills (e.g., print knowledge and vocabulary) are predictive of numeracy abilities, including numbering, numerical relations, and arithmetic operations. These findings show that abilities in one domain may influence functioning in another during the same developmental period (i.e., ‘horizontal decalage’, Piaget, 1970).

1.1.4 Social and Emotional Functioning

Children’s social and emotional functioning (SEF) is associated with successful transition to school (Arnold et al., 2012; Collie et al., 2019; Denham et al., 2012). Key competencies predictive of adjustment include self-control, persistence, mastery orientation, academic self-efficacy, and social competence (Child Trends, 2014). Ricciardi et al. (2021) point to the influence of SEF on academic outcomes from preschool through to the primary phase, and similarly, temperament has been found to moderate performance in both language and mathematics achievement (Valiente et al., 2021). These robust new findings, in a previously under-researched area, point to the cross-cutting importance of early childhood psychological well-being for overall performance in primary school.

1.1.5 Child Growth Status

Growth stunting or linear growth faltering (low height-for-age) is evident when the height of a child is two or more standard deviations below the World Health Organization reference norm for their age and gender (World Health Organization, 2006). Stunting is an indicator of chronic malnutrition (Boyden et al., 2019), and is a significant problem in South Africa. The 2016 South African Demographic and Health Survey found a stunting rate of 27% for the under-five population, and a 16% estimate for children aged 50–59 months (the age group of the Thrive by Five study) (Hall et al., 2019). This prevalence for the under-five population has not changed in the past forty years (Said-Mohamed et. al., 2015).

The South African National Income Dynamics Study (NIDS), a nationally representative household panel, has reported stunting rates for children aged 48–72 months over five waves (2008–2017) (http://www.nids.uct.ac.za). Over the waves, the stunting rate ranges from 10 to 20%, much lower than the figure for the broader age band referred to above. It is likely that the difference reflects growth recovery (‘catch-up’) between infancy and age five. Both NIDS and the Birth to Twenty Cohort study findings provide evidence of substantial catch-up growth through early childhood (Casale, 2016, 2020; Casale & Desmond, 2016). The Young Lives study findings from four low-and-middle income countries also provide evidence of catch-up: between 27 and 40% of children recover from earlier stunting at 12 months by age five years (Benny et al., 2018).

Apart from access to adequate nutrition in the early years, other factors associated with poverty that increase the risk of growth stunting are poor maternal nutritional status both before and during pregnancy (Sanders & Reynolds, 2017), and lack of access to “improved” water and sanitation, which increases the likelihood of diarrheal disease and is proximally associated with growth stunting (Dearden et al., 2017).

The effects of early stunting depend on the child’s age and the duration of deprivation. When recovery does not occur in early childhood, growth stunting can persist throughout childhood and adolescence, compromising the ability to learn in school and eventually impacting life opportunities (Boyden et al., 2019; Georgiadis & Penny, 2017; Sánchez, 2017; Woldehanna et al., 2017).

1.1.6 Sex Differences

There has long been considerable interest in sex differences in cognitive skills (particularly in mathematics and language). The findings are mixed, for many reasons, not least being the variety of measures used. A full consideration is beyond the scope of this article. A recent analysis of longitudinal study data using measures comparable to the present study is instructive, though. Toivainen et al. (2017) report findings from 16 000 twin pairs (born in England and Wales) on both non-verbal and verbal abilities, measured from early childhood (between two- and four-years old) through adolescence to age sixteen. They report that girls outperform boys on both verbal and non-verbal tasks as early as 24-months-old (although the differences are small at that point). Girls retain their advantage in language skills, particularly writing, and language use, throughout school, and indeed the gap appears to widen with age. Boys tend to get ahead of girls in mathematics skills from around Grade 4 (Bornstein et al., 2004; Halpern et al., 2007), although this is not reported in all studies, and it does appear to depend on the measures used by the researchers (Toivainen et al., 2017).

1.2 Survey Goal

The Thrive by Five Index 2021 constitutes the first (baseline) in a series of nationally and provincially representative surveys of the development of children aged 50–59 months in South Africa. The goal is to establish the proportion of children who are developmentally On Track for their age, and to monitor change in this age group over time through until 2030.

2 Method

2.1 Sample

A multistage cluster strategy was employed to recruit the sample and to permit disaggregation of findings by sex and income quintile for the country as a whole, and for each of South Africa’s nine provinces. In the first stage, 48 public and private primary schools per province were randomly selected from the National Department of Basic Education Schools Database (432 schools were selected; 48 per province). The sample was matched as closely as possible with the income distribution of preschool children in each province by stratifying the sample by school quintile as a proxy for child socio-economic status (we did not have household level economic data). Quintile ranks are assigned to public schools in South Africa roughly according to the relative poverty levels of the population they serve, aggregated over an area within three kilometres of the school. Quintile 1 schools serve children in the poorest areas, while quintile 5 schools serve the wealthiest. Ranks are largely based on the income, education level and unemployment of households in the school catchment area, as obtained from South African national census data. As ELPs are not assigned quintiles, they were assigned the quintile of the closest school in our sampling plan.

South Africa defines early childhood development (ECD) as “an umbrella term that applies to the policies by which children from birth to at least nine years grow and thrive, physically, mentally, emotionally, spiritually, morally and socially” (Department of Basic Education (1995, paragraph 73). Early Learning Programmes (ELPs) are the early childhood education (ECE) component of ECD. There is no state provision of ECE. Services are provided by community-based non-profit organisations and micro-social enterprises. Almost all charge fees with a reported national average (2021) of R509 per month (US$ 1.00 = ZAR 19.34). The state provides means-tested subsidies for children in the poorest three income quintiles who attend non-profit ELPs (R17.00 [< US$1.00] per child per day; Department of Basic Education, 2022). Fees charged by ELPs attended by study children ranged from less than R210 (approximately US$ 11) to greater than R1 231 (approximately US$ 66) per month. All ELPs in the sample were privately operated community-based non-profit organisations and micro-social enterprises. Because South Africa does not have a complete database of all ELPs, in the second stage, a sampling frame of programmes was built using a multipronged approach. This included using a recently compiled national dataset of known ELPs, and by recruiting additional ELPs through snowball sampling (schools in the initial sample, and ELPs known to exist within the school ward were asked to provide details on possible additional ELPs to recruit). ELPs were considered eligible if they (1) operated for more than eight hours per week, and (2) had at least six children aged 50- to 59-months old (4-years old) in regular attendance, who spoke at least one of the official South African languages as their home language. Each school in stage one was associated with a cluster of identified ELPs situated within a five to ten-kilometre radius. If there were at least three ELPs within five kilometres, we did not extend the search as far as ten kilometres. Approximately three ELPs per cluster were randomly selected for the survey and they included center-based preschools and community playgroups. In the final stage, four children (two boys and two girls) were randomly selected in each ELP for assessment.

The sampling approach reflects the Department of Basic Education’s requirement that the results should be representative at the national and provincial level, hence the sampling target of n = 576 children per province. In the absence of a sampling frame for ELPs at the time the study was designed, schools were used as a sampling proxy, and as the size of schools in terms of number of learners does not necessarily correlate with the size of the ELPs in their vicinity, it was not considered as a factor during the sampling process. Both factors were considered in the calculation of sampling weights thus accounting for the overrepresentation (or underrepresentation) of some provinces and schools.

The final sample consisted of 5,570 children, aged 50–59 months. However, as detailed later, 348 cases were removed at the data cleaning phase, leaving 5,222 cases in the main sample prior to computing post-hoc weights (mean age = 54.74 months, SD = 2.7), of which 2 525 (48%) were boys, and 2 697 (52%) were girls. These proportions are similar to the 2019 mid-year population estimate of the 0–5 population (Statistics South Africa, 2019).

Unfortunately, there was significant missing data pertaining to age of enrolment and years of enrolment and average attendance at the ELP. In our experience over many years, this is a common issue in South African ELPs.

Table 1 presents the unweighted sample disaggregated by province, school quintile and sex. Table 2 shows the breakdown by all eleven official languages, which is very similar to the national adult language distribution (Statistics South Africa, 2018).Footnote 1 Locations of ELPs within which children were sampled are displayed in Fig. 1 (indicated by black dots). Capital cities of provinces are also indicated on the map.

Table 1 Study Sample by Province, Quintile, and Sex
Table 2 Distribution of Home Language in the Unweighted Sample
Fig. 1
figure 1

Early Learning Programme Sample Sites

2.2 Weighting

Weighting was decided in consultation between the National Department of Education and co-authors of this article. The method is described in the online supplementary materials. Applying these weights to the sample allowed us to interpret results as roughly representative of children attending ELPs near to school clusters. Statistical analyses were performed using the R statistical programming language (R Core Team, 2022), and the survey packages survey (Lumley, 2010), and srvyr (Freedman & Schneider, 2022), among others. Missing data on some of the variables used in the computation of weights resulted in the additional loss of 83 cases. There were thus 5,139 cases in the weighted sample.

2.3 Measures

Indicators used in the study are provided in Table 3. Children’s learning outcomes were assessed using the Early Learning Outcomes Measure (ELOM 4&5 [Snelling et al., 2019; Dawes et al., 2020a, 2020b]). The measure has 23 items, representing five domains of development: Gross Motor Development (GMD), Fine Motor Coordination and Visual Motor Integration (FMC-VMI), Early Numeracy and Mathematics (ENM), Cognition and Executive Functioning (CEF), and Emergent Literacy and Language (ELL). In each domain, item standard scores are summed to provide a domain total score out of 20. The five domain scores are then summed to derive the ELOM 4&5 total score out of 100.

Table 3 Thrive by 5 Indicators

The ELOM 4&5 is aligned with the South African Early Learning Curriculum and was developed and standardised for use with children aged between 50- and 69-months of age. It provides a reliable and fair assessment of children regardless of their socio-economic and ethnolinguistic background and is available in all eleven official languages. Content, construct, age, and concurrent validity, as well as test–retest reliability, have been established (Anderson et al., 2021; Dawes et al., 2020a, 2020b; Snelling et al., 2019). The predictive validity of ELOM 4&5 for early academic achievement has not been assessed.

Three performance bands (see Table 3) were empirically derived from the ELOM 4&5 total score distribution developed on the standardisation sample (Dawes et al., 2020a, 2020b). Standards were set in 2016 in consultation with technical experts, following a review of other assessment tools, research literature, South African policy, and the South African National Curriculum Framework for Children from Birth to Four. Expert opinion on abilities and knowledge deemed essential for readiness to learn in Grade R, and for the skills known to be associated with academic achievement in Grades 1, 2 and 3, were also considered when setting standards.

Children’s social and emotional functioning was assessed using the ELOM Social-Emotional Rating Scale. One sub-scale measures the child’s social relations with adults and peers and the second assesses the child’s emotional readiness for school. Children are rated by their teachers. Satisfactory reliability and concurrent validity of the scales have been established (Dawes et al., 2020a, 2020b).

Lastly, children’s growth status (normal / stunted growth) was measured using the standard WHO practice for growth monitoring in children under five years (World Health Organisation, 2006). Height was measured using a stadiometer and height-for-age Z-scores (HFAZ) were calculated using the R package 'anthro' (Schumacher et al., 2021).

2.4 Procedure

Seventy-two child assessors were recruited. They participated in a four-day training workshop prior to the commencement of data collection. Only those judged competent in the use of the measures proceeded to the field. Assessors worked in provincial teams under the supervision of a professional supervisor familiar with the measures. Assessors were responsible for selecting the classroom from which children would be randomly selected for the assessment, and selected classrooms with the largest number of 4-year-old children. They then randomly selected individuals from a list of eligible children. Children were assessed in their home language, and all data was captured using an electronic tablet, and saved to a central online server. Data collection took place between 20 September and 19 November 2021. Assessors adhered to COVID-19 protocols laid down by the South African government when conducting assessments. Full details of training and COVID-19 protocols are described in online supplementary material.

2.5 Compliance with Ethical Standards

2.5.1 Informed Consent

Parents (or alternative primary caregivers) were asked via letter to give informed consent for assessment of their children. They were provided with a document in which the child assessment procedures were explained. Parents were requested to indicate whether they approved their child's participation, sign the form and return it to school with the child. The form clearly stated that if it was not returned, this would indicate that the parent did not object to the child’s participation. Follow-up with caregivers who did not return forms would have proved logistically impossible given the scale of the study, the time frame and cost. The study constituted minimal risk to participants as the assessments involved activities similar to those that would occur in the normal course of an early learning programme. As approved by the Department of Basic Education passive consent was relied on where consent forms were not returned. Children were asked to assent to assessment. Where they declined, the child was replaced. No inducement to participate was offered. ELP principals were asked for consent for assessors to test children at their facilities. Those who declined were replaced by ELPs within the same primary school catchment area.

2.5.2 Ethical Approval

The National Curriculum Framework from Birth to Four is the mandate of South Africa's National Department of Basic Education. The survey was initiated by the Department to monitor the implementation and outcomes of the Framework and to provide baseline data for a series of forthcoming surveys on early childhood programme participation, programme quality, and child outcomes. The data generated in the survey constitutes Department of Basic Education administrative data. The Deputy Director of the Department reviewed and approved the study protocol and consent procedures and approved the analysis of anonymised secondary data by the study team.

2.5.3 Research Involving Human Participants

The study was conducted in accordance with the Declaration of Helsinki (1994 and amendments) and conforms to the South African Professional Conduct Guidelines in Psychology 2007 of the Psychological Society of South Africa,Footnote 2 and the University of Cape Town Code for Research involving Human Participants.Footnote 3 All assessors were trained in child protection standards. All child data was anonymised prior to analyses.

2.6 Data Cleaning

Invalid cases were excluded from the final dataset, including duplicates, dummy assessments, and assessments with children who did not satisfy the eligibility criteria. Missing data was evident for variables assessing children’s emotional readiness for school and their social relations with peers and adults (373 sets of scores), and to a lesser degree for the variable recording children’s WHO Z-transformed height, and for the stunting category variable dependent on it (7 missing scores). Missing data was excluded for particular analyses, where needed.

3 Research Questions

  1. 1.

    For learning outcomes overall (a combination of outcomes on the five domains listed above), what proportion of children are On Track (achieving the expected developmental standard), Falling Behind the standard, and Falling Far Behind the standard?

  2. 2.

    For each of the five learning domains, what proportion of children are On Track, Falling Behind, and Falling Far Behind the standard?

  3. 3.

    For the social-emotional domain, what proportion of children meet the expected score on measures of their Social Relations with Peers and Adults, and their Emotional Readiness for School?

  4. 4.

    What proportion of children show normal growth, or are stunted or severely stunted in their growth?

  5. 5.

    Do the predictors age, sex, quintile, growth status, and socio-emotional score combine to model the ELOM 4&5 total successfully?

4 Results

The findings that follow are reported at national level for each study question in turn. Unless indicated, these are stratified by sex and by quintile. Greyscale shades are used in figures throughout to indicate percentages of children in each ELOM 4&5 band, using the key: On Track (transparent, or 0% grey), Falling Behind the standard (light grey), and Falling Far Behind the standard (dark grey).

Percentages of children in each ELOM 4&5 band are indicated on the frequency histograms that report distributions of scores. Each frequency histogram has an estimated (Gaussian kernel) density curve overlaid, but the bandwidth is kept constant, and should be considered only a rough estimate. We report findings for ELOM total scores first, followed by findings for individual domains.

4.1 For Learning Outcomes Overall, What Proportion of Children are On Track, Falling Behind the Standard, and Falling Far Behind the Standard?

Descriptive statistics are reported for the ELOM 4&5 scores in Table 4 (mean, standard deviation, median, and the 10th (p_10) and 90th (p_90) percentile points are included). This table shows that at the national level the mean ELOM 4&5 total score is 45.09 (SD = 14.10). Girls perform better than boys with mean scores of 46.40 (SD = 14.23) and 43.70 (SD = 13.83) respectively. This 2-point difference represents a sex difference of approximately 20% of a standard deviation (Cohen’s d = 0.19), which is small but statistically significant (t(422) = 5.13, p < 0.001).

Table 4 National: Descriptive Statistics for Total ELOM 4–5 Score

Figure 2 shows the distribution of ELOM 4&5 total scores (i.e., summed over all the domains) for the 5,139 children in the weighted sample, and shows that 45.7% of children who attend Early Learning Programmes in South Africa are achieving the standard (On Track), while 26.3% are Falling Behind, and 28% are Falling Far Behind the standard. Overall, girls are performing better than boys, with half On Track (50.2%) and 25.8% Falling Far Behind, while only 41% of boys are On Track and 30.3% are Falling Far Behind. A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across these categories for boys and girls were significantly different (F(1.9, 785.8) = 10, p < 0.001), This trend is also evident in four of the five ELOM 4&5 learning domains, as discussed below.

Fig. 2
figure 2

Proportions of Children On Track, Falling Behind and Falling Far Behind the Total ELOM 4–5 Score Standard (with Density Curves Overlaid, and Separated by Sex)

As Table 4 shows, performance on the ELOM 4&5 improves significantly from lower to higher quintiles. Mean scores and standard deviations give a good sense of differences and overlap as well as the extent of variation within each quintile. Children in quintile 1 early learning programmes score on average 42.50 (SD = 13.42) while those in quintile 5 score almost 7 points higher with an average of 49.49 (SD = 16.03). This difference is statistically significant (t(232) = 3, p < 0.003), and represents a small to medium-sized effect (Cohen’s d = 0.47). A gradient is also evident in the proportions of children in the different quintiles achieving the ELOM 4&5 standard. We found that 57% of quintile 5 children are On Track, while only 19.5% are Falling Far Behind. In contrast, 36.5% of quintile 1 children are On Track and 33.4% are Falling Far Behind the expected standard. A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across school quintiles for ELOM 4&5 categories were significantly different (F(5.6,2374.2) = 6, p < 0.001).

In sum, analysis of ELOM 4&5 total scores in this study shows that fewer than 50% of South African children attending known early learning programmes are On Track for early learning outcomes. In line with the international literature, girls outperform boys with 9% more On Track and thus achieving the ELOM 4&5 standard.

4.2 What Proportion of Children is On Track, Falling Behind, and Falling Far Behind the Standard for Each of the Five ELOM 4&5 Learning Domains?

Children’s performance on all domains is summarised in Table 5. It is evident that apart from the literacy domain, where 54.7% of the children are On Track, in no other domain does the proportion exceed 50%. As all domains are important for performance in the primary phase of education, this figure is of considerable concern.

Table 5 Percentage of children On Track in each ELOM 4–5 Domain

Next, we report findings for each domain. It is useful to remember that standard scores for all ELOM 4&5 domains range from 0–20. We found that girls out-perform boys in all domains apart from Gross Motor Development. A social gradient is evident in four out of the five domains, with children in the higher quintiles achieving higher scores than those in lower quintiles.

4.2.1 Gross Motor Development (GMD)

Findings are reported in Table 6 and Fig. 3. GMD scores have a fairly narrow range, with a mean score of 8.22 and standard deviation of 3.91 out of a possible 20 points, and this narrowness of range is also evident within sex and quintile groups. ELOM 4&5 bands follow this trend, with 48.3% of children On Track, 24.4% Falling Behind, and 27.4% Falling Far Behind. Although boys seem to score slightly higher than girls this difference is not significant (t(232) = 0.07, p < 0.90). Similarly, a weighted GLM shows that apparent differences across quintiles are not significant (b = 0.08, SE = 0.08, t = 1.04, p < 0.3).

Table 6 Descriptive statistics for Gross Motor Development
Fig. 3
figure 3

Proportions of Children On Track, Falling Behind and Falling Far Behind for Gross Motor Development, (with Density Curves Overlaid, and Separated by Sex)

4.2.2 Fine Motor Coordination and Visual-Motor Integration (FMC-VMI)

Findings are reported in Table 7 and Fig. 4. The aggregate mean score for this domain is 11.01 (SD = 3.57). Boys score lower with a mean of 10.51 (SD = 3.47) whereas girls score higher with a mean of 11.48 (SD = 3.60). This is a fairly small but significant difference (Cohen’s d = 0.27, t(232) = 5.7, p < 0.001). FMC-VMI mean scores show a social gradient (b = 0.41, SE = 0.10, t = 4.23, p < 0.001), with the mean score increasing from 10.51 (SD = 3.51) in quintile 1 to 12.06 (SD = 3.82) in quintile 5. There is a notable difference in the distribution of ELOM 4&5 bands between boys and girls and between quintiles in this domain: girls fare better than boys with 35.1% On Track and 29.9% Falling Far Behind, while 25.5% of boys are On Track and as many as 40.6% are Falling Far Behind.

Table 7 Descriptive statistics for Fine Motor and Visual-Motor Integration
Fig. 4
figure 4

Proportions of Children On Track, Falling Behind and Falling Far Behind on Fine Motor Coordination and Visual-Motor Integration (with Density Curves Overlaid, and Separated by Sex)

A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across sex for FMC-VMI categories were significantly different (F(1.8,761.4) = 18, p < 0.001). Children in quintile 1 demonstrate a much greater chance of Falling Far Behind (39.7%), with only 24.4% being On Track, as compared to those in quintile 5 (28.9% Falling Far Behind, while 43.4% are On Track). A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across quintiles for FMC-VMI categories were significantly different (F(1.8,761.4) = 18, p < 0.001).

4.2.3 Emergent Numeracy and Mathematics (ENM)

Findings are reported in Table 8 and Fig. 5. A similar general pattern to that for the total ELOM 4&5 score distribution is evident. The mean score in this domain for the whole sample is 8.39 (SD = 4.14). Overall, 38.4% of children achieve the ENM standard and are On Track. Of some concern, given the influence of this domain on later school performance, is that 28.5% are Falling Behind and 33.1% are Falling Far Behind; 61.6% of children do not meet the standard. Once again, boys score lower with a mean of 8.08 (SD = 4.02); mean for girls = 8.67 (SD = 4.22). This difference is very small, but statistically significant (Cohen’s d = 0.14, t(232) = 3.7, p < 0.001). A greater proportion of girls are On Track (41.7%) while 30.8% are Falling Far Behind; 35% of boys are On Track and 35.5% are Falling Far Behind. A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across sex for ENM categories were significantly different (F(1.9, 802.28) = 6.7, p < 0.002). As with the total ELOM 4&5 scores, a social gradient is evident in this domain (b = 0.38, SE = 0.10, t = 3.73, p < 0.001), with the mean for quintile 1 children = 7.96 (SD = 3.93), and the mean for quintile 5 = 9.49 (SD = 4.23) being moderately far apart (Cohen’s d = 0.37). Using school quintiles, it also seems as if the advantage in this domain is only at the top end of the income distribution (quintile 5), with children in quintile 4 performing very similarly to children in quintiles 2 and 3. Note the large standard deviation in relation to the mean scores here, which indicates considerable variation between individuals, even within the same quintile. In particular, the difference between the mean scores for boys and girls is considerably smaller than the differences within the samples of boys and girls.

Table 8 Descriptive statistics for Emergent Numeracy and Mathematics
Fig. 5
figure 5

Proportions of Children On Track, Falling Behind and Falling Far Behind on Emergent Numeracy and Mathematics (with Density Curves Overlaid, and Separated by Sex)

4.2.4 Cognition and Executive Function (CEF)

Findings are reported in Table 9 and Fig. 6. Overall, scores are relatively low, but many children are On Track, particularly those in Quintile 4 and 5 ELPs. On average, children score only 6.96 (SD = 4.01) out of a possible 20. Again, we see a sex difference with girls outperforming boys (Cohen’s d = 0.14, t(231) = 2.5, p < 0.001). Girls achieve a mean Cognition and Executive Function score of 7.23 (SD = 4.03), with 43.8% On Track and 23.7% Falling Far Behind. Boys, on the other hand, achieve a mean score of 6.67 (SD = 3.96), Cohen’s d = 0.14) but only 38.8% are On Track, 31.8% are Falling Behind, and 29.5% are Falling Far Behind. A very steep social gradient is evident for this domain (b = 0.67, SE = 0.13, t = 5.21, p < 0.001). Average Cognition and Executive Function scores increase from 6.12 (SD = 3.73) for quintile 1 children to 8.77 (SD = 4.71) for quintile 5 children. Here, attention should again be paid to the large standard deviations within quintiles, which indicate large variation in individual scores. Children in quintile 1 have a greater chance of Falling Far Behind (32.6%), with only 34.5% On Track and achieving the standard. In quintile 5 children, only 15.8% are Falling Far Behind, 29.4% are Falling Behind, and a fairly high 54.79% are On Track to meet the expected standards for Cognition and Executive Function.

Table 9 Descriptive Statistics for Cognition and Executive Function
Fig. 6
figure 6

Proportions of Children On Track, Falling Behind and Falling Far Behind on Cognition and Executive Functioning (with Density Curves Overlaid, and Separated by Sex)

4.2.5 Emergent Literacy and Language (ELL)

Findings are reported in Table 10 and Fig. 7. Children performed relatively well on the ELL domain. Across the sample, the mean score is 10.53 (SD = 4.45). The mean for boys was 10.16 (SD = 4.39) while that for girls was 10.87 (SD = 4.48), and this difference was statistically significant (Cohen’s d = 0.16, t(231) = 2.1, p < 0.04). The ELL scores show the expected significant pattern across quintiles (b = 0.43, SE = 0.14, t = 3.11, p < 0.002) with the mean for quintile 1 children being 9.92 (SD = 4.37), increasing to a mean of 11.32 (SD = 5.0) for quintile 5 children. The difference between quintiles 1 and 5 is a small-sized effect (Cohen’s d = 0.29).The distribution across ELOM 4&5 bands is comparatively higher for this domain than for others. Overall, 54.7% are On Track (57.3% of girls and 51.9% of boys), and only 19.3% are Falling Far Behind (17.7% of girls and 21% of boys). As is expected, children in quintile 1 demonstrate a greater chance of Falling Far Behind (23.3%, but with 47.3% On Track), while in quintile 5 there are 16.2% Falling Far Behind and 63.0% of children On Track. A contingency table analysis, with Rao & Scott’s adjustment, showed that the distributions across quintiles for ELL categories were significantly different (F(2, 845) = 4.4, p < 0.01).Poor performance in domains related to later school achievement is of considerable concern – particularly for those children in lower quintiles. That said, there is considerable inter-individual variation in performance on the ELOM 4&5 with a substantial proportion of children in all five quintiles being On Track in early learning domains.

Table 10 Descriptive Statistics for Emergent Literacy and Language
Fig. 7
figure 7

Proportions of Children On Track, Falling Behind and Falling Far Behind on Emergent Literacy and Language (with Density Curves Overlaid, and Separated by Sex)

4.2.6 Social and Emotional Readiness for School

Findings are reported in Table 11. As we observed in the introduction, findings on the relationship between social and emotional functioning and primary school outcomes vary. In the present survey, we find that 68.1% of children were On Track in terms of Social Relations with Peers and Adults, and 62.6% were On Track in terms of Emotional Readiness for School. Findings are consistent with the pattern of girls being more developmentally advanced than boys in other domains. Regarding social relations, substantially more girls (72.4%) than boys (63.60%) meet the expected standard in this area. While a greater proportion of girls (66.0%) show emotional readiness for school than boys (59.0%) the difference is relatively minor. The social gradient found for other domains was not evident here.

Table 11 Percentages of Children Meeting Social and Emotional Functioning Cutoff Points

The overall findings are positive as both have been shown in some studies to be associated with adjustment to the school environment.

4.3 What Proportion of Children Show Normal Growth, or are Stunted or Severely Stunted in their Growth?

Using the WHO-reference standard (height-for-age Z-score), we find that 94.89% of children surveyed were of normal height for their age, while 4.60% were moderately stunted, and 0.51% were severely stunted. There is little difference in the proportions of boys (95.1%) and girls (94.69%) with normal growth in this national preschool sample (F(2,831) = 0.96, p < 0.40). However, more children in lower socio-economic quintiles have stunted growth (b = -0.25, t = -2.47, p < 0.02); 7.34% of children in quintile 1 (the poorest) are either moderately or severely stunted while the figure for quintile 5 children (the wealthiest) is 2.5%.

4.4 What is the relation between Early Learning Programme quality and performance on the ELOM?

Although it was not one of our key questions in the study reported above, it is important to know what the relationship is between Early Learning Centre Quality and performance on the ELOM 4&5. We did not have data on the full set of ELPs from which children were sampled in the main study, but we did have access to a technical report on development of an instrument for measuring ELP quality (Tredoux et al., 2022), and in that report correlations are computed between five dimensions of ELC quality, and the ELOM 4&5 total score for a subset of 522 of the ELCs sampled in the present study. We were thus able to compute correlations between each dimension of ELC quality and performance of children in the centres. These are reported in Table 12 and are small in size but statistically significant for three of the quality dimensions.

Table 12 Correlations between ELC quality dimensions and Total ELOM 4–5 scores

4.5 Modeling performance on the ELOM 4&5

We conducted a mixed linear regression analysis (or mixed linear model) to test potential predictors of the Total ELOM 4&5 Score. The outcome variable was the Total ELOM 4&5 Score, and the predictors were variables thought to be potentially important, as discussed in the literature review: age, sex, quintile, growth status (recoded as stunted vs not stunted), and socio-emotional score (the average of social and emotional functioning, as rated by teachers). We did not consider potential interactions or nonlinear relations between predictors and outcome. Mixed linear models allowed us to investigate effects at the level of the individual but also allowed us to take clustering in the data into account. In the descriptive statistics reported earlier, we used a sampling weight correction, as explained in the Method section. However, we did not use a weight correction in the regression modelling. There are several reasons for not doing so, many authors pointing to disadvantages in including sampling weights for regression models (e.g., Lumley, 2010).

The results of the mixed linear analysis are shown in Table 14. Significance tests of individual predictors are reported with associated p-values. All the predictors entered into the model significantly predicted Total ELOM 4&5 Score (at p < 0.001). When running the analyses, we made use of “dummy variables” to represent categorical variables. Note that in this table specific contrasts are shown as effects, e.g., the coefficient for “Stunted” in the table is the coefficient that compares the group that is severely stunted with the reference group (normal HFA). To assess the effects of all levels of categorical variables simultaneously, we conducted omnibus ANOVA tests on predictor terms, and these are shown in Table 13. In the first row of the detailed table (Table 14), the Intercept represents the Total ELOM 4&5 Score when the predictors are all set at 0 (continuous predictors), and when the dichotomous predictors are set to the reference category (e.g., boys, in the case of sex). The following rows of the table contain raw and standardised coefficient estimates for each of the predictors in the model, along with confidence intervals, and p-values of the tests that the coefficients are equal to 0. In this case, the p-values are not of particular interest to us given the large sample size ‒ when sample sizes are very large, p-values are more likely to be very small, and an over-reliance on these results can lead to claiming support for results which in fact are of little practical significance. The standardised regression coefficients (Beta), which are more interesting, have been standardised so that the units of each predictor are on the same scale and they are therefore roughly comparable. They estimate the amount of change in Total ELOM 4&5 Score in SD units by dividing each numeric variable by twice its SD (following a recommendation by Gelman [2008]). Figures 8 and 9 illustrate the effects graphically, showing their unstandardised sizes in the first of the figures, and the standardised effects in the second. Note that confidence intervals are shown to give a sense of the uncertainty in the estimates, as well as the line of null effect (0).

Table 13 ANOVA summary table for mixed linear modelling of ELOM 4&5 Total
Table 14 Coefficient Summary Table for Mixed Linear Modelling of ELOM 4&5 Total
Fig. 8
figure 8

Effects in the model of Total ELOM 4&5 Score: unstandardised. Lines are 95% Confidence Intervals. *** p < .001 ** p < .01

Fig. 9
figure 9

Effects in the model of Total ELOM 4&5 Score: standardised. Lines are 95% Confidence Intervals. *** p < .001 ** p < .01

We find that growth stunting has a strong impact on the Total ELOM 4&5 Score (Beta = -0.33), i.e., the greatest difference in Total ELOM 4&5 Scores will be found between children who are stunted and those with normal HFA. Socio-economic status (measured in our sample by quintile score) also has a strong effect on the Total ELOM 4&5 Score. Differences vary by quintile, but one can see stark differences between children in lower and upper quintiles, particularly between children in quintile 1 and those in quintile 5 (Beta = 0.45) and between children in quintile 1 and those in quintile 4 (Beta = 0.28). Socio-emotional development has a surprisingly large effect on Total ELOM 4&5 Scores (Beta = 0.31). This is followed in strength by the effects for age, and finally sex.

The marginal R-squared statistic represents the amount of shared variance explained only by fixed effects, which in this case was R2 = 0.18. Conditional R2 estimates the variance explained by the entire model, i.e., with both fixed effects and random effects, and in this model, conditional R2 was 0.35.

5 Discussion

We have reported detailed results from the first national South African survey of the growth and learning outcomes of children aged 50 – 59 months enrolled in ELPs. Prior to this study the country lacked representative data that could be used to inform education planning and assess the extent to which the country was on track to meet the targets of Sustainable Development Goal 4.2.1.

Children were assessed in the final term of the year prior to proceeding to the Reception year of primary school (Grade R/Kindergarten). By that point, as many as possible should be achieving the ELOM 4&5 standard for the measure as a whole and for all of the domains.

Research reviewed for this article has consistently found that by 50–59 months of age, language and mathematics skills, executive functioning, fine motor coordination, and visual motor functioning are all predictive of children’s performance in the Foundation Phase of their schooling. Social and emotional development are both predictive of adjustment to school and to achievements in the primary phase, but to a lesser extent.

Analysis of ELOM 4&5 Total scores in this study shows that the development of fewer than 50% of South African children attending known early learning programmes is On Track for successful transition to Grade R. In line with the international literature, girls outperform boys with 9% more achieving the ELOM standard.

In line with other studies of the relationship between socio-economic background and indicators of school readiness, clear socio-economic differences are evident (using school quintiles as a proxy for the socio-economic status of the children in the catchment area). Almost 60% of children in the wealthiest quintile 5 are On Track for early learning (they achieve the ELOM 4&5 total score standard), while fewer than 40% of children in quintile 1 are functioning at this level.

Of particular concern is that between 28 and 32% of children in quintiles 1 to 4 are Falling Far Behind the ELOM standard despite participating in an ELP.

The skills gap between wealthier and poorer children is likely to endure and impact on school achievement in a number of areas, but particularly in mathematics. That said, there is considerable variation within all quintiles and being of lower socio-economic status does not necessarily mean that a child will not be On Track—a proportion of children in all three of the lower quintiles, and never lower than 38%, achieves the ELOM 4&5 total score standard.

Regarding ELOM 4&5 domains assessed directly, only in the ELL domain are more than fifty percent (54.7%) of children achieving the expected standard. Even here however, we observe that Quintile 5 children perform higher than others on this domain. This is unsurprising. Studies from many countries have shown that early language and literacy proficiency varies with socio-economic status (SES) (as noted in our review of the literature). Higher SES (quintile) homes have been shown in a number of studies to provide more resources for the stimulation of early literacy (e., Rowe, 2008; Zuilkowski et al., 2019). A recent South African study (Dawes et al., 2020a) found that in low SES homes, children spent very little time in stimulation activities with their young children (including reading. Access to books was very limited.”

Regarding Gross Motor Development, we have noted that apparent quintile differences are not significant. However, South African research suggests that we might expect lower income South African children to outrank their wealthier counterparts. For example, Draper et al (2017) found that significantly greater proportions of urban and rural low-income preschool children than their high-income urban counterparts were within the average to above average range for gross motor skill proficiency. The greatest proportion of proficient children were from low-income rural settings. Low-income children in both settings were also the most physically active children.

Of particular concern is poor performance in FMC-VMI, where only thirty percent (30.5%) perform at this level. Thirty eight percent (38.4%) achieve the standard in ENM, while forty one percent (41.4%) do so in the CEF domain and forty eight percent (48.3%) do so in GMD. And in the domains, apart from GMD, socio-economic differences are starkly evident, significantly disadvantaging poorer children as they proceed to school.

Children are doing relatively well in the social and emotional domains which are both expected to assist transition to formal schooling and enable learning. Some 68% of teacher ratings of children's Social Relations with Peers and Adults are at the expected level, and 63% of ratings of children are at the expected level for their Emotional Readiness for School. It is important to bear in mind that these are subjective measures. A socio-economic gradient in these domains was not evident.

Regarding children’s growth, the stunting rate found in this study is lower than rates for much the same age group reported by both the 2016 South African Demographic and Health South African Survey (16%) and the National Income Dynamics Study (10–20%). Our data is cross-sectional, so we do not have growth data from earlier points in development. However, it is probable that our findings of lower stunting rates than national surveys of children under five may reflect catch-up growth. In addition, all the children in the Thrive by Five Index survey were attending ELPs which provide some form of nutrition to low-income children, thereby reducing the risk of malnutrition. Children whose parents cannot afford ELP fees are less likely to attend an ELP and are more likely to be impacted by malnutrition. If these children had been included in the sample, it is likely that the stunting rates would have been somewhat higher.

Most policy-related research on child growth focuses on the implications for development of moderate and severe growth stunting. But as Stevens et al., (2012, p.824) point out, “the hazardous effects of undernutrition happen along a continuum of mild, moderate, and severe undernutrition.” The implication is that mild stunting (HFAZ below 1 SD of the median of the WHO reference population) should not be ignored as it could be a barrier to thriving. This Index found 1 in 4 children to be mildly stunted. Findings from the NIDS study (Casale, 2020), reinforce the importance of attending to this group of children. Casale classified children’s catch-up growth by 4 to 5 years as “incomplete” (p. 6) if they were mildly stunted. She reports that children in this group did worse on education outcome measures (grade completion and failure) “compared to the children who were never stunted” (p. 14). Most important, there was also little difference in education outcomes between this group and those who remained stunted.

In the current study, we find 5.65% of the children attending known ELPs fall below the 2 SD cut-point for moderate stunting. Their development, health, education and life chances will be negatively impacted. And the rate for those children whose caregivers cannot afford to send them to a programme is likely to be higher.

6 Implications/Conclusion

The findings of this study suggest that children who attend ELPs in South Africa are not meeting the ELOM 4&5 expected standards on a number of indicators. Indeed, the human potential of many children in this cohort is under threat. However, the hope is that the results will be used to inform interventions to improve the quality of early learning services and access to improved nutritional support. As two further waves of data using the same measures will be collected in 2024 and 2029, it will be possible to track whether these efforts are paying off.

Much of our data, and that from other countries, points to the importance of household wealth as a powerful determinant of early development (Boyden et al., 2019; Bradley & Corwyn, 2002). This is evident in the pervasive differences in performance between children from the five socio-economic levels (school quintiles) in nearly all the early learning and health outcomes we report. Some 60% of South Africa’s children live in poverty, and despite their best efforts, most caregivers cannot provide the quality of support for early learning that is available in the homes of their better resourced counterparts. The evidence seems clear that the poverty landscape of South Africa is very unlikely to change in the short to medium term. Nonetheless, we know from research in South Africa (e.g., Dawes et. al., 2021), and in other countries (e.g., Melhuish et al., 2008; Nakajima et al., 2019), that poor children benefit considerably from participation in quality ELPs. Expanding preschool provision alongside ensuring access to appropriate nutrition from pregnancy through early childhood is essential if we are to improve the child outcomes reported in this study.

7 Limitations

It is most important to recognise that the data used in these analyses was collected between September and November 2021. The COVID-19 pandemic significantly affected the ELP service sector over the two-year period preceding this data collection. Closures of ELPs during various lockdown periods in 2020 and 2021 significantly disrupted the amount of programme participation possible for the cohort of children of interest here. In addition, services and preschools were subject to standard operating procedures required by the national government to manage risks of infection. The impact of the pandemic and associated changes to the daily programmes of ELPs have likely changed the nature of children’s experience in several ways and in all probability reduced the amount of benefit they might normally have gained. As a result, one cannot regard the findings presented as reflecting children’s development under normal societal conditions. However, as we do not have representative data on the pre-pandemic population of children under five years of age, we cannot know the extent to which these findings may at least to some extent, represent a “COVID-19” effect.

In the absence of household level data, school quintiles were used as proxies for the probable socio-economic background of the children, and for this reason were used in the analyses reported in this paper. There are recognised limitations of the quintile system. For example, while the school may for instance be in a quintile 4 area based on the characteristics of that population, the intake may comprise significant numbers of poor children whose primary caregivers aspire to the better quality of education they perceive to be provided at these schools (van Dyk & White, 2019).

Analysis of the study data using the quantum charged for fees by ELPs has recently been conducted and reported by the Thrive by Five Index team elsewhere (Henry & Giese, 2022). This has shown a more pronounced gradient in learning outcomes as measured by the ELOM 4&5. While use of school quintiles is necessary for sampling purposes (we did not know the ELP fees until they were surveyed), ELP fees may be a more reliable proxy for children’s socio-economic backgrounds and hence analysis of their relationship to learning outcomes.

Finally, we recognize that exclusion of children not in ELPs is a limitation, as the national skills gap between rich and poor children is likely to be underestimated. The same applies to estimates of the proportions of children who are falling behind the expected ELOM standard. The intention is to address this by enrolling children not in ELPs in the next round of data collection. This will be challenging as home visits to assess children are unlikely to be feasible, and even if they are, it is not appropriate to test children in circumstances that might result in measurement error. Many poor children one would wish to reach live in overcrowded small dwellings that do not lend themselves to sound assessment. Several options are however being explored. These include hosting open days for children not currently in ELPs and who will be enrolled in Grade R the following year and reaching children through community home visiting programmes and health facilities.