Gender gaps in early educational achievement


This paper analyzes the source of the gender gap in third-grade numeracy and reading. We adopt an Oaxaca-Blinder approach and decompose the gender gap in educational achievement into endowment and response components. Our estimation relies on unusually rich panel data from the Longitudinal Survey of Australian Children in which information on child development reported by parents and teachers is linked to each child’s results on a national, standardized achievement test. We find that girls in low- and middle-socio-economic-status (SES) families have an advantage in reading, while boys in high-SES families have an advantage in numeracy. Girls score higher on their third-grade reading tests in large part because they were more ready for school at age 4 and had better teacher-assessed literacy skills in kindergarten. Boys’ advantage in numeracy occurs because they achieve higher numeracy test scores than girls with the same education-related characteristics.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. 1.

    In the gender wage gap literature, these are often referred to as the characteristics and returns components, respectively.

  2. 2.

    We are aware of only two exceptions. Jacob (2002) conducts an OB decomposition of the gap in men’s and women’s university attendance, while Fortin et al. (2015) adopt an extension of the OB method to decompose the gender gap in high academic achievement.

  3. 3.

    Recent exceptions include Husain and Millimet (2009), DiPrete and Jennings (2012), Cornwell et al. (2013), and Fortin et al. (2015).

  4. 4.

    The results, available upon request, are similar if we use an income- or education-based measure of SES.

  5. 5.

    The correlation in the WAI and WISC scores is 0.25.

  6. 6.

    This is based on a version of Goodman’s (1997) Strength and Difficulty Questionnaire (SDQ) that has been adapted for toddlers.

  7. 7.

    To facilitate interpretation, we normalize WAI and WISC scores, parental involvement, teacher-assessed absolute and relative achievement, and the SDQ measure to all have a mean of 0 and a standard deviation of 1.

  8. 8.

    Specifically, 17 % of cohort B children drop out of the survey before wave 4 (prior to third grade); 5 % did not consent to the data linkage; NAPLAN test scores could not be retrieved for 9 % of cases; and reading or numeracy test scores are missing for 1 % of cases.

  9. 9.

    The recoding indicator takes the value of 1 if information is missingin the case of dummy variables and takes the value of 1 if information is available in the case of continuous variables.

  10. 10.

    We compared the mean characteristics of our estimation sample to the full sample of respondents in cohort B. There are significant differences (at 5 %) in means for eight (out of 43) characteristics. Most of these differences are small and unlikely to be economically meaningful. On average, respondents in the estimation sample appear slightly more advantaged (they have a higher birth weight, more educated parents, more often live with two biological parents, and are less often indigenous) but at the same time less often get help with homework every day from the secondary parent. Results are available upon request.

  11. 11.

    On average, boys also have lower achievement in writing (24 points), spelling (19 points), and grammar (21 points).

  12. 12.

    Confidence intervals are boot strapped with 100 replications.

  13. 13.

    Simultaneous estimation across different values of τ allows the variance-covariance matrix of the different \(\alpha _{1}^{j\tau }\)to be obtained and the significance of the gender gap in test scores at points of the achievement distribution to be tested. The equality of \(\hat {\alpha }_{1}^{j\tau }\)at all values of τ was tested and rejected using an F test.

  14. 14.

    Analyses of grammar, spelling, and writing achievement scores result in conclusions similar to those based on reading. These additional results are available upon request.

  15. 15.

    Following Jann (2008), we include a gender indicator variable in the pooled regression.

  16. 16.

    Fortin et al. (2015) use an extension of the OB method to decompose mean differences in the propensity to get high and low grades. Their reweighting approach has certain advantages in providing more precise estimates if the conditional mean function is not linear. As we have no reason to believe that in our case, the conditional mean function is particularly nonlinear and there is debate in the literature about how sensitive the OB decomposition really is to deviations in linearity even if they exist (see Fortin et al. 2011), we have chosen to implement the standard OB method.

  17. 17.

    All models are estimated with STATA 13 using the “Oaxaca” command without the “categorical” option.

  18. 18.

    The OLS coefficients underpinning the decomposition analysis are presented in Appendices 3 and 4. Analysis of the other domains of literacy including writing, spelling, and grammar resulted in similar conclusions. These results are available upon request.

  19. 19.

    Differential expectations for boys’ and girls’ educational attainment may have long-term consequences. Fortin et al. (2015), for example, provide evidence that gender differences in students’ own post-secondary expectations are one of the most important factors underlying the relative advantage that girls now have in the grades they receive in high school.

  20. 20.

    LSAC allows us to control for parenting style using the consistent and warmth parenting scales. In contrast, the ECLS-K does not include the scale for consistent parenting. See Appendix Appendix2 for the details of these additional variables.


The authors would like to thank the anonymous referees for helpful comments and suggestions. This paper uses unit record data from Growing Up in Australia: the Longitudinal Study of Australian Children, conducted in partnership between the Australian Government Department of Social Services (DSS), the Australian Institute of Family Studies (AIFS), and the Australian Bureau of Statistics (ABS). This research was supported by the Australian Research Council (ARC) Centre of Excellence for Children and Families over the Life Course (project number CE140100027). The Centre is administered by the Institute for Social Science Research at The University of Queensland, with nodes at The University of Western Australia, The University of Melbourne, and The University of Sydney. The findings and views reported in this paper are those of the authors and should not be attributed to DSS, AIFS, ABS, or ARC.

Appendix: 1

Table 5 provides descriptions of the variables included in the Oaxaca decomposition. The variables in bold are shown individually in the results, while the rest are grouped in the “other” category.

Table 5 Variables included in the Oaxaca decomposition

Appendix: 2

Table 6 describes the variables that are included in robustness checks but not in the main model. All variables are measured at 6 years old unless specified otherwise.

Table 6 Variables included in the robustness checks

Appendix 3

Table 7 OLS coefficients for reading test scores by SES

Appendix 4

Table 8 OLS coefficients for numeracy test scores by SES

