# Decomposing international gender test score differences

## Abstract

In this paper, we decompose worldwide PISA mathematics and reading scores. While mathematics scores are still tilted towards boys, girls have a larger advantage in reading over boys. Girls’ disadvantage in mathematics is increasing over the distribution of talents. Our decomposition shows that part of this increase can be explained by an increasing trend in productive endowments and learning productivity, although the largest part remains unexplained. Countries’ general level of gender (in)equality also contributes to girls’ disadvantage. For reading, at the upper end of the talent distribution, girls’ advantage can be fully explained by differences in learning productivity, but this is not so at lower levels.

## Keywords

Gender gap Test scores PISA Mathematics Reading## JEL Classification

I23 I24 J16## 1 Introduction

Consensus exists regarding significant gender test score differences in schools. Boys typically excel in mathematics and science whereas girls score better in reading and literacy subjects (e.g., Turner and Bowen 1999; Halpern et al. 2007; Ceci et al. 2009). Although girls have somewhat caught up in mathematics (Hyde and Mertz 2009), differences remain. On the other hand, there is evidence of more men or boys at the upper end of the education or professional distribution (Machin and Pekkarinen 2008), which could be attributed to the larger variance of test scores for boys. The magnitude, spread and practical significance of gender differences in educational outcomes have remained a topic of concern. This concern is important, because gender disparities in achievement at an earlier stage, particularly at the upper ends of the distribution, may impact career selection and educational outcomes at a later stage.

The previous literature mostly examined mean differences (Fryer and Levitt 2010), while quantile regressions do exist for some countries (Gevrek and Seiberlich 2014; Sohn 2012; Thu Le and Nguyen 2018): providing evidence for Turkey, Korea and Australia, respectively. Two possible arguments have been suggested for these gender gaps, one biological or natural (Benbow and Stanley 1980; Geary 1998) and the other environmental, including family, institutional, social, and cultural influences (e.g., Fennema and Sherman 1978; Parsons et al. 1982; Levine and Ornstein 1983; Guiso et al. 2008; Pope and Sydnor 2010; Nollenberger et al. 2016). Recent studies looked at the impact of culture: Nollenberger et al. (2016) look at immigrants in the U.S. to explain whether gender-related culture in the home country can explain differences in mathematics scores; similarly Guiso et al. (2008) look at gender differences in 35 countries PISA mathematics scores.

The present study looks at mathematics and reading scores for all countries included in the OECD’s PISA test and tries to decompose these score differences at different percentiles of the distribution through natural and environmental factors that influence the students’ mathematics and reading test scores. This decomposition research is guided by the Juhn et al. (1993) decomposition model, which extends the usual Blinder–Oaxaca decomposition by taking into account the residual distribution. Following this method, this study will decompose test score gaps between males and females to analyze how much of the test score gap can be “predicted” by observable differences across students in determining the test score production function and inequality within these classifications.

In this study, we employed international PISA data to examine test score differences between boys and girls worldwide, focusing on the differences at different quantiles of the distribution. PISA has the advantage of covering various personal, family, school system, and societal background characteristics, which enables decomposing potential differences into effects due to different endowments, institutional settings, and the productivity of learning in different situations. We adopted a decomposition following Juhn et al. (1993), which enabled us to decompose test score differentials into endowment, productivity, and unobservable components.

Our decomposition for score differentials in mathematics shows that part of the increasing disadvantage of girls over the distribution of talent can be explained by an increasing trend in productive endowments and learning productivity, although the largest part remains unexplained. Countries’ general level of gender (in)equality also contributes to girls’ disadvantage. For reading, at the upper end of the talent distribution, girls’ advantage can be fully explained by differences in learning productivity, but this is not so at lower levels. Our contribution to the literature lies in an extension of quantile regression results to practically all PISA countries, to an inclusion of country-specific gender-related variables and to an application of the Juhn, Murphy and Pierce analysis, which extends a simple decomposition to take the residual distribution into account.

The remainder of the paper is organized as follows: The next section describes the PISA database, its features and other data sources used in the study. Section 3 discusses the estimation strategy used in this paper and structures the econometric model based upon the Juhn, Murphy and Pierce decomposition method. Section 4 presents results on test score inequality for our dispersion analysis. Section 5 concludes.

## 2 Data

This paper uses the micro data of the Program of International Student Assessment (PISA) 2012 as well as data on per capita GDP (PPP), gender equality, and government expenditure on education to analyze the decomposition of gender differences in test scores. Combining the available data, the dataset contains information on 480,174 students in 65 countries pertaining to mathematics and reading literacy.

### 2.1 PISA data

PISA is a cross-national study created by the Organization for Economic Co-operation and Development (OECD) to assess students’ ability in mathematics, reading, science, and problem solving. Since its launch in 2000, the assessment is conducted on a triennial basis. The main advantage of the program is its international comparability, as it assesses students’ ability based on a cohort of students of the same age. Moreover, there is a large volume of background information of students and schools, which may help to put student assessment into perspective. The assessment in each wave focuses on one particular subject,^{1} and tests other main areas as well. In our analysis, we employed data from the 2012 PISA wave that focused on performance in mathematics.

The PISA 2012 dataset covers the test score performance of students from 34 OECD and 31 non-OECD countries, which includes approximately 510,000 students aged 15 or 16 years. The dataset includes a number of demographic and socioeconomic variables for these students. The instrument was paper-based and comprised a mixture of text responses and multiple-choice questions. The test is completed in 2 h. The questions are organized in groups based on real life situations. A stratified sampling design was used for this complex survey, and at least 150 schools were selected^{2} in each country and 35 students randomly selected in each school to form clusters. Because of potential sample selection problems, weights were assigned to each student and school. The PISA test scores are standardized with an average score of 500 points and standard deviation of 100 points in OECD countries. In the PISA 2012 test, the final proficiency estimates were provided for each student and recorded as a set of five plausible values.^{3} In this study, we used the first plausible value as a measure of student proficiency.^{4}

In 2012, Shanghai scored best and remained at the top with 613 PISA points in mathematics, followed by Hong Kong, Japan, Taiwan, and South Korea, all high-performing East Asian countries. Among the European countries, Liechtenstein and Switzerland demonstrated the best performance, followed by the Netherlands, Estonia, Finland, Poland, Belgium, Germany, and Austria with slightly lower figures. On average, the mean score in mathematics was 494 and 496 for reading in OECD countries. The UK, Ireland, New Zealand, and Australia were close to the OECD average, while the USA scored lower than the OECD average with 481 PISA points.

Since the primary concern of this study is to explore the differences in mathematics and reading test scores between male and female students, the dependent variable is the student test score in PISA 2012. The rich set of covariates includes five characteristics, namely individual characteristics of the students, their family characteristics, school characteristics, student’s beliefs or perceptions about learning, and country characteristics. Table 2 provides a description of all variables from the PISA data used in this study.

In the survey data, the probability that individuals will be sampled is assumed dependent on the survey design. To take into account this feature, students’ educational production functions were estimated using survey regression methods. This allowed us to include student weights and school clusters depending on the sampling probabilities and within standard errors respectively in our analysis.

### 2.2 Level of development, education expenditure, and gender equality data

To consider the country’s level of development in this analysis, we employed the data on GDP per capita (measured in purchasing power parity (PPP)) from the World Development Indicators 2012. Data on education expenditure was derived from the Human Development Report 2013, United Nations Development Program, while data for Jordan, Shanghai, and Macao were obtained from the World Bank database.

To explore the cultural role related to gender equality, following Guiso et al. (2008), we employed the Gender Gap Index (GGI) by the World Economic Forum (Hausmann et al. 2013). The Global Gender Gap Index was first introduced in 2006, which by that time was published annually by the World Economic Forum. GGI shows the ranking of countries based on the average of four sub indices,^{5} namely economic, political, health, and educational opportunities provided to females. A GGI of 1 reflects full gender equality and 0 total gender inequality. The top five countries in the 2012 GGI ranking were Iceland (0.86), Finland (0.85), Norway (0.84), Sweden (0.82), and Ireland (0.78). It is important to note that GGI data is only available for whole countries^{6} and not for participating economic regions in the PISA 2012 dataset (e.g., Hong Kong, Macao, and Shanghai), Furthermore, it does not seem reasonable that data for whole countries can be representative of the relevant economic regions. These regions were eliminated from the data set.^{7}

## 3 Estimation strategy

In general, decomposition approaches follow the standard partial equilibrium approach in which observed outcomes of one group (i.e., gender, region, or time period) can be used to construct various counterfactual scenarios for the other group. Besides this, decompositions also provide useful indications of particular hypotheses to be explored in more detail (Fortin et al. 2011).

^{8}We show this decomposition following the description of Sierminska et al. (2010) as follows:

_{j}are the test scores for j=M, W (men and women respectively), X

_{j}are observables, β

_{j}are the vectors of the estimated coefficients, and ε

_{j}are the residuals (unobservables, i.e., unmeasured prices and quantities).

_{j}(.) denotes the cumulative distribution function of the residuals for group j, then the residual gap consists of two components: an individual’s percentile in the residual distribution p

_{i}, and the distribution function of the test score equation residuals F

_{j}(.). If p

_{ij }= F

_{j}(ε

_{ij}|x

_{ij}) is the percentile of an individual residual in the residual distribution of model I, by definition we can write the following:

_{j}

^{−1}(.) is the inverse of the cumulative distribution (e.g., the average residual distribution over both samples) and \(\overline{\beta }\) an estimate of benchmark coefficients (e.g., the coefficients from a pooled model over the whole sample).

- 1.
Hypothetical outcomes with varying quantities between the groups and fixed prices (coefficients) and a fixed residual distribution as

$${\text{y}}_{\text{ij}}^{(1)} = {\text{x}}_{\text{ij}} \overline{\beta } + {\text{ F}}_{\text{i}}^{ - 1} \left( {{\text{p}}_{\text{ij}} |{\text{ x}}_{\text{ij}} } \right)$$(3)

- 2.
Hypothetical outcomes with varying quantities and varying prices and fixed residual distribution as

$${\text{y}}_{\text{ij}}^{(2)} = {\text{x}}_{\text{ij}}\upbeta_{\text{j}} + {\text{ F}}_{\text{i}}^{ - 1} \left( {{\text{p}}_{\text{ij}} |{\text{x}}_{\text{ij}} } \right)$$(4)

- 3.
Outcomes with varying quantities, varying prices, and a varying residual distribution

^{9}as$${\text{y}}_{\text{ij}}^{(2)} = {\text{ x}}_{\text{ij}}\upbeta_{\text{j}} + {\text{F}}_{\text{i}}^{ - 1} \left( {{\text{p}}_{\text{ij}} |{\text{ x}}_{\text{ij}} } \right)$$(5)

_{M}–Y

_{W}can then be decomposed as follows:

The major advantage of the JMP framework is that it enables us to examine how differences in the distribution affect other inequality measures and how the effects on inequality differ below and above the mean.

## 4 Estimation results

### 4.1 Descriptive statistics

Table 4 contains the descriptive statistics on all the variables used in this microanalysis of the PISA, 2012 dataset. The descriptive statistics are displayed by gender and by OECD and non-OECD countries separately. We imputed missing data for the variable ‘age’ and for some other variables^{10} in the schooling vector by using the mean imputation method.

Table 4 shows that in OECD countries, students on average, scored 42.12 and 46.1 points more in mathematics and reading, respectively than non-OECD countries. On average, OECD girls have fallen behind OECD boys by 5.4 points in mathematics scores and 9 points in reading scores, while, non-OECD girls remain 3.5 PISA points behind non-OECD boys in mathematics and 6.5 in reading.

In order to examine whether or not a gender difference within PISA is statistically significant at the 1%, 5% and 10% levels, we also calculated the mean difference between the girls’ and boys’ scores.^{11} It shows that significant mean differences across gender (based on the OECD and non-OECD grouping) exist for almost all variables.

### 4.2 PISA score in mathematics

In general, a strong upward trend in the total male–female test score differential (T) is evident. While there is (almost) no difference for the lowest percentiles, the female disadvantage in mathematical competence increases almost linearly to around 20 PISA points at the 95th percentile. As good mathematical knowledge, particularly at the upper percentiles, is especially valuable for getting a good job (Athey et al. 2007), it is important to explore this issue. This total (T) effect will be decomposed into an effect due to differences in observables (Q), in a productivity-effect (P) on the learning productivity of these observables, and finally, an unobservable (U) rest.

Looking first at Panel F—including all characteristics, this upward trend in mathematical test score differences (T) cannot easily be explained by one factor. Unobservables demonstrate a clear upward trend, but observables and productivity effects do so at a somewhat lower level. We now examine individual contributions of individual versus school characteristics. Here, decomposing the contribution of unobservables (U) in Panels A–E does not make sense, because even if the individual contributions are orthogonal, the unobservable trends measure mainly the impact of omitted variables.

Turning to the contribution of observables (Q) towards mathematical competence, the endowment effect, Panel F indicates a negative endowment effect. In other words, females typically enjoy better endowments: around 10 PISA points at lower percentiles down to 5 PISA points at higher levels. These advantages stem from better female endowments in terms of schooling characteristics and beliefs. The slight upward trend in the contribution of observables in Panel F can mainly be attributed to an upward trend in observables in belief characteristics.

What is the contribution of learning productivity (P)? Panel F shows that the learning productivity of females increases the male–female test score gap for all percentiles, but the effect is slightly higher for higher percentiles. Panels A–E indicate similar productivity disadvantages for all included lists of characteristics.

Ceteris–paribus shifts in math and reading test scores due to a one standard deviation shift in individual variables

Mathematics | Reading | |||||
---|---|---|---|---|---|---|

Male | Female | Gender score difference | Male | Female | Gender score difference | |

| ||||||

Age | 1.001 | 0.930 | 0.071 | 0.731 | 0.775 | − 0.567 |

Grade | 11.66 | 9.950 | 1.71 | 12.67 | 10.24 | 2.43 |

Country of birth | 1.675 | 1.577 | 0.098 | 1.235 | 1.098 | 0.137 |

| ||||||

Mother’s education | 4.30 | 6.09 | − 1.79 | 4.706 | 5.947 | − 1.241 |

Father’s education | 5.414 | 5.457 | − 0.043 | 4.180 | 3.976 | 0.204 |

Mother’s work | 4.217 | 5.763 | − 1.546 | 3.605 | 5.354 | − 1.749 |

Father’s work | 5.841 | 5.467 | 0.374 | 5.540 | 4.896 | 0.644 |

Family structure | 1.734 | 1.178 | 0.556 | 0.930 | − 0.106 | 1.036 |

Language | 2.401 | 0.856 | 1.545 | 6.44 | 5.276 | 1.164 |

Home possession | 16.89 | 17.83 | − 0.94 | 14.98 | 17.51 | − 2.53 |

| ||||||

Public schools | − 3.897 | − 1.769 | − 2.128 | − 7.069 | − 2.88 | − 4.189 |

6.370 | 7.563 | − 1.193 | 5.502 | 6.234 | − 0.732 | |

Class size | 9.425 | 9.122 | 0.303 | 10.44 | 7.932 | 2.508 |

Quality of physical infrastructure | 2.904 | 2.65 | 0.254 | 2.183 | 1.534 | 0.649 |

Percentage of girls at school | 7.983 | − 0.872 | 8.855 | 8.807 | 1.667 | 7.14 |

Certified teachers | 7.697 | 9.528 | − 1.831 | 6.796 | 7.164 | − 0.368 |

Teacher–student ratio | − 3.570 | − 4.763 | 1.193 | − 1.818 | − 2.858 | 1.04 |

Teacher–student relations | − 1.409 | − 0.218 | − 1.191 | − 1.580 | − 1.120 | − 0.46 |

| ||||||

Difference in test efforts | − 3.565 | − 2.083 | − 1.482 | − 5.635 | − 3.837 | − 1.798 |

Out of school study hours | 1.586 | 5.825 | − 4.239 | 0.236 | 3.810 | − 3.574 |

Perseverance | 9.765 | 6.977 | 2.788 | 8.79 | 6.136 | 2.654 |

Success | 16.52 | 10.85 | 5.67 | 11.85 | 6.055 | 5.795 |

Career motive | 12.52 | 10.06 | 2.46 | 7.424 | 5.476 | 1.948 |

Job motive | − 2.88 | − 4.589 | 1.709 | − 8.765 | − 9.541 | 0.776 |

Subjective norms | − 12.40 | − 9.155 | − 3.245 | |||

| ||||||

GDP | − 0.342 | 0.963 | − 1.305 | 0.976 | 0.723 | 0.253 |

GGI | − 0.908 | 1.507 | − 2.415 | 0.826 | 0.621 | 0.205 |

Gender ratio at PISA | − 12.21 | − 10.37 | − 1.84 | − 7.641 | − 7.302 | − 0.339 |

Education expenditure | 11.47 | 11.02 | 0.45 | 12.06 | 12.74 | − 0.68 |

### 4.3 PISA scores for reading

On the other hand, similar to mathematics, the total advantage of girls (T) diminishes from around 50 PISA points at the lowest percentiles to about 20 PISA points at the highest.^{12} Decomposing that, at the highest percentile levels, this male–female differential is fully explained by productivity differentials (P), less so at lower percentiles. There is a contribution of observables (Q): the endowment of students contributes between 6 and 12 PISA points towards this female advantage. Finally, the contribution of unobservables (U) is mixed, increasing between − 9 to + 9 PISA points.

Which factors are responsible for this difference? Our detailed analysis of the causes in Panels A–E in Fig. 3 indicates that endowment differences (Q) are strongest for schooling characteristics. Schooling characteristics, considered separately, explain between 7 and 10 PISA points, while the contributions of other domains are minor.

On the other hand, there is a large productivity (P) contribution in all separately considered domains. They are particularly high in the family, individual, belief, and country domains.

Regarding the contributions of individual items (Table 1), those favorable for boys are the percentage of girls in a classroom, success motivation, and class size. Factors favorable for girls are public schools and the amount of studying time out of school. Interestingly, a country’s GGI has no effect on the reading differential between boys and girls.

## 5 Conclusion

In this paper, we provided a decomposition of PISA mathematics and reading scores worldwide. Our contribution to the literature lies in an extension of quantile regression results to practically all PISA countries, to an inclusion of country-specific gender-related variables and to an application of Juhn et al. (1993) analysis, which extends a simple decomposition to take the residual distribution into account.

While mathematics scores are still tilted towards boys, girls have a larger advantage in reading over boys. This advantage is particularly large for low-achieving individuals. Our analysis shows that over the distribution of talent, boys’ scores increase more than girls—for both mathematics and reading: thus—at the highest percentiles—we see a smaller reading advantage for girls as well as a large advantage of boys in mathematics.

Our decomposition shows that part of this increase can be explained by an increasing trend in productive endowments and learning productivity, but the largest part remains unexplained. Countries’ general level of gender (in)equality also contributes towards girls’ disadvantage. For reading, at the upper end of the talent distribution, girls’ advantage can be fully explained by differences in learning productivity, although this is not so at lower levels. Education policy trying to reduce these gender differences must target high-performing females in their efforts in mathematics and science, and must be concerned by low-achieving boys who lag in reading and verbal expressiveness.

## Footnotes

- 1.
The first PISA exam in 2000 focused on reading literacy, while the second focused on mathematics specialization. PISA 2012 again focused on mathematics literacy.

- 2.
The PISA consortium decides which school will participate, and then the school provides a list of eligible students. Students are selected by national project managers according to standardized procedures (OECD 2012).

- 3.
These plausible values are calculated by the complex item-response theory (IRT) model (see Baker 2001; Von Davier and Sinharay 2013) based on the assumption that each student only answers a random subset of questions and their true ability cannot be directly judged but only estimated from their answers to the test. This is a statistical concept, and instead of obtaining a point estimate [like a Weighted Likelihood Estimator (WLE)], a range of possible values of students’ ability with an associated probability for each of these values is estimated (OECD 2009).

- 4.
“Working with one plausible value instead of five provides unbiased estimates of population parameters but will not estimate the imputation error that reflects the influence of test unreliability for the parameter estimation” (OECD 2009).

As this imputation error decreases with a large sample size, so the use of one plausible value with a sample size of 480,174 students will not make any substantial difference in the mean estimates and standard errors of the estimates. For details, see p 43: https://www.oecd-ilibrary.org/docserver/9789264056275-en.pdf?expires=1537249103&id=id&accname=guest&checksum=FCF6D3D8A03AB42A0FEC82FE7E2ADF47.

- 5.
- 6.
GGI data for Liechtenstein, Montenegro, and Tunisia is unavailable.

- 7.
See Munir (2017) for details.

- 8.
- 9.
These outcomes are actually equal to the originally observed values, i.e., \({\text{y}}_{\text{ij}}^{( 3)} = {\text{ y}}_{\text{ij}} = {\text{ x}}_{\text{ij}} \beta_{\text{j}} + \, \varepsilon_{\text{ij}}\).

- 10.
These are school autonomy, class size, quality of physical infrastructure, proportion of girls at school, out of school study time and perseverance.

- 11.
These results are not presented here because of space limitations but are available upon request.

- 12.
See also Stoet and Geary (2013) for the inverse relationship between mathematics and reading assessments.

## Notes

### Authors’ contributions

The authors contributed equally towards the preparation of the paper. Both authors read and approved the final manuscript.

### Acknowledgements

Thanks to helpful comments to Nicole Schneeweis and Helmut Hofer.

### Competing interests

The authors declare that they have no competing interests.

### Availability of data materials

Data (PISA) are available for free, Stata-files are available upon request.

### Funding

There is no external funding.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

- Adams, R., Butler, J.: The impact of differential investment of student effort on the outcomes of international studies. J. Appl. Measur.
**8**(3), 279–304 (2007)Google Scholar - Athey, S., Katz, L.F., Krueger, A.B., Levitt, S.: What does performance in graduate school predict? Graduate economics education and student outcomes. Am. Econ. Rev. Papers Proc.
**97**(2), 512–520 (2007)CrossRefGoogle Scholar - Baker, F. B. (2001). The basics of item response theory. For full text: http://ericae.net/irt/baker
- Benbow, C.P., Stanley, J.C.: Sex differences in mathematical ability: fact or artifact? Science
**210**(4475), 1262–1264 (1980)CrossRefGoogle Scholar - Blinder, A.S.: Wage discrimination: reduced form and structural estimates. J. Hum. Resour.
**8**, 436–455 (1973)CrossRefGoogle Scholar - Ceci, S.J., Williams, W.M., Barnett, S.M.: Women’s underrepresentation in science: sociocultural and biological considerations. Psychol. Bull.
**135**(2), 218 (2009)CrossRefGoogle Scholar - Fennema, E.H., Sherman, J.A.: Sex-related differences in mathematics achievement and related factors: a further study. J. Res. Math. Educ.
**9**, 189–203 (1978)CrossRefGoogle Scholar - Fortin, N., Lemieux, T., Firpo, S.: Decomposition methods in economics. Handbook Labor Econ.
**4**, 1–102 (2011)CrossRefGoogle Scholar - Fryer, R.G., Levitt, S.D.: An empirical analysis of the gender gap in mathematics. Ame. Econ. J.
**2**(2), 210–240 (2010)Google Scholar - Geary, D.C.: Male, female: the evolution of human sex differences. American Psychological Association, London (1998)CrossRefGoogle Scholar
- Gevrek, Z., Seiberlich, R.: Semiparametric decomposition of the Gender Achievement Gap: an application for Turkey. Labour Econ.
**31**, 27–44 (2014)CrossRefGoogle Scholar - Gneezy, U., Niederle, M., Rustichini, A.: Performance in competitive environments: gender differences. Quart. J. Econ.
**118**(3), 1049–1074 (2003)CrossRefGoogle Scholar - Guiso, L., Monte, F., Sapienza, P., Zingales, L.: Culture, math, and gender. Science
**320**(5880), 1164–1165 (2008)CrossRefGoogle Scholar - Halpern, D.F., Benbow, C.P., Geary, D.C., Gur, R.C., Hyde, J.S., Gernsbacher, M.A.: The science of sex differences in science and mathematics. Psychol. Sci. Public Interest
**8**(1), 1–51 (2007)CrossRefGoogle Scholar - Hausmann, R., Tyson, L.D., Bekhouche, Y., Zahidi, S.: The global gender gap index 2012. In: World Economic Forum (2013)Google Scholar
- Hyde, J.S., Mertz, J.E.: Gender, culture, and mathematics performance. Proc. Natl. Acad. Sci.
**106**(22), 8801–8807 (2009)CrossRefGoogle Scholar - Juhn, C., Murphy, K.M., Pierce, B.: Wage inequality and the rise in returns to skill. J. Polit. Econ.
**101**(3), 410–442 (1993)CrossRefGoogle Scholar - Kunter, M., Gundel Schümer, G., Artelt, C., Baumert, J., Klieme, E., Neubrand, M., Prenzel, M., Schiefele, U., Schneider, W., Stanat, P., Tillmann, K.-J., Weiß, M.: PISA 2000: dokumentation der erhebungsinstrumente. Materialien aus der Bildungsforschung, Max-Planck-Institut für Bildungsforschung (2002)Google Scholar
- Levine, D.U., Ornstein, A.C.: Sex differences in ability and achievement. J. Res. Develop. Educ.
**16**(2), 66–72 (1983)Google Scholar - Machin, S., Pekkarinen, T.: Global Sex Differences in Test Score Variability. Science
**322**, 1331–1332 (2008)CrossRefGoogle Scholar - Machado, J., Mata, J.: Counterfactual decomposition of changes in wage distributions using quantile regressions. J. Appl. Econ.
**20**, 445–465 (2005)CrossRefGoogle Scholar - Munir, F.: Essays on Labor Market Institutions, Growth and Gender Inequality (Doctoral dissertation). (2017). http://epub.jku.at/obvulihs/content/titleinfo/1873092?lang=en
- Nollenberger, N., Rodríguez-Planas, N., Sevilla, A.: The math gender gap: the role of culture. Am. Econ. Rev.
**106**(5), 257–261 (2016)CrossRefGoogle Scholar - Oaxaca, R.: Male–female wage differentials in urban labor markets. Int. Econ. Rev.
**14**, 693–709 (1973)CrossRefGoogle Scholar - OECD.: PISA Data Analysis Manual (Second edition): SPSS. OECD. (2009). http://www.oecd-ilibrary.org/education/pisa_19963777;jsessionid=4gvjps237hiqq.x-oecd-live-02
- OECD.: Bildung auf einen Blick 2012: OECD-Indikatoren (2012)Google Scholar
- Pope, D.G., Sydnor, J.R.: Geographic variation in the gender differences in test scores. J. Econ. Perspect.
**24**(2), 95–108 (2010)CrossRefGoogle Scholar - Parsons, J.E., Meece, J.L., Adler, T.F., Kaczala, C.M.: Sex differences in attributions and learned helplessness. Sex Roles
**8**(4), 421–432 (1982)CrossRefGoogle Scholar - Sierminska, E., Frick, J., Grabka, M.: Examining the gender wealth gap. Oxford Economic Papers
**62**, 669–690 (2010)CrossRefGoogle Scholar - Sohn, K.: A new insight into the gender gap in math. Bull. Econ. Res.
**64**(1), 135–155 (2012)CrossRefGoogle Scholar - Stoet, G., Geary, D.C.: Sex differences in mathematics and reading achievement are inversely related within- and across-nation assessment of 10 years of PISA data. PLoS ONE
**8**, e57988 (2013)CrossRefGoogle Scholar - Thu Le, H., Nguyen, H.T.: The evolution of the gender test score gap through seventh grade: new insights from Australia using unconditional quantile regression and decomposition. IZA J. Labor Econ. (2018). https://doi.org/10.1186/s40172-018-0062-y CrossRefGoogle Scholar
- Torppa, M., Eklund, K., Sulkunen, S., Niemi, P., Ahonen, T.: Why do boys and girls perform differently on PISA Reading in Finland? The effects of reading fluency, achievement behavior, leisure reading and homework activity. J. Res. Read.
**41**(1), 122–139 (2018)CrossRefGoogle Scholar - Turner, S.E., Bowen, W.G.: Choice of major: the changing (unchanging) gender gap. Ind. Labor Relat. Rev.
**52**(2), 289–313 (1999)CrossRefGoogle Scholar - Von Davier, M., Sinharay, S.: Analytics in international large-scale assessments: Item response theory and population models. In: Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis. pp 155–174 (2013)Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.