We observe five main outcomes for each child: the test score which serves as an objective assessment of ability at the end of primary school (6th grade), the primary school teacher’s assessment which serves as a subjective assessment measure of ability (6th grade), track allocation in the first and third year of secondary education (7th and 9th grade), the results from a cognitive test in 9th grade, and track switching in the first three years of secondary education (7th–9th grade). We now present information on these measures and information about the Dutch education system.
Dutch Education System
Countries differ in the age at which they first track children into different types of schools. In the majority of OECD countries, tracking takes place between the ages of 14–16. Some countries, including the Netherlands, undertake the first tracking at the age of 12 when children progress from primary to secondary school (i.e., from 6th to 7th grade).Footnote 5 We take advantage of this system by studying the allocation in secondary school, the transition from primary to secondary school and performance in 9th grade.
In the Netherlands primary education consists of eight years of which the first two are spent in kindergarten. As of the third year of primary school (i.e., 1st grade), children formally learn how to read and write. Most children start kindergarten at the age of 4, enter 1st grade at the age of 6 and finish primary school at the age of 12. As of secondary school children are allocated to tracks. The track allocation decision is made by the secondary school. It is based on test scores and the teacher’s assessment. Some schools set a threshold test score level below which children are not allowed to enter a certain level of secondary education.
The Dutch secondary education system is hierarchically structured by ability and consists of three main tracks that differ in duration and qualification (see the left-hand column of Fig. 1). The four-year track (VMBO or T1) qualifies children for vocational education, the five-year track (HAVO or T2) qualifies children for higher vocational education and the six-year track (VWO or T3) qualifies children for university education. The next column in Fig. 1 shows four sub-tracks at the lowest level of secondary education (T1a–T1d). The difference between these four sub-tracks is the importance of a practical versus theoretical focus in the curriculum. Time spent on more theoretically oriented courses increases with the tracks from T1a to T1d.Footnote 6
The third column of Fig. 1 shows all possible tracks, some of which are combinations of the three major tracks. Both the objective and subjective assessment measure are tailored toward allocating children into one of these 11 track combinations. The 6th grade test distinguishes brackets which are consistent with these 11 track combinations, shown in the fourth column of Fig. 1. Teacher assessment is measured on the same 11-point scale.
Background
In 6th grade, all children have to take an objective assessment test. Schools are free to choose which test their children take. Approximately 85 % of all Dutch children complete the Cito Eindtoets. We use results from this Cito Eindtoets for our analysis. The children in our data have taken this test in 2009. The test is standardized, meaning that the test procedure is the same for the whole country. During the assessment children have to answer questions in the areas of math, reading, study skills and science. The performance is measured on a scale between 501 and 550.
The aim of the cognitive test score is to provide an independent and appropriate perspective on children’s expected performance and their best track placement in secondary education. The test institute offers guidelines for children’s track allocation by reporting brackets of scores and accompanying track assessments. We followed these guidelines for constructing our objective ability measure variable. We use the brackets as the outcome measure of the objective assessment measure.
High scores on the standardized test are an important way in which primary schools try to signal the quality and value-added of their educational efforts. Primary schools seem to use their average scores on this test to attract new children. In addition, the Dutch Education Inspectorate uses these results, controlled for individual characteristics, as one of the inputs for their overall evaluation of the school’s quality and value-added. Children also have an incentive to obtain a high test score because it is an important signal of their ability. In that sense, the test is a high-stakes assessment.
In addition to the objective test score, teachers make a personal assessment of each child’s level of ability. The assessment is based on the teacher’s experience and interaction with the child, observable demographic and socio-economic factors and the child’s performance throughout all grades in primary school. Teachers also know the test score at the time they make their assessment of the child’s ability. The subjective assessment is provided in the spring of 2009 before children apply to a secondary school. The teacher’s assessment is provided in similar brackets as the objective assessment and fits with possible track allocations in secondary schools.
Primary school teachers do not have a strong incentive for strategic behaviour in such a way that their assessment overstates the child’s ability. The teacher’s compensation scheme does not depend on the assessments made. Furthermore, the primary school’s population usually goes to the same secondary schools every year. This means that over time the information asymmetry between the primary and the secondary schools reduces and secondary schools learn how to interpret the assessments from primary schools. Furthermore, each year children are assigned to a class in which they are taught by one primary school teacher. This teacher is involved in teaching all subjects in primary school. Differences in test scores across different parts of the test are therefore unlikely to be driven by different teacher characteristics.
Secondary schools allocate children to tracks. They obtain the information about the objective and subjective assessment measures. Secondary schools have an incentive to allocate children to the track that matches their ability level. Inputs for the Education Inspectorate evaluation of secondary schools’ performance include the percentage of children who graduate every year as well as the percentage of children who switch tracks. Allocating children to tracks that are too high (too low) leads to switching downward (upward) and would induce negative (positive) evaluations on this part of the performance assessment. Nevertheless, secondary schools also benefit from having more children in the higher tracks as this is beneficial for signalling the quality of the schools’ education, which potentially helps to attract more children.
Strategy
Our analysis first focuses on the way in which both the objective and subjective assessment measures in 6th grade help to explain track placement in 7th grade. Both assessment measures aim to measure ability, but face the problem that the true underlying and unobserved level of ability is unknown. The test score \(({ TS}_i )\) is used as the objective assessment measure of the child’s (i) ability. This test score depends on the child’s true and unobserved ability \((A_i )\), a vector of observed characteristics \((X_i)\) and the primary school he attends \((P_i )\).
In an ideal world \({ TS}_i =A_i \). In practice this is not the case because \({ TS}_i \) is measured with noise and observed characteristics \(X_i \) are likely to influence the measurement of \(A_i \) by \({ TS}_i \). The reason for adding school fixed effects (or dummies in our cross-sectional specifications) is that characteristics of primary schools can be related to test results of children in 6th grade, which can influence \({ TS}_i \). Some of these school characteristics we cannot observe. Hence, we add school fixed effects \((\beta _{P_i})\) to the model.
The teacher’s assessment \(({ TA}_i )\) is used as the subjective assessment measure of the child’s ability. This measure includes the same ingredients plus the observed test score. The information about the child’s test score influences the teacher’s assessment of the child’s ability. Because children are assigned to one teacher in the final year of primary education and most primary schools have only one 6th grade class, potential teacher effects are captured by school fixed effects. This is also the reason for indexing all variables with child i and for ignoring teacher j effects.
Finally, we observe the child’s initial track placement \(({ TP}_i )\) in 7th grade (at secondary school) and thereafter allocation in 9th grade. The decision about initial placement is made by the secondary school and includes the objective and subjective assessment measures. Adding secondary school fixed effects to the model would create additional endogeneity as not all schools offer the same track levels and secondary school fixed effects are related to \({ TS}_i\) and \({ TA}_i \).
In the first part of the empirical analysis we analyse whether there are any systematic differences between the test score, the teacher’s assessment and track placement for various socio-economic background characteristics of children. We are primarily interested in explaining track placement. We estimate equation (1) with an ordered probit model to find the determinants of track placement in both 7th and 9th grade. We do not observe \(A_i \) but two signals \({ TS}_i \) and \({ TA}_i \). In the empirical analysis we incorporate the possibility that secondary schools take into account both \({ TS}_i \) and \({ TA}_i \), although \({ TA}_i \) includes information about \({ TS}_i \). In this way we have the two signals competing with each other. A statistically significant coefficient of \({ TS}_i\) would in all likelihood suggest that secondary schools put weight on both signals of ability. The equation we first estimate is:
$$\begin{aligned} { TP}_i =C_1 +a_1 X_i +a_2 { TS}_i +a_3 { TA}_i +\beta _{P_i } +\varepsilon _i, \end{aligned}$$
(1)
where \(\varepsilon _i\) is the error term.
In the second part of the empirical analysis we estimate the determinants of track switching in the first three years of secondary school (i.e., in the period spanning 7th, 8th and 9th grade). To do so, we first estimate a set of probit models in which we show what type of children tend to switch tracks. Second, we estimate probit models in which we estimate the probability of switching tracks \((SW_i )\) for child i:
$$\begin{aligned} SW_i =C_2 +b_1 X_i +b_2 { TS}_i +b_3 { TA}_i +b_4 { TP}_i +\gamma _{S_i } +\mu _i , \end{aligned}$$
(2)
where \(\mu _i\) is the error term. We estimate different versions of the model in which the dependent variable is switch, switch up or switch down. We only include secondary school fixed effects \((\gamma _{S_i})\) because switching takes place in secondary school. We show below that primary school fixed effects have no impact on track placement, which makes us confident that including only secondary school fixed effects is sufficient to estimate the determinants of switching.
Finally, we estimate models to investigate to what extent test scores on an identical (low-stakes) test in 9th grade are correlated with the ability signals from the teacher and the test score in 6th grade. We also estimate to what extent switching is correlated with test scores in 9th grade.
The strength of the data at our disposal is that we are able to observe performance in both primary and secondary school. In addition, we have detailed information about teacher assessments and initial track placement in secondary school. This is a unique feature in the literature. Nevertheless, the analysis is constrained by the fact that we are not able to identify a source of exogenous variation in our data. Ideally, one would want to conduct an experiment in which a random portion of the sample was placed according to the test scores’ signal, another part according to the teacher’s signal and a final slice of the population as it is currently done (i.e., decided by the secondary school). The alternative is to find instruments to deal with the “self-fulfilling prophecy” that creates endogeneity. The “self-fulfilling prophecy” is the idea that when a child is placed on a higher or lower track than he should be according to his true ability, the child is more likely to switch back to the track that matches his true ability. This is to be kept in mind when interpreting the results in the switching section. Furthermore, all tests, but also the teacher’s assessment, contain measurement error and analyses concerning ability suffer from omitted variable bias. However, such instruments are not readily available. Our analyses focus on outcomes between 6th and 9th grade. In order to find exogenous variation we would need a set of instruments related to (one of) our assessment measures and at the same time unrelated to any unobservable variables that might influence our outcome variable. Since children’s true ability is unobserved, this is problematic. We are aware of the endogeneity concerns with respect to omitted variable bias and also with the fact that potential measurement error is an important disclaimer when interpreting the estimated coefficients, but try to deal with this in the best way possible by using primary and secondary school fixed effects and a rich set of covariates, including track placement in 7th grade.