1 Introduction

The youth unemployment rate has been rising since 2004, pre-dating the 2008 recession, following a fairly predictable pattern with regard to cyclical downturns (Petrongolo and van Reenen 2011).Footnote 1 Although it is difficult to pin down the causes of the rise in youth unemployment, one possible cause highlighted by Petrongolo and van Reenen (2011) is the quality of schooling. Furthermore, what is very clear from their analysis of UK LFS data is that the unemployment rate for 16–17 year olds was as high in 2010 as it was in the last major recession in 1980—exceeding 30%. However, official measures of youth unemployment, and teenage unemployment in particular, are likely to understate the true magnitude of joblessness for this group given the propensity of some youths to drop out of the labour market and remain economically inactive for periods of time. A better measure of the labour market fortunes of youths is therefore likely to be the proportion of the group who are ‘Not in Education, Employment or Training’ (NEET)—the unemployed and economically inactive. Since this group of young people are not engaged in skill formation of any kind, they are most likely to be ‘scarred’ by this early labour market experience.

Previous research has, in fact, shown that a poor start to a young persons career can lead to an increased probability of unemployment, as well as a negative effect on future earnings. A lack of investment in human capital, such as vocational skills, acquired though work and training, or through education, causes such detrimental effects. For instance, Arulampalam et al. (2001) have shown that earnings can be 6% lower on re-entry to a job and 14% lower after 3 years—the most damaging spell of unemployment is the first. Clark, Georgellis and Sanfey (2001) also show that past unemployment is correlated with current life satisfaction, an additional dimension to the scarring effect, although Knabe and Ratzel (2011) have recently shown that this effect operates via a fear of future unemployment (See also Bell and Blanchflower 2011, 2012). Mroz and Savage (2007) provide a counter argument suggesting that, whilst earnings growth can be retarded, young workers who experience spells of unemployment can respond by acquiring human capital which reduces the risk of future spells of unemployment.

This paper focuses on the labour market outcomes of the teenage group (16–17 year olds) and assesses the relationships between test scores and truancy behaviour, each of which are a function of school quality, family background and personal characteristics, including a pupil’s tastes for schooling, and the risk of unemployment or NEET several months after leaving school.Footnote 2 There has been considerable debate in the press and amongst politicians in recent years about pupils who persistently miss school due to unauthorised absence (truancy), which has, in several high profile cases, led to fines for parents.Footnote 3 Pupils who miss schooling because of truancy are likely to have lower test scores than pupils who do not miss school. When truancy reaches high levels, such that attendance at school is infrequent, then truants are more like high school dropouts, a phenomenon that has received a lot of attention in the USA and Europe. The consequences for this group of truants in terms of adverse labour market outcomes are likely to be similar to those for high school dropouts. Pupils who truant less frequently, whose behaviour is amenable to teacher interventions, are likely to have different and more positive labour market outcomes, suggesting that we need to disaggregate truants by the frequency of this event. Furthermore, it is not clear whether truancy has a direct effect on the risk of unemployment or NEET. There are several possible mechanisms through which truancy could have a direct effect on the risk of unemployment or NEET. First, a higher propensity to truant could increase the probability of unemployment or NEET insofar as it could act as a negative productivity signal to employers and training providers, providing of course that employers actually receive this signal. However, there is no guarantee that pupils would provide evidence of truancy from school during the job selection process. Second, truants may mix with other young people who are unemployed or inactive, and this peer effect may increase their risk of unemployment or NEET. Third, schooling not only creates human capital in the form of cognitive skills, it also increases soft skills which employers value. Truants could miss out on the acquisition of these soft skills which increases their risk of unemployment or NEET. Fourth, pupils who truant may fail to develop social networks which are useful for finding work, and their lack of motivation at school could be signalled via teacher references to employers, thereby increasing the risk of unemployment or NEET. Finally, a contrary argument is that, pupils who have truanted, could search more intensively for jobs because they have ‘switched off’ school and want to work. Unfortunately, our data do not permit us to investigate these different channels. Nevertheless, if truancy does lead to lower test scores, then there is a possible indirect, and positive, effect of truancy on the risk of unemployment or NEET. In contrast, there is a large literature which demonstrates a strong link between low high school test scores and a higher probability of unemployment and NEET (see Sect. 2).

We argue that because decisions regarding truancy, which could be seen as a proxy for effort at school, and performance in tests affect the subsequent transition from school, then these behavioural outcomes (decisions) are simultaneously determined. To capture this simultaneity, a three-equation model is estimated in which we allow for correlation in unobservables between models for truancy, test scores, both of which are ordered categorical variables, and unemployment (or NEET), which are binary variables. Therefore, unlike virtually all the previous literature, our models acknowledge the endogeneity of truancy and test score performance for early labour market behaviour; specifically, we model the endogeneity of truancy for test scores, and we model the joint endogeneity of test scores and truancy for early labour market behaviour. By building more comprehensive models, we are able to obtain more insight into the determinants of early labour market behaviour, and get closer than other researchers have to uncovering causal effects when using cross-sectional data.Footnote 4

To estimate our model, we use pupil-level data from the Youth Cohort Studies (YCS), specifically YCS6 to YCS12, which cover the period of the early 1990s and early 2000s. To each of these datasets, we append detailed information on the characteristics of the school attended which was obtained from the School Performance Tables and Schools Census.

Closely related papers include Boardman et al (1977), which models the educational process with six simultaneous equations for student’s achievement, motivation, expectations, efficacy, and perceived parents’ and teachers’ expectations. However, the closest paper to our own is that of Buscha and Conte (2013) who estimate a bivariate simultaneous equation model for the discrete ordered responses of student’s truancy and test scores. The Buscha and Conte (2013) paper has the advantage that it allows for a correlation between the unobservables in their model and they allow for the endogeneity of the latent value of truancy on the latent value of test scores using YCS data.

Our paper differs from Buscha and Conte (2013) in a number of respects. We extend their approach by estimating a trivariate model, primarily to explore the direct effect of truancy and the direct test scores on the risk of unemployment and NEET. Also, we present an alternative identification strategy to that proposed by Buscha and Conte (2013); their model relies on local labour market conditions, that is, part-time pay, to identify the latent variable on truancy in the latent model for test scores. It is difficult to believe, however, that local labour market conditions do not have, in their own right, a direct effect on the test scores. Working part-time (e.g. in a supermarket) whilst still at school may encourage students to work harder at school so as to avoid this kind of job post-school. Our approach to identification differs to theirs insofar as we include the actual values, or direct effects, of the endogenous variables, truancy and test scores, in our sub-models for test scores and labour market outcomes (see below for a fuller discussion of our identification strategy). A further difference between the two approaches is that we investigate the direction of ‘causation’ between truancy and test scores on the risk of unemployment and NEET. We also estimate a range of alternative specifications of our system of equations. Finally, we have a richer set of covariates than Buscha and Conte (2013) because we map detailed school-level data on to the pupil-level YCS data.Footnote 5

The findings from our preferred model (Heterogeneous Model 1) suggest that truancy works primarily through test scores (i.e. an indirect effect), having a ‘weak’ direct effect on labour market outcomes. However, truancy also has an unobserved effect on the risk of unemployment and the risk of NEET insofar as the correlation between latent variables for truancy and labour market outcomes are positive and statistically significant. Test scores have a direct effect on labour market outcomes, and through the estimation of ATTs, we show a good performance in high stakes tests (i.e. GCSEs) can mitigate the effect of truanting from school on labour market outcomes. In sum, truancy need not be a significant problem for young people in terms of their post-school outcomes, so long as this behaviour does not reduce test score performance. This makes sense insofar as employers observe test score performance in the job/training selection process whereas they are less likely to observe truancy behaviour. We find no evidence of ‘reverse causality’, i.e. that test scores determine truancy. We draw out the implications for policy in our conclusions.

The remainder of the paper is structured as follows. In Sect. 2 we briefly discuss the existing literature on the determinants of test scores, truancy and unemployment or NEET. This is followed by the specification of our simultaneous model—a trivariate-ordered probit model. Section 4 provides a discussion of the data that is used in our econometric analysis, and in Sect. 5 we present our results. This is followed by our conclusions.

2 A review of the literature

There is a large literature which investigates the determinants of the school-to-work transition, including the risk of unemployment and NEET (see Bradley and Nguyen 2004 for a review). Many of these papers estimate reduced-form, single-equation models, where the role of test scores features prominently as a determinant of a successful school-to-work transition (Lynch 1987; Andrews and Bradley 1997; Crawford et al. 2011; Duckworth and Schoon 2012). Coles et al. (2010) argue that the main determinants of NEET occur pre-school leaving and refer to different forms of ‘educational disaffection and educational disadvantage’. Ermisch and Janatti (2012) go further and argue that low test scores is a key mechanism that perpetuates disadvantage across the generations.

It is also worth noting that Coles et al (2010) see a direct correlation between educational disaffection and the probability of NEET. This is important in our context because educational disaffection refers to involuntary exclusion from school as well as what they refer to as ‘self-exclusion’—truanting from school. Duckworth and Schoon (2012) also find this effect.

School effects on the school-to-work transition have also been identified (Micklewright 1989; Rice 1987, 1999; Dolton et al 1999), as well as gender and ethnic differences are also evident. Non-white girls are more likely to stay on beyond compulsory school leaving age to avoid unemployment (Leslie and Drinkwater 1999), an effect which is largest for Indian and Chinese pupils than for Black Caribbeans (Bradley and Taylor 2002).

Not surprisingly, the probability of staying on at school, and hence avoiding unemployment or NEET, is higher for young people from a professional family background, and much lower if their father is a manual worker (Rice 1987, 1999; Crawford et al. 2011). Young people from single parent families and those with unemployed heads of household also tend to leave school early, partly because of financial constraints on the household, and enter NEET (Coles et al 2010). Duckworth and Schoon (2012) show that for cohorts of school leavers in the 1970s having parents with low education and living in social housing increased the likelihood of NEET. However, this finding disappears for cohorts of school leavers in the 1990s, suggesting some degree of educational mobility in more recent years.

In terms of the determinants of test scores and truancy, many studies have shown that a similar set of variables influence these outcomes. Family background is of prime importance as a determinant of test scores (Hanushek 1986, 1992). Dustmann et al. (1998) distinguish between financial and time resources allocated to the child. Financial resources enable parents to choose better schools for their child and provide a more suitable environment for studying, whereas time resources are related to the help given in explaining homework, for instance. These effects are often proxied by a wide range of parental and household variables, which also affect truancy behaviour. There are clear differences in the effect of parental occupation on test scores and truancy (Feinstein and Symons 1999; Bosworth 1994; Ermisch and Francesconi 2001; Fuchs and Woessmann 2004). Pupils with parents in professional occupations, for instance, have higher test scores and a lower probability of truanting, whereas pupils whose parents are in manual occupations are significantly more likely to be absent from school. Experience of life in a single parent family reduces test scores and increases the probability of truanting (Bosworth 1994; Ermisch and Francesconi 2001; Robertson and Symons 1996). The structure and state of the local labour market also play a part in determining test scores and truancy. For instance, McIntosh (2001) investigates the effect of labour market conditions on transitions into training and finds only a small effect, whereas expected returns to continued schooling and prior academic attainment are more important determinants.

Steele et al. (2007) investigate the effect of a school’s resources on pupil test scores, and find that expenditure per pupil and the pupil–teacher ratio, which captures average class size, effect test score performance in mathematics and science. More generally, Gibbons and McNally (2013) review the evidence on the causal relationship between school resources, including class size, and test scores.

In sum, there is a considerable literature on the school-to-work transition and on the determinants of test scores, though there is less analysis of truancy behaviour and post-school outcomes. Few papers have analysed the determinants of the risk of NEET. Much of the existing literature finds that a similar set of covariates ‘determine’ the school-to-work transition and schooling outcomes which makes the identification of a system of equations more challenging. However, the more recent, and smaller, literature has sought to advance the literature by estimating systems of equations, and it is in this context that the current paper should be seen.

3 The data and institutional background

3.1 Institutional background

During the period of our study, young people could leave compulsory schooling at the age of 16 and could then continue into further education, enter the labour market for work or a government sponsored training programme, or become unemployed or economically inactive. Selection for entry into jobs with training, for instance apprenticeships, or government sponsored training programmeswith good prospects started before young people left school and start dates typically occured up to September of any particular year.Footnote 6 Similarly, young people who wished to continue onto further education also gained admission prior to leaving school and also started around September. The remaining young people who entered the labour market had to compete for available jobs, usually with relatively poor career prospects, or enter government sponsored training programmes where career development was equally uncertain. Consequently, for these young people entry to spells of unemployment were common and, in the absence, of unemployment benefits for 16–18 year olds, some would simply give up searching and become economically inactive. The inactive would often re-enter the labour market and re-commence their search for work or a ‘good’ Youth Training scheme. It is therefore important to consider this group in our analysis, as well as those registered as unemployed, who together comprise the NEET group.

3.2 The data

The data used in the following analysis have been obtained from several sources. First, pupil-level data are extracted from the Youth Cohort Study (YCS) for England and Wales, which refers to Cohorts 6–12, covering the time period 1989–90 to 2000–01. The YCS is a nationally representative, longitudinal, sample survey, typically with three waves, and respondents complete a questionnaire at each wave and the information covers the age range 15/16–18/19.Footnote 7 The YCS contains detailed information on the young person’s family background, personal characteristics as well as their propensity to truant, their test scores in GCSE subjects and their destination post-school, that is, whether they are employed, unemployed, in training or further education, or whether they are economically inactive. The latter is an heterogenous group including those young people who are caring for family members, for instance. We regard the NEET group as a joint category for the economically inactive and unemployed young people.

Second, we map additional school information not present in the YCS to each pupil, which is obtained from the School Performance Tables and the School Census, both of which were obtained from the Department for Education and Skills (DfES). The School Performance Tables contain information about the type of school, the number of pupils and the gender composition, whereas the Schools Census provides additional information on the proportion of qualified teachers, support staff hours and the proportion of pupils on free school meals. From this data, we are able to construct measures of school background and quality, as well as the pupil’s peer group.

The dataset also contains information on truancy at school, test score and labour market outcomes for nearly 70,000 young people, which is a major strength of these data when compared to other survey-based datasets.

Test scores are recorded for all of the GCSE subjects that a young person studies, not all of which are eventually examined, and graded from ‘non-exam/fail’ to ‘A*’. We combine the grade and number of GCSE subjects studied to form an ordinal scale of test scores, and our classification system has the advantage that it covers the full range of the ability distribution, including the category ‘5 or more GCSE grades A* to C’. At the pupil level, this is a very important threshold because performance at this level, in addition to successful study at A Level, permits entry to University, whereas at a school level the higher the proportion achieving in this category, the more ‘successful’ the school is deemed to be. The propensity of a pupil to truant is measured on an ordinal scale ranging from ‘never truant’ to ‘truants for weeks at a time’. Table 1 shows the relationship between the frequency of truancy and test scores for males and females separately.

Table 1 The relationship between pupil test scores and truancy

There is an almost monotonic increase in the level of test score performance as the frequency of truancy decreases and there appears to be a significant break in this relationship between ‘Particular days’ and ‘Several days’. In general, Table 1 does suggest a very clear negative relationship between the frequency of truancy and test scores—higher truancy is associated with lower test scores. It should be noted, however, that the number of observations in some of the cells in Table 1 is relatively small—for males, see category ‘Never’ truant and test score ‘None’, and for females truanting ‘Weeks at a time’ and test scores ‘10+ A*–C’. We address this issue in Sect. 4.

The risk of unemployment or NEET doubles as the level of truancy increases (see Table 2). For instance, with respect to risk of unemployment, compare truancy category ‘Particular days’, where the proportion unemployed is 7.2% for females, whereas the category ‘Several days’ is 14%. Similar effects are found for males. However, for both males and females, at higher levels of truancy the rate of increase in the risk of unemployment and NEET slows down. The risk of unemployment and NEET does differ between males and females and is almost always greater for males. For instance, for those pupils who truant for weeks at a time, the risk of unemployment is 8 percentage points higher for males, whereas for NEET it is 5 percentage points higher.

Table 2 Truancy behaviour and labour market outcomes

Table 3 shows a clear negative relationship between the level of test scores and the risk of unemployment and NEET. In fact, these risks fall close to zero for the very highly qualified simply because they have more options after leaving school, such as college or employment. This is not the case for the unqualified where the risk of unemployment or NEET after leaving school is between 20–33%.

Table 3 Test scores and labour market outcomes

In Table 4, we show the combined effect of truancy and test scores on labour market status, separately for females (Panels A and B) and males (Panels C and D). Panels A and C report the risk of unemployment, where the risks are calculated row-wise implying a direct relationship between truancy and test scores. Panels B and D show the risk of NEET. The pattern of risks in Table 4 now differs to those reported in Tables 2 and 3. For instance, the risk of unemployment for the unqualified (Test scores = ‘None’) who have never truanted is 7.3% for females and 9.8% for males, as compared with 1.4% and 2.2% in Table 2. In contrast, unqualified females who truant for weeks at a time have a risk of unemployment of 63.6% and a risk of NEET of 61.2%, which are far higher than those reported in Tables 2 and 3 for the two-way cross-tabulations. The corresponding figures for males are slightly higher—66.7% and 63%. As we move up the test score distribution, there is wider variation in terms of the risk of unemployment and NEET than is implied by estimates in Table 3, simply because of the additional effect of truancy behaviour on those risks. For instance, Table 3 shows that the average risk of unemployment for young people with 5–9 GCSE grades A*–C is around 0.9%, and 2.9% for the risk of NEET. However, Table 4 shows that for females the risk of unemployment with 5–9 GCSE grades A*–C ranges from 2.8% to 19.8%, depending on whether the young person had never truanted versus those who truanted for weeks at a time. In terms of the risk of NEET, the corresponding figures are 3% to 32.8%. Similar findings are observed for males. These findings suggest that the relationship between truancy, test scores and labour market status are complex, and there are differences in the effects of truancy and test scores on the risk of unemployment and the risk of NEET.

Table 4 The relationship between truancy and test scores by labour market status

Appendix A”, Table 9, contains the sample proportions for the explanatory variables used in the statistical models. The control variables used in our analysis refer to personal, family, school and cohort effects, reflecting the education production process.The analysis of the data has been carried out for males and females separately (see below for a justification).

4 Statistical methodology

4.1 The relationship between test scores, truancy and unemployment or NEET

Our literature review shows that very few studies have examined the effects of test scores and truancy on the risk of youth unemployment or NEET in a simultaneous equations framework. In this section, we discuss the possible relationships between these three variables.

Two effects of truancy on test scores can be identified. There is a direct effect, whereby repeated absence from school leads to the acquisition of less knowledge, culminating in lower test scores. Since we observe in the data the incidence and duration of truancy, we can measure this effect on test scores. However, it is likely that truancy also reflects a latent, unobservable, negative attitude to schooling, such as a dislike of studying and of school discipline or school ethos. Moreover, whilst it is highly likely that truancy will reduce test scores, its effect on the risk of unemployment or NEET is ambiguous (see the Introduction). Truancy could be treated as a negative signal of productivity, hence increasing the risk of unemployment and inactivity, or truants dislike of school could reflect a strong desire to work or train (see Mroz and Savage 2007), leading to increased search effort and hence a lower of risk of unemployment or NEET. Nevertheless, truancy could still affect the risk of unemployment and NEET indirectly via its effect on test scores. As the literature review shows, the effect of lower test scores on the risk of unemployment is well documented, but less so with respect to the risk of NEET. Our modelling strategy attempts to identify these direct, indirect and unobserved effects on the risk of unemployment and NEET.

The general specification of the model with latent variables \(\left( {Y_{ti}^{*} ,Y_{ei}^{*} ,Y_{ni}^{*} } \right)\) is as follows:

$$Y_{ti}^{*} = \beta_{1t}^{\prime} \left( {{\mathbf{Family}}_{i} ,{\mathbf{Personal}}_{i} ,{\mathbf{School}}_{i} ,{\mathbf{Place}}_{i} } \right) + \epsilon_{ti} = \eta_{ti} + \epsilon_{ti}$$
(1)
$$Y_{ei}^{*} = \beta_{1e}^{\prime} \left( {{\mathbf{Family}}_{i} ,{\mathbf{Personal}}_{i} ,{\mathbf{School}}_{i} ,{\mathbf{Place}}_{i} } \right) + \mathop \sum \limits_{k = 2}^{k = 3} \gamma_{ek} Y_{ti}^{k} + \epsilon_{ei} = \eta_{ei} + \mathop \sum \limits_{k = 2}^{k = 3} \gamma_{ek} Y_{ti}^{k} + \epsilon_{ei} ,$$
(2)
$$\begin{aligned} Y_{ni}^{*} & = \beta_{1n}^{\prime} \left( {{\mathbf{Family}}_{i} ,{\mathbf{Personal}}_{i} ,{\mathbf{School}}_{i} ,{\mathbf{Place}}_{i} } \right) + \mathop \sum \limits_{k = 2}^{k = 3} \gamma_{nk} Y_{ti}^{k} + \mathop \sum \limits_{l = 2}^{l = 4} \theta_{nl} Y_{ei}^{l} + \epsilon_{ni}\\ & = \eta_{ni} + \mathop \sum \limits_{k = 2}^{k = 3} \gamma_{nk} Y_{ti}^{k} + \mathop \sum \limits_{l = 2}^{l = 4} \theta_{nl} Y_{ei}^{l} + \epsilon_{ui} \\\end{aligned}$$
(3)

where \(Y_{ti}^{*}\) refers to a pupils latent propensity to truant, \(Y_{ei}^{*}\) is the latent test score performance of the young person, and \(Y_{ni}^{*}\) represents the latent probability of becoming unemployed or entering NEET on completion of schooling. The \(\left( {\epsilon_{ti} ,\epsilon_{ei} ,\epsilon_{ni} } \right)\) are from a trivariate standard normal distribution with correlation matrix \(\varSigma_{\text{tel}}\), implying that the observed responses \(\left( {Y_{ti} ,Y_{ei} ,Y_{ni} } \right)\) are from a trivariate-ordered probit model. We have used \(\eta_{zi}\) to represent the linear predictors, \(\left( {z = t,e,n} \right)\) of the exogenous covariates \(\left( {{\mathbf{Family}}_{i} ,{\mathbf{Personal}}_{i} ,{\mathbf{School}}_{i} ,{\mathbf{Place}}_{i} } \right).\) The exogenous covariates in the model represent a standard set of variables included in many specifications of the so-called education production function (see “Appendix A” for variables and descriptive statistics). The linear predictors do not contain constants as these are not identified.

4.2 The trivariate model

The YCS data contain 5 levels of school truancy \( \left( {Y_{t} } \right) \) at age 16 for student \( i \) at school, as follows:

Response \( Y_{ti} \)

Description

1

Never truant

2

Odd days

3

Particular days

4

Several days

5

Weeks at a time

Truancy from school is self-reported. The ordered nature of the observed \(Y_{ti}\) suggests we treat it as an ordered response with 5 categories. In addition to modelling truancy as a response, we also treat truancy as an endogenous variable in the linear predictors for the models for test scores at age 16 \(\left( {Y_{ei} } \right)\) and subsequent unemployment and NEET \(\left( {Y_{ni} } \right)\) models. To simplify the joint estimation of the endogenous truancy effects and the correlations in the random effects, we use a reduced number of endogenous dummy variables in the linear predictors for \(Y_{ei}\) and \(Y_{ni}\). Specifically, we combine categories 2 and 3 and also categories 4 and 5, with \(Y_{ti}^{1}\) taken as the reference category. Given the relatively low number of observations in some categories (see Sect. 3), this should ensure that we obtain more precise estimates. Note, we do not collapse the categories for the response variable because these are representations of our underlying latent variable—the propensity to truant.

In the UK, a pupil’s performance at school is typically measured by the level of attainment in public examinations. In this paper, test scores refer to the number of, and grade in, the General Certificate of Secondary Education (GCSE), obtained from the YCS, which is classified into one of six levels \( l \) of educational attainment at age 16 \( \left( {Y_{e} } \right) \), these are as followsFootnote 8:

Response \( Y_{ei} \)

Description

1

no GCSEs

2

1–4 D-G GCSEs

3

5+ D-G GCSEs

4

1–4 A*–C GCSEs

5

5–9 A*–C GCSEs

6

10+ A*–C GCSEs

The nature of \(Y_{ei}\) suggests that we treat it as an ordered response with six categories. Again, to simplify the joint estimation of the endogenous test score effects and correlation in the random effects of the other responses, and to ensure that we obtain more precise estimates, we also use a reduced number of dummy variables for GCSE scores in the linear predictors for \(Y_{ni} .\) Specifically, for \(Y_{ei}\) we combine categories 1 and 2 together, which becomes \(Y_{ei}^{1}\), and categories 5 and 6 together, where \(Y_{ei}^{1}\) is treated as the reference category. As for truancy, we do not collapse the categories of the response variable.

In this analysis, we will use two levels of response for post 16 labour market outcomes \( \left( {Y_{n} } \right) \) at age 16 for individual \( i, \) as follows:

Response \( Y_{ni} \)

Description

1

Education, employment or training

2

Unemployed or NEET

The nature of \(Y_{ni}\) means we can treat it as an ordered response with just 2 categories (binary). Clearly, we do not treat labour market outcomes as an endogenous variable in the models for truancy and educational attainment.

There are various joint models that can be used for trivariate-ordered responses, the most widely used assumes that observed responses \(\left( {Y_{ti} ,Y_{ei} ,Y_{ni} } \right)\) are obtained from underlying normally distributed variables \(\left( {Y_{ti}^{*} ,Y_{ei}^{*} ,Y_{ni}^{*} } \right)\). The continuous latent variables, e.g. \(Y_{ti}^{*}\), are observed in one of the (in this case \(K = 5)\) categories through a censoring mechanism, that is:

$$\begin{aligned} Y_{ti} & = 1\quad {\text{if}}\quad c_{t0} < Y_{ti}^{*} \le c_{t1} \\ Y_{ti} & = 2 \quad {\text{if}}\quad c_{t1} < Y_{ti}^{*} \le c_{t2} \\ Y_{ti} & = 3\quad {\text{if}}\quad c_{t2} < Y_{ti}^{*} \le c_{t3} \\ Y_{ti} & = 4 \quad {\text{if}}\quad c_{t3} < Y_{ti}^{*} \le c_{t4} \\ Y_{ti} & = 5\quad {\text{if}}\quad c_{t4} < Y_{ti}^{*} \le c_{t5} \\ \end{aligned}$$

where the \(c_{tk} ,\)\(k = 1, \ldots ,5\) are finite cut points or thresholds of the latent variable \(Y_{ti}^{*} ,\) with \(c_{t0} = - \infty\), and \(c_{t5} = \infty\). In this paper, we assume that the cut points \(\left( {c_{tk} ,c_{el} ,c_{nm} } \right)\) do not vary across individuals \(\left( i \right).\) Ordered responses based on latent variables can be given a utility maximisation interpretation, see Bhat and Pulugurta (1998).

As suggested earlier, there are likely to be unobserved effects that determine truancy, test scores and the transition from school, such as attitudes to school discipline and ethos, summarised as ‘tastes for schooling and motivation. These unobserved effects may bias the estimates of the variables of interest—truancy and test scores. Therefore, to disentangle the observable direct and indirect effects from the unobservable effect requires the simultaneous estimation of Eqs. 13 where test scores and truancy are treated as endogenous variables in our models of the risk of unemployment or NEET. This is shown in Eq. 4.

The probabilities of the observed responses for truancy, \(Y_{ti}\), test scores, \(Y_{ei}\) and labour market outcomes, \(Y_{ni}\), are given by a triple integral which does not have a closed form, so for example if \(Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2\) then this individual’s contribution to the likelihood is given by:

$$L_{i} = \Pr \left[ {Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2} \right] = \mathop \int \limits_{{c_{t1} - \eta_{ti} }}^{{c_{t2} - \eta_{ti} }} \mathop \int \limits_{{c_{e2} - \eta_{ei} - \gamma_{e2} }}^{{c_{e3} - \eta_{ei} - \gamma_{e2} }} \mathop \int \limits_{{c_{n1} - \eta_{ni} - \gamma_{n2} - \theta_{n3} }}^{{c_{n2} - \eta_{ni} - \gamma_{n2} - \theta_{n3} }} \phi \left( {\epsilon_{t} ,\epsilon_{e} ,\epsilon_{n} ;\varSigma_{ten} } \right){\text{d}}\epsilon_{n} {\text{d}}\epsilon_{e} {\text{d}}\epsilon_{t} .$$

where \(\phi \left( {\epsilon_{t} ,\epsilon_{e} ,\epsilon_{n} ;\varSigma_{ten} } \right)\) is a trivariate standard normal density function with the \(3 \times 3\) correlation matrix \(\varSigma_{ten} .\) Note that \(\gamma_{ek}\) and \(\gamma_{nk}\) refer to the direct effects of truancy on test scores and the direct effect of truancy of labour market outcomes, respectively. \(\theta_{ne}\) refers to the direct effect of test scores on labour market outcomes. The log likelihood for all individuals is then

$$\log L = \mathop \sum \limits_{i = 1} \log L_{i}$$
(4)

The log likelihood is maximised to provide the parameter estimates using CMP in Stata 14 (Roodman 2011). To evaluate the three-dimensional cumulative normal distributions, CMP uses simulated likelihood methods, specifically the Mata function, that is, the Geweke–Hajivassiliou–Keane algorithm (\(ghk2()\)) with Haldane sequences (Geweke 1989, Hajivassiliou and McFadden 1998). For instance, for Heterogeneous Model 1 for the NEET and unemployment outcomes, this involves 390 draws per observation for females, and 355 draws per observation for males. We let CMP decide on the number of draws to be used. As a check on the adequacy of the simulated likelihood approach, we also evaluated the log-likelihoods at the solutions using the NAG Fortran Library.Footnote 9

4.3 Alternative specifications

Equations 13 are estimated several times, each with a slightly different specification, and we refer to these as models. In the Homogenous Model, we actually estimate Eqs. 13 separately which is consistent with much of the existing literature. We then estimate several different versions of Eq. 4. In model 1, we allow for correlation between the random effects of each sub-model and include all direct effects of endogenous variables. In model 2, we drop the direct effect of truancy on youth unemployment (and NEET), which means that the impact of truancy behaviour at school on labour market outcomes is picked up via its effect on test scores (the indirect effect) and through the unobserved effects. Model 3 drops the unobserved effect in the truancy equation. By dropping the direct effect of truancy on unemployment (and NEET) and the correlation between Eqs. 1 and 3, we can determine whether test scores play a more important role than truancy. Model 4 takes a different approach. In this model, we re-introduce the direct effect of truancy in Eq. 3 and the correlation between the unobservables in Eqs. 1 and 3; however, we explore the possibility of reverse causation between test scores and truancy. In this case, \(Y_{e}\) is inserted in Eq. 1 and \(Y_{t}\) is dropped from Eq. 2.

Reverse causality could arise if pupils who systematically fail at school eventually reduce effort and start to truant. This is plausible given the ‘teaching to test’ that has arisen since the introduction of competition between schools and School Performance Tables following the 1988 Education Reform Act. However, we argue that this reverse causality should be less of an issue in our data for two reasons. First, our measure of test score is a summative statement of performance measured primarily at the end of compulsory schooling at age 16 when pupils sit for their GCSE examinations, whereas our measure of truancy refers to behaviour between the ages of 14 and 16. Second, it is more likely that poor performance in coursework could increase the incidence of truancy because this does contribute to the final GCSE grades. But, performance in tests in GCSE subjects is still weighted heavily and this implies that truancy behaviour will therefore affect overall performance in GCSE examinations at age 16. Nevertheless, we do investigate the issue of the direction of causation between \(Y_{ti}\) and \(Y_{ei}\) in our modelling.

As we show below, our preferred model is Heterogenous Model 1.

4.4 Identification

Identifiability in structural equation models with discrete outcomes has been widely discussed. Wilde (2000) shows that the existence of one varying exogenous regressor in each equation is sufficient to avoid small variation identification problems in multiple equation probit models with endogenous dummy regressors. Thus, in our models we include the actual measures of truancy, \(Y_{ti}^{k}\), albeit in collapsed form for reasons explained above, for our model of educational attrainment \(Y_{ei}^{*}\) and early labour market behaviour \(Y_{ni}^{*}\), and similarly with the effect of dummy endogenous variables for the actual test scores or GCSE levels, \(Y_{ei}^{l}\), in our models for early labour market behaviour \(Y_{ni}^{*} .\)Footnote 10 Wilde also notes that his result applies to other distributions besides the normal (multivariate probit), but these models have some additional constants in them. Our approach implies that we rule out reverse causality between \(Y_{ti}^{k}\) and \(Y_{ei}^{l}\) from the outset; however, there are two reasons why this is not the case and hence why it is important to explore the issue of reverse causality in these data. First, we can regard \(Y_{ti}^{k}\) and \(Y_{ei}^{l}\) as joint, or mutually dependent, measures of educational experience and outcomes. The GCSE outcome is the final result of studying between the ages of 14–16, in the same way that truancy reflects behaviour which translates into an ’educational experience’ between the ages of 14–16. Ideally, we would like to allow for reciprocal dependence, rather than recursive dependence, as in our models. However, reciprocal multivariate probit models with dummy endogenous variables cannot be identified whatever instruments one might use (Heckman 1978). As a compromise, we estimate alternative models in which either \(Y_{ti}^{k}\) or \(Y_{ei}^{l}\) are treated as predetermined. However, our prior is that \(Y_{ei}^{l}\) occurs at the end of the education production process, whereas \(Y_{ti}^{k}\) is an intermediate part of that process. Second, although we do not base our identification strategy on this, there are measures of school-level and neighbourhood-level truancy in our model for \(Y_{ti}^{k}\) which are excluded from our model for \(Y_{ei}^{l}\).

In sum, our approach to identification is analogous to that of Elbers and Ridder (1982) and Heckman and Singer (1984) for heterogeneity in duration data. Other papers in the education economics literature that adopt the same identification strategies as ours are Evans and Schwab (1995) and Neal (1997), whereas Goldman et al (2001) is an example from the statistics literature.

4.5 Measuring the average treatment effect on the treated

It is not straightforward to estimate marginal effects holding everything else constant for the endogenous covariates in our modules, because of the correlation between responses. To aid the interpretation of the endogenous effects, we compute the average treatment effect on the treated (ATT). In our case, the levels of test scores \(\left( {Y_{ei}^{l} } \right)\) and truancy \(\left( {Y_{ti}^{k} } \right)\) are different treatment effects for unemployment and NEET \(\left( {Y_{ni} } \right)\). To obtain the treatment effects for the expanded levels of truancy and test scores, we use the joint model for the various observable treatments \(\left( {Y_{ti} ,Y_{ei} } \right)\), and the unobservable counterfactual treatments for the same unemployment response. This is given by setting the parameters for the endogenous effects \(\left( {\gamma_{nk} ,\theta_{nl} } \right)\) to zero. For our example, with \(Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2\) we have:

$$\Pr \left[ {Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2|\gamma_{n2} = 0,\theta_{n3} = 0} \right] = \mathop \int \limits_{{c_{t1} - \eta_{t} }}^{{c_{t2} - \eta_{t} }} \mathop \int \limits_{{c_{e2} - \eta_{e} - \gamma_{e2} }}^{{c_{e3} - \eta_{e} - \gamma_{e2} }} \mathop \int \limits_{{c_{n11} - \eta_{n} }}^{{c_{n2} - \eta_{n} }} \phi \left( {\epsilon_{t} ,\epsilon_{e} ,\epsilon_{n} ;\varSigma_{ten} } \right){\text{d}}\epsilon_{n} {\text{d}}\epsilon_{e} {\text{d}}\epsilon_{t} .$$

The joint probability of the \(\left( {Y_{tki} ,Y_{tli} } \right)\) treatment is:

$$\Pr \left[ {Y_{ti} = 2,Y_{ei} = 3} \right] = \mathop \int \limits_{{c_{t1} - \eta_{t} }}^{{c_{t2} - \eta_{t} }} \mathop \int \limits_{{c_{e2} - \eta_{e} - \gamma_{e2} }}^{{c_{e3} - \eta_{e} - \gamma_{e2} }} \phi \left( {\epsilon_{t} ,\epsilon_{e} ;\varSigma_{te} } \right){\text{d}}\epsilon_{e} {\text{d}}\epsilon_{t} .$$

where \(\varSigma_{te}\) is the \(2 \times 2\) correlation matrix for \(\left( {\epsilon_{t} ,\epsilon_{e} } \right).\) The treatment effect on the treated, i.e. when \(Y_{ti} = 2,\) and \(Y_{ei} = 3,\) for individual \(i\) is

$$TT_{23i} = \frac{{\Pr \left[ {Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2} \right] - \Pr \left[ {Y_{ti} = 2,Y_{ei} = 3,Y_{ni} = 2|\gamma_{n2} = 0,\theta_{n3} = 0} \right]}}{{\Pr \left[ {Y_{ti} = 2,Y_{ei} = 3} \right]}}$$

This estimate of the treatment effect varies by young person, \(i\), because the exogenous covariates vary with \(i.\) The sample average of the treatment effects (e.g. when \(Y_{ti} = 2,\) and \(Y_{ei} = 3\)) gives the average treatment effect for unemployment or NEET \(\left( {Y_{n} } \right)\) on the treated (in this example ATT23). The reference groups for the endogenous dummy variables are \(Y_{ti}^{1} ,\)\(Y_{ei}^{1}\) and \(Y_{ei}^{2} ,\) so ATT11 = ATT12 = 0. The reference categories act as a control group.

5 Econometric results

We begin our discussion of the results by first discussing our preferred models and the reasons for this selection. This will allow us to focus our discussion of the results on particular models, whilst also comparing the findings from our preferred models with other models. Table 10 in “Appendix B” compares the various models and shows the results of a number of likelihood ratio tests. Based on the likelihood ratio tests, it is clear that Heterogeneous Model 1 substantially outperforms the Homogenous Model for males and females, and for the unemployment and NEET outcomes. This is a significant finding in the sense that much of the existing literature has ignored the simultaneous nature of the relationship between truancy, test scores and early labour market outcomes. However, Table 10 also compares each of the other heterogenous models with Heterogeneous Model 1, and the model which comes closest is Heterogeneous Model 2. In this model, we drop the direct effect of truancy on unemployment and NEET; however, the likelihood ratio test shows that Model 2 is still rejected against Model 1, albeit marginally for the male unemployment model and the female NEET models. Models 3 and 4 are easily rejected when compared with Heterogeneous Model 1, and so we focus our discussion of the results on this model drawing a particular comparison with Model 2, where appropriate.

Table 11 in “Appendix B” tests for differences between the male and female models. It is clear from the likelihood ratio tests that there are statistical differences between the male and female models in all cases, but the combined Heterogenous Model 1 still outperforms the combined Homogenous Model. Together with the well-established behavioural differences between male and female students vis-a-vis test score performance (see Andrews et al. 2004), it is clearly important to also perform this analysis separately for males and females.

5.1 The determinants of test scores

The main focus of this paper is on the effects of test scores and truancy on the probability of unemployment or NEET several months after leaving school. However, given that we estimate a system of equations it is important to briefly assess the sub-model for test scores.

Table 12, “Appendix B”, reports the effects of truancy on test scores for males/females and by NEET/unemployment outcomes. This Table shows that the endogenous truancy indicators on test scores change from negative to positive when we allow for a correlation in the unobservables of truancy and test scores, that is, in models 1, 2 and 3. This feature of these models may seem counter-intuitive; however, the correlation between the unobservable components of the latent variables is large and negative, and for Heterogenous Model 1 the correlation ranges from − 0.464 to − 0.512 for the male/female—unemployment/NEET models (see Table 5). This implies that the overall relationship, combining the direct and correlated effects between truancy and test scores will still be negative in our preferred model. Stated differently, the positive direct effects of the endogenous truancy indicators are not large enough to dominate the large negative correlation in the unobservables for our preferred model, and indeed for all heterogenous models.

Table 5 Unemployment Ordered Probit Model Results

It is worth noting that in model 4, we include our test score variable to investigate whether test score performance affects truancy, i.e. causality runs in the opposite direction. In general, our estimates (not reported) are statistically significant and suggest that pupils who (ultimately) achieve higher test scores are less likely to truant. This implies that causation could run from test scores to truancy, however, as suggested in Table 10, “Appendix B”, this model does not fit the data as well as Heterogenous Model 1, hence we reject model 4.

5.2 The direct effects of test scores and truancy on the risk of unemployment and NEET

Tables 5 and 6 show the estimated effects of truancy and test scores on the probability of a young person becoming unemployed (Table 5) or entering the NEET category (Table 6) several months after leaving school. The estimated effects for the homogenous model are fairly standard findings in the cross-sectional literature and so are a good place to start the discussion of our findings.

Table 6 NEET-ordered probit model results

In the Homogenous Model, higher levels of truancy increase the probability of unemployment and NEET for both males and females. Conversely, the higher the pupils test scores the lower the likelihood of unemployment and NEET, which is consistent with the view that these pupils have more choices after leaving compulsory schooling insofar as they can continue their education, enter a training programme or get a job. These effects can be regarded as direct effects of truancy and test scores on labour market outcomes. However, for the Homogenous Model, truancy also reduces test scores (not shown), and so there is also an additional indirect effect of truancy on the probability of a young people becoming unemployed or NEET. The total effect of truancy on labour market outcomes would therefore be underestimated by simply looking at its direct effect.

Of course these homogenous models do not take account of the effect of unobservables, which imply correlations in the latent variables of truancy, GCSE, and unemployment and NEET. Tables 5 and 6 also report the estimated direct effects of truancy and test scores on youth unemployment and NEET for the various heterogeneous models. Recall, our preferred model is Heterogeneous Model 1. For this model, the direct effect of test scores remains negative and statistically significant, but for females in particular, the estimates are smaller than the equivalent estimates from the homogenous model. Allowing for the correlations in the unobservables of models 1–4, \(\left( {\varSigma_{ten} } \right),\) may therefore produce more precise estimates of the direct effect of test scores on the risk of entering unemployment or NEET on leaving school. The estimated effects for test scores in Heterogenous Model 1 do also differ in the unemployment and NEET models.

Turning to the direct effect of truancy, we observe that for males in both the unemployment and NEET models the effects are negative and generally insignificant (see Heterogeneous Model 1). For females at least one of the estimated effects of truancy, \(Y_{t45}\), is positive but is statistically insignificant. Interestingly, for the lower levels of truancy, \(Y_{t23}\), the effects are negative and statistically significant for females in the case of unemployment, which is also the case for males with respect to the risk of NEET. Thus, the effects for truancy are more mixed when compared with the test score results. A more parsimonious model may therefore be one that drops the direct effects of truancy (see Heterogeneous Model 2), which implies that the effect of truancy only works through test scores (an indirect effect) and through the correlations (the unobservables). There is little evidence for this indirect effect in Heterogenous Model 2 for females, where the correlations for Rho_tn are almost identical when compared to those for Heterogeneous Model 1, whereas in the case of males the size of the correlations actually fall. Similarly, the size of the estimates for truancy in the test score models (see Table 12) are almost identical in Heterogenous Models 1 and 2, suggesting that the indirect effects of truancy on labour market outcomes are unchanged. In view of this, and given our likelihood ratio tests where we reject Model 2, we argue that the direct truancy effects are at least jointly significant and should be retained.

Tables 5 and 6 also report the correlations between the unobservables in the various branches of the model, which pick up the effect of unobservable differences between students, e.g. differences in motivation. We compare the correlations for the unemployment outcome for females and males, respectively, in Table 5. Table 6 shows the equivalent results for NEET. What is clear for Heterogenous Model 1 is that, when one compares the estimates of the correlations for the unemployment (Table 5) and the NEET (Table 6) outcomes for each gender, there are only small differences.

In both Tables 5 and 6 for Model 1, there is a negative and statistically significant correlation between the unobserved effects on truancy and test score sub-models (see \({\text{Rho}}_{te}\)). Pupils who are unobservably more likely to truant, perhaps because they are demotivated by school, are also unobservably less able and so their test scores are lower. (This effect is almost identical in terms of magnitude for models 2 and 3.) With regard to the correlations between unobservables for the truancy and unemployment and NEET models (see \({\text{Rho}}_{tn}\)), the estimates for Heterogeneous Model 1 are positive and statistically significant, suggesting that students who are unobservably more likely to truant are more likely to become unemployed or NEET. A lack of motivation at school translates into poor entry into the job market, possibly because of poor motivation to find a job or training place, or because employers are able to screen out such youngsters during the selection process. (Again this result is consistent—see models 2 and 4.) Finally, we consider the correlations between the unobservables for examination scores and the probability of unemployment of NEET (see \({\text{Rho}}_{en}\)). There is some variation in the estimated correlations; however, the correlations between the unobserved effects are negative and statistically significant. Unobservably more able students are less likely to become unemployed or economically inactive.

In this analysis, unobservable effects do matter, and in terms of the magnitudes of the effects that we estimate, it is \({\text{Rho}}_{tn}\) that has the largest effect. Disentangling these effects allows us to tell a richer story of the relationships between truancy behaviour and test score performance at school on the one hand, and their impact on labour market outcomes on the other.

5.3 Calculating the magnitude of the main effects of truancy and test scores

We turn now to the estimation of the ATTs for our preferred model—Heterogeneous Model 1—in order to gauge how important, from a quantitative perspective, the effect of truancy and test scores are for the risk of unemployment and NEET. Our approach allows us to calculate the ATTs for the observed levels of truancy (5 levels) and test scores (6 levels). For example, the ATT for the unemployed is the probability of a flow into unemployment and is obtained by estimating a model where the test score variable is initially nonzero. We then estimate a model where the test score variable is set to zero, and the ATT is the conditional difference between the two. This is repeated for the NEET category. The ATTs are calculated for the categories of the response variables, they reflect the direct, indirect and unobservable effects of the endogenous effects on these categories.

Tables 7 and 8 report the estimated ATTs for the unemployment and NEET models, for males and females separately. There are differences in the ATT effects between these groups, however, in all cases the effects of test scores and truancy are negative. The effects can be interpreted as follows: for a particular level of test score and truancy, and when compared with the control group, the negative effect is to reduce the risk of unemployment when compared to the base category. This implies that the effect of test score dominates the effect of truancy, that is, we can think of test score performance as compensating for poor attendance. This is best seen by looking at low levels of truancy (e.g. ‘Odd days’). For females, Table 7, Panel A reports the ATTs for unemployment, and the compensating effect of test scores is modest, as expected, because truancy at this level is not so much of a problem (e.g. the ATT for test scores = ‘5+D-G’ is − 0.069 versus test scores = ‘10+ A*–C’ is − 0.100), whereas for truancy = ‘Weeks at a time’ and test scores = ‘5+ D-G’ the ATT is − 0.113 versus an ATT of − 0.236 where test scores=‘10+ A*–C’.

Table 7 Estimated ATTS for Model 1, unemployment
Table 8 Estimated ATTS for Model 1, NEET

The ATT effects for unemployed males (Table 7, Panel B) are higher than those for females, whereas for NEET models (Table 8) the ATTs are similar in magnitude for males and females. Note, however, that truancy also impacts indirectly via its effect on test scores and through the correlation in the unobservables. Nevertheless, it is still the case that the test score effect dominates the truancy effect in terms of labour market outcomes. Of course, this is not to deny that reducing truancy is important; reducing truancy is likely to improve test scores which in turn improves labour market prospects.

6 Conclusion

In this paper, we investigate the effects of test scores and truancy behaviour on the labour market outcomes of teenagers in England and Wales. We also investigate the interdependencies, and the direction of causation, between truancy behaviour and test score performance. This is because it may be that truancy has a direct effect on the risk of unemployment or NEET amongst young people as well as an indirect effect via the effect of truancy on test scores. Truancy could be regarded as a proxy for effort at school, and performance in tests, affects the subsequent transition from school, then all three behavioural outcomes (decisions) are jointly determined. Consequently, to capture this joint nature, a three-equation model was estimated in which we allow for correlation in the unobservables between the latent variable models for truancy, test scores, for the ordered categorical variables, and similarly for unemployment (or NEET), which is a binary variable. Several models are estimated which allow for different specifications of the relationships between the three outcome measures. To estimate our models, we use detailed pupil and school-level data from the Youth Cohort Studies (YCS), specifically YCS6 to YCS12, as well as school performance and school census data, which cover the period of the late 1990s and early 2000s.

The findings from our preferred model (Heterogeneous Model 1) suggest that truancy works through test scores (i.e. an indirect effect) with only a weak direct effect on labour market outcomes. However, truancy also has an unobserved effect on the risk of unemployment and the risk of NEET insofar as the correlation between latent variables for truancy and labour market outcomes are positive and statistically significant. Test scores have a direct effect on labour market outcomes, and through the estimation of ATTs, we show a good performance in high stakes tests (i.e. GCSEs) can mitigate the effect of truanting from school on labour market outcomes. Truancy need not be a significant problem for young people in terms of their post-school outcomes so long as this behaviour does not reduce test score performance. This makes sense insofar as employers observe test score performance in the job/training selection process whereas they are less likely to observe truancy behaviour.

The popular view that truancy is universally bad for young people is open to question according to our findings. The story is more complex and it is important to simultaneously track academic performance (an ‘intervening’ variable) rather than focus in on truancy per se. This is not to say that the government, schools, and parents should ignore truancy behaviour; it matters where test score performance will be adversely affected because this will lead to poor labour market outcomes. We also expect that the determinants of truancy behaviour and its effect on academic performance, and hence test scores, goes back further into the educational careers of young people than we are able to control for. Nevertheless, our analysis of the latter part of the educational process between ages 14–16 has helped to shed some light on the complex interaction between truancy behaviour, test score performance and early labour market outcomes.