Introduction

As Blumstein et al. [1] argued in the original report on criminal careers, age of onset is one of the strongest predictors of long-term offending, with a youth’s prognosis worsening with earlier initiation. Given this knowledge, juvenile risk assessment treats the onset age indicator as an essential item in evaluating the likelihood of future offending [2, 3]. Likewise, developmental researchers studying antisocial and criminal behavior acknowledge this as an important correlate of long-term patterns of behavior and have made it a central demarcation point in theoretical explanations [4, 5]. While this knowledge informs the inclination toward early intervention with at-risk youth to forestall persistent offending over time, in and of itself, age of onset is merely a risk marker and not a causal factor. This means that the link between age of onset and persistent offending across developmental stages can paradoxically be considered both conventional wisdom and a black box in terms of exactly how the mechanics of the relationship work [68].

Despite the abundance of research that examines the relationship between early age of onset and future offending, researchers have yet to provide a full accounting of its underlying causal mechanisms [6]. In short, age of onset is often seen as a proxy for several other factors that are likely to impact child and adolescent development (e.g., temperament, cognitive skills), which in turn affect prosocial or antisocial behavior as a part of persistent individual differences. Still, there are also questions about whether it plays a more active role in influencing later outcomes in a dynamic state dependence process.

To better specify how age of onset matters in understanding the development and persistence of delinquent and criminal behavior, this study utilizes longitudinal data from the Pathways to Desistance research to examine the relationship between age of onset and other individual and social influences as they jointly affect long-term trajectories of offending behavior. Specifically, we consider the direct and mediated impact of onset age in relation to long-term trends in offending, investigating one of the key questions in developmental and life-course criminology in the process [9].

Assessing the Role of Onset Age

Although early age of onset has receded somewhat in its prominence as a topic of interest, it is an area that has generated some insights of broad significance as well as questions that still await resolution. Much of what is known points to its consistent association with both later offending patterns and earlier individual and social deficits, but the unknowns pertain more to (a) whether it affects later behavior and, if so, (b) how that effect works.

Age of Onset and Developmental Mechanisms

A series of studies by Tolan and colleagues (see [8, 10, 11]) looked at early age of onset in terms of its correlates and implications for later delinquent behavior. Tolan [10] studied a sample of 337 boys in middle and high school and found that there were significant differences in frequency, seriousness, and variety of delinquent behavior among early and later starters. He also used discriminant analysis and found that a combination of individual, school, family, and demographic variables was useful in distinguishing onset groups; family functioning and academic performance and behavior were among the primary discriminating factors. He concluded that early onset has an impact on later behavior, but this may reflect continuity in childhood risk rather than a causal effect (i.e., population heterogeneity).

Nagin and Farrington [12] assessed the notion of population heterogeneity effects, which were prominently featured in works by Gottfredson and Hirschi [13] and Wilson and Herrnstein [14], as relates to early onset and later offending. In particular, they studied the role of static versus dynamic explanations for chronic offending. A theory drawing on population heterogeneity effects asserts that later variables will not have much impact on behavior beyond enduring individual differences (i.e., early starters are just those more crime prone to begin with; [15, 16]). Using data from the Cambridge Study on Delinquent Development (CSDD), Nagin and Farrington [12] specified a multivariate regression model to determine whether the effect of early onset on later behavior remains significant after controlling for stable between-individual differences. They also assessed the degree to which key covariates from the individual (e.g., psychological history) and social (e.g., parental supervision) domains had differential effects on the onset and continuance of delinquent behavior. They found that a statistical model that controlled for population heterogeneity rendered the coefficient estimate for early onset nonsignificant, implying that there was no causal impact. But, in other multivariate analyses, they found that the determinants of continued offending and its onset differed. While the former finding suggests that early onset is a marker of population heterogeneity rather than a causal factor, the latter suggests that fully explaining delinquency over time requires a developmental explanation accounting for dynamic factors.

A later study by Tolan and Thomas [8] looked at that particular issue in more depth. They found that early starters tend to have greater frequency, seriousness, and chronicity of offending later on in adolescence, but, like Nagin and Farrington [12], they also suggest that the important question is whether early onset has a causal impact on later offending or is just a marker of an antisocial disposition. Their study used five waves of National Youth Survey data and defined “early onset” (EO) as self-reported offending at age 12 or earlier. The analysis showed that the EO group had more serious offending patterns and also more chronic offending over the observed time window. Their multivariate analysis also found that psychosocial indicators remained statistically significant in predicting those later offending patterns after controlling for EO and that age of onset had an independent, additive impact on later offending. This implies that it was not simply acting as a stand-in for those early psychosocial risk factors. Tolan and Thomas [8] concluded that onset age is not simply a part of an unfolding trajectory and suggest testing possible mechanisms connecting early onset to later behavior patterns.

Although there have been fewer studies on this question recently, Bacon et al. [6] used data from the 1958 Philadelphia Cohort Study to examine whether the population heterogeneity or state dependence perspective more adequately explains the relationship between age of onset and future offending. Their preliminary results replicate the common finding that age of onset is significantly and negatively associated with future offending. However, when the authors controlled for unobserved criminal propensity, although the relationship between age of onset and future offending remained (supporting state dependence perspectives), they found that boys whose first contact with the police occurred later in adolescence (i.e., a late age of onset) were at an increased risk of future offending. This unexpected finding led Bacon and colleagues to emphasize the importance of further research examining the complexities of the relationship between onset age and later offending.Footnote 1

Theory and Early Onset

What is known to date suggests that early age of onset can be an important indicator in assessing the possibility of long-term offending and directing prevention efforts. Furthermore, early onset is undoubtedly associated with a number of other influences on offending. Still, it is important to move past this general associative relationship to one that is more sensitive to the mechanisms that might be generating the observed offending trajectories [17]. Some theoretical perspectives on the development of antisocial behavior suggest that early problems, or markers for them like age of onset, will be useful in understanding mechanisms underlying delinquent and criminal behavior. “Launch” and “cascade” theories, for example, suggest that long-term behavioral patterns will be impacted considerably by antecedent factors [18, 19]. Moffitt’s developmental taxonomy [5] and Lahey et al. [4] integrated perspective also provide a sense of intervening factors that might connect an early age of onset to long-term offending.

The “launch” perspective proposes that, beyond their immediate effects, early risk factors have an influence that carries forward to later behavioral trajectories [19, 20]. In the context of onset age, its propositions would suggest that there are direct effects on both short- and long-term behavioral trends. Though similar in terms of anticipated outcome, a launch perspective differs somewhat from a population heterogeneity model in that it suggests a linkage between a causal antecedent and the later offending trajectory. A population heterogeneity perspective often sees that earlier indicator as a marker of the same underlying propensity driving the offending trajectory. The launch perspective is most similar to a pure population heterogeneity perspective in that it is expected that the earlier influence will set a course that is impervious to shifts in later environmental circumstances [20]. More recently, Blokland and Nieuwbeerta [21] identified a process of “continuous change” and stated that criminal involvement can be predicted by higher levels of past criminal involvement. In this sense, while the direct effect of age of onset itself might be short-lived, it can lead to long-term offending by starting a process of stability at an earlier age.

The cascade theory proposes that early behavioral problems, which may be associated with precocious delinquency, impact later developmental trends through a series of chain reactions linking early deficits to problems in early childhood development (e.g., parental deficits to child conduct problems) that in turn block development of cognitive and social skills. This ultimately manifests itself in behavioral problems in adolescence and beyond [18]. Similarly, embedded within the discussion of state dependence and population heterogeneity perspectives is the concept of “inertia,” described by Nagin and Paternoster [15] as “the idea that delinquent involvement is determined not only by an individual’s current social circumstances and state of mind but also by prior levels of those influences” (p. 179). Like dominoes falling, youths’ early problems have consequences that play out over years and across domains.

Probably the most prominent perspective related to the long-term implications of age of onset is the developmental taxonomy of antisocial behavior proposed by Moffitt [5]. In Moffitt’s taxonomy, age of onset can be viewed as an important diagnostic factor distinguishing “life-course-persistent” offenders from those whose antisocial behavior is “adolescence-limited.” In that sense, those youth with earlier ages of onset are likely to carry far more pronounced risk profiles, which have accumulated since infancy or even prenatally. Consequently, in a given sample of delinquents, those with earlier ages of onset are more likely to be those who will continue offending well into the future. Of note, she suggests that there is a cumulative continuity in early problems and their manifestation (e.g., early contact with the juvenile justice system), which serve to limit the opportunities for youth to develop prosocial skills and ultimately leave antisocial behavior behind. This is also compatible with the narrowing of life chances and cumulative consequences inherent in Sampson and Laub’s [22] age-graded social control perspective.

In their integrative model of the development of antisocial behavior, Lahey et al. [4], like Moffitt, present age of onset as a central element in that the strength of the various causal factors of antisocial behavior—antisocial propensity and environmental factors—varies depending on when youth begin such behavior. They argue that for those with an early age of onset, antisocial propensity is the result of factors like temperament and cognitive ability “that are transformed into antisocial behavior through successive transactions with the social experiment” (p. 672). However, the causal import of these influences decreases as age of onset increases as, in those cases, environmental factors such as peers become more salient.

Age of Onset and Essential Risk Factors

In considering these perspectives on how age of onset may figure into long-term patterns of offending, it is essential to look at the specific developmental factors that might be impacted by variation in onset age. The Study Group on Very Young Offenders examined numerous risk factors and social, psychological, and criminal outcomes for young offenders whose offending behavior began before age 13 [23]. Among their most important findings were that (1) early and persistent offending denied youths the opportunity to learn prosocial behaviors; (2) persistent offending led to poor relationships with relatives, peers, and employers; (3) early onset offending led to low interest and motivation in school, resulting in an increased chance of dropping out; and (4) early onset offenders have a greater risk of becoming addicted to illicit substances. The authors conclude that early onset offenders whose delinquent behavior persists through adolescence experience “cumulative and cascading negative consequences for a person’s life” ([23], p. 746). In other words, factors such as those identified by the study group may each serve as a mediating factor in the relationship between age of onset and long-term offending trajectories.

Together, the theoretical frameworks and previous analysis of risk factors suggest some reason(s) why further consideration of the mechanism surrounding the link between the age of onset of offending and its long-term course is valuable. They also lend some insight into possible ways that age of onset may be viewed in a model of long-term offending.

The Current Study

Despite the role of early onset as part of some continuous pattern of behavior, Nagin and Farrington [12] and Tolan and Thomas [8] indicate that developmental theories may be important in considering connections between that status and later offending. Patterson et al. [7] mention the importance of accounting for both consequences of early onset and later effects—while also considering enduring differences that may be apparent in childhood and adolescence (i.e., population heterogeneity). This raises an important point in that much of the previous literature seems to consider the factors expected to lead up to/or occur alongside early onset, but there is not as much work that takes that status and then moves forward to investigate what happens thereafter. Tolan and Thomas [8], for example, use terms like “spurs on later involvement” (p. 176) or “boosting” (p. 179) to hypothesize how early onset might affect later behavior. In developmental terms, there is evidence to suggest that early onset might matter beyond its role as a marker of persistent heterogeneity and that there is a need to understand the “transactional” effects of early onset, psychosocial factors, and later offending patterns. This is particularly relevant for a population of serious delinquents as understanding the processes that lead to more or less extensive patterns of offending as they play out with these youth is important for assessing theoretical mechanisms relevant to developmental and life-course criminology and informing intervention.

The current study investigates the ramifications of early age of onset on long-term offending trajectories—while elaborating some key mechanism that connect the former to the latter. According to Tolan and Thomas [8], the central question in this area is whether “onset or previous criminal behavior influences individual and other psychosocial characteristics that then, in turn, increase subsequent involvement level.” Here that “involvement level” is captured by developmental trajectories of offending behavior [24] coupled with a mediation analysis of key covariates [25, 26]. We focus on three research questions (see Fig. 1 for our conceptual model). First, is age of onset associated with membership in groups estimated from longitudinal offending trends? Here, we expect a negative relationship between age of onset and groups with chronic and/or frequent offending patterns. Second, is age of onset associated with key individual and social influences? Based on the previous literature, we expect that such associations do exist. Third, is the relationship between age of onset and trajectory group mediated in ways that suggest it has an influence beyond its status as a risk marker? The expectation is that there will be some mediating effects consistent with state dependence mechanisms. The previous literature and acknowledgment of a mixed perspective on continuity in offending (see [16]) suggest that a direct effect will likely remain, however. As seen in Fig. 1, we expect age of onset to have a direct effect on trajectory group membership (and thus, long-term offending patterns), as well as an indirect effect on group membership and offending patterns via key mediating variables.

Fig. 1
figure 1

Conceptual model

Methods

To address these questions, we use data from the Pathways to Desistance (Pathways) study, a longitudinal investigation of serious offenders transitioning from adolescence to young adulthood. Participants in the Pathways study are adolescents who were found guilty of a serious offense (almost entirely felony offenses) in Maricopa County, AZ or Philadelphia County, PA. These youth were ages 14 to 17 at the time of the study index offense. A total of 1,354 adolescents were enrolled in the study initially, representing approximately one in three adolescents adjudicated in each locale during the recruitment period (November 2000 through January 2003). The sample was predominately minority (41.3 % African American, 33.5 % Hispanic) and male (86.4 %).

The data comprise extensive individual and social history interviews over ten waves spread across 84 months.Footnote 2 The extensive individual and social histories of the Pathways study participants allowed for varied measures of early onset, a lengthy follow-up to observe offending trajectories, and a host of relevant variables that might be part of a state dependence process linking onset age with later continuity in offending. The analytic sample for this study comprises 792 males who had data on at least 70 % of the possible assessments. These sample restrictions (removing females, restrictions on complete assessments) were imposed to have these analyses correspond as closely as possible to previous studies using Pathways data (see [29]). We also compress the follow-up periods that were 6 months long into yearly windows (6 and 12, 18 and 24, and 30 and 36 months) to be consistent with the approach taken in previous research (e.g., [30]).

Dependent Variable

At each wave, the respondents were asked a series of questions about the types of offenses they may have committed during the recall period. At the first wave, they were asked to report their offending over the previous year. The respondents answered questions about 22 types of offenses, including destroying/damaging property, burglary, shoplifting, drug dealing, drunk driving, rape, murder, robbery, and fighting. Similar to previous research with the Pathways data (see [29, 31]), we compiled the responses to create a variety score of self-reported offending that ranges from 0 to 22 for each respondent at each wave. This method distinguishes the most serious offenders from the least serious offenders and allows us to study the variation in offending over time. Prior research has found that variety scores are a valid and reliable way of measuring offending and hold advantages over frequency scores or dichotomies [32, 33].

Independent Variables

Exposure Time

Given the possible differences in identified trajectory patterns for offenders when considering time spent in the community [34], we incorporate a measure of “exposure time” in a given year as a time-varying control in the model. We measure this as the proportion of the recall period that the respondents spent outside of secure correctional facilities. Respondents who spend a larger proportion of the recall periods outside of secure facilities will have more opportunities to offend.

Age of Onset

At the first wave, when each respondent answered the self-reported offending questions, they were also asked to report the age at which they first committed that offense. We use the youngest self-reported offense by each respondent to denote their age of onset. Respondents who committed their first offense at the age of 9 or younger were coded as “9” in the original data set so we keep that measurement scheme. This variable ranges from 9 to 17 with a mean of 10.3 and a standard deviation of 1.7. Comparing this to the mean baseline age of the respondents (16), it appears that most of the respondents began engaging in delinquent behavior several years before the study began. The difference between the age at the initial point in this analysis and the reported onset age was fairly substantial (Mean = 6.2; std dev = 1.98 years), indicating some degree of temporal order.Footnote 3 Past studies of age of onset have found that future offending is related to both official records (e.g., [35, 36]) and self-reported measures (e.g. [8, 10]) of age of onset. However, past research has also defined onset through the age when youths display conduct problems, which would typically occur before an act of delinquency or contact with the system [18].Footnote 4

Mediator Variables

Given our desire to understand the possible linkages between early onset and later offending trajectories, we consider several covariates that may (a) have an effect on subsequent delinquency and (b) perhaps be influenced by whether a youth engaged in delinquent behavior earlier or later. Each of the covariates is measured at the first point of contact with respondents.

Domains of Social Support

The Pathways researchers asked a series of questions about the amount of social support respondents received from their families. The respondents were asked if there were any adults in their families that they could talk to if they needed advice, go to if they had troubles at home, tell about awards or accomplishments, discuss important decisions with, or depend on for help. In addition, respondents were asked if there were any adults in their family that cared about them or were their role models. Each domain of social support was scored as a “1” if the respondent reported having an adult family member who fit that description. We use the total number of domains of social support from adult family members as an early onset of antisocial behavior might attenuate familial relationships. This measure ranges from 0 to 8 with a mean of 6.2 and a standard deviation of 2.

Perceptions of Procedural Justice

Each respondent was presented a series of statements about their views on procedural justice and system legitimacy and were asked to report their level of agreement. These items were coded on a four-point Likert scale (1 = strongly disagree to 4 = strongly agree). System legitimacy, which is the mean of 11 items, captures respondents’ belief in the system and whether they feel that people should support its actors (α = 0.80). This measure has a mean of 2.3 and a standard deviation of 0.6, indicating that the respondents in the sample tend to have negative views of the legitimacy of the justice system. Respondents who view the criminal justice system as illegitimate are likely to have higher levels of offending (e.g., [38]).Footnote 5

Peer Antisocial Behavior

Respondents answered a series of questions about the prevalence of antisocial behavior in their peer groups. They were asked about 12 different types of peer antisocial behavior and answered these questions on a five-point Likert scale (1 = none of them to 5 = all of them). The peer antisocial behavior covariate is the mean of these responses (α = 0.92). This covariate has a mean of 2.4 and a standard deviation of 0.9, indicating that the respondents in the sample tended to have relatively few antisocial peers. Respondents with an early onset of antisocial behavior may be more likely to associate with antisocial peers [4042] and, in turn, have higher observed levels of offending over time [43].

Unsupervised Routine Activities

Respondents answered four questions about their activities using a five-point Likert scale (“never” to “almost every day”). For instance, the respondents were asked how often they spent time with friends informally. The covariate used in this study is the mean of the responses to those four questions (α = 0.62). It has a mean of 3.8 and a standard deviation of 0.8, which indicates that at the initial wave, the respondents’ routine activities were quite often unsupervised. Respondents with unstructured routine activities are likely to have higher levels of offending than those with less structured routine activities (e.g., [44, 45]). Youth with an early onset of offending may associate with antisocial peers, who are less likely to engage in structured routine activities.

Motivation to Succeed

Respondents were asked a series of questions about how optimistic they were about achieving their future goals and the perceived opportunity of success for people who live in their neighborhood. The respondents reported their level of agreement with several statements on a five-point Likert scale. Example items include: “In my neighborhood, it’s pretty easy for a young person to get a good-paying, honest job” and “My chances of getting ahead and being successful are not very good.” After reverse coding for directional consistency, the scale comprises the mean of the responses to six statements. Those with higher scores on this scale are more optimistic about their future success and have greater motivation to succeed. The overall motivation to succeed measure has a mean of 3.3, a standard deviation of 0.7, and a Cronbach’s alpha of 0.66. An early onset of offending could limit prosocial opportunities and decrease motivation to succeed. In general, youths that are less engaged and motivated in school are less likely to have academic success (e.g., [46, 47]). Subsequently, academic failure and dropping out of school is associated with criminal behavior later in life (e.g., [48]). Thus, youths with lower motivation to succeed and fewer perceived opportunities may have greater continuity in offending over time.

School Performance

Respondents were asked about their grades in school. This measure ranged from 1 to 8, with “1” indicating that most of a respondent’s grades were A’s and “8” indicating that most of a respondent’s grades were below D’s. This measure had a mean of 4.8 and a standard deviation of 1.9, which indicates that respondents in the sample tended to have about a “C” average in school. An early onset of antisocial behavior can lead to rejection by teachers, which decreases school engagement and increases the risk of academic failure [49]. This academic failure is often associated with criminal behavior later in life [48].

Moral Disengagement

At the initial wave, respondents were presented with 32 statements pertaining to moral detachment that they answered on a three-point Likert scale. Examples include moral justification, displacement of responsibility, and distorting responsibility. The moral disengagement variable used in this analysis is the mean of all 32 items (α = 0.88). The mean value was 1.6 with a standard deviation of 0.4. This measure is similar to Sykes and Matza’s [50] concept of techniques of neutralization. Respondents with higher levels of moral disengagement are expected to have higher levels of offending over time (e.g., [51, 52]).

Substance Use

At each wave, respondents were asked to report how frequently they used the following substances: alcohol, marijuana, sedatives, stimulants, cocaine, opiates, ecstasy, hallucinogens, inhalants, and amyl nitrate. Then, respondents were asked whether they used any other drug to get high (0 = no; 1 = yes). Similar to the self-reported offending variety score, we created a substance use variety score that distinguishes the most serious drug users from the least serious. This measure ranges from 0 to 7 with a mean of 0.9 and a standard deviation of 1.1, which indicates that drug use is relatively rare among the respondents in our analytic sample. Past research has found that individuals with higher levels of substance abuse tend to have more stable patterns of offending over time [19].

Analytic Plan

We utilized latent class growth curve models (LCGA), estimated in Mplus 7.1, to assess variation in offending trends across the measurement window in the Pathways data. LCGA estimates a model based on an assumption that observed longitudinal trends in response (e.g., self-reported variety of offending) might be explained by underlying categorical groupings [24, 53]. We focused in particular on three, four, five, and six class versions of the model (see [29, 31]). All models use a full-information maximum likelihood estimator (FIML) for data missing at random (MAR) to accommodate missing responses and attrition among participants [54]. Random starting values were utilized to avoid local maxima in the estimation process (n = 500). Again, to best approximate previous studies using these data (see [29, 31]), we utilize a cubic function with the time scale and zero inflated-Poisson distribution for the self-reported variety score measures.

Model selection was based on several measures of fit and classification quality. The Bayesian information criterion (BIC), which is the most often-used benchmark in determining the number of classes, is based on the log likelihood value of the fitted model adjusted for the number of estimated parameters [55, 56]. Lower values on information criteria suggest a better fitting model. Next, the Lo-Mendell-Rubin (LMR) and bootstrapped likelihood ratio (BLR) tests compare the specified model to a “k-1” class version (e.g., five classes vs. four). Lower observed probability values on these tests suggest that the more parsimonious model can be rejected in favor of the one with an additional class [56, 57]. The quality of classification based on the assignment of individuals to latent classes using model estimates and observed longitudinal response patterns was assessed by the “entropy” statistic. Its value ranges between “0” and “1” with those values closer to “1” suggesting clearer placement of individuals into classes [58]. The agreement between predicted and actual classification was examined through the overlap of the two in each of the classes (i.e., marginal values in classification table). Higher values suggest greater correspondence, with values over 0.80 preferred [24].

The key estimates produced within the LCGA process are (a) latent growth factors and (b) latent class probabilities. The model estimates latent growth factors that draw on all observed time points, allowing individuals to have their own trajectories (i.e., intercept, slope), which are summarized by a sample mean and variance. In LCGA, subgroups have their own intercept (initial level) and slopes (pattern of change over time), which are representative of their members. Individual cases are assigned to latent classes based on modal class probability based on where their growth factors suggest they belong. The patterns are generally summarized visually, but the growth factors comprise the statistical estimates underlying the group-based trajectories.Footnote 6 Latent class probabilities index the relative prevalence of the groups in the sample.

Mediation tests were conducted at the final stage of the analytic process. Specifically, we wished to evaluate possible relationships among age of onset, possible mediator variables that may be affected by early onset, and estimated trajectory group membership to establish the degree to which there may be intermediate factors affecting the relationship between age of onset and long-term offending patterns [25, 26]. To that end, following the estimation and assessment of the LCGA for offending (step 1), we investigated the degree to which the age of onset measure affected the relative likelihood of membership in certain groups in a multinomial logistic regression framework (step 2). From there, we added the potential mediator variables described above to determine (a) the degree to which they impacted the relative likelihood of membership in certain trajectory groups and (b) assess whether age of onset had a significant, indirect effect on the likelihood of latent class membership through intermediate factors.Footnote 7

This mediation analysis was carried out via path modeling procedures where each of the covariates described above was regressed on age of onset (see Fig. 1). The logged probability of class membership was then regressed on the direct effect of age of onset and each of the mediator variables. This testing presents challenges in the context of the latent class growth curve modeling process. Given the difficulties of simultaneously estimating the mediating process inherent in the linkage between age of onset, intervening variables, and long-term patterns of offending, we estimated indirect effects in separate path models based on the approach suggested by Clark and Muthén [59].Footnote 8 The standard errors for indirect effects were estimated using 1,000 bootstrap draws (see [25]).

Ideally, the dependent variable should be normally distributed in this mediation analysis as this is a linear model. Unfortunately, the outcome measures tended not to be normally distributed here. This is an analytic situation where there is no foolproof solution as estimating the model with an alternate distribution (e.g., censored normal) does not allow for the use of the mediation test due to the different measurement scales involved in computing that indirect effect estimate.

Results

Estimating Latent Classes

We tested up to six offending trajectory classes (model fit statistics for the three through six class models are shown in Table 1). The BIC values displayed in Table 1 indicate that the four class model is a better fit than the three class model. However, the BIC values for the five and six class models are lower than that for the four class model, which suggests use of alternative fit indices [56]. The lower observed probability value associated with the LMR for the four class model (p = 0.13) versus that for the five class model (0.29) indicates that the three class model can be rejected in favor of the four class, but the four class version cannot be rejected in favor of five classes. Next, the entropy value for the four class model (0.84) indicates better delineation than the five and six class models; however, the value for the three class model is slightly higher (0.87) than that for the four class model.

Table 1 Model fit statistics for iterative latent class growth analysis

Based on this triangulation, the four class model provides a solid fit to the data on all measures, whereas there are inconsistencies for the other specifications. As a final check on fit for the four class version of the model, we ran the bootstrapped likelihood ratio test (BLRT), which utilizes bootstrapping procedures to create a difference in log likelihood test [56] and produces an associated p value in which lower values indicate better fit for the k model over the k − 1 model. Accordingly, the p value of the BLRT for the four class model (0.00) indicates that it is a good fit to the data.

Figure 2 provides a graphical representation of the four latent offending trajectories.Footnote 9 The first trajectory class is loosely referred to as high, chronic and consists of 12.7 % of the sample. This class has the highest self-reported offending rate at the baseline interview (10.50) and at every subsequent wave. The second trajectory class is labeled high, desisting and comprises 25.8 % of the sample. This class also had a high baseline self-reported offending (SRO) rate (8.19), but unlike youth in the high, chronic group, on average, these youth tended to commit little crime in the later time periods. The third trajectory class, moderate, stable, consists of 16.5 % of the sample. This group had a moderate baseline SRO rate (3.98) that remained relatively stable through the subsequent observations. The fourth trajectory class, low, no offending, includes the remaining 45.0 % of the sample. This class had the lowest baseline SRO rate (2.79). After the initial drop in SRO between the baseline and wave 1—which was present to varying degrees in each of the trajectory groups—youths in the low, no offending class committed very few offenses across the remaining interview points.Footnote 10

Fig. 2
figure 2

Overview of group-based longitudinal trends from four class model

Results: Research Question 1

Table 2 addresses research question 1 regarding the relationship between age of onset and likely membership in the trajectory groups. Working from left to right in the table, when using the low, no offending class as a reference, we see that a one unit increase (i.e., a year) in age of onset is associated with significantly lower relative odds (shown in parentheses) of being in the high, desisting (−25 %) or high, chronic (−35 %) classes as compared to the low, no offending class. Additionally, there is also a significant effect for the high, chronic group using the moderate, stable class as a reference. Specifically, a year increase in age of first delinquent act suggests a 29 % reduction in the odds of being in the high, chronic class relative to the moderate, stable offending class. This analysis suggests that onset age does help to distinguish cases that belong in different latent classes—albeit not in all cases.

Table 2 Logistic regression—age of onset and trajectory group membership

Results: Research Question 2

We then considered the bivariate relationship between age of onset and the key individual and social factors that might act as mediators (research question 2). Of the eight covariates included in the analysis, only two—domains of social support and grades in school—were not significantly associated with age of onset. Each relationship was in the hypothesized direction. For example, youth with later ages of first reported delinquency tend to report less frequent drug use (r = −0.13, p = 0.00), delinquent behavior on the part of their peers (r = −0.20, p = 0.00), and moral disengagement (r = −0.15, p = 0.00). These youths also tended to have higher scores with respect to their perceptions of system legitimacy (r = 0.16, p = 0.00) and motivation to succeed (r = 0.13, p = 0.00).

Results: Research Question 3

Our third research question considers the possibility that onset age might be mediated by other individual and social influences—suggesting that it may be a part of a state dependence process. The results from this analysis are shown in Table 3. The table displays logit coefficients, standard errors, and odds ratios using the trajectory groups with the lowest (low, no offending) and highest (high, chronic) self-reported offending as reference points while controlling for baseline age. Accordingly, the logit coefficients indicate the estimated likelihood of placement into a given class relative to the reference class given a unit change on a covariate.

Table 3 Logistic regression of latent classes on age of onset and potential mediators

Several covariates included in the models did help distinguish likelihood of class membership. Using the high, chronic class as a reference, youth who reported a later age of onset were more likely to be members of the low, no offending (b = 0.33, standard error (SE) = 0.14) and high, desisting (b = 0.24, SE = 0.12) groups, the same result found in the unconditional model that did not include any covariates. Specifically, a 1-year increase in age of onset produced a 38 % increase in the odds of being in the low, no offending class and a 28 % increase in the odds of being placed in the high, desisting class relative to the high, chronic offending class. As youths’ scores increased one unit on the motivation to succeed scale, they had significantly higher odds of being placed in the low, no offending (103 %), moderate stable (98 %), and high, desisting (63 %) groups relative to the high, chronic group. Youths who used drugs in the past 6 months at elevated levels had significantly lower relative odds of being placed in the former three groups (78, 60, and 23 %, respectively). Similarly, youths with a higher degree of moral disengagement were significantly less likely to be placed in the low, no offending group (b = −2.13, SE = 0.57) relative to the high, chronic group. Finally, youth who associated with antisocial peers had significantly lower odds of being placed in either the low, no offending (−83 %) or moderate, stable groups (−73 %), while those youths who perceived the justice system as having greater legitimacy had greater odds of placement in one of these two groups (171 and 135 %, respectively). Similar results were found when using the low, no offending group as the reference class. Youths who associate with antisocial peers, those that have a higher degree of moral disengagement, and those who used drugs in the past 6 months at higher levels have significantly higher odds of being placed in any of the three higher offending groups relative to the low, no offending group. In addition, youth who viewed the justice system as more legitimate had lower relative odds of being placed in the high, desisting (−60 %) or high, chronic groups (−63 %).

The previous results establish that there are clearly relationships between early onset and offending trajectory groups, early onset and some potential mediators, and mediators and offending trajectory groups, which is often viewed as a part of assessing mediation [64]. Furthermore, looking at the shift in estimated effects from Tables 2 to 3 implies that there may be some mediating effects at work as the size of the onset age estimate diminishes in all relevant comparisons. Still, more formal tests are generally preferred [25, 26]. Table 4 displays the direct and indirect effects found in such a mediation analysis.

Table 4 Mediation analysis: direct and indirect effects of age of onset on the logged odds of class placement

The first model uses the log-odds transformed measure of being placed in the high, chronic class as a dependent variable. This path model holds age of onset as fully exogenous and the various individual attitude and social influences as endogenous. It includes parallel mediation relationships from age of onset through the covariates described above and direct effects from those covariates as well (see [25]). After accounting for the mediation processes, age of onset no longer has a significant direct effect on Pathway respondents’ likelihood of being placed in the high, chronic group (b = −0.17, SE = 0.09). There are, however, three significant indirect relationships where the effect of age of onset on the likelihood of placement in this class is mediated by other variables. Given each of the relationships presented in Table 4, the significant indirect effect from age of onset through peer delinquency (b = −0.11, SE = 0.03) indicates that a later age of onset reduces the level of peer delinquency to which a youth is exposed and also in turn reduces the positive relationship between delinquent peers and the logged odds of placement in the high, chronic offending class. Similar protective effects of a later age of onset can be observed for the substance use (b = −0.11, SE = 0.04) and motivation to succeed (b = −0.05, SE = 0.02) measures as well.

Somewhat different results emerge when considering the path model that uses the logit of being placed in the low, no offending class as a dependent variable. In this model, age of onset maintains its significant direct effect on youths’ likelihood of being placed in the low, no offending class (b = 0.28, SE = 0.13) after accounting for mediation processes. In addition to its significant direct effect, age of onset has a significant indirect effect on the log-transformed odds of placement in this class via three mediator variables: peer delinquency (b = 0.23, SE = 0.05), moral disengagement (b = 0.09, SE = 0.03), and drug use (b = 0.13, SE = 0.04). In all cases, a year increase in age of onset increases the likelihood of placement in the low, no offending class by virtue of its indirect and negative impact on those mediators. In other words, a later age of onset predicts a lower level of peer delinquency where a higher level of peer delinquency also predicts a lower likelihood of placement in this latent class.

When looking at the model that uses the transformed probability of being placed in the moderate, stable class as the dependent variable, there is no significant direct effect of age of onset on being placed in this class (b = −0.08, SE = 0.10) nor are there any significant indirect effects via the covariates. Similarly, there is no direct effect of onset age on class membership for the model using the probability of membership in the high, desisting group as the dependent variable (b = 0.01, SE = 0.01). There is, however, a significant indirect effect through peer delinquency (b = −0.13, SE = 0.03). This indicates that, similar to the high, chronic model discussed above, an increase in age of onset reduces interaction with delinquent peers as well as the relationship between peer delinquency and the transformed odds of being placed in the high, desisting class.Footnote 11

Discussion

Key Findings

Although a great deal of prior research has concluded that age of onset is one of the strongest predictors of future offending [1, 5, 9], there are still gaps in knowledge about its role in understanding continuity in offending. A population heterogeneity explanation suggests that the relationship between age of onset and offending trajectories is spurious due to persistent individual differences that underlie both onset age and future offending. In other words, there is no causal link between onset age and long-term offending. Conversely, a state dependence explanation suggests that early onset age has a causal effect on long-term offending trajectories via its negative consequences on other aspects of a youth’s life. This study attempted to address this gap in knowledge by using longitudinal data from the Pathways to Desistance study and group-based trajectory and mediation models to expand our understanding of the importance of age of onset by examining the direct and mediated effects of onset age and individual and psychosocial factors to determine their effects on long-term offending trajectories.

We focused on three primary research questions. First, we considered whether onset age was associated with membership in trajectory groups estimated from longitudinal offending trends. Our results show that a four class model best captures the data and that age of onset is significantly associated with membership in the four trajectory groups. Specifically, as youths’ age of onset increase, their odds of being placed in the more serious offending trajectory groups (high, chronic and high, desisting) decreases and their odds of being placed in the less serious groups (moderate, stable and low, no offending) increases. Our second research question addressed whether age of onset was associated with key social and individual covariates. All but two of the included covariates are significantly correlated with age of onset. Interestingly, the two nonsignificant covariates—domains of social support and grades in school—are those that tap the youths’ social support and attachment to prosocial institutions, which have traditionally considered as vital correlates of juvenile delinquency [46]. Greater correlations were found among covariates that address youths’ attitudes (e.g., motivation to succeed, moral disengagement) and activities (e.g., drug use), suggesting that onset age is related to some more proximal influences on long-term offending trajectories that could mean it is part of a chain of relationships as opposed to solely a marker of persistent population heterogeneity.

Our final research question focused on possible mediating factors in the relationship between self-reported age of onset and the probability of assignment to a given offending trajectory group. Our analyses indicated that some of the potential mediators did help to predict most likely class membership. For example, relative to the high, chronic group, youths with a higher motivation to succeed and those who had not used drugs in the past year had significantly greater odds of being placed in each of the three lower offending trajectory groups. The indirect effects of age of onset on trajectory class membership via eight social and individual factors were then examined more formally. Onset age maintained its significant direct effect on class membership in only one case (low, no offending), and there were significant indirect effects present in each model except for the moderate, stable class. The mediation analysis and the dissipation of the direct effect of onset age provide some evidence that there are state dependence processes that may be affected by a youth’s age of onset. Still, the fact that the direct effect for onset age remained when predicting the relative likelihood of placement in the low, no class suggests that it is likely serving as a factor that distinguishes less and more serious offending—regardless of the dynamic processes that may be triggered by when that onset happens.

Limitations

While this study provides meaningful insight into the relationship between age of onset and later offending, the results should be considered in light of some limitations. There are two limitations related to the initial and analytic sample. This study uses data for a sample of offenders aged 14–17 at baseline who were adjudicated for a serious offense, while many of the studies discussed above examined age of onset in general population samples. The current approach is beneficial in terms of focus on a sample that includes offenders who all had some age of onset. Still, only including only serious offenders may exclude some segment of juvenile offenders who commit relatively minor crimes, as well as limit the generalizability of our findings for less serious offenders or the juvenile population as a whole. As such, long-term offending trajectory groups identified in this study may be different when using a combined sample of serious and minor offenders. A second limitation is that we used a subsample (n = 792) of the Pathways data due to the inclusion of only those youth who had complete data for at least 70 % of the possible assessments, as well as excluding the small number of female offenders. These inclusion criteria were used so that our analyses might correspond as closely as possible to previous studies using the Pathways data. Although a check of attrition showed no significant differences between our sample and the full Pathways sample on age of onset, offending variety score, or any of the mediating variables, the exclusion of roughly 30 % of the cases could potentially bias both the formation of the latent trajectory classes and the mediation analyses. Similarly, it is possible that the exclusion of these cases, along with attrition in the sample, may explain, in part, a portion of the downward slopes of the trajectory curves.

There were three limitations that emerged in the measurement of the study’s focal variables. The onset age measure required recall on the part of the respondent at the baseline interview and, consequently, it may not be as precise as other measurement approaches. Additionally, although the potential mediators were measured at the start of the study and therefore before age of onset, it is possible that earlier manifestations of the mediating variables may have influenced the onset of offending (see [65] discussion of “unobserved initial conditions” for a somewhat analogous problem). Still, results from ancillary analysis show that most key results held (see footnote 9). Finally, we included only eight possible mediating variables in our analyses based on prior research. Future analyses should consider incorporating more factors in order to develop a more complete understanding of their possible mediating effect on the relationship between onset age and long-term offending trajectories.

Conclusions

Although previous research has established a strong link between early onset and long-term offending, much of this research identifies onset age simply as a risk marker for future antisocial and illegal behavior. It is important, however, to expand our understanding of the role of onset age as it plays a role in affecting various offending pathways. Overall, using a sample of adolescent offenders, this study lends further support to the idea of a mixed perspective on continuity in offending: age of onset seems to be a marker for high delinquent propensity and related choices (e.g., association with delinquent peers), which reflects a population heterogeneity perspective. In our mediation analysis, age of onset generally maintained its significant direct effect on the low or no offending group, which is an especially important test with respect to the effect of onset age on later offending patterns. Onset age also might be an early link in a chain of factors that increases the likelihood of becoming a serious long-term offender (i.e., state dependence). Our results indicate that the relationship between age of onset and long-term offending trajectories is mediated, in part, by certain social and individual factors, supporting a state dependence perspective. Peer delinquency and drug use appear to be consistent mediators of this relationship, while moral disengagement and motivation to succeed have moderate mediating effects. In addition, after accounting for the mediation processes, age of onset no longer had a direct effect on offending class placement for three of the four latent classes.

Based on these findings, this study offers some insights into delinquency prevention and intervention. For example, the finding that the relationship between early onset and high levels of long-term offending is heavily mediated by drug use suggests the need to monitor early offenders in order to intervene to ameliorate that criminogenic need. The results from this study suggest that this early intervention can potentially reduce the likelihood of serious future offending. A similar intervention strategy for early offenders could also be used in attempts to limit association with delinquent peers. Furthermore, juvenile risk assessments should not consider an early onset age as just a risk marker. Instead, consideration should be given to the factors that mediate the relationship between onset and future offending, and these factors should also be included in updated assessment tools. In other words, the consequences of that early onset—including the inherent effects of system involvement (see [66, 67])—need to be explicitly considered in any efforts to prevent later delinquent behavior. Overall, the study findings suggest that the general correlation between age of onset and long-term offending trajectories is a surface indicator of a relationship that is rich in its meaning for developmental and life-course criminology and in implications for responding to offending. While the presence of this relationship can be characterized as conventional wisdom at this point, it is important that it is probed further just the same.