Participants and procedure
Data were obtained from the Quebec Longitudinal Study of Child Development (QLSCD) approved by the Quebec Institute of Statistics and the St-Justine Hospital Research Center ethics committees. A population sample of 2120 children born in 1997/1998 in the province of Quebec, were identified through birth registries. Families were included if the pregnancy lasted between 24 and 42 weeks and the mother could speak either French or English. Data were collected through structured interviews conducted by trained researchers. Relevant health and sociodemographic characteristics of the children, family and parents were obtained at 5 months. Behavioral ratings of hyperactivity–impulsivity and inattention were obtained from mother reports (1.5, 2.5, 3.5, 4.5, 5, 6 and 8 years), teacher reports (6, 7, 8, 10, 12 and 13 years), and participant-reports (10, 12, 13, 15 and 17 years). Written informed consent was provided by parents at each interview.
Symptom ratings were derived from items in the early childhood behavior scale from the Canadian National Longitudinal Study of Children and Youth [15]. The instrument incorporates items from the Child Behavior Checklist [16], the Ontario Child Health Study Scales [17], and the Preschool Behavior Questionnaire [18]. Assessments at ages 15 and 17 years were made using the Mental Health and Social Inadaptation Assessment for Adolescents [19]. Hyperactivity–impulsivity items were: Can’t sit still, is restless or hyperactive; Impulsive, acts without thinking; and Difficulty waiting his/her/your turn in games/activities. Inattention items were: Cannot concentrate, cannot pay attention for long; Is inattentive; and Easily distracted, difficulty pursuing any activity. These items correspond with DSM-V criteria for ADHD and correlate highly with those used in other standardized measures of childhood behavioral problems such as the Strengths and Difficulties Questionnaire [20]. These measures have been extensively used in ADHD research as proxies of ADHD diagnosis particularly in epidemiological samples from the general population and consider ADHD as a quantitative trait [9, 21]. Scores were summed and divided by the number of items then standardized on a 0–10 scale. Symptoms were rated on a frequency scale (never/not true = 0, sometimes/somewhat true = 1, often/very true = 2). Alpha scores for hyperactivity–impulsivity and inattention were, respectively, 0.85 and 0.89 for mother ratings (1.5–8 years), 0.91 and 0.93 for teacher ratings (6–13 years), and 0.74 and 0.82 for participant-reports (10–17 years). Correlations for overlapping mother–teacher ratings were, for hyperactivity–impulsivity and inattention, respectively, 0.36 and 0.36 at 6 years and 0.37 and 0.44 at 8 years; correlations for overlapping teacher–participant-reports were 0.24 and 0.38 at 10 years, 0.30 and 0.37 at 12 years, and 0.29 and 0.29 at 13 years.
Baseline characteristics and early risk factors
Information on family and child characteristics was obtained from parents at 5 months. For categorical variables, the presence of risk was coded as 1 and its absence as 0.
Child characteristics
The sex of the child was coded as 1 for boys and 0 for girls. Methylphenidate hydrochloride (Ritalin) use was coded as 1 for any methylphenidate taken between 6 and 15 years (14.4% of the sample). Child IQ was assessed at 41 months using the Wechsler Intelligence Scale for Children Block Design [22]. Child temperament was assessed at age 5 months using the difficult temperament scale from the well-validated Infant Characteristics Questionnaire [23].
Prenatal and perinatal factors
Information about the child’s birth was obtained from medical records, defined as: premature birth if < 37th week of gestation (4.9% of children), low birthweight if < 2500 g (3.3% of children). Parental tobacco, alcohol and street drug use during pregnancy were collected when the child was 5 months old. Tobacco, alcohol and drug exposure were, respectively, coded as 1 if the mother smoked at least one cigarette per day (25.3% of mothers), drank at least once per week (3.3% of mothers) or used any drugs (1.4% of mothers) during pregnancy.
Perinatal social factors
Family socioeconomic status (SES) was calculated from the family’s overall income, and the mother’s and father’s number of years of education and occupational prestige [24]. SES scores were standardized with a mean of 0 and standard deviation of 1. Family structure was coded as 1 if the family was not intact (i.e., child not living with both biological parents; 21.0% of the sample) and 0 if the family was intact (child living with both their biological parents irrespective of conjugal relationship). Insufficient household income (24.5% of the sample) was calculated based on Statistics Canada’s guidelines which account for family area of residence, number of occupants in the household, and family income over the past year. Early motherhood was coded as 1 if she was 21 years or younger at the birth of her first child (22.5% of the sample). Low parental education was coded as 1 if the mother/father had never obtained a high school diploma (16.0% of mothers, 17.6% of fathers).
Postnatal family factors
Family dysfunction at age 5 months was assessed using the McMaster Family Assessment Device [15]. The 12-item instrument measures communication, showing and receiving affection, control of disruptive behavior, and problem resolution. Scores are z-standardized. Mother–child interactions were assessed using the responsiveness scale of the home observation for measurement of the environment–infant version [25]. Hostile–reactive parenting, overprotection, parental self-efficacy, and perceived parental impact were assessed using The Parental Cognition and Conduct Toward the Infant Scale [26]. Scores for each dimension were z-standardized.
Parental psychopathology
Parents were asked whether before completing high school they had displayed any of five different conduct problems matching DSM-IV criteria for conduct disorder and antisocial personality disorder. Parental depression, also obtained at 5 months, was assessed using the abbreviated version of the Center for Epidemiologic Studies Depression Scale (12-item) [27]. Parents reported the frequency of depressive symptoms in the past week. Items were coded on a 4-point scale and are z-standardized.
Trajectory modeling
Ratings of hyperactivity–impulsivity and inattention between 1.5 and 17 years were modeled using group-based multi-trajectory modeling [28, 29]. The method, based on finite mixture modeling, identifies groups of distinctive developmental trajectories over age or time. The approach uses a generalization of the basic trajectory model in which trajectory groups are defined by multiple trajectories. In the present application, each group is defined by trajectories obtained from annual symptoms from three raters: mothers (1.5–8 years), teachers (6–13 years) and participant-reports (10–17 years). The approach generates a set of trajectory groups that represent the continuous symptom course from 1.5 to 17 years. The trajectory groups are displayed separately for each rater (see figures). Model selection was based on methodological as well as substantive considerations. At the methodological level, it was based on the Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC) numbers and model adequacy tests, while at the substantive level, the model was selected based on parsimony and maximum explanatory power given what is already known about symptom change across development [29]. Further details about model selection, including model fit statistics for the two next best fitting models, are presented in the supplementary material (eTable 1). Separate models were used to estimate hyperactivity–impulsivity and inattention symptoms.
Multivariable analyses
To test whether individual risk factors significantly distinguished among the six trajectory groups, we ran a series of Wald-based Chi-square tests. Risk factors that were significant at the 0.05 level were then included in a multivariable model to identify risk factors that remained significant in the context of multivariable analysis. Significant predictors were again identified by Wald tests. An important limitation of these tests is that they do not identify which trajectories were distinguished by statistically significant risk factors. From the perspective of developing population-based preventive interventions, we were specifically interested in identifying risk factors for following high-symptom trajectories. In this context, groups of children with atypical (i.e., elevated) symptom levels will be larger than groups of children with the most extreme (i.e., clinical) symptom levels. Thus, to identify children following persistently high symptom trajectories, we combined groups 5 and 6 to create a single high-symptom group and collapsed the remaining four trajectories into a low-symptom group. We then repeated the risk factor analysis within a logistic stepwise regression framework, performed separately for the hyperactivity–impulsivity and inattention symptom categories, then again for those children who were following both high-symptom trajectories simultaneously. To perform this second-stage analysis, participants were assigned to the trajectory group they most likely followed based on the posterior probability of group membership [29], a step that was not required for the analysis of risk factors distinguishing the six trajectory groups.
Three logistic regression models were used to examine early risk factors for high-symptom trajectories: one for inattention, a second for hyperactivity, and a third for participants who followed high-symptom trajectories in both symptom categories simultaneously. In each model, risk factors were identified using two steps. First, we selected variables by running bivariate logistic regressions between each predictor and the outcome (high vs. low trajectory). Variables with p values < 0.25 were included in an initial multivariable model (model 1). In the second step, backward selection (variables are deleted if p ≥ 0.05) was used together with step-by-step confounding control (model 2) [30]. Results are presented as adjusted odds ratios.
Participants with at least two data points for hyperactivity–impulsivity and inattention for each rater were included in the trajectory modeling (missing data patterns are reported in eTable 2). To examine the effects of missing data on the risk factor analysis, inverse probability weightings were generated (predictors of missingness were sex, insufficient income and maternal depression) and added to the multivariable logistic regression models as a sensitivity analysis. Variables used in the risk factor analysis had between 1.9 and 11.6% missing data. Data were considered missing at random, i.e., missingness is explained by other observed variables [31]. All analyses were conducted using Stata 14. Statistical significance was set at 0.05.