Sample and procedure
This study is based on data collected as part of the TRacking Adolescents’ Individual Lives Survey (TRAILS), an ongoing cohort study investigating mental health and social development from early adolescence into adulthood. The study consists of two prospective cohort studies, a population-based cohort (N = 2230) and a clinical cohort (N = 543). TRAILS was approved by the Dutch Central Committee on Research Involving Human Subjects (CCMO), participants were treated in accordance with the Declaration of Helsinki, and written consent was acquired from all adolescents and their parents.
The data collection in both cohorts involved largely the same measures and participants were assessed at largely the same ages, every two or three years [39]. The specific questionnaires and tasks used were described in a previous report [39]. For the present study, we used data from the first (T1) and fourth (T4) waves of both cohorts. The participants of the population cohort were recruited from primary schools (response rate 90 %) in five municipalities in the northern region of the Netherlands. Of all eligible children, 2230 (76 %) agreed to participate. For more details on the selection procedure see De Winter and colleagues [40]. At T1, which ran from March 2001 until July 2002, the mean age of the population cohort was 11.1 years (SD 0.6), and 51 % were females. At T4 (from October 2008 until September 2010), 1881 adolescent participated again (retention rate 84 %), the mean age was 19.1 years (SD 0.6), and 52 % were females. Participants of the clinical cohort had been in contact with a specialized mental health service in the north of the Netherlands before the age of ten. Of all eligible participants asked, 543 (43 %) agreed to participate in the study. As was expected, non-response in this particular group was larger than it was in the population cohort. However, no significant differences were found between responders and non-responders in age, gender, parental education, age of referral to mental health services, teacher reports on mental health and on school achievement (except for lower mathematics performance in non-responders) [41]. At T1 (from September 2004 until December 2005), the mean age of the clinical cohort was 11.1 years (SD 0.5) and 34 % were females; at T4 (from September 2012 until April 2014), 422 adolescents participated again (retention rate 78 %), the mean age was 19.1 years (SD 0.7), and 34 % were females. The larger proportion of boys compared to girls in the clinical cohort is due to the fact that children with pervasive developmental disorder, attention deficit/hyperactivity and externalizing problems are referred to mental health services more often than those with internalizing problems [42, 43], and these problems are more common in boys than in girls [43–45].
From both cohorts, we selected all participants who: (1) completed the facial emotion identification task at T1; (2) had been subjected to the World Health Organization Composite International Diagnostic Interview (CIDI) at T4; and (3) had not had a depressive disorder, i.e., major depressive disorder, minor depressive disorder or dysthymia, as measured by the CIDI retrospectively at T4, prior to taking the facial emotion identification task. This yielded a sample of 1840 participants (81 % of the remaining population cohort at T4, 76 % of the remaining clinical cohort at T4).
Since the TRAILS study covered numerous research questions, no a priori power analysis was performed regarding our specific research question. For the present study, a post hoc power analysis for logistic regression [46] with an alpha set to 0.05, 1840 included participants, a proportion of lifetime depressed participants of 0.20, and a predefined effect of 20 % increased risk of depressive disorder per SD increase in the predictor variable, yielded an estimated power of 0.88. To detect an effect of 10 % increased risk the power decreased to 0.37. For outcomes with a proportion of about 0.35, like lifetime symptoms of anhedonia and sadness, the estimated power for effects of 20 and 10 % increased risk was, respectively, 0.96 and 0.49.
Measures
Facial emotion identification
Facial emotion identification was measured by means of the ‘Identification of Facial Expressions’ (IFE) task at T1. This task was the last of seven tasks selected from the Amsterdam Neuropsychological Tasks program (ANT) [47], which in total took approximately 70 min to complete. Detailed information on the ANT testing procedures and the IFE task is provided in Online Resource 1.
Our hypotheses concerned the facial emotions happiness, sadness, anger and fear. Participants were included if they had completed the IFE task on at least one of these four emotions. For each of the four emotions we calculated the error proportion (EP) and the reaction time (RT). EPs were calculated as the mean proportion of misses and false alarms: \({\text{EP}} = \left( {\left( {{\text{misses}}/\left( {{\text{misses}} + {\text{hits}}} \right)} \right) + \left( {{\text{false alarms}}/\left( {{\text{false alarms}} + {\text{correct rejections}}} \right)} \right)} \right)/ 2.\) RTs were calculated by the mean RT across hits and correct rejections. Subsequently, EPs and RTs of more than four standard deviations above the mean [48] or EPs indicating performance at chance level, i.e., of 50 % or higher, were considered outliers and treated as missing. Because EP and RT potentially influence each other, outliers in one outcome parameter were also considered missing in the other. The percentage of missing EPs and RTs, including outliers, was less than 1.3 % for each facial emotion.
Depressive disorder and symptoms of anhedonia and sadness
At wave T4, the World Health Organization Composite International Diagnostic Interview (CIDI) version 3.0 [49] was used to assess onset of psychiatric disorders. The CIDI is a structured diagnostic interview which has been shown to have good reliability and validity in assessing current and lifetime DSM-IV disorders [50–52]. The interview started with a screening section for all participants, meant to determine which of the subsequent sections on specific disorders should be included in the interview. For each of these specific disorders age of onset was also registered.
In the present study, we were interested in the following outcome measures: (1) depressive disorder and (2) symptoms of anhedonia and sadness regardless of depressive disorder. The screening part of the CIDI depression section contained three questions: (1) ‘Have you ever in your life had a period lasting several days or longer when most of the day you felt sad, empty or depressed?’; (2) ‘Have you ever had a period lasting several days or longer when most of the day you were very discouraged about how things were going in your life?’; (3) ‘Have you ever had a period lasting several days or longer when you lost interest in most things you usually enjoy like work, hobbies, and personal relationships?’. All participants who endorsed at least one of these symptoms entered the whole depression section of the CIDI, which allowed classifying the participants according to DSM-IV criteria for major depressive disorder, minor depressive disorder, dysthymia, recurrent brief depression, and bipolar disorder. For the present study, depressive disorder was operationalized as the occurrence of at least one of the following disorders: major depressive episode, minor depressive disorder or dysthymia, and first onset of depressive disorder as the first onset of any of these three disorders. Since we were interested in predicting the incidence of depressive disorder after T1, we excluded the participants whose age of onset of depressive disorder was lower than or equal to their age at the time they took the facial emotion identification test at T1.
Within depressive disorders, correlations between anhedonia and sadness are high (in our sample, r = 0.78, p < 0.001), leaving little power to test anhedonia and sadness separately. Moreover, subclinical expressions of anhedonia and sadness were considered informative too. Therefore, we used CIDI screening items to determine the presence of symptoms of anhedonia and sadness regardless of the diagnostic status. Anhedonia was measured by the item ‘Have you ever had a period lasting several days or longer when you lost interest in most things you usually enjoy like work, hobbies, and personal relationships?’, and sadness by the item ‘Have you ever in your life had a period lasting several days or longer when most of the day you felt sad, empty or depressed?’. The correlation between these items was 0.37 (p < 0.001).
Since we were interested in predicting the incidence of symptoms of anhedonia and sadness after T1, we omitted participants who had already reported these symptoms in the Youth Self-Report (YSR) [53] at T1, when they were asked to report about the 6 months prior to T1. More specifically, we excluded participants with high scores (i.e., ‘clearly or often’) on T1 YSR item ‘I enjoy very little’, and participants with high scores (i.e., ‘clearly or often’) on T1 YSR item ‘I am sad, unhappy or depressed’ from the regression analyses of symptoms of anhedonia or sadness on facial emotion identification speed.
Statistical analysis
Using SPSS version 22.0, we performed a series of logistic regression analyses to determine whether facial emotion identification speed (RTs) predicted onset of depressive disorder, anhedonia and sadness. Facial emotion identification can be assessed by RTs (speed) and by EPs (accuracy). We focused on speed rather than accuracy, because the task used (static facial emotion expressions presented at full intensity) is relatively easy for 11-year-old adolescents, and we therefore expected that identification speed would have more discriminative power than identification accuracy. To account for possible associations between speed and accuracy, we adjusted for accuracy.
Standardized RTs were used to be able to compare odds ratios (ORs) across different emotions. All analyses consisted of two steps: first, the effects of the RTs for happy, sad, angry and fearful emotions were tested separately, adjusted for the respective EPs, gender, and age at the time of the IFE task. Second, we started with a full model including the EPs and RTs for all facial emotions and ran a backward conditional logistic regression analysis (again adjusting for gender and age) to estimate the combined effect of emotion identification speed of multiple emotions. In the final models, we always adjusted for the EPs of all of the RTs in the model, to ensure that found effects could not have been driven by EPs rather than RTs.
In the first step of our analysis, i.e., testing the emotions separately, significance was set at 0.05 and in the second step, i.e., backward conditional logistic regression analyses, the entry criterion was set at 0.05 and the removal criterion at 0.10. The choice of a backward rather than a forward selection procedure was motivated by the idea that forward selection involves a higher risk of excluding predictors with a suppressor effect (i.e., predictors that are only significant if certain other predictors are included in the model as well), which we did not want to ignore beforehand because of the exploratory nature of this part of our study. The exploratory nature of this study was also the reason for choosing a backward conditional logistic regression removal criterion of 0.10 and not correcting for multiple tests in our initial analyses. The latter also implies that our results should not be interpreted in a formal discriminatory way, which is why we did not focus on single significant results but on more general patterns. The False discovery rate (FDR) method [54] was employed post hoc to give an indication which effects meet multiple test correction criteria. The maximum acceptable FDR was set to 0.05.
Since our analyses on the core symptoms anhedonia and sadness were primarily aimed at identifying symptom-specific facial emotion identification patterns, the effects of anhedonia and sadness were corrected for each other in these models. For the purpose of identifying potential gender differences, gender*RT interactions were tested for all separate emotion models. Practical limitations prohibited the inclusion of interactions with gender in the multi-emotion backward selection models.
Several additional specificity and sensitivity analyses were performed. To check if findings pertained specifically to depression all associations were tested both regardless of comorbid anxiety and after exclusion of all participants with T4 retrospective CIDI-based lifetime diagnoses of SP or GAD. Finally, we checked whether adjusting for baseline speed or cohort status (population cohort or clinical cohort) changed the main results.