Diagnostic instruments for the assessment of disruptive mood dysregulation disorder: a systematic review of the literature

Disruptive mood dysregulation disorder (DMDD) involves non-episodic irritability and frequent severe temper outbursts in children. Since the inclusion of the diagnosis in the DSM-5, there is no established gold-standard in the assessment of DMDD. In this systematic review of the literature, we provide a synopsis of existing diagnostic instruments for DMDD. Bibliographic databases were searched for any studies assessing DMDD. The systematic search of the literature yielded K = 1167 hits, of which n = 110 studies were included. The most frequently used measure was the Kiddie Schedule for Affective Disorders and Schizophrenia DMDD module (25%). Other studies derived diagnostic criteria from interviews not specifically designed to measure DMDD (47%), chart review (7%), clinical diagnosis without any specific instrument (6%) or did not provide information about the assessment (9%). Three structured interviews designed to diagnose DMDD were used in six studies (6%). Interrater reliability was reported in 36% of studies (ranging from κ = 0.6–1) while other psychometric properties were rarely reported. This systematic review points to a variety of existing diagnostic measures for DMDD with good reliability. Consistent reporting of psychometric properties of recently developed DMDD interviews, as well as their further refinement, may help to ascertain the validity of the diagnosis. Supplementary Information The online version contains supplementary material available at 10.1007/s00787-021-01840-4.


Introduction
Disruptive mood dysregulation disorder (DMDD) is a relatively new diagnosis, which has been introduced to the domain of depressive disorders in the fifth version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) in 2013 [1]. The diagnosis was endorsed by DSM-5 work groups to address concerns that children with pathological irritability and temper outbursts/anger were being inappropriately diagnosed with bipolar disorder [2]. The diagnosis of bipolar disorder did not accurately capture the non-episodic nature of those children's symptoms and therefore, might have led to questionable treatment decisions [3]. The development of the DMDD diagnosis was based on the description of a broad phenotype of pediatric bipolar disorder called severe mood dysregulation (SMD) by Leibenluft and colleagues in 2003 [4]. In addition to irritability and anger, the latter required symptoms of chronic hyperarousal (e.g. agitation, distractibility, racing thoughts, insomnia, pressured speech or intrusiveness). Increasing evidence of the clinical distinction between episodic and non-episodic irritability and anger as well as distinct pathophysiology finally led to the formulation of the new diagnosis [2,[5][6][7].
DMDD involves non-episodic anger or irritability and frequent severe temper outbursts over a period of at least one year in pediatric patients aged 6-18 years [1]. Temper outbursts occur on average three or more times per week, can occur verbally or behaviorally (e.g. physical aggression towards objects or persons), their duration or intensity 1 3 is inappropriate to the situation and they are inconsistent with the child's developmental level. DMDD is characterized by persistent irritable and angry mood between temper outbursts in at least two of three settings (i.e. at home, at school, with peers). While the average age of onset is suggested to be 5 years of age [2], the diagnosis is assigned from age 6, as the identification of pathology before this age is difficult due to normal variations in preschool behavior [8].
The prevalence of DMDD ranges from 0.8% to 3.3%, with 2-3% in preschool children, 1-3% in 9-12 year-olds, and 0-0.12% in adolescents [9][10][11]. Although the prevalence of DMDD decreases with increasing age, individuals with a history of DMDD are at higher risk for adult depression and anxiety, adverse health outcomes, low educational attainment, poverty, and reported police contact, compared to healthy and clinical controls with other psychiatric conditions [11]. Prevalence estimates differ between studies because there is substantial diagnostic variability in the adherence to DSM-5 criteria with respect to the frequency of outbursts, the duration of irritability or the exclusion criteria.
Comorbidity is one of the obstacles which have been reported around the DMDD diagnosis [12]. The majority of patients with DMDD have at least one other comorbid psychiatric disorder, of which oppositional defiant disorder (ODD) or depressive disorders are most commonly reported [10]. In addition, there is substantial diagnostic overlap with childhood psychiatric disorders such as ODD, intermittent explosive disorder or attention deficit hyperactivity disorder (ADHD), questioning the validity of the diagnosis as a distinct disorder [13][14][15]. Correspondingly, in the International Classification of Diseases and Related Health Problems (ICD-11), DMDD will be listed as a subtype "with chronic irritability-anger" of oppositional defiant disorder [16].
The diagnostic challenges may, at least in part, be due to difficulties in its assessment [17]. As such, symptoms of DMDD are not unique to children referred for psychiatric services. Hence, many existing measures provide questions which assess symptoms relevant to DMDD (e.g. irritability is measured but considered a nonspecific indicator and is related to several other psychiatric disorders) [12]. Moreover, structured interviews or questionnaires specifically developed to diagnose DMDD are still in their infancy. Consequently, there is currently no gold standard or broad consensus regarding the clinical assessment of DMDD.
In this systematic review of the literature, we aimed to provide a synopsis of all measures that have been used in diagnosing DMDD since the advent of the diagnosis in 2013. Study characteristics of the included studies, quantities of used diagnostic measures, and psychometric properties, where applicable, are reported and discussed. The results of this systematic review of the literature might guide future research in the selection of appropriate tools to diagnose DMDD in the clinical and research setting.

Methods
This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) checklist [18]. The protocol was pre-registered in the International Prospective Register of Systematic Reviews (PROSPERO) and may be accessed under the registration number CRD 42020165496.

Literature search
The goal of the literature search was to identify any studies assessing DMDD. Therefore, a broad search strategy was formulated. The full electronic search strategy of the systematic literature search in the PubMed database (https:// pubmed. ncbi. nlm. nih. gov) was: ("Disruptive Mood Dysregulation*") OR ("DMDD"). No limits or filters were added to this search. PubMed, Embase, PsycINFO, and Web of Science databases were scrutinized for relevant literature published from 2013 to 31st March 2020. We used identical search terms in all databases. Further, reference lists of publications identified through database search were screened for potentially pertinent studies not identified in the initial search. To reflect the broadest use of tools to diagnose DMDD, in research as well as in the clinic, we included any regular article, case report, or conference abstract published in any of the searched databases.

Study selection
Studies were excluded if they (a) did not include patients with diagnosed DMDD; or (b) a full text was not available. Prior to a full-text review, the titles, abstracts, and methods sections of the articles identified through database searches were screened for the eligibility criteria outlined above by two independent reviewers until consensus was reached.

Data extraction
A digital data extraction sheet was developed and refined during the data extraction process. The following data were extracted if available: general information and identifying features of the study, i.e., full reference, year of publication, and country of study origin. Additionally, the article type was identified, comprising regular articles, conference abstracts, or case reports. All article types were included to cover the full breadth of tools available for research and clinical purposes. Magnitudes and percentages of all outcome variables were given for all study types included as 1 3 well as for abstracts only. Further data extracted comprised details on the study design, study population, sample size, and age range. The main outcome was the tool used to diagnose DMDD, including the rater (clinician, parent, self) and whether psychometric properties had been assessed. Where possible, information about the number of items, administration time, and availability of the tool (licensed vs. free of cost) in different languages was obtained. Authors were contacted to provide details if any of the information of interest was not provided in the study.

Search results
The first literature search, conducted on January 22, 2020, yielded K = 1149 records (PubMed k = 168, PsycINFO k = 471, Web of Science k = 201, Embase k = 309). Search updates identical to the first search were carried out on May 26, 2020, yielding an additional k = 18 records. K = 351 duplicates were removed from the K = 1167 records screened for eligibility. Of the k = 172 full-text articles screened for eligibility, a further k = 53 studies were excluded as they did not include patients with diagnosed DMDD and k = 9 because a full text was not obtainable. The PRISMA flow diagram of the full process of study selection is depicted in Fig. 1.

Included studies
From the initial base of records, k = 110 studies fulfilled all inclusion criteria and were retained for qualitative syntheses.
In most of the measures used in the included studies, a clinician rated the patients' and participants' statements and behavior (n = 91, 82.7%), while others consisted of a parent-(n = 3, 2.7%), or self-rating (n = 4, 3.6%). No information about the rater was given in k = 10 (9.1%) studies.

Psychometric properties
In k = 79 studies (71.8%; k = 17 abstracts, 15.5%), any information on the presence or absence of psychometric properties of the measure used to diagnose DMDD was given or obtained from the authors. Of those, in k = 39 (35.5%; k = 4 abstracts, 3.6%) no psychometric properties have been obtained or reported as part of the study or using the study data. In the remaining k = 40 studies (36.4%, k = 13 abstracts, 11.8%), the most commonly reported psychometric property was reliability, with k = 33 (30.0%; k = 13 abstracts, 11.8%) reporting inter-rater reliability ranging from κ = 0.6 to 1 and k = 29 (26.4%; k = 11 abstracts, 10.0%) reporting intraclass correlation coefficients. Three studies assessed internal consistency with Cronbach's alpha = 0.92 for a Spanish version of the K-SADS-PL modified under the DSM-5 to diagnose DMDD [38], and Cronbach's alpha = 0.75 for the PAPA [39] and 0.98 for the E-SWAN DMDD scale [27]. In the studies of the NIMH group around Dr. Ellen Leibenluft (n = 25, 22.7%), raters were trained to reach inter-rater reliability with κ ≥ 0.9, before they contributed to interviews/ data collection for the respective studies. Cases were further discussed in conference with other reliable clinicians and in a lab meeting where leading clinicians reviewed the core criteria before diagnosis was made. The same group also provided ICCs ≥ 0.9 differentiating the DMDD module from the mania/hypomania part of the K-SADS-PL. One study examined consensus validity between a clinical psychiatric   Sparks et al. [56] 2014 SCID-IV Sections from K-SADS-PL and ODD module, and review of narrative summaries of clinical presentations Clinician NA Grau et al. [36] 2018 Set of questions Six questions referring to current severe temper outbursts and severe temper outburst during primary school to determine whether DSM-5 criteria were met Self-rating NA interview based on DSM-5 diagnostic criteria and the Turkish version of the DSM-5 version of the K-SADS-PL (K-SADS-PL-DSM-5-T), led by two independent clinicianresearchers [40]. A consensus of 96%, κ = 0.63 was reached. Further, concurrent validity was evaluated with the Affective Reactivity Index (ARI), κ = 0.70. One study generated Receiver Operating Characteristic (ROC) curves to obtain Area Under the Curve (AUC) for their diagnostic instrument, as a measure of predictive validity. With an AUC value of 0.85, the E-SWAN DMDD scale performed equally well in predicting diagnoses compared to the Affective Reactivity Index [27].

Discussion
Evidence from this systematic review points to a variety of different measures used for the evaluation and diagnosis of DMDD. The majority of studies used clinician-rated structured interviews in combination with DMDD specific symptom checklists. Few studies employed questionnaires or interviews specifically designed to measure DMDD or its severity. In the following, some of the most used measures are presented in more detail, before practical aspects, such as available languages and cost as well as diagnostic challenges and future directions are discussed. By far the most often used instrument was the K-SADS-PL in combination with the DMDD module. The K-SADS-PL is a semi-structured interview to diagnose mental disorders in children aged 6-18. Administration time is estimated to be about 75 min for psychiatric patients and 35-45 min for healthy control subjects. It is freely available for download online. It has high inter-rater reliability and good to excellent test-retest reliability [19]. The DMDD module has been developed by a workgroup around Leibenluft, in collaboration with the K-SADS developer Kaufman. A prior version of this module was based on a research diagnosis coined severe mood dysregulation (SMD) [4]. The DMDD module is a checklist consisting of four items probing for the DSM-5 criteria to be met (Fig. 2, see supplementary material for the DSM-5 diagnostic criteria A-K). With training and case discussion, the module can be administered with high inter-rater reliability [41]. It has further shown to differentiate well between other mood disorders such as mania/ hypomania.
Our study's findings revealed different methodological approaches to diagnosing DMDD. Some of the instruments utilized in the reviewed studies consisted of a symptom checklist. This was the case not only for the K-SADS-PL DMDD module but also for its precursor, the SMD module or the ODD module. While the checklist format might suggest simplicity, it is most often used in the context of the more comprehensive K-SADS-PL semi-structured interview, If not provided in the publication, this information was obtained through direct contact with study authors which is used by raters to create a proxy diagnostic using a combination of ODD, depression, or mania criteria, and thereby empirically derive a DMDD diagnosis. Moreover, a combination of comprehensive structured or semi-structured interviews (e.g., K-SADS-PL, SCID, DISC or CIDI) and self-made checklists or clinical evaluation to probe for DSM criteria have been employed. An approach that has further been adopted in some of the reviewed studies was to search established interviews or questionnaires (CBCL-DP, Conners, ChIPS, MINI or PAPA/CAPA/DIPA) for items relevant to the DMDD diagnosis. This approach likely stems from the fact that these studies assessed DMDD retrospectively in data not collected with the focus of determining the prevalence of DMDD. Few instruments have been deliberately designed to diagnose DMDD. Those identified by this systematic review were the K-SADS-PL DMDD module, the Breton, Bergeron and Labelle DMDD scale (available as a semi-structured interview and questionnaire), the E-SWAN DMDD module (interview) and the DAWBA DMDD section (interview; see Table 3 for an overview of instruments designed to diagnose DMDD). The instruments contain 4-34 items assessing occurrences, frequencies, and circumstances of temper tantrums/outbursts and irritable or angry mood. All instruments are available in the English language. The Breton, Bergeron and Labelle DMDD Scale is additionally available in French, and the DAWBA DMDD section additionally exists in Danish and Portuguese. The E-SWAN and DAWBA scales are freely available online or upon request to the authors. Indicated age ranges are similar, encompassing preschool age to early adulthood. While the K-SADS-PL DMDD module, the Breton, Bergeron and Labelle DMDD Scale, and the DAWBA DMDD section provide categorical outcomes, the E-SWAN DMDD module is designed to capture DMDD symptoms dimensionally. This scale reconceptualizes each diagnostic criterion for DMDD as a behavior, which can range from high (strengths) to low (weaknesses). Regarding the psychometric properties, it seems that the DMDD module has been evaluated most often, as high levels of reliability are reported in many studies. However, these reliabilities have been reached artificially by training raters to differentiate K-SADS-PL DMDD from mania modules. Although useful for the clinic, this approach does not correspond to the evaluation of reliability as a measure of consistency between raters for a certain diagnostic instrument used in a study. Therefore, a more comprehensive psychometric evaluation of this widely used measure is necessary. Besides the DMDD module, psychometric properties have been reported for the E-SWAN DMDD module. The reliability of this scale has been reported to be excellent (Cronbach's 1. Criterion A-D have been present for 12 months or more, no period of three or more consecutive months without symptoms. 2. Criterion A-D are present in at least two of the three settings listed below: Specify: _______ Home _______ School _______ Peers 3. Onset of Criterion A-E before age of 10.  Reporting of psychometric properties of the other DMDD scales is still pending. Studies using tools to diagnose DMDD followed a broad spectrum of study objectives and hypotheses. Thus, the DMDD measure and its psychometric properties might not have been the focus of attention, which might be the reason for not providing this information. However, to determine gold-standard measurement, psychometric evaluation of the currently used diagnostic measures is necessary. When assessing the psychometric properties of the instruments used in the included studies, mainly measures of reliability have been considered and reported. However, the psychometric evaluation of a diagnostic tool ideally also contains the assessment of its validity. Neither contentrelated (e.g., construct validity, factorial structure) nor criterion-related types of validity (e.g., concurrent or predictive validity) have been considered broadly in existing studies. One study reported substantial consensus validity (κ = 0.63) and concurrent validity (κ = 0.70) of a Turkish version of the K-SADS-PL [40]. A further study showed substantial predictive validity of the E-SWAN DMDD module (AUC = 0.85) [27]. Consequently, measures of validity require more attention in future research on the measurement of DMDD and should guide the reporting of respective measures in future studies.

Evidence of Disruptive Mood Dysregulation Disorder
Given the aim of the present systematic review, to provide an overview of existing instruments for the assessment of DMDD and their use in the diagnostic process, we refrained from conducting a formal risk of bias assessment of included studies. The potential risk of bias does not interfere with the aim of the present review and was thus deemed irrelevant.
Since the advent of DMDD, clinicians and researchers have noted various challenges and the diagnosis is not without controversy [17]. The characteristic symptoms of DMDD, namely irritable mood and temper outbursts are observed across multiple disruptive behavior and mood disorders and the validity of DMDD as a distinct diagnosis has been questioned [13,42,43]. Further, DMDD could not be distinguished from ODD based on symptomatology alone in a population-based study [44]. It has further been criticized that alternative thresholds for defining DMDD, as well as a closer investigation of clinically relevant thresholds, have so far only partly been considered in the existing literature [45]. The lack of precision in diagnosing DMDD might in part account for the criticism voiced about the clinical entity of DMDD. Similarly, the heterogeneity in measurement of DMDD up to date, as found in the present systematic review of the literature, might account for variations in current prevalence and comorbidity rates as well as findings on associations with risk factors or functional outcomes in individuals with DMDD. Studies designed a priori with appropriate instruments to capture DMDD are therefore necessary [46].
While the diagnostic entity of DMDD may be a useful clinical heuristic, many researcher-clinicians focus their efforts on broader transdiagnostic constructs, such as irritability [8]. Irritability has been defined as a heightened proneness to anger relative to peers [47,48] which can be seen as a personality trait with a continuous distribution across the population. In children and adolescents with DMDD, by definition, irritability is severe and expressed stably across time. In the last decade, there has been a marked increase in irritability research and there have been neuroscientific as well as treatment-related approaches to understanding pathophysiological mechanisms [41,49]. Until now, whether persistent irritability between temper outbursts and the outbursts themselves are independent of each other, or whether the mood between outbursts is rather a concatenation of less severe tantrums, remains unknown.
In addition to further psychometric evaluation of current diagnostic measures and the development of a gold-standard diagnostic measure, adjuvant measurement approaches have become popular in the last decade. One promising approach to describe the full spectrum of irritability and temper outbursts in patients' everyday lives is ecological momentary assessment (EMA; also known as experience sampling method or ambulatory assessment). This involves the repeated sampling of patients' experiences or mood, performed via a handheld device such as a mobile phone. This measurement method has high ecological validity, avoiding biases due to retrospective assessments [50]. The repeated measurement of affect, with multiple measurements during the day over several days, potentially in children or their parents might be insightful in the characterization of hourly and daily fluctuations of mood in patients with irritability and/or DMDD.
To inform the debate around the diagnostic entity of DMDD, the application of Research Domain Criteria (RDoC) constructs may yield greater clarity in terms of underlying processes and thus inform nosology as well as appropriate interventions [51]. The constructs of frustrative non-reward (Negative Valence Domain), reward prediction error (Positive Valence domain), attention and language (Cognitive domain) as well as arousal (Arousal and Regulatory systems) have been found to be particularly promising in this regard.

Limitations of the review
The present systematic review encompasses literature involving instruments for the categorical diagnosis of DMDD. In view of the described developments regarding dimensional aspects of DMDD, a systematic review of the literature on dimensional constructs, such as irritability would be informative and topical. Similarly, a comprehensive overview on the examination of developmentally non-appropriate temper tantrums would be of interest in this regard.
A substantial proportion of the studies included in this systematic review stems from one laboratory in the United States. More studies evaluating the reliability and validity of the DMDD diagnosis should be conducted in other laboratories, to reduce the potential bias of findings and address cultural differences.
Psychological assessment should not be made based on any one instrument in isolation. Rather, test findings should be integrated with information from personal and educational histories and in collaboration with other clinicians [52,53]. Consequently, using any current instruments to evaluate DMDD will require additional query and clinical evaluation. For research purposes, however, standardized assessment methods are inevitable.

Conclusion and future directions
A variety of different measures have been used for the evaluation of DMDD. The most commonly used and established instrument consists of a symptom checklist, while more recently developed structured interviews and questionnaires are still to establish their reliability and validity in diagnosing DMDD. Dimensional and experimental approaches to assessing irritability and temper outbursts as well as their interrelation might bring forth more clarity about DMDD symptomatology in children.