Introduction

Mood disorders include a large group of psychiatric diseases, of which major depressive disorders (MDD), bipolar disorder (BD), and cyclothymia can be detected based on diagnostic criteria of DSM-IV (diagnostic and statistical manual of mental disorders, 4th Edition) [1]. BDs are often undiagnosed, so the treatment of patients is difficult [2]; thus, delays in diagnosing people with BD can delay the treatment. Although the lifetime prevalence of type 1 BD (BD-I) is about 1%, the prevalence of all types of BD is significantly higher [3]. The correct diagnosis of BD is often delayed by about 10 years in most cases [4]. Accurate and concise tools that have been designed for screening BD are the Mood Disorders Questionnaire (MDQ), a checklist that includes 13 questions extracted from the IV-DSM criteria and using clinical experience, and hypomania checklist-32 (HCL-32).

During the studies that were performed on patients from different countries after completing the MDQ questionnaire, they concluded that the sensitivity and specificity of MDQ are in the range of 73–76% and 86–90%, respectively [5,6,7,8]. Moreover, HCL-32 was reported to impose the range of 48–66% and 59–71%, sensitivity, and specificity for screening BD [9, 10]. Thus, both the MDQ and HCL-32 tools have relatively acceptable sensitivity and specificity in screening for BD. The bipolarity index (BI), the auxiliary diagnostic method, is a clinician-rated tool that focuses on five clinical domains, including signs and symptoms, age at onset, course of the disease, treatment response, and family history [11]. Considering the clinical domains cover by BI, this diagnostic method may be more conducive than MDQ and HC-32, of which previous studies reported a specificity of 100% in the differential diagnosis of BD [12].

Various studies have shown that about 40–50% of patients with BD are undiagnosed at the time of referral and are often treated as depression and with different of clinical outcomes [13, 14]. Since a large number of patients with BD suffer from imperative complications and consequences due to lack of proper diagnosis, to accurately diagnose these disorders, in addition to a clinical interview, an appropriate diagnostic tool with psychometric properties is needed. Therefore, according to previous studies, the results were not very satisfactory and that only a limited number of parameters were considered, the present meta-analysis was conducted to determine the diagnostic accuracy of psychometric properties of the BI in people with BD.

Methods

This systematic review and meta-analysis were conducted according to the Meta-analyses Of Observational Studies in Epidemiology (MOOSE) [15] and Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) [16], and SEDATE (Synthesizing Evidence from Diagnostic Accuracy TEsts) [17] guidelines.

Search strategy

To access content, we systematically searched databases including Scopus, ISI Web of Sciences (WOS), Pubmed/Medline, Embase, and PsycINFO using standard search terms "Bipolarity index" [Text] AND (((("Bipolar Disorder"[Mesh]) OR "Bipolar and Related Disorders"[Mesh]) OR "Mood Disorders"[Mesh]) OR "Mania"[Mesh]) OR ("Depression" [Mesh] OR "Depressive Disorder"[Mesh], and articles relevant to the subject of this article published between May 1990 and 30 July 2020, were collected and reviewed. There is norestriction on language.

Inclusion and exclusion criteria

Studies considering individuals with BD, and prospective, national, population-based studies using BI tool for diagnosis, were included. However, articles that had incomplete or unidentified data, various study designs, congress abstracts, reviews, case reports, letters, and duplicate publications were excluded.

Study selections

After removing duplicated studies, two authors (MS and FR) independently screened titles and abstracts of potential papers considering pre-defined inclusion and exclusion criteria. Any disagreements were resolved by either re-evaluation of the source article or consulting a third author (ME).

Data extraction

Information, including author's name, publication year, country, age, sample size, study design.

Methodological quality assessment

Two reviewers (MS and FR) performed the quality assessment of included studies using the Newcastle–Ottawa Scale (NOS) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tools. Disagreements were resolved by either discussing or re-evaluating the original article with a third reviewer (ME).

Ethical consideration

Ethical committee approval and informed consent were not essential due to working on previously published studies.

Statistical analysis

We used either the random-effects or fixed-effect models depending on the level of heterogeneity to evaluate the diagnostic utility of BI in the screening and diagnosis of individuals with BD [18]. Afterward, we measured heterogeneity across studies using Cochran's Q statistics and the I2 test. When I2 values (more than 50%) showed a high heterogeneity sensitivity, subgroup analyses were performed to discover the source of the heterogeneity. A hierarchical receiver-operating characteristic summary (HSROC) curve and a summary receiver-operating characteristic (SROC) curve have been mounted. All experiments were viewed with the HSROC curve as a circle and plotted. The area under the curve (AUC) was computed to determine the diagnostic precision. Approaches to 1.0 to the AUC would mean outstanding results, and bad performance would be suggested if it approaches 0.5. Among numerous subgroups, the 95%CI of the AUC was compared. When the sensitivity and specificity were directly unavailable, they were calculated according to the following formulas: sensitivity = TP/(TP + FN) and specificity = TN/(FP + TN). Publication bias was measured using Deeks' regression test [19]. The analysis was conducted using version 1.4 of the Meta-DiSc software (https://meta-disc.software.informer.com/1.4/) [20] and Revman 5.3.

Results

Search results

Two hundred and ninety-six records were found through the initial search. Of 679 articles, 25 duplicated studies were found, and 70 were omitted due to irrelevant titles and abstracts. The rest 450 entered the full-text screening, of which 186 were excluded due to pre-defined inclusion criteria (Fig. 1). Ultimately, 15 studies on 6525 patients were included (Table 1) (Additional file 1: Table S2).

Fig. 1
figure 1

Flow diagram of the selection process

Table 1 Characteristics of included studies

The methodological quality of included studies

The methodological quality of the included studies is shown in Fig. 2. A total of four studies were at low risk of bias in the participant selection domain [21, 25, 31, 33]. Also, a total of five studies were at low risk of bias in the reference standard domain [12, 27, 28, 32, 33]. Moreover, seven studies were at low risk of bias in the flow and timing domain [11, 21, 22, 28, 29, 31, 32]. Three studies were at low risk of bias for all index tests other than one threshold [25, 28, 32] (Additional file 1: Fig. S1). There was no need to contact the authors of the selected paper in this study.

Fig. 2
figure 2

Risk of bias and applicability concerns summary: review authors' judgments about each domain for each included study

Pooled sensitivity and specificity

The sensitivity and specificity along with the 95% confidence interval (CI) for each of the main analyses are shown in a forest plot (Fig. 3). Our findings showed that the pooled sensitivity of BI in the diagnosis of BD was 0.82 (95%CI: 0.81–0.83, P < 0.0001, I2 = 99%) (Additional file 1: Fig. S1). The pooled specificity also was 0.73 (95%CI: 0.72–0.74, P = 0.000, I2 = 99%) (Additional file 1: Fig. S1). The pooled NLR and PLR are presented in Additional file 1: Fig. S2.

Fig. 3
figure 3

Forest plot of bipolarity index including sensitivity and specificity of included studies

Diagnostic accuracy

Nine studies reported the diagnostic accuracy of the BI for the detection of BD (Fig. 4). The BI was significantly more accurate than the other tests with a pooled DOR of 47.2 (95%CI: 12.01–85.52, P = 0.0000, I2 = 99.2%) (Additional file 1: Fig. S3).

Fig. 4
figure 4

Summary estimates and 95% confidence region (ellipses) of the meta-analyses showing diagnostic test accuracies of bipolarity index

In our pooled analysis of patients with BD had higher BI than subjects with MD, as would be expected, though there was a significant higher average BI with a mean difference (MD) of 31.36 (95%CI: 29.40–33.33, P < 0.0001, I2 = 49%) (Fig. 5).

Fig. 5
figure 5

Forest plot of mean difference of bipolarity index between patient with bipolar disorder and other mood disorders. CI confidence interval, IV inverse variance

Direct comparison

Comparison of the BI with HCL-32 for the detection of bipolar disorder

The BI curve was consistently above the HCL-32 curve in the region encompassing most of the observed data (Fig. 6A).

Fig. 6
figure 6

The difference of bipolarity index compared to HCL-32 (A), BSDS (B), and MDQ (C). CI confidence interval, IV inverse variance

Comparison of the BI with BSDS for the detection of bipolar disorder

The BI curve was consistently above the BSDS curve in the region comprising most of the observed data (Fig. 6B).

Comparison of the BI with MDQ for the detection of bipolar disorder

The BI curve was consistently above the MDQ curve in the region involving most of the observed data (Fig. 6C).

Discussion

The present meta-analysis was conducted to determine the diagnostic accuracy of psychometric properties of the BI in people with BD, which showed that the utility and diagnostic accuracy of BI was significantly more than other tools. BD and other chronic mental disorders such as schizophrenia are different, but sometimes the symptoms are confused with the symptoms of chronic mental disorders. However, mental disorders are separate and even each is classified in a different group. If a psychiatrist does not have a good clinical history or does not pay attention to the context of the patient's current life situation, misdiagnosis may occur. Substantial misdiagnosis rate between bipolar disorder and other chronic mental disorders, especially mood disorders, may lead to delay in receiving proper and timely treatment and symptom controls.

Our meta-analysis showed sensitivity and specificity of 0.82 and 0.73 for the BI at recommended cut-off in psychiatric services, respectively. In this context, Carvalho et al. performed a meta-analysis to compare the diagnostic accuracy of the bipolar spectrum diagnostic scale (BSDS), HCL-32 and (MDQ, and reported summary sensitivities of 81%, 66% and 69%, as well as specificities of 67%, 79% and 86% for the HCL-32, MDQ, and BSDS in psychiatric services, respectively [10]. Thus, the BI could be more accurate than the other available tools for the detection of BD in primary care or general population settings. Given that the BSDS, HCL-32, and MDQ were proposed to advance the diagnosis of less exuberant BD [34, 35], this may explain why the other tools are less accurate than the BI for detection of BD.

The age of onset is very important in BI; thus, earlier ages of onset include higher scores, which point toward a greater probability of BD [36]. Proper care of people with BD needs an in-depth understanding of the subtleties of symptoms at different ages, as well as considering the age of onset, which may have an etiological worth [37]. Truthfully, from the clinical practice point of view, a recent meta-analytical study showed that early age of onset is associated with longer delays in diagnosis and treatment, more severe depression, higher levels of anxiety, and substance use [38]. Another study revealed that given the difference in terms of both ages of onset and initial treatment, though intervention in early-onset may reduce the severity of the disorder and prevent secondary symptoms, this should be determined in the clinical practice context [39]. Therefore, it is very important to pay attention to the age of onset in terms of the BI combined with clinical psychiatry.

Present classification systems that discriminate BD from MDD are polarity-based rather than recurrence [40]; thus, there is a risk of being mistakenly diagnosed as MDD [41]. Using recovery period as a discrimination factor of recurrent distinct manic episodes, of which the highest BI score (score of 20) describes full recovery makes this factor and the earlier age of onset useful predictors of bipolar diathesis. In this context, Mossolov et al. conducted a non-interventional diagnostic study on 409 patients with recurrent depressive disorder (RDD), and showed that among these patients, 40.8% had a diagnosis of bipolar disorder [26]. This means that focusing on the course of the disease not only helps to diagnose BD, but also to understand BD features and traits.

Limitations

The present study includes some limitations. First, the sample size was relatively small. Only 9 studies were involved in the meta-analysis. Second, the data collection method may affect the result; e.g., diverse cut-off criteria can lead to different diagnostic indices (sensitivity, specificity, PPV, NPV, and accuracy) rates. Third, using different diagnostic criteria for BD, may influence diagnostic indices rates. Fourthly, there was high heterogeneity certain risk of bias in data among included studies in meta-analysis. Thus, the findings should be generalized with caution as if they might indeed be applied in clinical practice.

Conclusions

BI appears to be a useful screening instrument with suitable psychometric properties to identify BD compared to both the MDQ and the HCL-32. It should be noted that the false-positive cases could be far higher when applying screening instruments for BD. Consequently, patients detected by the BI should be confirmed through diagnostic interview. Thus, more studies are needed to explore the optimal cut-off values of BI among screened populations during long-term follow-up, since a considerable portion of individuals primarily diagnosed with MDD could have BD.