Introduction

β-thalassemia trait (βTT) and iron deficiency anemia (IDA) are amongst the most regularly reported microcytic anemia disorders1,2. IDA is prevalent in developing countries, hence βTT is predominant in regions like the Mediterranean, the Middle East, and the South East3,4,5,6,7. However the discrimination between βTT and IDA is important clinically, but it is challenging and normally difficult, because both of the disorders are sometimes clinically and experimentally in the similar conditions8,9,10. Thus, if a patient with IDA is identified as βTT, then he is deprived of iron therapy. Considering that βTT does not need treatment, but the diagnosis of a patient with βTT, and IDA may cause attendant risk of birth of thalassemia major child in the pre-marriage genetic counseling11,12,13. To effectively differentiate between these two hematologic disorders, in addition to counting blood cells (CBC), also time-consuming, and cost-effective tests are essential. Because the definitive diagnosis between βTT and IDA is confirmed by performing blood tests in order to measure the HbA2, serum iron, serum ferritin, transferrin saturation, and total iron binding capacity (TIBC), and in fact these parameters are typically considered as the gold standards for discriminating between these two hematologic disorders9,14,15,16,17,18.

Because of the discriminating between these two disorders importance, and cost-effective and time-consuming tests in order to differentiate them, several discriminating indicators have been proposed in large-scale research for the rapid and inexpensive differentiation between these two common hematologic disorders since 1973. These indices are founded on the blood parameters obtained from automated cell counters of blood that traditionally derived parameters of Hb (Hemoglobin), Mean Corpuscular Volume (MCV), Mean Corpuscular Hemoglobin (MCH), Red Blood Cell Distribution Width (RDW), Mean Corpuscular Hemoglobin Concentration (MCHC), and Red Blood Cell Count (RBC)19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41. Several studies have studied these indices diagnostic accuracy, which presented different results, as well as none of these indicators showed a sensitivity and specificity of 100%3,6,17,32,40,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56. Therefore, this study purpose was to evaluate the diagnostic function of 26 different discriminating indices in patients with microcytic anemia, by using accuracy measures, and proposing two distinct new discriminating indices for differentiation between βTT and IDA, as well.

Material and Methods

Population evaluated to develop the new index

In this study, a total of 907 patients aged over 18 years old diagnosed with IDA or βTT were selected to develop new discriminating indices. Hematological parameters like Hb (Hemoglobin), Mean Corpuscular Volume (MCV), Mean Corpuscular Hemoglobin (MCH), Red Blood Cell Distribution Width (RDW), Mean Corpuscular Hemoglobin Concentration (MCHC), and Red Blood Cell count (RBC) were measured by using Sysmex kx-21 automated hematology analyzer.

Inclusion criteria

In the IDA group, patients had hemoglobin (Hb) levels less than 12 and 13 g/dL for women and men, respectively. Mean corpuscular hemoglobin (MCH) and Mean corpuscular volume (MCV) were below 80 fL and 27 pg for both sexes, respectively, and for men, ferritin of <28 ng/mL was considered as IDA. In the βTT group, patients had a MCV value below 80 fL. Patients with HbA2 levels of >3.5% were considered as βTT carriers.

Exclusion criteria

For the IDA group, patients who had mutations associated with αTT (3.7, 4.2, 20.5, MED, SEA, THAI, FIL, and Hph) were excluded so, individuals presenting the two diseases simultaneously were not selected. For the βTT group, patients with αTT confirmed by presence of mutations in molecular analysis were excluded. All patients with malignancies or inflammatory/infectious diseases diagnosed based on clinical data and personal information obtained from medical records were also excluded.

Ethical consideration

This study was approved and supported by Ethical committee affiliated by the Ahvaz Jundishapur University of Medical Sciences (AJUMS), Ahvaz, Iran. A written informed consent was obtained before the enrollment. All methods were performed in accordance with the relevant guidelines and the institution regulations.

Development of the new index

26 discrimination indices of diagnostic performance proposed in the literature, and 2 new indices introduced in this study (CRUISE index and index26) were considered for evaluation of differences between βTT and IDA using accuracy measures like sensitivity, specificity, false positive and negative rate, positive and negative predictive value, Youden’s index, accuracy, positive and negative likelihood ratio, diagnostic odds ratio (DOR) and area under the curve (AUC).

$${\rm{Sensitivity}}\,({\rm{True}}\,{\rm{Positive}}\,{\rm{Rate}})=\frac{{\rm{True}}\,{\rm{Positive}}}{({\rm{True}}\,{\rm{Positive}}\,+\,{\rm{False}}\,{\rm{Negative}})}$$
$${\rm{Specificity}}\,({\rm{True}}\,{\rm{Negative}}\,{\rm{Rate}})=\frac{{\rm{True}}\,{\rm{Negative}}}{({\rm{True}}\,{\rm{Negative}}\,+\,{\rm{False}}\,{\rm{Positive}})}$$

\({\rm{False}}\,{\rm{Negative}}\,{\rm{Rate}}=(1-{\rm{Sensitivity}})\)

\({\rm{False}}\,{\rm{Positive}}\,{\rm{Rate}}=(1-{\rm{Specificity}})\)

$${\rm{Positive}}\,{\rm{Predictive}}\,{\rm{Value}}\,({\rm{PPV}})=\frac{{\rm{True}}\,{\rm{Positive}}}{({\rm{True}}\,{\rm{Positive}}\,+\,{\rm{False}}\,{\rm{Positive}})}$$
$${\rm{Negative}}\,{\rm{Predictive}}\,{\rm{Value}}\,({\rm{NPV}})=\frac{{\rm{True}}\,{\rm{Negative}}}{({\rm{True}}\,{\rm{Negative}}\,+\,{\rm{False}}\,{\rm{Negative}})}$$

\({\rm{Youden}}\mbox{'}{\rm{s}}\,{\rm{Index}}={\rm{Sensitivity}}+{\rm{Specificity}}-1\)

$${\rm{Accuracy}}=\frac{({\rm{True}}\,{\rm{Negative}}+{\rm{True}}\,{\rm{Positive}})}{({\rm{True}}\,{\rm{Negative}}+{\rm{True}}\,{\rm{Positive}}+{\rm{False}}\,{\rm{Positive}}+{\rm{False}}\,{\rm{Negative}})}$$
$${\rm{PositiveLikelihood}}\,{\rm{Ratio}}\,({\rm{LR}}\,+\,)=\frac{{\rm{Sensitivity}}}{(1-{\rm{Specificity}})}$$
$${\rm{Negative}}\,{\rm{Likelihood}}\,{\rm{Ratio}}({\rm{LR}}\,-\,)=\frac{1-{\rm{Sensitivity}}}{{\rm{Specificity}}}$$
$${\rm{Diagnostic}}\,{\rm{Odds}}\,{\rm{Ratio}}({\rm{DOR}})=\frac{{\rm{Positive}}\,{\rm{Likelihood}}\,\mathrm{Ratio}\,}{{\rm{Negative}}\,{\rm{Likelihood}}\,\mathrm{Ratio}\,}$$

If a discrimination index had sensitivity, specificity, positive and negative predictive value, Youden’s index and accuracy near to 1, then this discrimination index has better differential performance. Discrimination index with likelihood ratio of greater than 10, negative likelihood ratio with lower than 0.1 and high diagnostic odds ratio has a good diagnostic performance in differentiation between βTT and IDA57. Also, receiver operating characteristic (ROC)58 curve analysis was used to calculate the AUC, and compare the amount of AUC of discrimination indices. AUC with higher value indicates an overall good performance measure for each discrimination index. A perfect diagnostic discrimination index has an AUC equal to 1. Relationship between the AUC with the diagnostic accuracy is defined as: 0.9 < AUC < 1: excellent, 0.8 < AUC < 0.9: very good, 0.7 < AUC < 0.8: good, 0.6 < AUC < 0.7: sufficient, 0.5 < AUC < 0.6: bad, AUC < 0.5: index not useful57.

Herein, 2 new discriminating indices (CRUISE index and index26) were proposed for differentiating between βTT and IDA. CRUISE index was created using CRUISE tree algorithm59,60, and important normalized variables were used for evaluating coefficients of hematological parameters in calculation of this index. Index26 was created by pooling all indices except the Janel (11 T) index. Index26 was computed similar to Janel (11 T) index41, but index26 was calculated by combination of 26 indices (all indices except Janel (11 T) index). Janel (11 T) index was calculated by combining some indices (England and Fraser, RBC, Mentzer, Shine and Lal, Srivastava, Green and King, RDW, RDWI, Ricerca, Ehsani, and Sirdah). Optimum cut off for index26 was calculated using Youden’s index (indeed, optimum cutoff has maximum Youden’s index).

Also cluster analysis was used in order to extract homogeneous groups of discrimination indices with a similar diagnostic performance, according to stated accuracy measures for determining the each discrimination index diagnostic performance.

Cluster analysis is a technique for extracting observations homogeneous subgroups in a data set containing n samples and P predictor variables. Different algorithms are recommended for cluster analysis and some of this algorithms are known as hierarchical algorithms like single-linkage, complete-linkage, average-linkage, Ward’s method, and k-means non-hierarchical algorithm61. In this study, we proposed the cluster analysis application by using accuracy measures as predictor variables and it can be an applicable idea for determining differential indices with a similar performances. In former studies, these indices were compared only in subjective way, according to the accuracy measures like sensitivity, specificity, positive and negative predictive value, positive and negative likelihood ratio, accuracy, Youden’s index and AUC3,6,17,32,40,42,56. We used hierarchical algorithm (complete-linkage), and also the optimal number of indices subgroups with a similar performances was selected by using the package of NbClust in R software. This package includes 30 appropriate measures for determining the subgroups optimal number. We selected the optimal number according to the majority role.

Validation of the CRUISE Index and Index26

To validate the CRUISE index and index26, a cross-sectional study was performed in a referral center (Boghrat clinical center) in Tehran, Iran. A total of 6103 out-patients were screened among which 907 cases with anemia were included in this study. Classification of patients regarding having IDA or βTT was carried out according to the WHO diagnostic criteria62. Among 907 patients with anemia, 370 of them were eligible to have IDA and 537 of them were eligible to have βTT (Fig. 1).

Figure 1
figure 1

Design of study used for the validation of the CRUISE index and index26. Hb: hemoglobin; MCV: mean corpuscular volume; MCH: mean corpuscular hemoglobin; IDA: iron deficiency anemia; βTT: βeta thalassemia trait.

Statistical analysis

Descriptive statistics such as the mean, the standard deviation (SD), the median, and interquartile range (IQR) were calculated for hematological parameters and also age variable. Mann–Whitney U test was used in order to compare the differences between two groups parameters (βTT and IDA), because of these parameters distributions were non-normal. Normality of data was evaluated by using Shapiro-Wilk test. Sex variable was tested by chi-square test for both of the βTT and IDA groups.

Data were analyzed using a free statistical software named R version 5.3.0. Package epiR in R was used in order to calculate accuracy measures with their 95% exact confidence interval. ROC curve analysis was completed by using the package of pROC. Also, the package of OptimalCutpoints was used in order to calculate new discrimination indices cut off values by using Youden’s index. Determining the clusters optimal number, or homogeneous groups of diagnostic discrimination indices with similar performances was completed by using the package of NbClust. P < 0.05 was considered significant statistical difference.

Result

537 (59%) patients with βTT (299 (56%) women and 238 (44%) men), and 370 (41%) patients with IDA (293 (79%) women, and 77 (21%) men) were participated in this research in order to evaluate the diagnostic performance of 28 discrimination indices (two of them are new indices like CRUISE index, and index26). Chi-square test pointed out that there is significant statistical association between sex and the disease groups (χ2(1) = 53.41, P < 0.001). Hematological parameters and age variable descriptive statistics of the study groups (βTT and IDA) are displayed in Table 1. According to information indicated in this table, we can concluded that all variables except HCT and RDW variables present significant difference amongst the groups (P < 0.001).

Table 1 Descriptive statistics of hematological parameters and age variable of study groups (IDA and βTT).

Discrimination indices with their cut off are shown in Table 2. The number of true positive and negative, false positive and negative, and total number of correctly identified patients (true positive + true negative) are displayed in Table 3 for each discrimination index. Table 4 indicates sensitivity, specificity, false positive and negative rate, and positive and negative predictive values for 28 discrimination indices, and also in Table 5 the rank of these discrimination indices according to accuracy measures is shown.

Table 2 Discrimination indices for differential between βTT (n = 537) and IDA (n = 370) in patients with microcytic anemia.
Table 3 True positive and negative (TP and TN), false positive and negative (FP and FN) and total number of correctly identified patients (TP + TN) of each discrimination index for differential between βTT (n = 537) and IDA (n = 370) in patients with microcytic anemia.
Table 4 Sensitivity (TPR), specificity (TNR), false positive and negative rate (FNR and FPR), positive and negative predictive values (PPV and NPV) of each discrimination index for differential βTT (n = 537) from IDA (n = 370) in patients with microcytic anemia with their 95% exact confidence interval.
Table 5 Ranking of diagnostic performance of discrimination indices for differential βTT (n = 537) from IDA (n = 370) in patients with microcytic anemia based on sensitivity (TPR), specificity (TNR), positive and negative predictive values (PPV and NPV), Youden’s index, accuracy, diagnostic odds ratio (DOR) and area under the curve (AUC) (lower rank shows better diagnostic performance).

Table 4 represents that none of discrimination indices have 100% specificity and 100% positive predictive value. Also, none of indices except Shine and Lal (S&L) have 100% sensitivity and 100% negative predictive value, but this index has very high false positive rate. According to information indicated in the Table 4 and the Table 5, Shine and Lal (S&L) and Bessman point out the highest and lowest sensitivity (the lowest and highest false negative rate) in βTT diagnose, respectively, and index26 and Telmissani–MCHD index indicate the highest and lowest specificity (the lowest and highest false positive rate) in IDA diagnose, respectively. Also index26 and Bessman showed the highest and lowest positive predictive value, respectively, and Shine and Lal (S&L) and Pornprasert had highest and lowest negative predictive value (Table 4 and Table 5).

Table 5 and Table 6 presented that lowest Youden’s index is related to the Pornprasert, and the highest amount is related to the index26. Also, these tables show that KermanII and Pornprasert have the highest and lowest accuracy, respectively, and the highest DOR is belong to index26, and the lowest is belong to Pornprasert. Two new indices introduced earlier (CRUISE index and index26), have better performance than some of the discrimination indices, which were listed in Table 2 (Table 5). Due to the findings, none of indices have LR + > 10, and only KermanI index has LR − <0.1.

Table 6 Youden’s index, accuracy, positive and negative likelihood ratio (LR+ and LR−) and diagnostic odds ratio (DOR) of each discrimination index for differential βTT (n = 537) from IDA (n = 370) in patients with microcytic anemia with their 95% exact confidence interval.

Each discrimination index AUC is shown in Table 7. Also, Fig. 2 showed the ROC curves for discrimination formula with the amount of AUC higher than 0.8 (Kerman II, Ehsani, Sirdah, Janel (11 T), Mentzer, Green and King (G&K), Nishad, Keikhaei and Sehgal), and two new indices (CRUISE index and index26). Indices with the amount of AUC higher than 0.8 have very appropriate diagnostic accuracy in the discrimination between βTT and IDA, and also CRUISE index has good diagnostic accuracy. AUC of all indices except Telmissani–MCHD were statistically significant, in regard to the amount of AUC equal to 0.5 (P < 0.001) (Table 7), and AUC of Bessman and Pornprasert were significantly less than 0.5 (P < 0.001). As shown in Tables 5 and 7, the highest AUC is related to index26, and the lowest AUC is related to the Pornprasert index. Comparison between AUCs of discrimination formula (indices with AUC higher than 0.8), and two new indices are displayed in Table 8. There was a significant difference between AUC of CRUISE index and other indices, which the AUC of this index was significantly less than other indices (P < 0.001) (Table 8), but this index has higher AUC than the amount of other indices recorded in Table 2 (Table 7). Table 8 also represented that the AUC of index26 is significantly higher than Green and King (G&K), Keikhaei, Nishad, Sehgal, Janel (11 T) and CRUISE index (P < 0.05), but there is no significant difference between AUC of this index and other indices like Mentzer, Kerman II, Ehsani and Sirdah (P > 0.05).

Table 7 Area under the curve (AUC) of each discrimination index for differential βTT (n = 537) from IDA (n = 370) in patients with microcytic anemia with their 95% confidence interval (SE: Standard Error, CI: Confidence Interval).
Figure 2
figure 2

Reciever operating characteristic curves of discrimination indices with area under curve (AUC) higher than 0.8 (discrimination indices such as: index26, Kerman II, Ehsani, Sirdah, Janel (11T), Mentzer, Green and King (G&K), Nishad, Keikhaei, Sehgal and CRUISE).

Table 8 Comparison between area under the curve (AUC) values of discrimination indices with AUC higher than 0.8 for differential βTT (n = 537) from IDA (n = 370) in patients with microcytic anemia (AUCd = AUCrow – AUCcolumn, SE: Standard Error (AUCd)).

Cluster analysis dendrogram (this plot represents steps in the cluster analysis) is presented in Fig. 3. Cluster analysis extracted three homogenous groups. First one of them includes discrimination indices like Pornprasert, Bessman, Huber –Herklotz, and Sirachainan. Second group includes Ricerca, Telmissani–MCHD, Shine and Lal (S&L), Das Gupta, and the third group includes discrimination indices like Bordbar, Sehgal, Jayabose, KermanI, RBC, Keikhaei, Wongprachum, Index26, Sirdah, Janel (11 T), Green and King (G&K), Nishad, Mentzer, KermanII, Ehsani, England and Fraser (E&F), Telmissani–MDHL, Srivastava, CRUISE. So two new introduced indices in this study have similar performances to indices of third homogenous group.

Figure 3
figure 3

Dendrogram from cluster analysis for extracting homogeneous groups of diagnostic discrimination indices with similar performance (each rectangles includes diagnostic discrimination indices with similar performance).

Discussion

βTT and IDA are known as common causes for microcytic anemia, and these two hematologic disorders typically have similar clinical and experimental conditions. The definitive diagnostic method for the βTT is based on the HbA2 increase17,18, and the principal methods for diagnosis of IDA based on the increase in TIBC, as same as a decrease in serum iron, serum ferritin, and transferrin saturation9.

The exact discrimination between these two hematologic disorders is very vital, because the correct treatment and its proper diagnosis through premarital genetic counseling, would prevent the attendant risk of thalassemia major child birth. Considering the importance of differentiating between βTT and IDA, several different indices have been proposed in large-scale researches; additionally, these indices showed different diagnostic performance, and none of these indices had definitive diagnosis in various studies.

It is possible to discriminate between βTT and IDA without using expensive tests with high performance index. We presented two new discriminating indices between these two common microcytic anemia, and also compared these two indicators performance with 26 different published indices. This study findings indicated that none of the discriminating indices provided 100% sensitivity and specificity. Consequently, the Shine and Lal index showed a sensitivity and a negative predictive value, but with respect to the AUC, it had a poor performance in the differentiation between the βTT and IDA. It is important to remember that this index has expressed as the best discriminating index for differentiation between βTT and IDA in former researches[9,50,63. Shen et al., reported that S & L index had a low AUC as same as this study55. In the present study, index26 had 100% specificity and complete positive predictive value. In addition, according to Youden’s index, DOR, and AUC, this index is a differential index with superior performance for differentiation between the βTT and IDA. Accuracy measure like Youden’s index, accuracy, DOR, and AUC take both sensitivity and specificity into consideration, so they can present the discrimination indices performance more accurately than other criteria. According to these criteria and also Table 6, index26 indicates better performance in comparison to the other discrimination indices.

Also, by comparing the AUCs of various discriminating indices, this test performance was better than the differential indices significantly, like Green and King, Keikhaei, Nishad, Sehgal and Janel (11 T). Considering the worth of index26 in this study, this index is still difficult to calculate, and we are developing a calculator-based approach on differential indices expressed in the results, and in the future works we will introduce this protocol, in order to solve this problem. By using this calculator, we can determine the accuracy and each indicator outcome easily and quickly. Thus, it can be concluded that the differential indices, including Mentzer, Kerman II, Ehsani, Sirdah, janel (11 T) and index26 are reliable indices for discrimination between the βTT and IDA. Another recommended index was CRUISE, which showed a good diagnostic performance, but its AUC was significantly lower compared to the other indices with the very appropriate diagnostic performance (AUC > 0.8). As a result, this index has a superior performance compared to some of before stated indices. Several studies proposed new discrimination indices by using discriminant analysis for differentiating between the βTT and IDA (these indices are Nishad, Matos and Carvalho, Sirachainan and Das Gupta)27,35,39,64,65. We used CRUISE tree algorithm for recommending a new discrimination index, because tree-based methods are non-parametric methods, and these methods have some advantages over the traditional statistical methods like discriminant analysis. Some of these advantages are known as following: without needing to determine assumptions about the functional form between outcome variable and predictor variables, useful for dealing with nonlinear relationships and high-order interactions, and robust to outliers and multicollinearity. In this study, CRUISE index showed a high AUC in comparison with the Sirachainan and Das Gupta indices.

Different studies are conducted in order to assess the differential indices diagnostic performance for discriminating between the βTT and IDA in different populations. Also, these studies indicated different results. We mention index with best diagnostic performance based on the highest AUC or Youden’s index here in some conducted studies in different populations.

Iranian population: Ghafouri et al. in 200646: Mentzer index, Rahim and Keikhaei in 200945: Shine and Lal index in patients < 10 years and RDW and RDWI index in patients with the age of 10 to 57 years old, Ehsani et al. in 200933: Mentzer index and Ehsani index, Ahmadi et al. in 200944: Shine and Lal index, Keikhaei in 201034: Keikhaei index, Sargolzaie and Miri-Moghaddam in 201453: Green and King index, Bordbar et al. in 201540: Bordbar index. Thailand population: Sirachainan et al. in 201439: Sirachainan index. Indian population: Tripathi et al. in 201566: Mentzer index, Piplani et al. in 201667: Mentzer index. Turkey population: Demir et al. in 200217: RBC index, Beyan et al. in 200748: RBC index, Vehapoglu et al. 201456: Mentzer index. Italy population: Ferrara et al. in 201068: England and Fraser index. Kuwait population: AlFadhli et al. in 200649: England and Fraser index. Sri Lanka population: Nishad et al. in 201235: Nishad index. Palestinian population: Sirdah et al. in 200732: Sirdah index. Brazilian population: Matos et al. in 201354: Green and King index. Chinese population: Shen et al. in 201055: Green and King index. France population: Janel et al. in 201141: 11 T, Green and King, RDWI and Sirdah index. Saudi Arabia population: Jameel et al. in 201769: RDWI index.

Conclusion and future directions

This cross-sectional study was conducted on Iranian patients diagnosed to have βTT and IDA. In this study, two new discriminating indices were proposed for differentiating between the βTT and IDA, and these indices presented a relatively similar diagnostic performance according to cluster analysis compared to different indices reported in the literature. Index26 indicated better performance in comparison with the other discriminating indices. This low-cost index can be useful for differentiating between the βTT and IDA, thus using this index, costs for health system can be minimized in regions with limited financial resources. Also, study results showed that data mining methods like tree-based classification models can be used in order to recommend new discriminating indices for differentiating between the βTT and IDA. CRUISE index was found to have a superior performance compared to some of discriminating indices. This study was also the first study in which cluster analysis was applied for identifying homogeneous subgroups of discriminating indices with similar diagnostic function. Accordingly, it is recommended to use cluster analysis for determining discriminating indices with similar diagnostic performance for future studies.