Introduction

The Ishihara test is widely used to evaluate human color vision [1,2,3,4,5,6,7]. The test is composed by pseudoisochromatic plates and the perceptual task is dependent of the chromatic differences between the stimulus target and background [7]. Some agencies consider a different number of errors to label the subject as congenital colorblind [4], and the accuracy of the Ishihara test has been the focus of many investigations [1, 2, 4, 8,9,10,11,12,13,14,15] since different plates have different hit rates for trichromat subjects [1, 2, 4, 8,9,10,11,12,13,14].

Although the probability of correct response to each plate is different, the current evaluation of the Ishihara test considers that all the plates have the same weight, and the sum of hits for the task is the result of the test. Rodriguez-Carmona et al. [4] introduced an evaluation of the Ishihara test based on the probability of error from each plate. They called it severity index. For a normal trichromat, it considered the probability of correct response to a plate in a normal trichromat population to weight its contribution to the final result. They applied the same rationale to protan and deutan (congenital colorblind) subjects.

Considering that a perfect plate to separate normal trichromat and other color vision phenotypes would be the that which the normal trichromat makes the correct decision and the subject with a color vision deficiency makes the wrong decision, it is reasonable that both pieces of information should be included in the equation of the severity index. In the present study, we introduced a correction in the equation of Rodriguez-Carmona et al. [4] to quantify the severity index of the Ishihara test result.

Main text

Methods

Subjects

Sixty normal trichromats and 107 congenital dichromats, consisting of 42 protans and 65 deutans, made up our database. The mean age of normal trichromat subjects was 23.78 ± 7.24 years-old, and that of dichromat subjects was 32.72 ± 10.84 years-old. We evaluated both eyes, but we randomly chose one eye to analyze the results. All subjects had normal or corrected visual acuity higher than 20/30 without neurological or systemic diseases. No subjects’ corrective lense had any tint.

Ishihara test application

We used 38 plates for the Ishihara test (1997 edition) [16]. We used the plates from 1 to 25 to test the subjects. The plates were 75 cm apart from the subjects’ eyes under the illumination of fluorescent lamps, which had relative spectral radiance showed in the Fig. 1. We used spectroradiometer (PR715 model, Photo Research, NY, USA) driven by SpectraWin 2 to record the spectral radiance of the light. We positioned the book around 45º from the table, and the exhibition of each plate lasted 3 s. The subject was instructed to read the number on the plate during its presentation. We compared the results for each plate to the responses indicated by the book manufacturer, and a lack of reading or misreadings were considered errors. We analysed the plates from 2 to 21 to estimate the test accuracy and to apply mathematical analyses, and plates from 22 to 25 to classify the color-vision phenotype of the participants. To be considered colorblind, the subject had to make 8 or more errors on the Ishihara test, and in order to classify subjects as deuteranopic and protanopic, we used the classification plates (Plates 21 to 25) of the Ishihara test. The classification indicated by the Ishihara test was confirmed by at least one more color vision test (Farnsworth-Munsell 100 hue test or HMC anomaloscope).

Fig. 1
figure 1

Relative spectral radiance of the white fluorescent bulb used to illuminate the Ishihara plates during the test

Data analysis: indexes calculation

We analyzed the results of Ishihara test for each subject by calculating a non-weighted index and two weighted indexes. We attributed the value 1 to the plates with correct responses while to the plates with the wrong responses, we attributed the value 0. The non-weighted index was calculated by the sum of the attributed values of the subjects’ responses to the conventional evaluation of the Ishihara test. The value of the sum was divided by the number of plates (Eq. 1).

$$Non{-}weighted\;index = \frac{{\sum\nolimits_{i = 2}^{21} {R_{i} } }}{n}$$
(1)

where \(R\) is the response for each plate (1 for correct, 0 for incorrect response), \(i\) is the number of the plate, \(n\) is the total number of plates.

The weighted indexes were calculated as follows:

  1. i.

    Using the equations suggested by Rodriguez-Carmona et al. [4]. For each plate, we also attributed the value 1 for a correct response and value 0 for the wrong responses. After, if the subject was a normal trichromat, we multiplied the value of the subject’s response to the hit rate of the plate in a normal trichromat population. If the subject was protan or deutan, we multiplied the value of the subject’s response to the hit rate of the plate in a population with the same color vision phenotype. We summed up the weighted response to all plates, and the total value were divided by the total number of plates (Eqs. 2a, 2b, 2c).

    $$Weighted\;\;index\_i_{t} = \frac{{\mathop \sum \nolimits_{i = 2}^{21} R_{i} \times PT_{hit} }}{n}$$
    (2a)
    $$Weighted\;\;index\_i_{p} = \frac{{\mathop \sum \nolimits_{i = 2}^{21} R_{i} \times PP_{hit} }}{n}$$
    (2b)
    $$Weighted\;\;index\_i_{d} = \frac{{\mathop \sum \nolimits_{i = 2}^{21} R_{i} \times PD_{hit} }}{n}$$
    (2c)

    where \(Weighted\;\;index\_i_{t}\) is the weighted index for normal trichromats based in Rodriguez-Carmona et al. [4], \(R\) is the response for each plate (1 for correct, 0 for incorrect response), \(i\) is the number of the plate, \(n\) is the total number of plates, \(PT_{hit}\) is the probability of correct response in a trichromat population, \(Weighted\;\;index\_i_{p}\) is the weighted index for protan subjects based in Rodriguez-Carmona et al. [4], \(PP_{hit}\) is the probability of correct response in a protan population, \(Weighted\;\;index\_i_{p}\) is the weighted index for deutan subjetcs based in Rodriguez-Carmona et al. [4], \(PD_{hit}\) is the probability of correct response in a protan population.

  2. ii.

    Adjusting the equations suggested by Rodriguez-Carmona et al. [4] including the information of normal trichromat and dichromat populations. For each plate, we also attributed the value 1 for a correct response and value 0 for wrong responses. After this, we multiplied the value of the subject’s response to the hit rate of the plate in a normal trichromat population and to the error rate of the plate in a protan and deutan population. We summed up the weighted response to all plates, and the total value was divided by the total number of plates (Eq. 3).

    $$Weighted\;\;index = \frac{{\mathop \sum \nolimits_{i = 2}^{21} R_{i} \times PT_{hit} \times PP_{error} \times PD_{error} }}{n}$$
    (3)

    where \(R\) is the response for each plate (1 for correct, 0 for incorrect response), \(i\) is the number of the plate, \(n\) is the total number of plates, \(PT_{hit}\) is the probability of correct response in a trichromat population, \(PP_{error}\) is the probability of wrong response in a protan population, and \(PD_{error}\) is the probability of wrong response in a deutan population.

We applied the descriptive statistics to the scores of non-weighted and weighted indexes using the results of all Ishihara test plates and using the more efficient Ishihara test plates to each group. The significance level was 5%. Statistical analysis was performed using the software Biostat 5.0.

Results

Accuracy of the test: hit rate

Table 1 shows the hit rates of the Ishihara plates. The normal trichromat subjects obtained 100% of the correct response to the plates 2, 3, 4, 7, 8, 11, and 16. The plates with lowest hit rates were 12, 17, 19, 20, 21, ranging between 66.7 and 75%. There were 26.67% of normal trichromat subjects that showed 100% hits on all plates.

Table 1 Hit rate to each plate of Ishihara test

For the protan group, we observed no (0%) hit on plates 5, 6, 9, 11, 12, 13, 14, 16, and 17 and the plates 3 and 21 had higher hit rates. For the deutan group, the plates that showed hits of 0% were 5, 10, 11, 12, 13, 14, 16, and 17, and the plates with the highest hit rates for deutans were 7, 8, 18, 19, 20, and 21.

Comparison of the non-weighted and weighted indexes

Figure 2 shows the partial index for each plate calculated by non-weighted and weighted equations. Comparing the functions obtained for each index, we observed that the values of the indexes in some plates showed noteworthy differences.

Fig. 2
figure 2

(modified from the Rodriguez-Carmona’s equation)

Indexes estimated from normal trichromats (black circles and lines), protan (red circles and lines), and deutan (green circles and lines) populations for each plate using non-weighted method (conventional evaluation), weightedi method (based in the Rodriguez-Carmona’s equation), weightedii method

The normal trichromat group exhibited higher index values than the groups with congenital color vision deficiency (p < 0.05) while there were no differences between the groups with congenital colorblindness (p > 0.05). The weighted index values were significantly lower than the non-weighted values for all groups (p < 0.05).

Receiver Operating Characteristic (ROC) analysis

We performed the ROC analysis to evaluate the performance of the classifier that used the different ways to measure the severity indexes used in the present investigation. For the ROC analysis, we considered the cumulative distribution of the indexes estimated from the trichromats as true positive rates, and we considered the indexes estimated from each group of subjects with color vision deficits as the false positive rates. The area under curve was calculated using MATLAB routines with the trapezoid function, and we observed that for all indexes the area under curve was 1.

Discussion

The present investigation proposed to modify the Rodriguez-Carmona’s equation [4] to calculate the severity index of the Ishihara test results. In our modification, we included information of normal trichromats and congenital colorblinds. Our main results were that for some plates, the proposal correction changed significantly the value of the index (for example, plate 1), but the overall benefits were small. We interpreted that the data distribution between congenital colorblind subjects and normal trichromats are largely separated, and all the methods to calculate the severity index we are studying in the present investigation had the same performance (ROC analysis) to distinguish both color vision phenotypes. However, we considered that the rationale of the Rodriguez-Carmona’s equation is partially correct and with our modification would be adequate to be applied for other situations where the distance of the distribution between normal trichromats and subjects with acquired color vision deficiency was smaller.

Many studies have shown that the plates in the Ishihara test lacked the same efficiency to elicit correct responses from normal trichromats and colorblinds [1, 2, 4, 8,9,10,11,12,13,14,15]. Probable reasons for confusing the reading of Ishihara plates are lens pigmentation due to aging [17], abnormal contrast sensitivity [18], decreased visual acuity [19], illuminant [7], and the incorrect illumination and administration of the test [20]. Moreover, variations in color printing between one copy of the test and another may also affect the subject’s ability to read [4, 9, 21, 22].

The non-weighted method is equivalent to the method of quantification suggested by the Ishihara test manufacturer. Rodriguez-Carmona et al. [4] proposed a severity index for the Ishihara test. The advantage of this index was the idea to weight the number of errors in the Ishihara test by the probability of correct responses in particular color vision phenotype populations. The results of a normal trichromat were weighted by the probability of hits on the plate in a normal trichromat population, and protan and deutan subject results were weighted by the probability of hits on their population. The method proposed in the present investigation used the idea of a weighted evaluation of the Ishihara test from Rodriguez-Carmona et al. [4] that we modified to adjust to the idea of what is expected for a perfect plate. We considered that a “perfect” plate would have to be seen by all normal trichromats and not be seen by all subjects with congenital colorblindness. Then, we weighted the responses by the hit probability on normal trichromats and by the error probability of a subject with congenital colorblindness. The implementation of the modification enabled to apply the same equation for any subject (normal trichromat or not), and that was not possible to do using the Rodriguez-Carmona’s equation.

Limitations

We had a smaller sample size than previous studies [4, 13], but we considered that the increase of the sample size would improve the good separation of the color vision phenotype groups we already observed. Six out of 9 authors were examiners for the color testing, and it could introduce some examiner bias way to apply the test. As all the examiners had the same training, we expect that this bias would not have had a great impact in the results. The fluorescent illumination is not recommended by the manufacturer, but some investigations have discussed about the use of this kind of illumination during the use of Ishihara test and they have indicated some influences on the results [11, 23]. Ishihara test was designed for natural daylight or CIE standard illuminant C and International standards for colour vision testing have been given [24]. Fluorescent bulbs have pronounced energy in some wavelengths and differ from natural daylight that has similar energy distribution across spectrum [25]. The comparisons between the results of pseudoisochromatic test using natural daylight (or Macbeth easel lamb) and fluorescent daylight lamp have shown variable results [11, 25, 26]. For the present study, some influence of the fluorescent lamp was minimized by the confirmation of the color vision phenotype by other color vision tests.

All the limitations of the present study had little or none impact in the differences between our results and those from Rodriguez-Carmona et al. [4], since that the accuracy of the results for trichromats and dichromats is similar to those showed in previous study [1]. We consider that the differences between both studies was due the different ways to calculate the weighted methods to quantify the Ishihara test performance (Additional file 1).