figure a

Introduction

Glaucoma is a progressive optic neuropathy characterized by the irreversible loss of retinal ganglion cells and their axons, leading to visual field defects [1]. An accurate and early diagnosis of this chronic disease is crucial, as no cure is established.

Currently, early diagnosis of glaucoma, besides functional tests, is performed based on imaging techniques such as optical coherence tomography (OCT) [2]. This is done by measuring the circumpapillary retinal nerve fiber layer (RNFL) thickness and optic disc (OD) parameters and comparing the measurement results with a normative database. Despite a high reproducibility [3, 4] of RNFL thickness measurements, intersubject variance [5] due to the influence of numerous individual factors hampers the interpretation and use of imaging techniques, especially in the early diagnosis of glaucoma. Individual biomarkers, such as sex, age, and ethnicity [6,7,8,9], are typically taken into account by the normative databases. However, other factors are not, including optic disc and fovea parameters, retinal vessel position, axial length, and refractive error, although they show an association with circumpapillary RNFL thickness and distribution [10,11,12,13,14,15,16,17,18,19].

In a previous publication, which is in line with previous research, we could show a significant correlation of retinal blood vessel position and peripapillary RNFL thickness distribution [20,21,22]. To enable a more comprehensive analysis of this factor, we recently have developed the parameter of circumpapillary retinal vessel density (RVD)—a function dependent on thickness and distribution of the major circumpapillary retinal blood vessels [23]. We have shown that interindividual variance may be reduced, on average, by 11% and up to 20% when taking RVD into account.

In a further study [24], this parameter, together with seven other anatomical features (corresponding to OD and fovea descriptors, as well as age and refractive error), were included in a multivariate model. The sample used to develop the multivariate model consisted of 101 healthy subjects and the independent sample used to validate the model consisted of a different set of 101 healthy subjects. The model achieved a reduction of the coefficient of variation of 18% on average (up to 29%), when tested in the independent validation sample. This work provides strong evidence of the major impact that anatomical factors, especially the distribution of the retinal vessels, have on the RNFL thickness profile in healthy volunteers. The reduction of the coefficient of variance demonstrates that intersubject variance in the RNFL thickness profile of healthy subjects can be decreased, which may improve the imaging support of glaucoma diagnosis.

In the present study, we have included glaucoma suspects, an interesting patient group, as the differentiation between normal and suspect-looking optic discs in early glaucoma is very much complicated especially due to interindividual differences. Our previously introduced model, compensating for intersubject variability, was applied to circumpapillary RNFL values measured with OCT (Cirrus®) in glaucoma suspects with or without prior progressive OD change in a series of at least 4 confocal scanning laser tomography (CSLT) (HRT®III) measurements. The aim of the present study was to investigate whether glaucoma suspect patients with or without prior OD change are behaving differently in RNFL measurement with OCT and if reducing interindividual variability of RNFL measurements through using a multivariate compensation method might improve detection of glaucoma among glaucoma suspects.

Materials and methods

Subjects

This prospective study was performed in collaboration with the Department of Ophthalmology and Optometry and the Section for Medical Statistics, Center for Medical Statistics Informatics and Intelligent Systems, Medical University Vienna, Vienna, Austria, for data analysis.

Inclusion and exclusion criteria

Forty-two glaucoma suspects were included. They were recruited from an ophthalmology institute, where follow-up examinations with the CSLT (HRT) during their glaucoma routine checks were done at least once a year prior to this study.

Inclusion criteria for glaucoma suspects were suspect appearance of the OD of at least one eye (either large excavation > 0.5, or asymmetry of excavation > 0.2, or localized rim loss, or failure of ISNT rule, or baring of circumlinear vessels) and reproducible normal visual fields (VF) in standard automated perimetry (SAP). Progressive OD change during a series of 3 of at least 4 consecutive CSLT follow-up measurements was determined with strict, moderate, and liberal criteria of the topographic change analysis (TCA) [25].

Patients with or without prior progressive OD change were divided into 2 groups, progressive or stable. This was done for all three TCA criteria.

We excluded all subjects with any evidence of other ocular pathology, history of ocular trauma or intraocular surgery, ocular inflammation or infection within the last 3 months, astigmatism more than + 2.0 diopters, and ametropia of more than ± 5.0 diopters. In addition to these criteria, only optic disc examinations acquired with Fourier Domain OCT (FD-OCT, Cirrus® Carl Zeiss Meditec Inc.) with a quality index higher than 5 were included (quality index was ranging from 0 to 10), as image quality may have an impact on RNFL thickness measurements [26]. Moreover, scans with movement artifacts within the measurement circle were excluded.

Experimental paradigm

Initially, a complete ophthalmological examination was performed, including medical history, best-corrected Snellen visual acuity, slit-lamp examination, fundoscopy, measurement of IOP by Goldmann applanation tonometry, and VF examination.

Subjects eligible for participating in the study according to the inclusion/exclusion criteria were included. If both eyes were includable, one eye was selected randomly.

The study was performed at the Department of Ophthalmology and Optometry of the Medical University of Vienna.

Methods

The included glaucoma suspect patients had CSLT follow-up measurements (at least 4 measurements) during their glaucoma routine checks at an ophthalmology institute. OD morphology with HRT III (Heidelberg Engineering GmbH, Heidelberg, Germany; software version 1.7) was measured without pupil dilation. In brief, a 3-dimensional topographic image consisting of 384 × 384 × 16 up to 384 × 384 × 64 voxels was constructed from multiple focal planes axially along the optic nerve head. To visualize the disc borders, the HRT images were inspected using the 3D image display. To define the contour line, six or more points were positioned at the inner margin of the scleral ring. This was done by one trained technician. Once the contour line was drawn, the software automatically calculated all the optic disc parameters. The reference plane is defined at 50 µm posterior to the mean retinal height between 350 and 356 degrees along the contour line. The area above the reference is defined as the rim and below as the cup. The standard protocol and the extended parameter table were then exported for statistical analysis.

OD change during the CSLT follow-up was determined with strict, moderate, and liberal criteria of the TCA [25]. The TCA takes into account the size and depth of the largest contiguous area or cluster of significant change within the optic disc border. According to that, patients were divided into a stable and progressive glaucoma suspect group.

On the study day, an automated visual field testing was performed with the Humphrey field analyzer II (program 30–2). Visual field eligibility criteria were less than 33% false-positive responses, less than 33% false-negative responses, and less than 33% fixation losses. Subjects showing either pathologic glaucoma hemifield test or a cluster of 3 points in pattern deviation plot significant at 0.5% not located at the border of the visual field, were excluded. As well, Cirrus® -SD-OCT (Carl Zeiss Meditec, Dublin, CA, USA) measurements centered in the macula (Macular Cube 200 × 200) and centered on OD were performed by one technician, using the optic disc cube 200 × 200. With this program, a data cube of 6-mm2, which acquires as a series of 200 horizontal scan lines, each composed of 200 A-scans was recorded. Measurements of the OD parameters were automatically generated by a Carl Zeiss Meditec OD analysis algorithm developed for Cirrus® -SD-OCT (software version 10.0.0.14618) without the interaction of the technician [27]. The RNFL was measured at a 3.4-mm-diameter circle around the OD. For both parameters, the software calculates the average, superior, nasal, inferior, and temporal quadrant values. Model compensation (MC) as well as age compensation (AC) was applied to the global, quadrants, and clock hour sectors of these RNFL thickness values.

Statistical analysis

Statistical analysis was performed using the SPSS® software package (Version 26.0.0, SPSS Inc., USA) and R Version (3.6.0) [28]. Multivariate model compensation (MC), (for details see our previous manuscript) [24] as well as age compensation (AC) of RNFL (RNFLMC vs. RNFLAC) was applied to the global, quadrants, and clock hour sectors (CHR) of the peripapillary RNFL thickness values, for both patient groups, stable and progressive. Parameters included in the multivariate model were age, spherical refractive error, retinal vessel density, optic disc area, orientation, and ratio (quotient between major and minor axis), as well as disc-fovea distance and angle. These parameters are known to correlate with peripapillary RNFL distribution. To compensate for the effect of the above eight parameters on RNFL thickness, we used the multivariate model to calculate for each subject a model-compensated RNFL thickness (RNFLMC). For age compensation, we used the data of the instrument’s normative database. Differences between groups regarding RNFL (RNFLMC and RNFLAC) and age were assessed using the Mann–Whitney U tests and Student’s t test. P values are provided as an explorative tool and were not corrected for multiple testing. P values smaller than 0.05 indicate the difference between groups.

The diagnostic performance of RNFLMC vs RNFLAC vs originally measured RNFL was tested with the area under the receiver operating characteristic (AUROC) and compared between methods using DeLong’s tests computed with the R package pROC [29].

Results

From the 54 glaucoma suspects screened, 12 had to be excluded due to insufficient image quality and/or retinal pathologies (for example vitreomacular traction) or visual field defects. A total of 42 glaucoma suspects with or without prior progressive OD change in a series of at least 4 CSLT (HRT®III measurements) were included. The mean follow-up time of measurements with HRT was 104.4 months. For statistical analysis, the last 3 measurements next to the study day were used. Subjects’ baseline demographics are provided in Table 1.

Table 1 Demographic data. Patient characteristics, n = 42 glaucoma suspects. SD standard deviation, IOP intraocular pressure, MD mean deviation, C/D cup/disc ratio

OD change during the CSLT follow-up was determined with strict, moderate, and liberal criteria of the TCA. According to that, patients were divided into a stable and a progressive patient group for each of the three criteria. The number of glaucoma suspects with prior progressive optic disc change during CSLT follow-up was 38, 29, and 16 for liberal, moderate, and strict criteria of the TCA, respectively. The number of stable glaucoma suspects was 4 for the liberal, 13 for the moderate, and 26 for the strict criterion.

The application of the strict criterion of TCA leads to fewer numbers of progressive glaucoma suspect patients due to its conservative definition of requiring a significant cluster of ≥ 2% of the disc area and a depth change of ≥ 100 µm. In contrast, the liberal criterion required a significant cluster of ≥ 0.5% of the disc area and a depth change of ≥ 20 µm and the moderate criterion a significant cluster of ≥ 1% of the disc area, and a depth change of ≥ 50 µm.

MC as well as AC of RNFL was applied to the global peripapillary RNFL thickness values measured with OCT, as well as to the quadrants thereof. For each instance, the differences in RNFL thickness and also in age between the two glaucoma suspect groups (stable versus progressive) were assessed with t-tests (Table 2).

Table 2 Median values of peripapillary retinal nerve fiber layer thickness values (RNFL) in µm (and interquartile range) using original, model compensated (MC), and age compensated (AC) values for the two groups of glaucoma suspect patients, with (= progressive) or without (= stable) prior progressive optic disc change during the CSLT follow-up

Glaucoma suspect patients without prior progressive OD change during the CSLT follow-up (= stable) were significantly younger and had thicker RNFL thickness values in most areas and for all progression criteria. The only exception was the temporal quadrant for the moderate and strict criteria for originally measured RNFL as well as RNFLMC and RNFLAC, where the stable patients had thinner RNFL values, although generally not statistically significant. For the liberal criterion, stable glaucoma suspects (this were only 4 patients) had statistically significant larger RNFL thickness values globally and in the superior and inferior quadrant, originally measured but also after applying MC and AC. For the strict criterion, progressive glaucoma suspects had statistically significant thinner RNFL thickness values in the inferior quadrant, again for all three methods, originally measured and after applying MC and AC.

Diagnostic performance of RNFLMC vs RNFLAC was quantified with area under the receiver operating characteristic (AUROC) and compared between methods using DeLong’s test (Table 3). Due to the significant age differences between the stable and the progressive groups, we do not report in detail on the AUROC as obtained by the originally measured RNFL.

Table 3 Comparison of diagnostic performance of multivariate model compensation (MC) as well as age compensation (AC) of retinal nerve fiber layer (RNFL) thickness parameters using area under the receiver operating characteristics (AUROC) for liberal, moderate, and strict criteria of the topographic change analysis in glaucoma suspect patients

For glaucoma suspects, in general, liberal criteria of HRT progression performed better concerning AUROC than moderate or strict criteria (Table 3), regardless of the compensation method. But the interpretation that the liberal criterion is better regarding diagnostic separation is speculative, as a small sample size in the stable group (n = 4) impedes significant results.

MC AUROC of global RNFL and of the inferior quadrant performed statistically significantly better than AC AUROC, for the moderate criterion (see Fig. 1a–c for the AUROCs of the RNFL global).

Fig. 1
figure 1

a AUC global RNFL thickness parameters, liberal criteria. Blue, RNFL_Average (original); red, RNFL_Average (model compensated, MC_AUROC); green, RNFL_Average (age compensated, AC_AUROC). b AUC global RNFL thickness parameters, moderate criteria. Blue, RNFL_Average (original); red, RNFL_Average (model compensated, MC_AUROC); green, RNFL_Average (age compensated, AC_AUROC). c AUC global RNFL thickness parameters, strict criteria. Blue, RNFL_Average (original); red, RNFL_Average (model compensated, MC_AUROC); green, RNFL_Average (age compensated, AC_AUROC)

In general, original RNFL and MC AUROC performed better than AC AUROC for the moderate progression criterion. For the strict criterion, these differences were less pronounced.

AUROC for originally measured RNFL performed statistically significantly better than AC AUROC for the superior, nasal, and inferior quadrant for moderate and strict criteria and for RNFL global for the moderate criterion. But taking into account the age distribution for stable and progredient glaucoma suspects, it became obvious that stable glaucoma suspects were statistically significantly younger than progredient ones by 10 or 8 years for the moderate and the strict criteria. This leads to bias in favor of the originally measured RNFL. In Fig. 1, the AUROCs for global RNFL thickness parameters are shown and the respective curves for the originally measured RNFL do not take into account the age differences, while both the AC and the MC do correct for age differences.

Discussion

In this paper, we use our prior published multivariate model that facilitates a clinically meaningful reduction of the intersubject variance of peripapillary RNFL parameters, by compensating for variation of parameters correlating with RNFL thickness distribution. This model includes physiological parameters, automatically extracted from OCT fundus images such as retinal vessel density, OD shape, fovea location, and additionally age and refractive error [23, 24].

This model (MC) was applied on RNFL parameters measured with Cirrus OCT in patients with suspicion of glaucoma. This patient collective was divided into 2 groups, a group with or without prior progressive OD change in a series of at least 4 HRT®III measurements. Change during the CSLT follow-up was determined with strict, moderate, and liberal criteria of the TCA. MC as well age compensation (AC) of RNFL was applied to the global, quadrants, and clock hour sectors of RNFL thickness values. It is to be expected that the MC leads to a better diagnostic separation. For glaucoma suspect patients, trends are seen in the present paper of MC performing better in regards of AUROC in comparison to AC, for all progression criteria, with statistically significant differences for global RNFL and the inferior quadrant (a clinically relevant region for glaucoma damage) for the moderate criterion. The lack of significance for all other parameters might be on the one hand due to the small sample size, especially for stable patients according to the liberal criteria. On the other hand, the limitation of HRT as a method to detect glaucoma progression (OD change over time) might have had an influence.

There is no consensus regarding the ability of HRT to detect progression among the prior literature. One study showed that TCA parameters observed longitudinally can discriminate between progressing and healthy eyes [25]. However, another prospective study suggested that TCA progression criteria do not predict photographic evaluation or visual field progression [30]. Similarly, a retrospective study demonstrated that the stereometric parameters of the HRT did not have a high enough sensitivity and specificity to detect glaucomatous progression that was otherwise detected by serial stereoscopic OD photography evaluated by experienced masked observers [31]. There are certain factors that can influence measurement reliability. The HRT estimates the height location of surface pixels indirectly through the use of a reference plane, which is set at an arbitrarily determined depth below the outer limit of the RNFL in the papillomacular bundle. This reference plane may be subject to alteration due to progressing glaucomatous damage at this location. This in turn may affect changes in pixel height as assessed by TCA.

In the present paper, 10 patients were treated with eye pressure–lowering eye drops, only 1 patient needed therapy enhancement (from mono-therapy to a combination treatment), and no glaucoma surgery was performed during the HRT measurement period. An influence of treatment on the measurement results seems thus unlikely.

The better performance of the AUROCs for liberal criteria in glaucoma suspect patients when compared to the other two criteria might indicate that in our setting a large proportion of patients with progressive OD change according to liberal criteria truly have glaucoma whereas, for strict criteria, many of them are labeled as (false) negatives. On the other hand, only when progression was defined by the moderate criterion the inferior quadrant RNFL MC and global RNFL MC performed significantly better than AC.

In our data, AUROC for originally measured RNFL performed statistically significantly better than RNFL AC AUROC for the superior, nasal, and inferior quadrant for moderate and strict criteria. However, lack of age compensation introduces a relevant bias when using the originally measured RNFL values in our data, where progressive glaucoma suspects were older by 8 to 10 years compared to stable glaucoma suspects. Furthermore, it is the AC compensation method, which is commonly applied during OCT data evaluation in daily routine, whereas the originally measured RNFL values, without age correction or age-dependent normative values, are generally neither clinically nor scientifically used.

A sample bias might also have had an influence on the study results, as patients with newly developed visual field defects were excluded. Only patients with normal visual fields were recruited, but if they had reproducible scotomas on the study day, they were excluded from further study participation. This was the case in 10 glaucoma suspects, which might lead to an inclusion of only stable glaucoma suspects. For these 10 glaucoma suspects, we also executed the strict, moderate, and liberal criteria of the TCA. For the liberal criterion, all of these glaucoma suspects were labeled as progressive (10 out of 10). Regarding the moderate criterion, 6 out 10, and, for the strict criteria, 4 out of 10, glaucoma suspects were classified with progressive optic disc change. The interpretation of this analysis could be that with strict criteria of the HRT-TCA real glaucoma cases might be missed; and on the other hand, for the liberal criterion, healthy subjects might be labeled as glaucoma. CSLT is able to detect optic disc change over time, but its clinical impact regarding real glaucoma disease progression should be interpreted with caution.

In conclusion, we found some evidence that the reduction of interindividual variability of RNFL measurements through using a multivariate compensation method might improve the detection of glaucoma among glaucoma suspect patients. Further studies are warranted to validate the present results.