With the improvement brought about by highly active antiretroviral therapy (HAART) in the health of human immunodeficiency virus (HIV)-positive patients, it has become important to determine whether damage continues to occur under HAART, which implies good immune control, and to identify methods of finding such damage. In HIV patients without history or evidence of retinitis, previous studies by us and others have disclosed structural damage to the retina and functional deficits resulting from such damage [113]. Individuals with HIV retinopathy without retinitis and with low CD4 T-lymphocyte counts show deficits in visual function, even though the central vision may be preserved [1, 2]. In the era before HAART became available, damage was suggested in such eyes by studies showing that HIV-positive patients have reduced sensitivity in the field of vision [3, 4], decreased color and contrast sensitivity in the test of central vision [5, 6], and altered retinal processing with electrophysiological testing [7]. Further studies from our group have shown that there are particular topographic patterns of this visual field (VF) loss [8]. Retinal cotton wool spots, microaneurysms, capillary drop-out, and ischemia are assumed to damage the ganglion cell layer and retinal nerve fiber layer (RNFL) [1, 9, 10].

Even in the HAART era, it has been shown that damage still occurs in HIV patients managed by HAART [11]. Using high-resolution optical coherence tomography and scanning laser polarimetry, we found thinning in the retinal nerve fiber layer of HIV patients with low CD4 counts [12, 13]. Multifocal electroretinogram (mfERG) showed abnormalities in the second order kernel (inner retina) in HIV populations [14]. Second order kernel abnormalities in mfERGs indicated that not only low CD4 patients but also high CD4 patients underwent detectable electrophysiological alteration and maybe even damage in the inner retina [15].

HIV patients with high CD4 counts may also have symptoms of retinopathy. Although most of these patients do not have visual symptoms, a few patients observe visual field changes. Automated perimetry is currently the most widely used method to detect functional deficits that anatomic changes in this population might cause. Because these changes are usually subtle, the deficits are difficult to detect by human observers, including perimetric experts.

Pattern recognition techniques, especially machine learning classifiers (MLCs), have been previously applied to ophthalmologic problems, such as the interpretation and classification of visual fields [16, 17], detection of visual field progression [18, 19], assessment of the structure of the optic nerve head [20, 21], measurement of retinal nerve fiber layer thickness [22, 23], and separation of noise from visual field information [24]. From previous studies in glaucoma, we found support vector machine (SVM) to be particularly effective for discriminating between normal and glaucomatous visual fields [16, 25]. MLCs can be trained to distinguish the group identity of patterns, sometimes with greater sensitivity than a human expert [2528].

In this study, we applied SVM with the Gaussian kernel to determine if visual fields in HIV subjects differ from visual fields in normal subjects. Since the immune function presumably was better in the high CD4 group, we expected HIV retinopathy damage to be less in the high CD4 group than in the low CD4 group. We assumed that there was enough information in the visual fields to distinguish low CD4 patients from HIV-negative patients, and we anticipated that there might be enough information to discriminate high CD4 patients from HIV-negatives. The Statpac global indices, mean deviation (MD) and pattern standard deviation (PSD) are widely available and in common use to interpret automated perimetry for glaucoma. We compared MLCs to MD and PSD in the ability to separate fields of low CD4 from those of high CD4 HIV patients.

Machine learning classifiers (MLCs) have evolved to approach the theoretical limit in finding the differences between classes. With these theoretically more effective MLCs, we (1) seek differences in visual fields between normal eyes and eyes of HIV patients, (2) try to find the effect of immunodeficiency on visual fields, as reflected in CD4 count, and (3) compare the effectiveness of MLCs to commonly-used Statpac global indices in analyzing standard automated perimetry (SAP).

Methods

Patients

The HIV-positive patients come from an Institutional Review Board-approved, National Institutes of Health-sponsored longitudinal study of HIV disease at the University of California, San Diego (UCSD). The research followed the tenets of the Declaration of Helsinki. Non-HIV controls were age-matched healthy participants in the HIV study as well as non-glaucomatous age-matched healthy controls from the National Eye Institute-sponsored ongoing longitudinal Diagnostic Innovations in Glaucoma Study (DIGS).

The patients were divided into three groups. The high CD4 group (H) consisted of HIV-positive patients with good immune status. Their medical records showed that their CD4 counts were never valued at <100 (1.0 × 109/l). The low CD4 group (L) were HIV-positive patients with CD4 cell counts measured at <100 (1.0 × 109/l) at some period of time in their medical history lasting for at least 6 months. Out of 59 eyes in this subgroup, 32 had signs of HIV retinopathy at the time of examination (n = 6) or based on their medical records (n = 26). None of the eyes had evidence of retinopathy caused by other virus. All HIV patients were on HAART therapy prior to the time of the examination, and a substantial portion of these patients had a recovery in their CD4 counts. The HIV individuals had no confounding ocular disease or eye surgery. The normal group (N) consisted of HIV-negative patients without evidence of ocular damage. This normal control group comprised 17% with a life style similar to the HIV groups, and 83% from DIGS.

Ophthalmologic evaluation

All patients had a complete ocular examination, including indirect ophthalmoscopy and morning intraocular pressure measurement. The exclusion criteria were inability to perform visual field testing, corrected visual acuity worse than 20/40, spherical refraction beyond ±5 diopters, cylindrical correction greater than 3 diopters, unclear ocular media, concurrent or healed CMV retinitis (a fellow eye without retinitis was eligible), scotopic pupil size <3 mm, glaucoma or suspicion of glaucoma by disk or field or intraocular pressure greater than 21 mmHg on two visits, and diseases that can cause retinopathy, like diabetes or uncontrolled hypertension.

Visual field testing and data input for the classifier

We used a Humphrey Visual Field Analyzer (model 620; Carl Zeiss Meditec, Dublin, CA, USA), standard automated perimetry (SAP) full threshold program 24-2, and routine settings for evaluating visual fields. Visual fields were taken within 1 week of the ophthalmologic examination. The examination was paused for a while before evaluating the second eye. Only reliable VFs, defined as those with less than 33% false-positives, 33% false-negatives, and 33% fixation losses, were used. Thus, eight eyes had to be excluded from analysis. Naïve visual fields were not analyzed; fields were included only after initial practice.

The absolute sensitivity (in decibels) of 52 visual field locations (54 excluding the two located in the blind spot) formed a feature vector in 52-dimensional input space for each of the 124 SAP fields of normal and HIV eyes [16, 25]. SAPs from the left eye were mapped to right eye format to make all the fields appear as right eyes for input for the SVMs.

Machine learning classifiers

Pattern recognition can use methods of machine learning classifiers. Support vector machine (SVM) is a machine classification method that seeks the boundary that best separates sparse samples that are difficult to separate in the two classes [29, 30]. SVM learning adapts to the data and often outperforms other classifiers; the use of sparse data helps SVM to learn efficiently. Support vector techniques have been used for various clinical medicine classification applications including the detection of glaucoma and HIV-related ocular disease [16, 21, 31]. The support vector method was implemented by using Platt’s sequential minimal optimization algorithm in commercial software (MatLab, version 7.0 MathWorks, Natick, MA). For classification of the SAP data, Gaussian (nonlinear) kernels of various widths were tested, and the chosen Gaussian kernel width was the one that gave the highest area under the receiver operating characteristic (AUROC) curve, using 10-fold cross-validation to separate teaching and test samples.

Training and testing machine learning classifiers

In this study, we used 10-fold cross validation, which randomly split each class into ten equal subsets. The classifier was trained on a set that combined nine of the ten partitions, and the 10th partition served as the testing set. This procedure was performed ten times, with each partition having a chance to serve as the test set.

Performance measure of trained machine learning classifiers

Receiver operating characteristic (ROC) curves display the discrimination of each classifier as the separation threshold is moved from one end of the data to the other. We generated an ROC curve to represent a chance decision to permit comparison of the machine learning classifiers against chance; the predictor with performance equal to chance will have AUROC = 0.5, while the ideal classifier will give an AUROC = 1.0. We tested the null hypothesis (p-value) for comparing the AUROCs of classifiers [25, 32]. We trained and tested SVM to distinguish fields from normal subjects and high CD4 group, and between normal subjects and low CD4 patients group.

Dimension reduction by feature selection

We trained the machine learning classifiers with the full feature set (SVM full) and, in an effort to improve performance, with a performance-peaking subset of near optimal features derived with feature selection [21]. To create small subsets with the best features, we used backward elimination (SVM back) with SVM. Previous research found backward elimination to work better than forward selection on visual field data [21]. Backward elimination started with the full feature set. The feature that, when removed, either maximally increased or minimally decreased the SVM performance was removed, and the process was repeated sequentially down to one feature. Close to the best feature set could be determined by choosing the reduced feature set with peak performance.

Results

There were 132 subjects (118 men, 14 women, 190 eyes) in the HIV group. There were 70 Hispanic, 48 Caucasian, 12 African-American and two Asian-Pacific patients in this cohort. The normal group consisted of 52 HIV-negative individuals (61 eyes) with the mean age ± standard deviation of 48.5 ± 8.2 years and mean spherical equivalent in diopters (Dsph) of −0.70 ± 0.43 Dsph. The high CD4 group had 39 patients (70 eyes) with mean age of 47.1 ± 8.1 years and mean spherical equivalent −1.20 ± 0.46 Dsph. There were 38 patients (59 eyes) in the low CD4 group, with mean age of 46.5 ± 7.8 years and mean spherical equivalent −0.91 ± 0.57 Dsph. The AUROCs and their standard deviations for the various combinations of CD4 level and feature set size are shown in the Table 1, which also demonstrates the p-values for chosen comparisons.

Table 1 AUROCs (maximum in italics) and p-values of comparison of ROC curves generated by classifiers separating HIV-positive and normal eyes (maximum AUROC italics, p ≤ 0.05 bold face)

HIV patients with low CD4 counts

With SVM full, the AUROC was 0.790 ± 0.042 (Table 1, Fig. 1a). It significantly differed from chance decision with p < 0.00005 (Table 1). Backward elimination selected a peak-performing 11-feature feature set (arrow in Fig. 2a). The AUROC for SVM back significantly improved to 0.833 ± 0.037 (p = .050), compared to SVM full. The bold dashed curve was the average of curves generated by the standard method of backward elimination [29] (Fig. 2). The location of the eight most significant field locations were mapped to the standard visual field display. Eight were chosen to match the size of the best feature subset for high CD4 patients (see below). The majority of the top eight features were located near the blind spot, with a preponderance superiorly and temporally (Fig. 3a).

Fig. 1
figure 1

Receiver operating curves (ROCs) for support vector machine (SVM) and Statpac global indices, mean deviation (MD) and pattern standard deviation (PSD), in human immunodeficiency virus (HIV) positive patients. SVM full are ROCs generated by SVM trained on all 52 field locations. SVM back are ROCs generated from the subset with the peak performance. The chance curve is the effect of SVM learning to distinguish classes with data randomly distributed between them. a ROCs from distinguishing low CD4 eyes from normal. b ROCs from distinguishing high CD4 eyes from normal

Fig. 2
figure 2

Performance curves measuring area under receiver operating curve (AUROC) for the best feature combination for each size subset of features generated by backward elimination between one feature and all 52 features. The bold curve averages the curves (thin dark gray curves) derived from the standard backward elimination. The peak (arrow) is the subset size with the best performance. a Curves generated by backward elimination applied to low CD4 vs normal eyes. b Curves generated by backward elimination applied to high CD4 vs normal eyes

Fig. 3
figure 3

Ranking by backward elimination. a Location of field defect in low CD4 group showing that the top eight field locations tend to be clustered superior temporally, close to the blind spot. b Location of field defect in high CD4 group showing that the top eight field locations tend to be without discernable pattern

MD and PSD produced AUROCs of 0.813 ± 0.039 and 0.723 ± 0.047 respectively. MD was better than PSD (p = .03). SVM back was significantly more effective than PSD (p = .004 )(Table 1), but not significantly better than MD (p = .41).

HIV patients with high CD4 counts

The AUROC was 0.664 ± 0.047 with SVM trained on the full feature set of 52 SAP locations (Table 1, Fig. 1b). It was significantly better than chance (p = 0.041). Backward elimination produced subsets that peaked at eight features. The AUROC with the eight-feature subset, was 0.733 ± 0.044. This peaking was demonstrated by the arrow in Fig. 2b. The top eight visual field locations were diffusely scattered (Fig. 3b).

The Statpac indices, MD and PSD, generated AUROCs of 0.651 ± 0.48 and 0.587 ± 0.50 respectively. SVM back was significantly better than PSD (p = .0007) but not MD (p = .10).

Discussion

This study in the HAART era confirmed the reports in previous publications that eyes from HIV patients with low CD4 T-lymphocyte counts have retinopathy damage that affects the visual field [3, 4, 8]. SVM trained with the full set of visual field locations, optimized SVM trained on the best subset of visual field locations, MD and PSD all distinguished visual fields of HIV subjects with low CD4 counts from fields in normal eyes. SVM and optimized SVM conferred no advantage over MD. A larger number of examples in each group is necessary to determine if optimized SVM differs from MD.

MD outperformed PSD in low CD4 eyes (p = .03). As a mass output measure of decreased field sensitivity, MD does not indicate if the depressions are focal, regional, or diffuse. PSD, designed to suppress global depression, is more responsive to local and regional field depression. PSD is more sensitive than MD to glaucomatous field defects, which tend to be regional [25]. The better performance of MD in eyes of low CD4 HIV eyes suggests that the field defects may be diffusely scattered, and less likely to be focal or regional. Also, since these patients were not old, the ability of PSD to account for cataracts was not beneficial.

Optimizing SVM significantly improved its performance on low CD4 eyes (p = .05), reducing the likelihood that the choice of the top field locations was due to the vagaries of the data set. The standard backward elimination curve found no increase in accuracy with the use of more than the top 11 locations (Fig. 2a). The eight most important field locations for distinguishing the low CD4 HIV eyes from normal tended to be superior, temporal, and close to the blind spot (Fig. 3a). This tendency located the retinal damage in low CD4 eyes to regions mostly close to the optic nerve, inferior, and nasal. It is not clear whether the damage was most prominent near the disk, or whether it was just more easily detected there.

SVM, optimized SVM, and MD were able to distinguish eyes from HIV patients with high CD4 T-lymphocyte counts from normal eyes, though with less assurance than with low CD4. PSD was no better than chance in making the distinction in high CD4 eyes (p = .17), though it was better than chance for low CD4 eyes (p = .002). The diminished assurance indicated that the visual field defects were fewer and less deep in high CD4 eyes than in low CD4 eyes. It is unclear if the smaller difference from normal in the high CD4 eyes is due to resolution of some defects in the high CD4 group, or if a relatively greater depression in the field around the optic nerve is present in the low CD4 subjects. This is a cross-sectional analysis, and the true answer to the point above could be possible with longitudinal observation and a larger data set. A comparison of field defects between those whose CD4 counts remained <100 at the time of testing versus those whose CD4 counts recovered would be interesting, as the question that arises is whether these defects are reversible with a recovery in CD4 counts.

Optimizing SVM did not significantly improve performance on high CD4 eyes compared to SVM full. This observation diminished the veracity of the ranking for high CD4 and made the location of the significant defects uncertain. The locations of the eight most important field locations were scattered without a pattern.

Our previous observation at the beginning of the HAART era showed a pattern of visual loss sparing of the papillomacular bundles and associated damage to the inferior retina external to the posterior pole [8]. Similarly, the papillomacular area was spared in this cohort, as was the inferior retina outside the arcades. The diffuse pattern of damage has also been shown when analyzing one eye per patient only, in a similar but not identical HIV-positive cohort [32]. It is tempting to speculate that HAART therapy may have an effect on the extent of retinal damage; longitudinal observation could bring more light to this complex problem.

HIV retinopathy is a microvasculopathy that causes peripapillary hemorrhages, microangiopathy, and cotton-wool spots in retinae that have not been secondarily infected [9, 10, 33, 34]. Inner retinal thinning was previously reported with OCT and scanning laser polarimetry of low CD4 eyes, with inferior thinning being more prominent [12, 13]. RNFL thinning was found even in patients with good immune status in the HAART era [35]. Retinal microinfarctions may be responsible for the RNFL defects and field deficits. Similar findings were also reported in HIV-positive children using the third-generation OCT [36]. Although fields from high CD4 eyes appear mostly normal to human perimetric experts, this study found that the trained machine learning classifiers and MD could each distinguish between eyes from high CD4 patients and normal eyes.

Pattern recognition has proved extremely useful in this clinical scenario. Even if testing the visual field in HIV patients currently does not have the same relevance for managing patients as it does in glaucoma, it has served to uncover information about the disease process. To assist the management of patients with HIV, a future approach could be the establishment of threshold values for MD or for SVM that would enable identification of individuals who have retinal damage.

SVM, especially when the feature set is optimized by dimensionality reduction, is a sensitive classification method that approaches the performance of the theoretical optimal classifier for classifying visual fields [16, 18, 24, 25]. Optimized machine learning classifiers appear to be a valid approach to detecting subtle abnormalities in medical tests with complex multidimensional measurements. This concept was demonstrated in our previous report analyzing complex datasets from mfERG in HIV-positive patients [15].

In summary, we have confirmed that eyes from low CD4 HIV patients have visual field measurements indicating retinal damage, and that high CD4 eyes also have retinal damage. We have demonstrated that a generalized learning classifier, SVM, is effective at learning which eyes have field defects, even when these defects are subtle, and we have discovered that MD, a statistical classifier tuned to visual field data, is also effective in distinguishing both high CD4 fields and low CD4 fields from normal. An important message to people at risk of HIV and to their providers is that HIV infection may produce ocular damage under HAART, even if there is good immune status.