FormalPara Key Summary Points

Dark adaptation has been proposed as a possible functional biomarker for age-related macular degeneration (AMD).

The last review on the measurement of dark adaptation in people with AMD was performed in 2006, and since then, new studies have been published and new instruments to measure dark adaptation have been created.

This systematic review found overwhelming evidence in the literature of an association between impaired dark adaptation and AMD.

Further study is needed to investigate its discriminatory power to better differentiate early-stage AMD and normative macular ageing, along with additional data on repeatability and reproducibility of measurement.

There is evidence that structural abnormalities such as reticular drusen are associated with a prolongation of dark adaptation time, while few studies have explored an association between dark adaptation and other measures of visual function or patient-reported outcome measures.

Digital Features

This article is published with digital features, including a summary slide, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.13353146.

Introduction

Age-related macular degeneration (AMD) is the primary cause of loss of sight in ageing populations of the developed world [1]. Currently, effective treatment can slow the progression of neovascular (wet) AMD, and extensive research is underway to develop effective interventions for non-neovascular (dry) AMD [2, 3]. It is therefore important to identify AMD in its earliest stages so disease progression can be potentially delayed. In order to reduce the duration and costs associated with trials for new treatments of AMD, sensitive biomarkers for disease progression are also required [4]. The retinal pigment epithelium (RPE) is believed to be the fulcrum of the AMD disease process [5], whilst the provision of 11-cis retinal from the RPE to the photoreceptors is the limiting factor in the rate of visual pigment regeneration in photoreceptors. Therefore, dark adaptation (DA) has been proposed as a functional biomarker for AMD onset and progression [6].

The term ‘dark adaptation’ describes the recovery of visual sensitivity to light stimuli in a dark environment after light exposure has bleached a significant proportion of visual pigment. DA is determined in part by the rate (speed) of visual pigment regeneration in photoreceptors, which is also dependent on the functional integrity of the choroidal circulation, Bruch’s membrane and, crucially, the RPE [7]. Abnormal DA manifests in reduced rates of recovery of rod and cone sensitivity, delayed rod-cone break (RCB) and/or an increased photoreceptor absolute threshold [6].

Measuring DA can likely be traced back to the nineteenth century, when Aubert [8] first described DA using the glow of a just visible platinum wire as a stimulus. Developments have since been made, from the early DA curves plotted by Hecht, Haig and Chase [9] to the landmark mathematical model of DA by Lamb and Pugh [7]. More recently, dark adaptometers have been designed for clinical use, with instruments designed to assess DA in conditions like AMD.

It has been suggested that delayed DA may manifest before the appearance of clinical features of AMD such as drusen and focal pigmentary changes [10, 11]. Indeed, DA has been reported to be a diagnostic indicator of AMD [12, 13]. Time taken for DA to occur has also been shown to progressively increase with increasing severity of AMD, suggesting it has potential not only as a diagnostic tool but also as a biomarker for disease progression [14,15,16].

The most recent systematic review considering DA assessment in people with AMD was published as part of a wider review on visual function tests in 2006 [17]. Since then, new studies on DA measurement have been published. Moreover, some instruments for measuring DA have become commercially and widely available, such as the AdaptDx adaptometer (MacuLogix, Hummelstown, PA, USA) [12, 13]. DA is also one of the candidate biomarkers being examined by a large multi-centre longitudinal study aiming to find better ways of detecting progression in AMD [4].

In this systematic review we aim to evaluate current methodology used to assess DA in people with AMD. The review specifically examines the evidence of diagnostic precision of these methods in detecting the onset of AMD, as well as the sensitivity to disease progression. The report is focused on the AdaptDx instrument as the device which is used most widely in the recent literature. We also explore evidence of associations between DA and functional vision measures, structural measures, patient-reported outcome measures (PROMs) and outcomes from performance-based studies (involving assessment of visually guided tasks such as face recognition or mobility assessments). This systematic review follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, and a detailed protocol has been published on PROSPERO (registration number: CRD42019129486).

Methods

Eligibility Criteria for Considering Studies for this Systematic Review

To be eligible for inclusion, studies had to be: (1) published in the English language; (2) dated from January 2006 to January 2020 to include studies following the review by Hogg and Chakravarthy [17]; (3) include participants with AMD (of any stage); and (4) include a dynamic measurement of rod and/or cone DA. Studies were excluded if they were review articles, letters to the editor, published protocols or conference abstracts.

Search Methods for Identifying Studies

The following databases were searched: CINAHL, MEDLINE, PsycINFO and PsycARTICLES (via EBSCO) and EMBASE and AMED (via OVID) for publications published between 01/01/2006 and 27/01/2020. An indicative list of search terms and the search query used is provided in Table S1 (see electronic supplementary material for details). Key terms regarding AMD, DA and dynamic photostress testing were used. The reference lists of the included literature were examined as a further source of relevant studies. Covidence software [18] (Veritas Health Innovation, https://www.covidence.org) was used for extraction, organisation and screening of the literature.

Study Selection

Duplicates were automatically removed by Covidence software [18]. Two authors (BEH and DJT) independently assessed for eligibility for inclusion through screening titles and abstracts. The same two authors then independently read the full texts of potential eligible studies with any disagreements about inclusion resolved through discussion and then arbitration by a third author (AMB).

Data Collection and Risk of Bias Assessment

Two authors (BEH and DJT) independently evaluated the quality of the included studies. The Critical Appraisal Skill programme tool [19] (CASP; Critical Appraisal Skills Programme. https://casp-uk.net/) was used to assess cohort, case–control and randomised controlled trial study designs. The Joanna Briggs Institute critical appraisal tool [20] (JBI; Joanna Briggs Institute, https://joannabriggs.org) was used to assess cross-sectional and case-series design types. These tools are recommended by the UK National Institute for Health and Care Excellence [21] (NICE; National Institute for Health and Care Excellence, https://www.nice.org.uk/) guidelines. The summarised quality appraisal results are shown in Table S2 (see electronic supplementary material for details).

Data Synthesis and Analysis

Study characteristics were extracted into a data synthesis table (Table S3; see electronic supplementary material for details). A meta-analysis was not appropriate given the range of stimulus and bleach parameters employed by different studies. Data were analysed based upon reported DA and/or photostress procedure (Table S4; see electronic supplementary materials), diagnostic accuracy and repeatability measures reported (Table S5; see electronic supplementary materials) and reported vision and structural outcome measures compared to parameters of DA (Table S6; see electronic supplementary materials). Data on reported outcome measures compared to parameters of DA, unrelated to vision, were also recorded but not analysed (Table S7; see electronic supplementary materials). This article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors.

Results

The search of bibliographic databases performed on 27th January 2020 identified 512 publications. During this screening procedure, most studies (n = 397) were excluded, principally for not reporting measures of dynamic DA or not including a study population with people who had AMD. Sixty-seven full texts screened were removed primarily because their format was ineligible for this systematic review (e.g. conference abstract) resulting in 48 papers deemed appropriate for the final review process (Fig. 1).

Fig. 1
figure 1

Study selection process

The 48 included publications were subjected to quality appraisal. The CASP tool was used for cohort (n = 4), case–control (n = 20) and randomised controlled trial (n = 4) study designs. The JBI tool was used for cross-sectional (n = 17) and case studies (n = 3). The grading of all papers can be found in Table S2 (see electronic supplementary material for details). The main source of bias was selection bias, whereby most observational studies did not explicitly report their recruitment strategies. Moreover, some studies were conducted on small sample sizes that could lead to systematic over- or under-estimation of effects. In addition, a large percentage of the studies did not report standard deviations (SD) or confidence intervals (CI) for the DA parameter, which made it difficult to judge the precision of the results. These factors were considered when analysing the results.

What follows is a short narrative summary of the main findings of the 48 studies included in this systematic review, with a focus on the 21 studies that used the AdaptDx and AdaptRx devices (Apeliotus Technologies, Atlanta, GA, USA). The other 27 studies used myriad lab-based dark adaptometers, dynamic photostress tests devices and the occasional commercially available instrument like the Roland Consult Dark Adaptometer (RCDA; Roland Consult GmbH, Germany) [22]. Full details about each study are in Tables S3–7 (see electronic supplementary materials for details).

AdaptDx: Testing Procedures

AdaptDx and AdaptRx were used to measure rod-mediated DA in 21 studies [10, 12,13,14, 16, 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38]. The AdaptRx methodology appeared identical to the AdaptDx and was reported once in the reviewed literature by Flamendorf et al. [23].

The most frequently reported DA parameter was rod-intercept time (RIT) (20 out of 21 studies) [10, 12,13,14, 16, 23,24,25,26,27,28,29,30,31,32,33,34,35,36, 38], an estimate of the time duration for the rods to recover to an established criterion sensitivity (i.e. 5 × 10−3 cd/m2) after focal bleaching [13]. Clark et al. [37] was the only study which fitted AdaptDx threshold data with a biological model of rod-mediated DA and analysed the slopes of the second and third components of recovery, final threshold recorded during the 20 min test period, and ‘mean threshold’ (the average of all thresholds after 300-s post-bleach).

There were different approaches to bleaching magnitude and target position reported in the literature that varied depending upon the aim of the study. For example, the most commonly reported bleaching procedure in these studies was an 82–83% photoreceptor bleach using a flash of 5.8 × 104 scotopic cd·s/m2 intensity or equivalent for 0.25 ms [10, 13, 14, 23, 27,28,29,30,31, 33, 36, 37] (12 out of 21 studies). This bleaching light has been reported by Jackson and Edwards [13] to be sufficient in magnitude to demonstrate impaired DA in people with early AMD when using the 20 min duration AdaptDx protocol. Another reported bleaching procedure in the literature was a 76% photoreceptor bleach using a flash of 1.8 × 104 scotopic cd·s/m2 intensity for 0.8 ms [12, 16, 24,25,26, 32, 34,35,36, 38] (10 out of 21 studies), which has been also reported as sufficient in highlighting the AMD-related DA impairment [12]. Chen et al. [36] reportedly used both an 82% and a modified 76% bleaching procedure. Only a handful of studies explicitly reported the wavelength of the bleach as 505 nm (7 out of 21 studies) [12, 24,25,26, 32, 34, 38].

The most commonly reported testing location of the bleaching procedures and subsequent location of threshold measurements was centred at 5° on the inferior vertical meridian (19 out of 21 studies) [10, 12,13,14, 23,24,25,26,27,28,29,30,31,32,33,34, 36,37,38], which is consistent with evidence of preferential damage to the parafoveal retina in the earliest stages of AMD [39, 40]. Three studies reported a test location of 11/12° eccentricity [16, 35, 36]. In two of these cases, both 5° and 12° locations were evaluated [35, 36]. Binns et al. [35] recently compared different bleaching protocols (65%, 70% and 76% photoreceptor bleach at either 5° or 12° inferior field). It was concluded that 76% bleach at 12° eccentricity allowed for separation between the groups in the sample and reduced recovery time to under 20 min, compared to more extended recovery times in the 5° location [35].

The AdaptDx has a threshold stimulus size of 1.7°–2.0° with a wavelength of 500–505 nm [10, 12,13,14, 16, 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38], centred on a bleached area subtending 4°. Eight studies reported that the stimulus was first presented 15 s after bleaching onset [10, 16, 27,28,29,30, 35, 37] while others reported that it started immediately after bleaching [24, 34]. The most frequently reported maximum test time (or cut-off time) was 20 min [10, 12, 13, 24,25,26, 28,29,30,31,32, 37, 38] (13 out of 21 studies) followed by 40 min [16, 23, 27, 33, 36] (5 out of 21 studies), 45 min [14, 34] (2 out of 21 studies) and 30 min [35] (1 out of 21 studies). For the rapid procedure, the DA functions were truncated to 6.5 min [12].

Twelve studies assigned the participant an RIT value (in most cases, the cut-off time) if they failed to recover within the duration of the test [13, 14, 23,24,25,26,27, 33,34,35,36, 38]. It was not clearly reported by six studies what this value was set to if the participant surpassed the cut-off time [10, 28,29,30,31, 37]. Two studies appeared to use an estimated recovery time based on the observed trend up to the maximum recording time, but when the parameter was unable to be extrapolated, it was again set to the experimental cut-off time [12, 32]. Conversely, Owsley et al. [16] set the value to ‘indeterminate’ rather than assign an estimated RIT.

AdaptDx: Diagnostic Precision, Repeatability and Longitudinal Studies

Three studies specifically reported estimates of diagnostic performance (sensitivity and specificity) of AdaptDx to separate people with AMD from visually healthy controls [12, 13, 35]. Only one of these studies had this as its primary aim [12]. A fourth reported the prognostic performance of the test at identifying healthy individuals who would develop AMD within 3 years of baseline testing [29].

Jackson and Edwards [13] introduced the AdaptDx 20 min protocol in a study of controls [n = 17, eight young participants (mean age 32.6 years) and nine old participants (mean age 73.1 years)] and participants with AMD (n = 17). The threshold for being test negative was ≤ 12.5 min; this yielded a sensitivity of 88% and specificity of 100%. Notably, nine participants with AMD [incidentally, the whole of the intermediate (iAMD) and late AMD cohort] were unable to complete the test so were assigned an RIT of 20 min. The small sample size yielded wide CIs and is considered insufficient to allow analysis of the true diagnostic precision of the 20 min procedure [13].

Jackson et al. [12] reported results from a prospectively planned cross-sectional study (at three centres) of people with AMD (n = 127) and visually healthy people (n = 21). The reference standard was clinical examination and grading of colour fundus photographs (CFP). The AdaptDx rapid protocol was the index test with RIT of ≤ 6.5 min as the threshold for being test negative; this yielded a sensitivity of 91% and a specificity of 91%. Fourteen people reportedly could not do the examination; this is a noteworthy bias, because the authors failed to include this in their estimates of diagnostic precision. Moreover, the small sample of controls meant that the lower bound of the 95% CI for specificity was ~ 70%, but this was not reported in the abstract. Further examination revealed that sensitivity was reduced (81%) in people investigators classified as having early AMD. The groups were not age-related (controls were mean ~ 8 years younger) and this is another limitation of the results [12].

Binns et al. [35] primarily aimed to determine optimal test conditions for evaluating DA in iAMD using the AdaptDx. The authors used estimates of diagnostic precision (area under the receiver operating characteristics [AuROC] curves) to conclude that a 76% bleach at 5° eccentricity provided ‘optimal’ separation between AMD and controls; however, this was at the expense of a long recording duration. An alternative test location was suggested at 12° to provide adequate diagnostic accuracy whilst reducing recording time by more than 50%. The authors reported estimates of diagnostic precision at different cut-offs (sensitivity and specificity of 89% and 63%, respectively, for the optimal condition, for example), but the small sample size (n = 16 people with AMD and n = 10 controls) was insufficient to allow evaluation of the true diagnostic precision of the procedure [35].

In a well-designed cohort study, albeit at a single centre, Owsley et al. [29] primarily aimed to examine whether slowed RIT (measured with an extended AdaptDx protocol) in adults with normal macular health (n = 325) at baseline was associated with the incidence of AMD 3 years later. After adjustment for age and smoking, those with abnormal DA (defined as RIT ≥ 12.3 min) at baseline were approximately twice as likely to have AMD in that eye by the time of the follow-up visit as those who had normal DA at baseline. The purpose of this study was not to evaluate the diagnostic precision, but the authors reported sensitivity and specificity of 33% and 83%, respectively, for incident AMD in patients in normal macular health at baseline [29].

Only one paper explicitly provided data about the repeatability of DA measurement with the AdaptDx. Flamendorf et al. [23], using the AdaptRx, conducted a cross-sectional, single-centre study of participants (> 50 years) with and without AMD (n = 116). Patients were stratified by fundus features, with 36% having ‘no large drusen’. The authors primarily demonstrated that RIT was associated with age, AMD severity and subfoveal choroidal thickness. A subsample of participants (n = 87) underwent repeat examination ~ 1 week later, and the authors reported a mean (± SD) RIT difference of 0.02 ± 2.26 min; this translates to 95% limits of agreement of −4.41 to 4.46 min, which is quite wide given the average RIT. The authors claimed that repeatability did not differ significantly between AMD groups but it is not explicitly clear how they analysed this. The authors did provide a Bland–Altman graph showing the differences in test–retest variability which did not seem to increase with worse RIT. Yet on inspection, it appeared those participants recording a RIT below 10 min had better RIT repeatability [23].

There is clear evidence for an association between delayed DA and presence of AMD. However, we only identified five longitudinal studies assessing DA measurements from AdaptDx over time [14, 16, 29, 30, 36].

Jackson et al. [14] prospectively collected DA, visual acuity (VA) and CFP at baseline and at 6 months and 12 months in people with AMD (n = 26). The investigators observed worsening of DA in five participants in 12 months of observation, despite seemingly stable VA and fundus appearance. The study was limited by the small sample size and had a small control group who were not age-related (n = 6). Four AMD participants exhibited large changes in DA at 6 months, which was inconsistent with their DA at 12 months [14].

The previously mentioned cohort study reported by Owsley et al. [29] focused on elderly people without AMD and showed delayed DA at baseline was associated with development of AMD at 3 years. Owsley et al. [30] reiterated this finding in another paper comparing the association of impaired visual function to AMD incidence 3 years later (using the previously reported DA results for 363 eyes) [29]. Impaired mesopic acuity was found to have a weaker association to AMD incidence compared to DA [30]. The same research group was responsible for another 2 year longitudinal study reporting on eyes with iAMD (n = 30) and measurable RITs [16]. This paper reported that rod-mediated DA slowed in iAMD over 2 years in most eyes and gave useful data on the expected average RIT change over time, but concluded that there was wide variability both in RIT at baseline, and in the extent to which it increased over 24 months.

Chen et al. [36] conducted a notable longitudinal study of changes in RIT in people with a range of AMD severities (n = 77) over 4 years; they indicated that decline in DA accelerated in eyes with greater AMD severity and especially in eyes that had subretinal drusenoid deposits (SDD) both at baseline and at 4 years (see below).

AdaptDx: Relationship with Other Measures

The secondary aim of this systematic review was to assess the relationship of DA measures with other factors. The latter include measures of visual function, structural measures from photographs or images, PROMs and potentially performance-based metrics within the literature. For this review, all factors directly (and statistically) compared to DA parameters were recorded for each study in their entirety (Tables S3–7; see electronic supplementary material for details). These were then categorised into relevant themes and are summarised briefly below.

The most frequent factor directly compared to RIT in the literature was AMD presence and severity of AMD (19 out of 21 studies) [10, 12,13,14, 16, 23, 24, 26, 27, 29,30,31, 33,34,35,36,37,38]. It was generally reported by authors that a presence of AMD was associated with slowed or ‘abnormal’ RIT [10, 12,13,14, 16, 23, 24, 26, 27, 29, 30, 33,34,35,36,37,38] when compared to visually healthy controls [10, 12,13,14, 23, 24, 26, 27, 29, 30, 33,34,35,36,37,38], although one cross-sectional study found no significant difference in average RIT between people with AMD and controls once data were age-adjusted [31]. Most of these studies reported a difference in RIT between controls and the AMD group as a whole [12,13,14, 37] or between controls and people with iAMD/advanced AMD [23, 24, 26, 33,34,35,36, 38]. While Cocce et al. [38] found a significant difference in RIT between early and iAMD, only a few studies found significant differences between those with controls and early AMD [10] and AMD incidence [29, 30]. For example, Owsley et al. [29] reported that visually healthy elderly people with abnormal DA were almost twice as likely to have AMD in the tested eye 3 years later. Jackson and Edwards [13] reported that the mean RIT for people with early AMD (n = 6) was nearly twice as slow as the mean RIT in the older control group (n = 9), but no p values were offered.

The effect of the presence of reticular drusen, also known as pseudodrusen or SDD, on RIT was considered in a number of studies [23, 24, 28, 31, 33, 36]. Overall, the results indicated that SDD presence in people with AMD correlated with slowed RIT [23, 24, 28, 31, 33, 36], regardless of whether the SDDs were in the testing location [24]. Neely et al. [28] reported that SDD presence in controls did not significantly impact RIT, while eyes with both early AMD and SDD presence did have markedly slower RIT. However, this association was lost when data were adjusted for age. Flamendorf et al. [23] reported significantly slowed RIT in participants with SDD presence (n = 15), with 80% reaching the test ceiling (40 min); although it is noteworthy that the SDD group was significantly older than the controls. Chen et al. [36] described a key association between SDD presence and accelerated worsening of DA in their longitudinal study. Conversely, peripheral classic drusen (both presence and number) have not been found to be associated with slowed RIT [25].

The literature indicates that other structural abnormalities also impact on RIT. For example, delayed RIT may appear more likely in pseudophakic eyes which may have implications on the routine clinical use of the instrument [23, 26, 29]. Chen et al. [36] observed huge changes in RIT across the study visits just preceding and after cataract extraction. However, age and AMD stage are evidenced to negatively impact this association when applied in multivariate analysis [26, 29] and some studies report no significant differences for lens status [24].

Others have reported an association between impaired DA and changes in choroidal thickness [23], changes in retinal thickness (adjusted for age and VA) [37], reticular pigmentary changes and presence of a mottled decreased fundus autofluorescence (FAF) pattern in the midperipheral zone [25]. Laíns et al. [24] found that structural abnormalities (such as classic drusen, ellipsoid zone disruption and serous pigment epithelium detachment) affected DA when found in the testing location, even after controlling for AMD stage and age. The level of macular pigment was not found to correlate with rod-mediated DA in people with AMD [34]. Sevilla et al. [31] reported that hyper-reflective foci, lower RPE-drusen-complex volume and greater RPE-drusen-complex abnormal thinning volume were associated with slowed RIT. Contrarily, no significant differences were found between groups (early AMD, iAMD and controls) in RPE-drusen-complex, retinal volumes, or inner and outer retinal volumes, and when the data were age-adjusted, group differences in RIT were also lost [31].

Genotyping of well-defined populations is another route to discover what might happen in early disease stages in AMD. Mullins et al. [27] reported a cross-sectional analysis of RIT and genetic risk factors in people aged ≥ 60 years with evidence of either normal visual ageing or AMD in one or both eyes (n = 543). A novel association was found wherein older adults with no evidence of AMD but who had the high-risk age-related maculopathy susceptibility-2 (ARMS2) genotype were more likely to demonstrate delayed RIT, but not those who had the complement factor H (CFH) genotype, while the presence of the CFH genotype was associated with delayed RIT in people with AMD. Further research into understanding ARMS2 function has been suggested to be a research priority [27].

Few studies directly compared visual function measurements with RIT (4 out of 21 studies), and results differed [23, 26, 34, 36]. Flamendorf et al. [23] and Laíns et al. [26] found that best-corrected VA (BCVA) and worse eye BCVA, respectively, correlated with delayed RIT. In the Chen et al. [36] longitudinal study, changes in RIT occurred over 4 years while BCVA remained largely unchanged, with a mean of only 1.8 letters lost. The authors suggested that RIT appears to show functional loss that BCVA cannot [36], although the study could not exclude the effects of lens removal, which may affect longitudinal measures of BCVA. Beirne and McConnell [34] did not find a relationship between RIT and VA in people with iAMD, although it was significantly associated with contrast sensitivity (CS). This was the only study found in the featured AdaptDx literature that compared CS to RIT.

Few PROMs have been directly compared to RIT. One example includes the Low Luminance Questionnaire (LLQ) [32, 33, 36]. Yazdanie et al. [33] found a correlation between lower scores on the LLQ and RIT, with the strongest association found for the driving-related subscale. Despite BCVA yielding a statistically significant association with LLQ, the correlations found were marginally weaker than those found for RIT. Yazdanie et al. [33] specifically reported that problems with night vision and low luminance may not be explained by traditional metrics of visual function measured in clinic. This correlation between the LLQ and RIT has also been found in another study [32] and when compared to RIT progression over time [36]. However, Thompson et al. [32] found that the correlation between the LLQ and RIT in people with early and iAMD was not significant once the data were controlled for AMD severity, speculated to be due to the lack of late-stage AMD participants.

In addition to the medical history questionnaire used by Laíns et al. [26], the study also incorporated a food frequency questionnaire and the Rapid Assessment of Physical Activity test. After adjusting data for age and AMD stage, body mass index, taking AREDS supplements, and family history of AMD were significantly associated with delayed RIT and alcohol intake was significantly associated with RIT ≥ 6.5 min.

There were no studies comparing measurement of RIT with performance-based measures.

Other Adaptometers and Photostress Tests

A total of 27 studies presented data collected using devices other than the AdaptDx [15, 22, 41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65]. This included studies evaluating parameters of either cone adaptation [22, 43,44,45, 49, 50,51,52,53,54, 61, 62, 64], rod adaptation [55,56,57,58,59,60], or both [15, 41, 42, 46,47,48, 63, 65]. Nineteen of these studies reported DA data collected using some form of dark adaptometer [22, 42, 44,45,46,47,48,49,50, 53,54,55,56,57,58,59,60, 63, 65]. Six studies reported data collected using a photostress test protocol, whereby time is recorded for cone function (e.g. cone threshold or VA) to return to a specified level [43, 51, 52, 61, 62, 64]. Dimitrov et al. [15, 41] assessed using both techniques. Eight studies used an adapted Humphrey Visual Field Perimeter (HVFP; Carl Zeiss Meditec, Inc, Dublin, CA, USA) [43,44,45,46,47,48,49,50] and eight utilised an ‘in-house’ adaptometer, with methods developed by investigators [15, 41, 42, 61,62,63,64,65]. Other adaptometers, featured in more than one study, included the Medmont Dark Adapted Chromatic perimeter (MDAC; Medmont Pty Ltd, Melbourne, Australia) (5 out of 27 studies) [55,56,57,58,59] and the Macular Disease Detection MDD-2® device (Health Research Science, LLC, Lighthouse Pt, FL, USA) (2 out of 27 studies) [53, 54]. The Scotopic Sensitivity Tester-1 (SST-1; LKC Technologies, Gaithersburg, MD, USA) [60], RCDA [22], KOWA AS14B NightVision Tester (KOWA Optimed, Tokyo, Japan) [51], and Eger macular stressometer (EMS; Gulden Ophthalmics, PA, USA) [52] were used in one study each. What follows is a brief narrative of these papers, with more detail in Tables S3–S7 (see electronic supplementary material for details).

The most frequently reported parameters of DA in studies that did not use the AdaptDx were RCB (10 out of 27 studies) [41, 42, 44,45,46,47,48, 60, 63, 65], cone absolute threshold (9 out of 27 studies) [41, 42, 44,45,46,47,48, 63, 65], cone time constant (8 out of 27 studies) [44,45,46,47,48,49,50, 65] and RIT/rod criterion time (7 out of 27 studies) [41, 55,56,57,58,59, 63]. Other parameters in the reviewed papers included rod absolute threshold (5 out of 27 studies) [41, 46,47,48, 63] and ‘second slope’ or rod slope (5 out of 27 studies) [42, 46,47,48, 65], which refers to the second phase of rod recovery when threshold data are fitted with a model based on the physiological process of DA [7].

There was a wide variation in procedures used by these studies (see Table S3; see electronic supplementary material for details). However, some similarities exist, such as the stimulus used. The most frequently reported was a 1.7°–2° circular stimulus [15, 41, 46,47,48, 50, 55,56,57,58,59] with a wavelength of ~ 500 nm [46,47,48, 55,56,57,58,59], similar to the threshold stimulus seen in the AdaptDx. To examine cone-mediated DA, frequently reported stimulus wavelengths of ~ 620 nm [55, 56] or 650 nm [47, 48] were seen in the literature. Eight studies used a threshold location of 12° eccentricity [46,47,48, 55,56,57,58,59], seven studies used a 5.5-6° location [42, 55,56,57,58,59, 65], seven studies used an 3.5-4° location [15, 41, 55,56,57,58,59], six studies used a foveal location [41, 44, 45, 49, 50, 63], five studies used an 8° location [55,56,57,58,59] and two studies used a 3° location [42, 65].

No cut-off times signifying the end of the DA assessment were reported for the photostress tests, as the nature of the examination is for the patient to return to baseline sensitivity, which occurs relatively quickly [43, 52, 64]. The most often reported cut-off time for the DA procedure was 30 min (7 out of 27 studies) [15, 41, 45, 55, 57, 59, 63], followed by 60 min (3 out of 27 studies) [47, 48, 65]. Four of these studies reported that the recovery parameter was set to the maximum test time as a censored data point [44, 45, 58, 59, 65], similar to the procedure followed by studies that used the AdaptDx.

Most studies that did not use AdaptDx demonstrated DA and photostress measurements to be altered in AMD [15, 22, 41,42,43,44,45, 48, 50, 55,56,57,58,59, 63,64,65]. However, use of small cohorts and/or comparative control groups (≤ 10 people) [22, 43,44,45, 55,56,57,58, 65] hampered many of these studies. Two studies suggested the techniques were not useful for AMD [52, 60]. The following narrative focuses on notable results of the remaining studies.

Newsome et al. [64] assessed photorecovery in controls (n = 144) and people with ‘dry’ (non-neovascular) (n = 118) and ‘wet’ (neovascular) AMD (n = 36) using an in-house photostress test. Prolonged photorecovery was evident in people with non-neovascular AMD with geographic atrophy compared to controls, but not if only drusen were present. Worsening non-neovascular AMD was shown to be accompanied by prolonged photostress recovery [64].

Using the adapted HVFP, Owsley et al. [48] reported that people with AMD (n = 83) severity ranging from early to late compared to controls (n = 43) exhibited a significantly longer average rod-mediated DA, while cone-mediated DA did not differ between groups. Conversely, Dimitrov et al. [15] used the same modality and reported that people with only hard and/or intermediate drusen demonstrated significantly abnormal cone photostress recovery and rod-mediated DA when compared to controls (n = 64). However, these parameters did not discriminate between people with different severities of AMD [15].

Robinson et al. [50], using an in-house system to assess cone adaptation in controls (n = 19) and people with AMD status varying from early to advanced (n = 81), reported mean differences in cone time constant between groups. The authors speculated on the measurement being a potential biomarker for AMD [50]. Participants with early AMD (n = 50), iAMD and contralateral neovascular AMD reported a significant delay in mean cone tau when reassessed 12 months later in the ALIGHT trial [66].

Various attempts have been undertaken to quantify the diagnostic accuracy of DA techniques other than AdaptDx in identifying early AMD. Two studies [44, 45] aimed to identify the pre-adapting light intensity and test location that generated the maximum separation in the parameters of cone DA and time to RCB between participants with early AMD and controls in the minimum recording time, using a custom built adaptometer. Sample sizes were prohibitively small, but the authors reported estimates of sensitivity and specificity at different thresholds and bleach conditions.

Dimitrov et al. [63] assessed the diagnostic capacity of rod- and cone-mediated DA parameters using an in-house adaptometer in small samples of people [AMD of varying severity (n = 27) and healthy peers (n = 22)]. Cone-mediated parameters gave smaller AuROC curves than rod-mediated parameters [63]. This result was replicated in a study in large numbers by the same team, where rod-mediated DA was found to have the best diagnostic capacity (AuROC, 0.93 ± 0.016), followed by cone photostress recovery (AuROC, 0.85 ± 0.021) [41].

Tahir et al. [42] used an in-house experimental adaptometer designed to present stimuli at two inferior field locations (3° and 5.5°), measuring cone- and rod-mediated DA. AuROC was used to justify the optimal testing procedure [42]. A small number of controls (n = 15) were used, so estimates have little value in terms of assessing the diagnostic suitability of this paradigm.

Few longitudinal studies were found. Rodriguez et al. [22] primarily demonstrated that cone-mediated visual function recovery, measured using a RCDA, could separate a small sample of people with early AMD (n = 12) from visually healthy controls (n = 17) with relatively high sensitivity and specificity (> 85%). The repeatability and reproducibility of the RCDA was assessed in early AMD participants (n = 8) and visually healthy participants (n = 4). Both baseline cone threshold and recovery half-life were found to have a high degree of repeatability across all visits (intraclass correlation coefficient (ICC) = 0.88; and ICC = 0.93, respectively). RCB exhibited poor repeatability (ICC = 0.40). A follow-up after 1 year illustrated high reproducibility of the adaptometer (baseline cone threshold (ICC = 0.84) and the recovery half-life (ICC = 0.84) [22]). However, the very small sample sizes render these data insufficient for judging the true repeatability and reproducibility of the adaptometer.

Tan et al. [59] evaluated longitudinal rod function at 14 different eccentricities using the MDAC perimeter in controls (n = 23), people with iAMD (n = 12) and people with SDDs (n = 13). Over 12 months, a significant decrease in the rod recovery rate was found in the iAMD group (at the 12° test-point alone), while no significant changes were found in RIT across all groups.

Wolffsohn et al. [52] used a longitudinal study and showed that EMS photostress recovery time did not predict those whose vision decreased over the following year compared with those among whom it remained stable. Moreover, this was the only study, not using AdaptDx, that compared visual function measurements (near and distance VA and CS) to rates of adaptation. However, no significant relationships were found [52].

Measurement of DA, not using AdaptDx, and photostress parameters have been shown to be altered, for example, in the presence of SDD [55,56,57,58,59], macular oedema [64], serous macular detachment [64], abnormal new vessels [64] and inner segment ellipsoid zone disruption [55]. The level of macular pigment has not been found to correlate with photostress recovery [61]. Furthermore, Rodrigo-Diaz et al. [65] found that parameters of the DA curves such as the rod-mediated second slope and the RCB were associated with FAF changes and CFP grading. However, only a moderate correlation at best was found between cone time constant and measures from FAF and CFP grading [65].

Three studies, not using the AdaptDx, compared measurement of DA directly to PROM data. Owsley et al. [46] used measures of rod- and cone-mediated DA to validate the newly developed LLQ, using the adapted HVFP. DA was assessed in participants who exhibited normal retinal ageing (n = 41) and participants with AMD of varying severity (n = 84). An association was established between greater difficulties with or emotional distress linked with low-luminance activities and worsening rod-mediated DA, including RCB. The highest correlation coefficients were seen between rod-mediated DA parameters (in particular rod threshold, defined as the average of the last three thresholds of the rod second slope) and the driving subscale. However, no LLQ subscales were associated with cone-mediated DA [46]. A significant association between rod-mediated DA and LLQ results was also reported by Owsley et al. [47], who assessed the effect of a 30-day course of retinol on DA in 104 participants. It was found that the change from baseline to day 30 in the mobility subscale was significantly associated with changes in the slope of the second component of rod recovery. No such correlations were found in cone parameters. A non-validated questionnaire on self-reported difficulties with glare recovery was reported by Wolffsohn et al. [52], but no association was found between reported self-difficulties and EMS photostress recovery time.

The National Eye Institute Visual Function Questionnaire (NEI VFQ) [48, 51, 61, 62], Veterans Affairs Low Vision Visual Function Questionnaire (VFQ-48) [49], the EuroQol-5D Instrument (EQ-5D) [49] and Pittsburgh Sleep Quality Index (PSQI) [49] also featured in the literature but were not compared to DA metrics.

There are no studies comparing measurement of DA with performance-based measures.

Discussion

Our systematic review clearly indicates that a delay in the measured rate of DA is associated with the presence of AMD [10, 12,13,14,15,16, 22,23,24,25,26,27, 29, 30, 34,35,36,37,38, 41,42,43,44,45, 48, 50, 55,56,57,58,59, 63,64,65]; this is our main finding. Yet the evidence for measurement of DA being able to discriminate early AMD cases from elderly controls and differentiate between groups of varying AMD severity is weaker. Selection bias, problems with experimental design, poor reporting of precision of estimates and small sample sizes seem to characterise many of the studies that specifically considered diagnostic precision of the AdaptDx. Still, some of the studies, from the Owsley group for example, point to adequate levels of diagnostic precision [29] and one well-designed cohort study indicates that DA becomes impaired in some eyes with iAMD over time [16]. Even here, there was wide variability in measurement of DA (RIT) at baseline and in the extent to which it increased over 24 months. We conclude that more longitudinal studies are required to test whether a measure of DA is truly a biomarker for changes in AMD severity.

The most common method used to assess DA in people with AMD was measuring RIT using the AdaptDx adaptometer [10, 12,13,14, 16, 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38]. Since the protocol was first introduced by Jackson and Edwards [13], the AdaptDx has been used frequently in rod-mediated DA research. Despite this, there is no consensus on how RIT is recorded for people who surpass the different experimental cut-off times set and then how it is statistically assessed. To reiterate, this systematic review found the current evidence for the true diagnostic capabilities of this instrument to be quite weak. For example, no study specifically designed to satisfy STARD (Standards for the Reporting of Diagnostic accuracy studies) guidelines turned up in our review [67, 68]. In fact, the original Jackson and Edwards [13] report, which has more than 60 citations at the time of writing, is often widely cited as evidence for the device being able to ‘sensitively and specifically detect early AMD’, yet the sample sizes included were small (n = 17 per group), and consequently CIs for diagnostic precision were wide.

Our systematic review indicates that there is reasonably good evidence for people with drusen and/or atrophic changes having impaired slowed DA, particularly those with SDDs [23, 24, 28, 31, 33, 36, 55,56,57,58,59, 69]. Most published literature on DA is characterised by CFP, which remains the gold standard for AMD diagnosis and grading, despite the recognised limitations of the technique [24, 70]. This review showed that recent studies utilised imaging devices (such as optical coherence tomography) to further examine the evident structure–function relationship that has emerged in DA research [24, 28, 31, 36, 37]. Subsequently, further relationships with AMD macular anatomy and DA such as choroidal [23] and retinal thickness [36] have been found, both at and beyond the DA testing location [24].

This systematic review identified a number of studies which evaluated parameters of cone adaptation in addition to or instead of rod adaptation. This was either through direct assessment of parameters of the cone branch of the DA function [22, 41, 42, 44,45,46,47,48,49,50, 53, 54, 60, 63, 65] or by assessment of cone photostress recovery [15, 41, 43, 51, 52, 61, 62, 64]. Whilst Owsley et al. [48] reported a greater deficit in rates of rod than cone adaptation in people with AMD, other studies included in this review did find evidence of significant delays in cone adaptation [15, 22, 41, 42, 44, 45, 50, 63, 65]. This has been suggested to have clinical implications in light of the more rapid rate of cone than rod sensitivity, and the resultant reduction in the clinical testing time [45].

Our review revealed some other findings. Only a small number of the studies directly compared DA to a measure of visual function, and findings were mixed [23, 26, 34, 36, 52]. Whilst lack of concordance with other measures of visual function can be interpreted as offering new information about the functional deficits of the condition, this notion needs to be formally tested in a prospective study. This review highlighted the importance of controlling for age, as it is a confounder in case–control and other observational studies [28, 31]. We noticed that some studies did not use an age-matched or age-related control group, often resulting in large differences in ages between groups [12, 71].

The association between measurement of DA and subjective complaints of visual dysfunction were first reported more than 30 years ago [72]. Yet, this review highlights a surprising lack of studies investigating the relationship between PROMs and DA parameters. A series of notable exceptions led to Owsley et al. [46] developing a 32-item LLQ, which was evaluated in some of the studies included in this review [32, 33, 36, 46, 47]. The LLQ builds on reports from people with AMD that visual function is more impaired under low lighting conditions [30, 46, 73, 74]. Interestingly, Owsley and team found a lack of correlation between the outcomes from the PROM and cone-mediated DA in two separate studies [46, 47], which may be surprising, as cone adaptation is important in early adjustment in vision when moving from a high to low level of luminance. Moreover, we did not find a single study looking at how measures of DA could be associated with people with AMD performing real-world visually guided tasks, or surrogates of them (sometimes referred to as performance-based tasks). Such tasks could measure face recognition, search performance, visuomotor control or mobility [75,76,77]. For example, DA would be expected to impact mobility in low lighting, and this could be a promising area of research.

Our review indicates other gaps in the literature as well. Many of the studies used the AdaptDx instrument (21 out of 48) but surprisingly, there were few data on the repeatability and reproducibility of measurements from this device. Apart from the Flamendorf et al. [23] study, there were no data on the practise effect or learning effect with the task; this is surprising given the psychophysical nature of the test.

We consider our systematic review to be timely. The most recent prior assessment of the literature on examination of DA in AMD was published 14 years ago as part of a larger review of the literature on visual function tests in AMD [17]. We found 48 new studies published since then on DA and photostress testing in people with AMD that satisfied the inclusion and exclusion criteria for our review. In addition, new adaptometers have been introduced and are now being used clinically. Our review and the results reported have some limitations. First, this review combined observations on photostress testing and alternative DA procedures to AdaptDx which differ in their experimental design. This was necessary due to the inherent heterogeneity of DA assessment. Second, only studies published in peer-reviewed journals were included, and we excluded protocols and non-English-language manuscripts. Third, we limited our search to studies including an evaluation of the kinetics of DA and, as such, we did not consider studies which evaluated thresholds in the absence of a prior photostress. Furthermore, pachychoroid diseases have been brought to attention as a possible explanation for AMD pathogenesis and are thought to be clinically separate from drusen-driven AMD [78, 79]. As a result, DA measurement may be different in AMD associated with pachychoroid, a factor not considered in this systematic review which requires further analysis.

Conclusion

This systematic review was the first in 14 years to assess the growing literature in DA assessment in AMD. This review highlights the variety of experimental methodology currently used in the field. Assessment of DA is a very promising measure of visual function, and it may play an important role in early detection and monitoring of AMD in clinical practice and in experimental studies.

We have highlighted the need for further evidence of the discriminatory power of DA measurement to better differentiate early-stage AMD and normative macular ageing. Further data on repeatability, reproducibility, practice effects and the true diagnostic precision of RIT as recorded by AdaptDx are needed as well. There is reasonably good evidence on how structural abnormalities such as SDDs negatively impact DA. However, more research on the relationship between DA and VA, CS, measures from microperimetry, PROMs and performance-based measures are required to see how this assessment fits in with the spectrum of measuring visual function in AMD.