Introduction

Globally, the number of people of all ages who are visually impaired is estimated to be 285 million [1]. In urban populations, the main causes of visual impairment are cataracts, age-related macular degeneration (AMD), diabetic retinopathies, and glaucoma [2]. Frequent monitoring of visual parameters is essential to improve care delivery. This is especially important for diseases such as AMD or diabetic retinopathy which show improvement with anti-vascular endothelial growth factor (anti-VEGF) therapy, as studies have shown that a delay between medical indication and treatment constitutes a risk of loss of visual acuity for the patient [3]. However, most countries are facing a shortage of eye-specialists. There are currently 8.8 ophthalmologists per 100,000 people in France, six ophthalmologists per 100,000 people in the USA, two ophthalmologists per 100,000 people in China, and one ophthalmologist per one million people in Rwanda [4, 5]. These facts pose a public health challenge: how to optimize the follow-up of patients with chronic eye diseases, ever-increasing in number, while physicians have busier clinics and less time to spare.

In the past several years, advances in internet and communication technology have made the proliferation of digital health tools possible, including mobile health applications (apps) [6]. The adoption and use of smartphones and other mobile devices is widespread, with 1.5 billion smartphones sold in the world in 2017 and 77% of Americans owning a smartphone.

Remote health monitoring tools may reduce patient stress, increase patient empowerment, and reduce overall costs, with potentially fewer unnecessary visits. In the field of ophthalmology, remote monitoring between clinic visits could detect progression of high-risk chronic disease before irreversible damage occurs. Gamification techniques can be used to increase patient compliance with self-monitoring [7]. By combining the fun, engaging, and motivational aspects of gaming without sacrificing scientific integrity, such an app can easily blend into the patient’s routine.

More than 318,000 health apps are available worldwide, but few of these have claims based on strong clinical evidence [6]. Physicians and patients need new, validated tools. OdySight is a remote monitoring tool that challenges standard ophthalmic tools currently used in clinical practice.

The purpose of this study was to evaluate the agreement between standard clinical eye tests and Odysight for assessing visual acuity, contrast sensitivity and the presence of scotoma or metamorphopsia.

OdySight

OdySight is a mobile medical video game available on a smartphone or tablet by prescription only. It contains a puzzle game as well as medical modules to test monocular vision (near visual acuity, contrast sensitivity, and the detection of metamorphopsia and scotoma via a digital Amsler grid). A unique, patented technology using the front camera of the smartphone ensures that the tests are performed at a standardized distance (40 cm/15.7 in. between the participant’s eye and the device) and with adequate ambient light, as measured by the device (note: these features were not used during this study). The results from the visual tests are sent via a secure server to an online dashboard which the physician can access in real time (Fig. 1). Alerts are sent to both the participant and physician when there is a significant decline in vision. A recommendation is made to the participant to schedule a physician visit.

Fig. 1
figure 1

OdySight. a Visual test, b game, c physician’s dashboard. Patients play the game and perform the tests. Physicians can follow their patients using the online dashboard

Near Visual Acuity Module

This module measures near visual acuity and is based on the standard Early Treatment Diabetic Retinopathy Study (ETDRS) letter chart test used in the clinical setting (Fig. 2). Throughout the test, a participant indicates the direction of the Tumbling E by sliding a finger in the corresponding direction. The orientation and the size of the letter “E” changes after each response; while the orientation of the letter “E” is random, its size changes according to a predefined algorithm. If the participant is not able to discern the direction of the letter “E”, he/she can use the “I don’t know” button to skip to the next one which appears after a few seconds of inactivity.

Fig. 2
figure 2

OdySight visual acuity module (the hand and arrow are for comprehension only)

Contrast Sensitivity Module

This module measures contrast sensitivity and is based on the Pelli–Robson contrast sensitivity chart (Pelli–Robson test). Throughout the test, the participant needs to indicate the direction of the Landolt C (appears as the letter “C”) by sliding a finger in the corresponding direction (Fig. 3). The orientation and the contrast of the optotype changes after each participant response, but the size remains stable. The orientation of the optotype is random, but its level of contrast changes according to a predefined algorithm. If the participant is not able to see the direction of the letter “C”, he/she can use the “I don’t know” button, which appears after a few seconds of inactivity, in order to skip to the next one.

Fig. 3
figure 3

OdySight contrast sensitivity module (the hand and arrow are for comprehension only)

Amsler Grid Module

This module is intended to detect the development of new scotoma and metamorphopsia in the central field of vision. It is a digital version of the standard paper Amsler grid, but because of the actual smartphone and tablet screen size, the grid is divided into three parts to cover the entire length of the paper Amsler grid (Fig. 4). For each portion, the participant needs to gaze at the black dot and draw the abnormalities he might see on the grid (blurry lines, grid disappearing…) using his finger as a pencil.

Fig. 4
figure 4

Three screens depicting the digital Amsler grid on OdySight

Methods

Participants

Study participation was offered to people who were visiting the hospital for out-patient visits, as well as to their accompanying friends and family members. Each participant was noted to have one or multiple eye diseases, or none.

Study Design

In order to assess the validity of the medical modules on OdySight, we conducted an open-label, single-arm, prospective, single-site study at the Quinze-Vingts National Ophthalmology Hospital, in Paris, France (ClinicalTrials.gov Identifier: NCT03457441). The research followed the tenets of the Declaration of Helsinki of 1964, as revised in 2013, and approval of the experimental protocol was obtained from an ethics committee (Comité de Protection des Personnes Ile-de-France VII, France, n° 18-005). Informed consent was obtained from each participant prior to any examination.

Participants in the study met the following key inclusion criteria: age ≥ 18 years; best corrected visual acuity (BCVA) between 0.0 and 1.0 logMAR; ability to recognize letters of the alphabet and to read French; ability to correctly distinguish body laterality; and affiliated with or beneficiary of the French healthcare system. Participants were excluded if: they presented with a pathology that was considered by the investigator to be capable of affecting the quality of the main evaluation criteria (e.g., mental illness, Parkinson disease); they were pregnant or breastfeeding; or they were not considered by the investigator or designee to correctly use the OdySight modules after the training session. To ensure a wide range of visual acuity, each participant’s eye was assigned to one of the three cohorts according to the level of near visual acuity, measured at a distance of 40 cm from the Sloan ETDRS letter chart (40-cm ETDRS): cohort 1, with near visual acuity of between + 1.0 and + 0.7 logMAR; cohort 2, with near visual acuity of between + 0.6 and + 0.3 logMAR; or cohort 3, with near visual acuity of between + 0.2 and 0.00 logMAR.

For the purpose of the trial, a scaled-down version of the OdySight app (in which the puzzle and the technical unit checking ambient light and distance to screen were removed) had been installed on two iPhone 6 smartphones. The OdySight medical tests and their algorithms had not been modified. Again for the purpose of the trial, the examiner was responsible for the navigation within the app, and the participants were asked only to perform the tests.

Sample Size

The sample size needed, driven by the number of eyes for the primary endpoint (to assess the agreement between a smartphone-based evaluation of near visual acuity with OdySight and a standardized method [Sloan ETDRS near vision letter chart]) was defined by performing a careful review of the relevant literature. The limits of agreement (LoA) technique was retained to analyze the results and to calibrate the sample size of this study (see Conduct of the Study, below).

The lack of agreement was estimated by the mean difference of the two measurements (d) and the standard deviation of the differences (s). Providing differences within d ± 1.96 s would not have been clinically important. Taking into account that these LoA are only estimates of the values which apply to the whole population, we viewed the sample size estimation as an attempt to obtain sufficient precision on these LoA. It can be shown that the standard error of d is root(s2/n), where n is the sample size and the standard error of d − 1.96 s and d + 1.96  s is about root(3 s2/n) [8, 9]. Bland and Altman recommended a minimum of 100 statistical units for a good sample size because this number leads to a 95% confidence interval (CI) of approximately ± 0.34 s which appears to be a reasonable and accurate estimate. To ensure that 100 studied eyes were available in the statistical reporting of agreement between the methods of measurements, and to take into account a possible lack of measurements for any reason, the study was planned to include between 60 and 120 patients as a whole.

Conduct of the study

During screening, inclusion criteria were checked, optical coherence tomography (OCT) was performed, and the participant’s BCVA was determined so that each eye was assigned to the corresponding cohort. The three medical modules were then presented to the participant. Because it is known that a learning phenomenon can occur, participants were asked to perform practice tests following instructions given by the examiner. At the end of this training session, the examiner estimated whether or not the participant was sufficiently capable of using the app. Eyes from participants who fulfilled all of the criteria were included, and a participant could be included for one or both of his/her eyes.

During the evaluation, participants performed tests to assess visual acuity, contrast sensitivity, and detection of scotoma and metamorphopsia with both standard tools and digitized tests from OdySight. Participants did not receive dilation treatment and, if relevant, were prompted to remove their contact lenses at least 1 h before any exam. All tests were performed at a distance of 40 cm, with the head of the participant leaning against an ophthalmological chin piece and the smartphone or chart fixed on a desk at an adequate height. Before each evaluation, distance was checked with a tape measure and luminosity was checked with a lux meter. All tests were monocular, the eye tested was equipped with the adequate correction, and the eye not being tested was covered with an occluder. The order of the tests (standard exams first vs. OdySight modules first) as well as the duration of resting breaks varied for each participant based on the discretion and assessment of the examiner.

All statistical outputs were generated using SAS® version 9.4 software (SAS Institute, Cary, NC, USA).

Visual Acuity

Visual acuity was assessed with two standard methods: with the ETDRS chart at a test distance of 4 m (4-m ETDRS)—and at 1 m if needed with + 0.75 D added for correction—and with the Sloan ETDRS chart at 40 cm (40-cm ETDRS). All tests were performed in a dedicated testing room with adequate illumination (≤ 161.5 lx). Visual acuity was determined by asking the participant to read each letter starting from the first line until the smallest letter readable on the chart.

During the same visit, visual acuity was also evaluated with the visual acuity module of OdySight.

Results of the visual acuity tests were recorded in terms of number of letters read.

Contrast Sensitivity

Contrast sensitivity was assessed with the Pelli–Robson chart at 1 m, in a dedicated testing room with adequate illumination (≤ 161.5 lx). Participants removed their own glasses and instead wore a trial frame equipped with the adapted correction for distant vision increased by + 0.75 D. The contrast sensitivity level was determined by asking the participant to indicate each letter, starting from the first letter until the last readable letter. The examiner recorded each letter identified correctly by the participant, and the last triplet of letters was validated if the participant was able to read at least two of them. Results were recorded in LogCS.

During the same visit, contrast sensitivity was also evaluated with the contrast sensitivity module of OdySight.

Results of contrast sensitivity were record in logCS.

Amsler Grid

The presence of metamorphopsia and/or scotoma was assessed using an empty Amsler grid printed on a white sheet of paper (10 × 10-cm grid) and a felt tip pen. The participant was asked to draw lines (or wavy lines) where distortion was seen, or to fill an area where any type of spot was seen (e.g., black spot, white spot, area where the grid disappears) while fixing the black dot.

During the same visit, the presence of metamorphopsia and/or scotoma was also evaluated using the digital Amsler Grid module on OdySight.

After the participants performed each test, the clinical team evaluated the presence or absence of metamorphopsia and/or scotoma for each type of grid. Results were recorded as the presence or absence of metamorphopsia and/or scotoma.

Statistical Analysis

For visual acuity and contrast sensitivity, statistical analysis was performed using the LoA technique provided by Bland and Altman (B&A). This method evaluates the agreement between two quantitative measurements by constructing LoA and deducing the resulting bias, which is the mean of differences between the two methods. The closer the bias is to 0, the more it indicates good agreement between the methods. LoA are an estimate of the true agreement limits for the entire population, whereby narrower limits reflect better agreement between two techniques. However, this method only defines the interval of agreements, it does not state whether those limits are acceptable or not. Acceptable limits must be defined a priori, based on clinical necessity, biological considerations, or other goals. Results are displayed on the B&A plot with the difference between the two measurements on the Y-axis, and the average of the two measurements on the X-axis [8, 9].

Acceptable Limits

Visual Acuity

The degree of improvement in logMAR or ETDRS letters that is considered to be clinically important is uncertain and has varied among studies. Statistically significant changes in visual acuity may not necessarily be clinically relevant. The minimal clinically important difference reported in the literature is five to ten ETDRS letters (one to two ETDRS lines) [10]. In fact, even in the absence of any clinical change (e.g., no change in room condition, no change in physician performing the exam), consecutive visual acuity measurements for a given person using the same visual chart are subject to a degree of variability. Rosser et al. [11] as well as a number of other authors found a test and retest variability for the ETDRS acuities of ± 0.18 logMAR (nine letters) [12,13,14].

Contrast Sensitivity

There is no strict consensus in the literature for an unacceptable limit. The limit reported in the literature varies from ± 0.14 (2.8 letters) to ± 0.48 logCS (9.6 letters). The average of these limits is ± 0.23 logCS and the median is ± 0.18 logCS [13, 14]. In current practice, some ophthalmologists may consider that there is a significant difference between two measurements if the difference is greater than ± 0.15 logCS (three letters).

Amsler Grid

For evaluation of the two versions (standard and OdySight module) of the Amsler grid, the presence or absence of metamorphopsia and/or scotoma was noted for each type of grid. A McNemar test was used to assess comparability between the results.

Results

Eighty-eight participants were considered for enrollment. Ten were assessed to have failed screening for the following reasons: participant did not present adequate level of visual acuity in at least one of his/her eyes; participant was not considered to correctly use the smartphone app after the training session; or both. The remaining 78 participants provided a total of 120 eyes to be evaluated (Fig. 5). Of the 78 participants, 42% were male, and the mean age was 64.7 (range 24–92) years. Mean visual acuity measured with the 40-cm Sloan ETDRS chart was 20/160 (logMAR 0.9) for cohort 1, 20/63 (logMAR 0.5) for cohort 2, and 20/25 (logMAR 0.1) for cohort 3 (Table 1).

Fig. 5
figure 5

Participant disposition. ITT Intention to treat

Table 1 Measure of near visual acuity by eye with 40-cm Sloan Early Treatment Diabetic Retinopathy Study chart in the intention-to-treat population

Ophthalmic History

Among the evaluated eyes, 15.8% had no eye condition, 50.8% of eyes had AMD, 8.3% had retinitis pigmentosa, 8.3% had a Stargardt disease, and 4.2% had glaucoma (Table 2).

Table 2 Ocular diagnosis by eye

The OCT examination detected abnormalities in at least one eye for approximately 90.8% (n = 69/78) of participants: approximately 40% of eyes had macular atrophy, 35% had drusen, 14% had either hypo- or hyperpigmentation of retinal pigment epithelium, and 10% had choroidal neo-vascularization. OCT was not performed on four eyes (Table 3).

Table 3 Abnormalities detected by optical coherence tomography by eye

All 78 of the included participants underwent OdySight training for the three medical modules, and all were considered to be able to correctly use the application.

Near Visual Acuity Between 40-cm Sloan ETDRS and OdySight

The B&A analysis showed a mean difference in units of 0.53 letters (95% CI − 0.42, 1.48) and a 95% LoA of between − 9.75 and 10.82 letters between the two methods. According to the LoA, in this study, 95% of the differences were between − 9.75 and 10.82 letters (Fig. 6).

Fig. 6
figure 6

Bland and Altman (B&A) plots comparing absolute differences between visual acuity measured (OdySight vs. Sloan Early Treatment Diabetic Retinopathy Study [ETDRS] letter chart at 40 cm testing distance [40cm Sloan ETDRS]) and the mean visual acuity measured ([40cm Sloan ETDRS + OdySight)]2). For an explanation of cohorts, please see footnote to Table 1. LoA Limits of agreement, s standard deviation

In Fig. 6, the cloud of points for cohort 3 (good visual acuity) is more compact than those for the other two cohorts. Only cohorts 1 and 2 contain points beyond the + 1.96 s and − 1.96 s LoA, indicating that the agreement between the two methods is better for participants with good visual acuity (cohort 3). This result was confirmed by the analysis performed by cohort (Table 4).

Table 4 Agreement between 40-cm Sloan Early Treatment Diabetic Retinopathy Study letter chart and OdySight, by cohort

Almost 90% of eyes (107 eyes) had a difference less than or equal to nine letters between the two methods. Of note, 94 eyes (78.3%) had a difference less than or equal to five letters.

Visual Acuity Between 4-m ETDRS and OdySight

The B&A analysis showed a mean difference in units of − 1.53 letters (95% CI − 2.78, − 0.27) and a 95% LoA of between − 15.16 and 12.11 letters between the two methods. According to the LoA, in this study, 95% of the differences were between 15.16 letters and 12.11 letters (Fig. 7).

Fig. 7
figure 7

B&A plots comparing absolute differences between visual acuity measured (OdySight vs, 4-m ETDRS) and the mean visual acuity measured ([4-m ETDRS + OdySight]/2)

For the 120 evaluated eyes, 99 (82.5%) showed a difference of ± 9 letters in near visual acuity between measurements by the ETDRS method and those by OdySight. Of note, a total of 78 eyes (65.0%) showed a difference of ± 5 letters between measurements by these two methods.

Contrast Sensitivity Between Pelli–Robson Contrast Sensitivity Chart and OdySight

The B&A analysis showed a mean difference in units of − 0.16 logCS (95% CI − 0.20, − 0.13) and a 95% LoA of between − 0.54 and 0.22 between the two methods. According to the LoA, 95% of the differences were between − 0.54 and 0.22 logCS. In other words, the differences were between − 10.8 letters (3.6 lines on Pelli–Robson) and + 4.4 letters (1.46 lines on Pelli–Robson) (Fig. 8).

Fig. 8
figure 8

B&A plots comparing absolute differences between contrast sensitivity measured (OdySight vs. Pelli–Robson Contrast Sensitivity Chart) and the mean contrast sensitivity measured ([Pelli-Robson + OdySight]/2)

For the 120 evaluated eyes, 74 (61.7%) showed a difference of less than or equal to ± 0.15 logCS (three letters, one line) when contrast sensitivity as measured by the Pelli–Robson method and by OdySight was compared. A total of 105 (87.5%) eyes showed a difference of less than or equal to ± 30 logCS (six letters, two lines) between the two methods.

Scotoma and Metamorphopsia Detection with Paper Amsler Grid and OdySight

At least one eye with metamorphopsia was detected in 50 eyes using the paper Amsler grid and in 50 eyes using the OdySight module. At least one scotoma was detected in 50 eyes using the paper Amsler grid and in 50 eyes using the OdySight module. There was only one participant for whom the OdySight app did not detect metamorphopsia while the paper Amsler grid did (McNemar test, p value = 1), and also one participant for whom the paper Amsler grid did not detect metamorphopsia while the OdySight app did (McNemar test, p value = 1) (Fig. 9). For these cases, similar results were noted in the detection of scotoma.

Fig. 9
figure 9

Example of corresponding results for two versions of the Amsler grid. Left: paper 10 × 10-cm Amsler grid completed with a felt tip pen. Right: digital Amsler grid completed on the OdySight app

Discussion

OdySight is based on standardized visual tests which are well accepted and used in the ophthalmology community world-wide. The purpose of this study was to demonstrate that there is good agreement between the results of tests performed by the patient using the modules of OdySight and those obtained using standardized tests. A B&A analysis was used, not to show a correlation between the different methods, but rather to determine the level of agreement between the results.

The B&A analysis for near visual acuity using the standard 40-cm Sloan ETDRS and the OdySight module demonstrated a low mean difference (bias), with the value very close to zero; these results show there is no disagreement between the two methods. According to these results, the OdySight application overestimated visual acuity by 0.53 letters as compared to the Sloan EDTRS method. The LoA are outside the limits considered to be acceptable, according to the literature, but very close to nine letters (difference of either + 9 or − 9 letters).

The B&A analysis for visual acuity using the standard 4-m ETDRS and the OdySight module demonstrated a low mean difference (bias), with the value very close to zero; these results show there is no disagreement between the two methods. According to the results, the OdySight application underestimated visual acuity by 1.53 letters as compared to the 4-m ETDRS method. The LoA are outside the limits considered to be acceptable. Therefore, caution is advised when comparing results from the near visual acuity module of OdySight and the 4-m ETDRS method.

For visual acuity, in addition to the primary objectives of the study, an analysis of the variation between two standardized tools used to measure visual acuity, namely, the Sloan ETDRS (at 40 cm) and the ETDRS at 4 m, was also performed. The B&A analysis showed a small difference in the measurements between the two standardized methods: a mean of − 2.06 and a LoA of between − 12.60 and 8.48 letters. Yet, both methods are currently used, independently, to monitor visual acuity. Overall, this study reveals that the results from the OdySight modules for visual acuity are close to those obtained by the standardized tests (Sloan ETDRS and 4-m EDRS).

For the contrast sensitivity module, the bias is non-negligible as it represents more than three letters, which is equivalent to more than one line on a Pelli–Robson chart. The results indicate that the application underestimated contrast sensitivity by 0.16 logCS as compared to the Pelli–Robson method. Furthermore, the interval for the LoA is quite large. Therefore, results using OdySight may not be truly comparable with those using the Pelli–Robson chart.

For the Amsler grid module, the study showed no difference in the detection of scotoma and/or metamorphopsia using OdySight compared to the paper Amsler grid.

A review of the comment section of the study electronic case exit report forms highlighted two comments on particular difficulties encountered by some participants: (1) “the scotoma of the left eye bothered the participant during the contrast test on OdySight. When the opening of the “C” was on the left, he clicked on the button “I don’t know”. It was not related to the contrast of the letter but to the location of the opening.” (2) “trembling hands and difficulty to understand the instructions.. Indeed, when addressing an elderly population, we might expect some hand–eye coordination issues when using a smartphone, (such as tremors) that need to be taken into consideration.”

Even if the OdySight medical modules were created to be equivalent to the standardized tests (in terms of design and calculation) they obviously are not identical to these tests. The platforms vary and at times the distances also vary (e.g., Pelli–Robson test is done at 1 m). Any differences observed in this study between the OdySight modules and the standardized tests may be due to measurement errors, misuse of the app, and/or other elements which are participant-dependent.

In addition, the participant’s condition (severely reduced visual fields, tremors, or other difficulties with hand–eye coordination), the novelty of smartphones and tablets, as well as other challenges might explain a slight difference in results. As with all scientific methods, these methods are not 100% reliable and have a degree of error. Agreement between two methods must always be assessed alongside the clinical status of the eye in question.

There are some limitations to this study. First, each patient was asked to perform each test once per eye. More data points would have been collected had the parameters been tested on each patient multiple times. Also, this study is not a real-world evaluation of OdySight, as the distance to screen and ambient light features were disabled. However, those features were tested and validated through an independent Human Factors study. Furthermore, during the study, all tests were performed on one specific type of smartphone only.

Conclusion

The results of this study demonstrate good agreement for the measurements taken by two of the OdySight medical modules (visual acuity and Amsler grid) compared to currently used gold standards (ETDRS Sloan near visual and distance ETDRS charts and paper Amsler Grid, respectively) for most people who participated in the study. Regarding the contrast sensitivity module, variability of the results and the absence of consensus on the appropriate limit precludes a definitive conclusion.

OdySight, for remote health monitoring, is a tool that challenges standard ophthalmic tools for measuring visual acuity and detecting scotoma and metamorphopsia.

Real World Use

OdySight received CE Marking as a class 1 medical device in the EU in May 2018 and, as of this manuscript submission, more than 40 ophthalmologists in France are prescribing it to their patients. Physicians and their clinical staff propose the application to those patients with chronic eye diseases who may benefit from remote monitoring. A starter kit composed of the basic information to install the game and perform the tests, as well as a stand to hold a smartphone, are provided to each patient. Patients then download OdySight from the iOS or Android store onto one of 3000+ compatible smartphones and tablets, play the puzzle game, and perform the eye tests on a regular basis, between clinic visits, from their home or other remote locations. At the end of each test, the data are transmitted via a secure server to a dashboard in the clinic. Significant declines in test results trigger an alert for both the physician and patient. To date, multiple anecdotal reports from physicians have demonstrated that an alert resulted in an early clinic visit, confirmation of visual decline, and an alteration in the treatment plan, such as an early injection of anti-VEGF medication for patients with AMD. Tilak Healthcare is now registered with the U.S. Food and Drug Administration, and OdySight is listed as a class I medical device in the US.