Introduction

The overarching issue we address in this work is the role that early visual experience plays in the acquisition of robust facial identification—a skill that originates in early development but continues to change through adolescence or later. The broad consensus is that facial identification undergoes extensive tuning early in development, making it particularly- (and perhaps critically-) sensitive to early visual experience, though some plasticity or learning is retained into early adulthood1,2,3. However, what this data does not explain is how critical this early visual experience is, and whether altering the early visual input will result in permanent changes to facial identification.

One approach to address this question involves testing visual skill acquisition in individuals who were born with dense bilateral cataracts and provided with surgical sight-correction treatment at different ages (for review see4,5,6,7,8). Multiple studies have found that such individuals are able to detect faces after the onset of sight, though this skill does not emerge immediately after treatment, requiring months to years of visual experience9,10,11. In contrast to facial detection, at least some aspects of facial identification remain permanently impaired following early visual deprivation12,13,14. Individuals treated within the first year of life and tested in adolescence show abnormal face-processing14,15 but normal object discrimination16, consistent with the idea that early visual input is especially important for face-specialization. These individuals show a range of deficits with tasks that typically-sighted adults are considered “experts” in17 (though notable individual differences exist even among neurotypicals18), demonstrating difficulties in memorizing famous and recently-learned faces19, worse performance than controls in detecting configural- or spacing-differences between face parts but normal performance with detecting local- or shape-changes12,14,16,20, and significant impairments in holistic facial-identification tasks, such as lack of susceptibility to the composite face effect8,15. Taken together, these studies suggest that in the absence of early visual experience, some face-specific skills (such as detection) can still be acquired, whereas others (such as sensitivity to configural face changes and holistic processing) remain impaired even years after treatment.

While these studies have enhanced our understanding of the role of early visual experience for facial identification, significant questions remain unanswered. For instance, the performance of children with very brief visual deprivation was found to be quite high on most tasks, albeit still falling short of controls, whose performance on most tasks reached ceiling levels. While this data confirms that very early visual experience is a prerequisite for developing intact facial identification later in life, it also suggests that these children can acquire an impressive level of identification performance—either via compensatory strategies or by taking advantage of the limited plasticity that the system likely possesses later in life. The absence of direct longitudinal data provides only a snapshot in time, making it difficult to make inferences about changes in identification skill as a function of time relative to treatment (a direct proxy of visual experience), how domain specific these persistent deficits are, and to what degree visual deprivation per se, versus a more specifically described altered visual experience, may explain these long-term difficulties. This study builds on previous studies and addresses these limitations, to provide alternative explanations for the effects of altered early visual experience on the acquisition of facial identification skill late in life.

Here, we directly measured changes in face- versus object-identification that result from the introduction of sight following congenital and, in this case, prolonged visual deprivation. Our primary objective is to determine if the delayed introduction of visual input precludes the emergence of facial identification altogether, if any persisting deficits are specific to faces or instead impact visual identification in general, and to longitudinally characterize the relative acquisition of face- versus object-identification. Our second objective is to revisit critical period accounts of face recognition by disentangling the relative contributions of early visual deprivation per se from that of the type of visual information that cataract-treated children receive right at the outset of vision.

We are motivated by our previously published work theorizing that the facial identification deficits observed in cataract-treated children may result from the abnormally high initial visual acuity that these children experience, relative to the typically sighted newborn21. According to this hypothesis, the initial period of low-resolution acuity that characterizes normal visual development forces the visual system to develop mechanisms that support extended spatial integration that is necessary for some aspects of visual processing later in development. Skipping this initial period of low-resolution vision will exclude the establishment of these mechanisms, and the visual system will therefore be ill equipped to support configural processing, required for facial- but not object-identification. Indeed, computational tests of this hypothesis, performed with convolutional neural networks, show that commencing training with blurred- followed by high-resolution images leads to enlarged receptive fields that integrate information over larger areas of the image and, in turn, improve facial identification performance and generalization. In contrast, commencing training with high- followed by low-resolution images preclude the emergence of these large receptive fields, resulting in poorer facial-identification performance.

This framework provides a foundation for the empirical predictions that we test in the current study. We do this by working with two groups of children, blind since birth because of congenital bilateral cataracts, and identified and treated by our outreach team late in life (http://www.projectprakash.org). To determine the impact of visual acuity trajectory on the development of face- and object identification, we sort the children into two groups, based on their pretreatment acuity: little to no pattern vision versus low-resolution vision comparable to a typically developing newborn. After treatment, both groups show a rapid improvement in vision, with higher-than-newborn visual acuity. As Fig. 3A demonstrates, we predict that (1) children who experienced little to no pattern vision prior to treatment (defined as pre-treatment acuity worse than that of a normally sighted newborn), and thus commence their post-treatment visual journey with higher-than-normal visual acuity, will show persistent impairments in face- but not object-identification, whereas (2) children with pretreatment visual acuity comparable to that of a newborn (resulting in an initial period of low visual acuity), followed by post-treatment high-resolution acuity will improve on both the face- and object-identification tasks. Such a result would support the possibility that in typical visual development, initially low visual acuity may play an adaptive role for setting up visual processes that bootstrap the system for configural processing that underlies tasks such as facial identification.

Methods

The study was approved by COUHES, MIT’s Institutional Review Board, and the Ethics Committee of the Shroff Charity Eye Hospital in New Delhi, and all experiments were performed in accordance with the relevant guidelines and regulations put forth by these institutions. Informed consent was obtained from all participants.

Participants

Participants were recruited as part of Project Prakash (http://www.projectprakash.org), a dual humanitarian/scientific initiative to identify and treat children born blind (typically due to bilateral congenital cataracts) and subsequently study their visual development, or how their brain ‘learns to see’22.

Our outreach team recruited children who presented with dense bilateral cataracts prior to age one, and had not received treatment until several years later7,23. The participants’ ages at treatment ranged from 5 years 8 months to 23 years 10 months old (average age = 11.874 years, SD = 4.067, N = 36, although only N’ = 35 were included in the calculation of age because the age of one participant was unknown). They were assessed for congeniality of cataract onset based on a combination of parental reports, ophthalmic examination, the presence and type of nystagmus and microcornea24, and available medical records. Inclusion criteria were kept strict, such that when information was insufficient to determine cataract onset at infancy, children were treated but excluded from the research study. Importantly, Project Prakash treatment is never contingent on participation in our research studies. Corrective surgery was provided by our medical team at Shroff Charitable Eye Hospital in New Delhi and involved surgical cataract extraction and intraocular lens implantation.

Prior to treatment, children underwent extensive ophthalmic assessments. Anterior segment was evaluated on slit lamp and type of cataract and any associated ocular pathology were noted. Given the patients’ dense bilateral cataracts, which precluded fundus viewing via ophthalmoscopes, B scan ultrasound was done in all cases preoperatively to check for any posterior segment pathology. Children’s pre-operative and longitudinal postoperative visual status were assessed by our research team. Visual acuities were measured using the Freiburg Visual Acuity Tests25, shown to reliably capture extremely low vision traditionally referred to as light perception (LP), hand motion (HM) and finger counting (FC). The patients were then divided into two groups, based on their preoperative visual acuity: children in ‘group PNB−’ (Patient Newborn Minus) were patients whose pre-treatment visual acuity was worse than that of a typically sighted newborn (i.e., worse than 20/850 Snellen Acuity), and ‘Group PNB+’ (Patient Newborn Plus) consisted of patients whose pre-treatment visual acuity was equal to or better than that of a newborn (i.e., equal or better than 20/850 Snellen Acuity). We note that our criteria threshold for the distinguishing between the two patient groups is determined by the approximation of the typically sighted newborn acuity, because our empirical predictions are biologically motivated by the typically developing visual acuity trajectory (detailed explanation below). Despite having two patient groups with different pretreatment visual acuities, we emphasize that all the patients who participated in this study were profoundly visually impaired since infancy, with vision worse than the criteria for legal blindness, and best pre-operative acuity recorded in only two individuals from group PNB+ (SS14 and SS18) equal to ~ 20/400 Snellen acuity. See supplemental information tables T1 and T2 for the complete acuity, clinical, and demographic of patients in each group, included in each time point.

Based on our previous research, we know that despite significant acuity improvements, all children treated for early-onset bilateral congenital cataracts continue to show some degree of visual impairment26. Therefore, before determining the impact of early visual deprivation on task performance, we needed to quantify the effect of low visual resolution on each of the tasks, effectively allowing us to disentangle the relative roles of early visual experience versus image quality at test time. To do this, we recruited a group of 29 age-matched controls from a local residential school. All children were screened for vision problems as part of our volunteer school-based screening programs. Only children with normal or corrected to normal visual acuity were enrolled in our studies. These children were then randomly assigned to two mutually exclusive participant groups and tested on the same tasks as the patient groups while wearing blur goggles simulating visual acuity of either 20/200 Snellen Acuity ('group C200', N = 10 who performed all four tasks) or 20/500 Snellen Acuity ('group C500', N = 19, 9 of whom performed the Old/New tasks, and a different 10 children performed the Cambridge Memory tasks). Simulated acuity was achieved by attaching Bangerter Occlusion foils to clear safety goggles. Full details of the participants enrolled in the control groups are provided in the supplementary section, table T3.

Procedure

The protocol included two types of tasks: Old/New Recall and Cambridge Memory Test. These tasks have been widely used by researchers to quantify facial identification performance, as well as to identify prosopagnosia, or the inability to identify faces. We chose these tasks over other available tasks because (a) when this longitudinal study began, these tests were considered the “gold standard” in the community, (b) each task has both a ‘face’ and ‘object’ (or nonface) version that have been extensively standardized to have comparable difficulty at baseline (in the general population), and (c) these tasks have been adapted to be used with children, and extensive data with minors with typical vision and prosopagnosia is thus available from previous studies (for a comprehensive review see27,28,29).

Figure 1 top panel shows sample trials from each of two task types used in this study: an old/new recall task (including the Old/New Faces, or ‘O/N-Faces’, and its corresponding non-face task, the Old/New Flowers, or ‘O/N-Flowers’) and the Cambridge Memory Task (including the Cambridge Face Memory Test, or ‘CFMT’, which is a modified version of the CFMT-Boys, and its corresponding non-face task, the Cambridge Bike Memory Test, or ‘CBMT’, modified from the CBMT-Boys). Both tasks measure the identification of unfamiliar faces/objects, after a brief delay. These four tasks were selected because each task type has a face- and object-version that are matched in format and difficulty for young children and are therefore most appropriate to test face and object perception that depend on dissociable processes in early childhood30. For a detailed summary of each task and data on typically developing children see27,29,30,31). For a detailed discussion on how these tasks were selected and possible caveats see supplementary information section S1.

Figure 1
figure 1

Late-sight onset, coupled with multiple years of post-treatment visual experience, leads to gains in object- but not face-identification improvements. (A) Sample trials of the Old/New Recall task (left) and Cambridge Memory Task (right). (B) Average group performance for group PNB− patients only shown pre-treatment versus 3+ years following treatment, as well as performance for acuity-matched controls who perform the task with imposed Snellen Acuity of 20/200 (group C200) or 20/500 (group C500). Performance is shown for the face-tasks (dark colors) and object-tasks (light colors) separately. Error bars represent SEM, N indicates sample size, and significant group differences are indicated by * for p < 0.05, ** for p < 0.01, and *** for p < 0.001.

For our study, these tasks were slightly modified from their original version to accommodate the specific needs of our clinical cohort. Images were presented on a 15.6-inch monitor and participants were encouraged to sit approximately 40 cm away from the monitor. We used Testable for all stimulus presentation and data collection32. Participants performed the four tasks in the following order: Old/New Recall tasks first (Flowers followed by Faces) followed by the Cambridge Memory Tasks (Bikes followed by Faces). This way participants performed in the easier tasks first, and only if cooperative and able, continued to the more challenging tasks. By administering each of these tests at different time points following surgical treatment for congenital cataracts, we were able to track the development of face and object identification performance in the patients who experienced prolonged periods of early visual deprivation, and as a function of late-onset visual experience.

The Old/New recall tasks consisted of an encoding and test block. In the encoding block, participants were presented with eight sequentially presented target images and asked to memorize these images. The sequence was presented twice to ensure that the participant saw and paid attention to the images. Within the sequence, each image was presented at a large size of 5.5 square-inches, to accommodate patients' low vision, and presented for 6000 ms each to allow sufficient scan time, with a 750 ms ITI fixation cross. This was followed by a test block, with a total of 24 randomly ordered trials. In each trial, participants saw two images (one target or old, and one distractor or new image) presented side-by-side for 3000 ms, with a 750 ms ITI fixation cross. Participants were asked to point to the image corresponding to the target (or old) image that they saw in the practice phase (left vs. right), and the experimenter input the response. Reaction time was not measured because our participants cannot independently input their responses. Figure 1A left shows sample trials of the face task.

The original Cambridge Face Memory Task (CFMT) is a standardized episodic memory test that involves studying six male faces, followed by a three-alternative forced choice test. This task consists of three conditions of increasing difficulty: same view, novel view, and novel view with noise. We found that even control participants struggled with the last two conditions, so in this study we analyze and report the results for condition 1 only. Results on condition 2 and 3 are included in the supplementary material, figure F2. The CFMT has been validated for detecting acquired prosopagnosia in adults33,34 and has been shown to produce much lower scores for inverted faces as compared to upright faces, confirming it is a reliable test of configural face processing35. Additionally, previous studies have found that performance on the CFMT correlates with performance on a face perception test without a memory component31,36. In contrast, performance on the CFMT shows only modest correlations with measures of nonface visual memory37,38 and even weaker correlations with measures of verbal memory39, suggesting it is a reliable test for isolating deficits in facial identification that are separate from other visual recognition skills. Finally, although recent evidence points to potential learning confounds with repeated testing40, we note that our study finds lack of improvements (or learning) despite repeated testing, and therefore we do not consider this to be a critical confound.

The original version of the CFMT was adapted to study facial identification and prosopagnosia in children across development, leading to two standardized tests to be used with minors: the CFMT-Kids and its non-face (Bike) version, the CBMT-Kids. We used these two kids’ versions, with slight adaptation to the presentation times and image sizes, given the specific needs of our cohorts. All adaptations were equally implemented to the Face- and Bike-versions of the Cambridge Memory Tasks (CMT). Briefly, the CMT tasks in our study consisted of a practice block with cartoon examples, to make sure the participants understand the task. The participants were then tested on six target identities. For each identity, participants performed a memorization phase, immediately followed by a test phase. In the memorization phase, three images of a specific identity (e.g. Bob or bike A) were presented sequentially from three viewpoints: 30° left, frontal, and 30° right, for 6000 ms each image, and with a 500 ms ITI. This was followed by three sequentially presented test trials of that individual (e.g. Bob). Each trial corresponded to one of the memorized viewpoints and consisted of three face images from the same viewpoint: the target (Bob’s face) and two distractors. All three images were presented simultaneously for unlimited time, until the participant responded which image corresponds to the learned face. Order of the viewpoints in the three test trials and location of the target face within the distractors were randomized. The CMT was performed once with non-face bike images (i.e. CBMT) and once with faces (i.e. CFMT). Figure 1A right shows sample trials of the face task.

Results

Early visual deprivation leads to impairments in face- but not object-identification

Our first goal was to determine if the ability to identify individual faces can develop following early visual deprivation. We predicted that visual deprivation in the first few years of life will lead to persistent impairments in face- but not object-identification, even after children gain extensive visual experience (measured as time following treatment). To do this, we assessed how well patients with essentially no pattern vision prior to treatment (i.e. group PNB− only) perform on each of the face- and object-tasks prior to treatment and multiple years after treatment. In Fig. 1B we present average group performance (purple) on each of the four tasks, directly comparing performance pre-treatment to three or more years following treatment. Note that given the complex nature of the cohort that participated in this study, the composition of the patients participating in each task and at each time point is not completely overlapping (see Supplementary Information table T2 for full details on group composition). We account for missing data in our analyses using mixed effects models. Data is also shown for two non-overlapping control groups (green) who participated in the experiments while wearing blurring filters simulating Snellen Acuity of either 20/500 (group ‘C500’) or 20/200 (group ‘C200’).

Our data reveal several patterns. First, prior to treatment the patients are unable to perform any of the four tasks, with performance being at or near chance across all four tasks. This contrasts with both control groups, who show better-than-chance performance regardless of task type (Old/New versus Cambridge Memory), category (Face versus Nonface) and acuity status at test time (C200 versus C500). To determine if participants’ performance level is significantly different from chance, we performed a binomial probability on each task type separately. The Old/New task consists of 24 total trials, with 0.5 probability of success on each trial. Calculating the binomial distribution, we find that chance behavior corresponds to a performance score between 0.375 and 0.625 (or correctly responding on between 9 and 15 of the 24 trials), whereas p(x = 8; x = 16) = 0.044, or is significantly different than chance. Similarly, the CMT task consists of 18 total trials, with 0.333 probability of success on each trial. Calculating the binomial distribution, we find that chance behavior corresponds to a performance score between 0.1667 and 0.5 (or correctly responding on between 3 and 9 of the 18 trials), whereas p(x = 2; x = 10) = 0.03, or is significantly different than chance. These data confirm that (1) prior to treatment, the patients’ visual status and experience was so profoundly impaired that it precluded the acquisition of both the face- and object-identification, (2) these pre-treatment impairments cannot be attributed to the tasks themselves being too difficult, given that age-matched controls were able to perform all tasks reasonably well, even when required to do so under extremely degraded visual resolution, and (3) despite the task modifications and blurred conditions implemented in our study, the face- versus nonface tasks remain well matched at baseline, as is evident from the controls’ comparable performance on the face- versus object-tasks.

Second, visual inspection of Fig. 1 data suggests that following a prolonged period of visual deprivation, commencing sight late in life (treatment), coupled with extensive post-treatment visual experience (time after treatment), leads to the acquisition of object- but not face-identification skill, regardless of task type. To account for the missing data points, we ran a mixed-effects repeated measures analysis on each experiment type separately. For the old/new experiment, our analysis reveals a main effect of time relative to treatment (F1,22 = 7.474, p = 0.0121) but no effect of object category (F1,19 = 2.523, p = 0.1287) or interaction (F1,19 = 3.058, p = 0.0965). The lack of category and interaction effects was somewhat surprising, given that the data clearly shows that time relative to treatment is not equal across the face- and object-tasks, and Bonferroni pairwise comparisons of performance at 3+ years relative to pretreatment is significant for the object old/new task (pobject = 0.0035) but not face old/new task (pface = 0.7491). To inspect this inconsistency, we looked back at individual participant data (see supplementary section, figure F1) and noticed that a single participant (SS #81) showed below-chance performance on the face task, likely due to a reversal of response (confusing the ‘old’ and ‘new’ responses). Excluding this data point, we find a more expected statistical pattern: a significant main effect of time relative to treatment (F1,22 = 4.868, p = 0.0381), no effect of object category (F1,18 = 1.25, p = 0.2782) and a significant interaction (F1,18 = 6.974, p = 0.0166), as well as Bonferroni pairwise comparisons of performance at 3+ years relative to pretreatment continuing to be significant for the object old/new task (pobject = 0.0019) but not face old/new task (pface > 0.9999). We emphasize that all results reported in the remainder of this paper do not exclude SS #81 outlier point, taking the more conservative approach in our data analysis. Finally, performing a comparable mixed effects analysis on the Cambridge Memory data, we find a main effect of time relative to treatment (F1,20 = 10.24, p = 0.0045), a main effect of object category (F1,17 = 9.981, p = 0.0057) and a significant interaction (F1,17 = 5.44, p = 0.0322). Bonferroni pairwise comparisons of performance at 3+ years relative to pretreatment is significant for the object CMT task (pobject = 0.0006) but not face CMT task (pface = 0.6211). Overall, our data reveal that when comparing performance pretreatment to three or more years after treatment, patients in group PNB− show significant gains on both object tasks, but no gains on either of the face tasks.

In addition to assessing performance gains as a function of time relative to treatment, Welch’s t-test comparing performance on the face versus nonface tasks within each of the time point reveals that while the patients perform equally poorly on the face and nonface tasks prior to treatment (Preop: pold/new = 0.9227, pCMT = 0.623), three or more years after treatment their performance is significantly better on the object- compared to the face-tasks (3 + years: pold/new = 0.0197, pCMT = 0.0018). Taken together, these data support our primary prediction that PNB− children, with early-onset visual deprivation that lasts beyond the first few years of life, experience persistent face-specific identification impairments, despite the introduction of both vision onset (treatment) and extensive visual experience (time post-treatment).

Finally, to disentangle the role of early visual deprivation from the effect of degraded vision that patients inevitably have even after treatment, we ran all the tasks on two groups of age-matched control participants with typical visual development, who performed the task while wearing blur goggles that simulate sub-optimal visual acuity. Because the patient group’s average acuity 3+ years following treatment was 20/178 Snellen acuity, we predicted that if task performance depends only on image resolution at test time, and not on early visual experience, then the patients’ performance should be approximately comparable to the performance of the control group performing the task under 20/200 Snellen Acuity. One-tailed unpaired t-tests with Welch’s correction comparing the patients’ group performance 3+ years after treatment (PNB-3+) to the performance of the control group performing the task with 20/200 Snellen acuity (C200) reveal significant differences on both of the face tasks (Face tasks: pold/new = 0.0024, pCMT = 0.0048) but no differences on both of the object tasks (Object tasks: pold/new = 0.3323, pCMT = 0.0867). Overall, we find that PNB− cataract-treated children can acquire object identification ability to levels that are comparable to acuity-matched controls, pointing to robust visual learning that is not bound by early critical periods. In contrast to this plasticity, children who begin to see late in life show persistent face-specific identification impairments, failing to show any performance gains even many years after treatment. Importantly, all data patterns found for objects compared to faces were consistent across both task types (Old/New Recall and Cambridge Memory Tasks).

Despite different pretreatment acuities, the two patient groups have comparable posttreatment acuities

To determine the impact of an initial period of low-resolution visual input on acquiring face- versus object-identification skill, we recruited a second group of patients with bilateral congenital cataracts, whose individual visual acuities prior to treatment were equal to—or better than—that of a typically developing newborn baby (group ‘PNB+’). Thus, we were able to compare the task performance of two patient groups, PNB− and PNB+, who differed in their pretreatment acuity. To make sure that any posttreatment performance differences found between PNB− and PNB+ cannot be attributed to group differences in posttreatment visual acuity, we compared the acuity profiles of the two patient groups at each timepoint. Acuity was measured using the FrACT test, a test validated for reliability and measuring profoundly low visual acuity25.

Figure 2A shows the acuities of individual participants per patient group and at each timepoint. Kolmogrov–Smirnov tests comparing the distribution of acuity profiles of the two patient groups was conducted at each timepoint. Our data reveal that although the visual acuity profiles of the two groups are statistically different prior to treatment (Preop: p < 0.00001), their acuities are not different at any other timepoint following treatment, and are completely overlapping at the final time point, 3+ years after treatment (see Fig. 2A for full set of statistical results). Thus, despite our two patient groups having different pretreatment visual acuities, the two groups’ post-treatment acuity profiles were comparable, and any post-treatment performance differences that the two groups may show on the face- and/or object-identification tasks cannot be explained by differences in the clarity of their vision at task time.

Figure 2
figure 2

Acuity and pretreatment baseline performance. (A) LogMar acuities of the two patient groups at each timepoint relative to treatment. Individual points correspond to individual patients that participated in the experiment at each corresponding time point. Lines indicate average LogMar acuities for each patient group at each time point. For comparison, the acuities of the control groups are shown in green. Kolmogrov–Smirnov tests comparing the distribution of acuities within each time point confirm that the two patient groups had significantly different pretreatment acuities, but comparable post-treatment acuities at all time points following treatment. Thus, group differences in task performance after treatment cannot be attributed to differences in acuity at the time of task. (B) Baseline performance prior to treatment only. Statistical comparisons confirm that prior to treatment group PNB+ performance exceeded that of Group PNB− on the object- but not on the face-identification tasks. Thus, pretreatment coarse visual acuity visual acuity that is comparable to that of a typically sighted newborn is sufficient to allow some object-identification skill to emerge, but not sufficient for developing face-identification performance. Individual points correspond to individual participants, and error bars correspond to group SEM.

Newborn-like acuity is sufficient for acquiring some object-, but not face-, identification

Prior to assessing performance gained after treatment, an important question is if one patient group had a pretreatment advantage on the task compared to the other group. Figure 2B shows task performance of the two patient groups at baseline (or preoperatively only), to determine if pretreatment group PNB+ was able to perform the tasks better than group PNB−. Two-way unpaired t-tests with Welch’s correction comparing pretreatment performance of group PNB− to PNB+ confirms that prior to treatment group PNB+ outperformed group PNB− on the object-identification tasks (Old/New-Flowers: t = 3.496, df = 19.46, p = 0.0023; CBMT: t = 3.034, df = 17.21, p = 0.0076), but not on the face-identification tasks, where the two patient groups performed comparably prior to treatment (Old/New-Faces: t = 0.7691, df = 14.00, p = 0.4546; CFMT: t = 0.6335, df = 15.09, p = 0.5359). This latter finding has several important implications. First, it means that even the extremely coarse visual resolution of group PNB+ is sufficient to resolve some aspects of object- but not face-identification prior to treatment. This is not the case for group PNB−. Second, it establishes that because the two groups have comparable pretreatment performance on both face tasks, any post-treatment improvement gains that group PNB+ may show relative to group PNB− cannot be attributed to better baseline performance. In contrast, if group PNB+ outperforms group PNB− on the object tasks post-treatment, this may be partially due to their superior pre-treatment performance. In the next section we report on the relative performance gains of each patient group following cataract-removal treatment.

HIA hypothesis: an initial period of low visual acuity is beneficial for later learning of face-identification

Figure 3A summarizes the predictions set forth by our proposal on the downsides of high initial acuity21,41. Briefly, the theory predicts that, like a typically developing neonate, children in group PNB+, who experience an initial period of low-resolution vision followed by the introduction of higher resolution vision will improve on both the object- and face-identification tasks. In contrast, children in group PNB−, whose preoperative vision is worse than a typically sighted neonate, effectively skip this initial period of low-resolution input and begin their visual journey with high resolution vision. As a result, these children will not develop large receptive fields effective for extended spatial integration and fail to improve on configural facial identification. They should, however, improve on object-identification tasks, as this relies on local processing and thus does not require the integration of information over large areas of the image. Indeed, data presented in Fig. 1B confirmed this second prediction, and showed that both in the Old/New Recall and Cambridge Memory tasks, group PNB− improved on the object- but not the face-identification tasks.

Figure 3
figure 3

High Initial Acuity. (A) Schematic of predictions set forth by the ‘downsides of high visual acuity’ by task type and patient group. (B) Longitudinal performance on all tasks divided by patient group. Significance performed within each patient group for each task to compare pretreatment performance to each of the post-treatment performances. Analysis reveals that, consistent with the predictions shown in panel (A), Group PNB− improves on object- but not face-identification tasks, even multiple years following treatment, whereas Group PNB+ improves on both the object- and face-identification tasks. Note the above analysis excludes S59’s 9–24 month data on the Old/New-Flowers task because this data point showed an abnormal sudden drop, resulting in an outlier and, to the best of our understanding, is actually a result of the experimenter switching between the ‘old’ and ‘new’ response buttons. To be conservative, we removed this time point from the analysis, despite being reasonably confident that the participant’s 20% performance reflects an 80% performance. The full set of analysis results for all comparisons is included in the supplementary section, figure F2.

In Fig. 3B we compared task performance of the two patient groups (PNB− in purple; PNB+ in orange) as a function of time since treatment (where treatment is the proxy for the point at which high resolution input is introduced). A mixed-effects three-way ANOVA was performed on each task type separately, given the different accuracy levels, with group (PNB+ vs. PNB−), time since treatment, and condition (face vs. object) as main factors, and accuracy as the dependent variable. For both the Old/New task and the Cambridge Memory task we found significant main effects but no interaction (see Supplementary Information table T4 for full statistical analysis). While the lack of three-way interaction was somewhat surprising, given our hypothesis, we note that this analysis should be treated with extreme caution: our sample sizes are small and not consistent across time or condition (face vs. object) and the power is likely too low to reveal any three-way interactions. To better understand the pattern of performance gains that occur after treatment for each participant group, we performed pairwise uncorrected Fisher’s LSD between each ‘pretreatment’ performance and each of its corresponding posttreatment performances, treating each participant group and condition separately. For clarity, the full set of statistical comparisons are included in the Supplementary Information table T5.

Our data show two primary findings. First, we replicate the finding from Fig. 1B, and show that, following treatment, group PNB− (purple) exhibited statistically significant improvements on both of the object tasks but did not improve on either of the face tasks, regardless of time after treatment. Additionally, we find that group PNB− improvements on the object-identification tasks does not occur as soon as high-resolution information is introduced, but rather require years of visual experience to emerge, with significant gains observed primarily at the ‘3+ years’ time point.

Overall, our data show that group PNB− children, whose pretreatment vision is worse than that of a typically sighted newborn, are (a) unable to identify faces and objects prior to treatment, (b) are able to learn object- but not face-identification upon treatment, and (c) this object identification learning is protracted and therefore not only dependent on the introduction of high resolution information, but also on gaining visual experience.

In contrast to PNB−, children in group PNB+ experience an initial period of low-resolution visual input prior to treatment, followed by the introduction of higher resolution input after treatment. According to the predictions of the HIA hypothesis, this group’s acuity trajectory of low-to-high visual acuity should lead to later improvements on both face- and object-identification. Indeed, as Fig. 3B shows, group PNB+ (orange) improved significantly on both the face- and object-tasks following treatment. Additionally, this group improves on the object-tasks immediately upon the introduction of high-resolution input, whereas their face-identification performance takes longer to emerge. Importantly, inferences about time-to-improvement should be made cautiously: this pattern is less consistent on the Cambridge Memory task, likely because this task tends to generate more variable patterns across participants30.

Overall, our data suggest that group PNB+ children, whose pretreatment visual acuity is comparable to that of a newborn, are (a) able to use their extremely coarse pretreatment vision to minimally identify objects but not faces, and (b) acquire both face- and object-identification upon the introduction of high-resolution visual information (treatment), and (c) show almost immediate improvements in object identification, suggesting this skill requires little visual experience, but a more protracted acquisition of face identification, which depends on both high-resolution visual input and visual experience.

General discussion

We have examined how early visual experience impacts later development of face- versus object-identification. The finding that cataract-treated children with profound early visual deprivation fail to improve on the face-identification tasks strongly suggest that this skill is highly vulnerable to early critical periods, a finding consistent with previous reports of cataract-treated children that showed impairments on configural face tasks11,42. Our longitudinal data extends these previous reports to show that despite both the onset of sight, and extensive visual experience (or time after treatment), group PNB− patients did not show any recovery of standardized facial identification skill but did show gains on object-identification. We note that these face-specific vulnerabilities do not hold for all face-related tasks: children with late sight-onset do improve on face-detection tasks (‘is this a face?’)10, some aspects of facial expression recognition43, and localized facial recognition based on features14, though these face recognition improvements may rely on neural representations that are different from those supporting recognition in typically sighted children. Taken together, our data point to persistent face-specific impairments that uniquely affect identification of individual faces—a process that, despite not reaching full maturity until adolescence, seems to be highly dependent on early visual experience.

We hypothesized that commencing visual experience with abnormally high acuity compared to typically-sighted newborns may result in an over-reliance on localized processing, leading to persistent impairments in perceptual tasks that rely on extended spatial integration (such as facial-identification, detecting configural changes of face parts, and the face composite effect15) but improvements in other tasks that can be performed more locally (such as face-detection, featural changes of face parts, and object-identification). Indeed, group PNB− data confirm this prediction: children with extended visual deprivation begin their visual journey with unusually high visual acuity (Fig. 2A), effectively skipping an initial period of newborn-like visual acuity, and despite years of visual experience, acquire object- but not face-identification skill (Figs. 1B and 3B, purple). A retrospective look at past studies with cataract-treated children seems to be consistent with this idea across other task domains as well. For example, children with brief visual deprivation tested years after treatment in adulthood showed little to no deficits in motion processing using only local cues (i.e. motion direction detection in sine wave gratings) but significant deficits when on tasks that required the global integration of moving dots (i.e. motion coherence tested with random dot kinematograms)44.

A second prediction is that if a Prakash child was to experience an initial period of low visual acuity before the onset of high visual resolution (a visual trajectory characteristic of typically-sighted infants), then they will develop mechanisms that support extended spatial integration, and in turn acquire both face- and object-identification skill (Fig. 3A—second row). Consistent with this prediction, children in group PNB+, whose pre-treatment acuity was comparable to that of a typically sighted newborn, improved on both the face- and object-identification tasks following treatment.

While our data in this study is consistent with the predictions put forth by the ‘downsides of high initial acuity’21, we note that we still cannot rule out alternative explanations such as critical periods for face processing. For example, the “sleeper effects” account of visual development suggests that very early visual experience that occurs when the infant’s vision is still poor (e.g. prior to 2 months old) plays a critical role in setting up the neural architecture for processes, such as face perception, that come online much later in the visual development trajectory5,45. Both the ‘downsides of high initial acuity’ and the “sleeper effects” accounts emphasize the critical timing of typical early or initial visual input for modulating or bootstrapping visual processing mechanisms that will become useful later in life, and therefore predict that introducing this period of low-resolution input later in development will not reverse the damage done by the abnormally high visual acuity that was experienced initially. Building on the “sleeper effects” proposal, the ‘downsides of high initial acuity’ account attempts to provide a more parsimonious non-domain specific explanation. Rather than focusing on the absence of visual information early in development, emphasis is placed on deviations from the typical acuity trajectory. Thus, this model accounts for the range of findings with cataract treated children, including why some processes within face perception are impacted (e.g. configural face recognition) while others are not (featural face recognition), why some perceptual skills can fully recover (e.g. object recognition) and others remain permanently impaired (e.g. global motion perception), and finally with the current findings, why even minimal pattern vision prior to treatment leads to improvements on specific tasks (group PNB+) whereas the exclusion of this low-resolution visual input does not (group PNB−).

Finally, our proposed hypothesis raises interesting paths for thinking about potential rehabilitation and targeted interventions. In a child treated for congenital cataracts, can we provide them with an initial period of low-resolution visual input to facilitate the establishment large spatial integration? If so, will this lead to larger receptive fields, tested with measures of population receptive fields46? An affirmative answer would suggest that any observed improvements would be long-lasting and may have significant implications both for understanding the mechanistic changes that are impacted by early experience, and for designing effective intervention.