Introduction and background

Homonymous hemianopia (HH) is a visual field defect that is defined by complete or partial blindness in the visual fields to the right or left side of both eyes, commonly caused by cerebral infarction [1]. Visual field defects are estimated to affect 20 to 57% of people who have had a stroke [2] with a recent UK reported incidence of 30% [3]. Approximately 8 to 10% of stroke patients with visual field defects have permanent HH [4].

Functional ability in activities of daily living like mobility, reading, driving and general quality of life can be affected by visual field defects following stroke. Additionally, visual field loss may influence the patients’ ability to participate in rehabilitation and has been linked to depression, anxiety and social isolation [2, 5]. There are many interventions for visual rehabilitation: restoring the visual field (restitution therapy), changing behaviour to compensate for lost visual function (compensation training) and substituting for the visual field defect by using a device or extraneous modification (substitution therapy) [6].

Visual scanning training is one of the compensation approaches for hemianopia rehabilitation that has been shown to lead to significant improvements in vision‐related quality of life judgements by affected patients [7,8,9]. Moreover, previous research shows that rehabilitation using visual scanning training leads to neural changes in addition to compensatory behavioural improvement. A study undertaken by Nelles et al. [10] investigated eye movement training–induced plasticity with functional magnetic resonance imaging (fMRI) in patients with HH and showed increased neural activity in the contralesional extrastriate cortex, suggesting that training of exploratory eye movements induces changes in the cortical representation of hemifields.

It has been hypothesized that audio-visual (AV) stimuli can further enhance saccade precision into the missing hemifield [11]. Furthermore, systematic AV stimulation of the blind hemifield has been shown to improve accuracy and search times in visual exploration, probably due to the stimulation of the superior colliculus (SC), an important multisensory structure involved in both the initiation and execution of saccades [12]. SC areas receive converging projections from different senses and consist mainly of multisensory neurons. As a result, the unimodal processing of visual information in the blind hemifield seems to be affected by the interaction between different sensory inputs occurring within the SC [13]. The process of synthesizing the information provided by our different senses is known as “multisensory integration”, which can improve perceptual and behavioural performance more than the sum of their individual action, leading to a profound impact on our daily lives [14].

Tinelli et al. [15] found that audio cues, spatially and temporally coincident with a visual stimulus, improve visual perception in the blind hemifield of patients with HH. This observation can be explained by populations of neurones in the SC that response ‘super additively’ to bimodal signals that are temporally and spatially coincident. When Wallace et al. [16] examined neuron responses to multisensory cues in the SC of the rhesus monkey, it showed that each of unimodal- and multisensory-responsive neurons was clustered by modality, these modalities were represented in map-like fashion and the different representations were in alignment with one another. Each multisensory SC neuron has multiple receptive fields, and each responds to a different type of stimulation. It has been claimed that the combination of stimuli may result in response enhancement, response reduction or no interaction, depending on the location of the stimuli relative to one another and to their receptive fields, respectively [16]. Consequently, maximal response interactions were seen when visual and auditory inputs originated at the same source position, and thus are likely to be caused by the same event. Stimuli that were spatially separated, in contrast, would fall outside the excitatory borders of its receptive field, so that either no interaction is produced or one stimulus diminishes the effectiveness of the other [16].

The expectation of an enhancement of visual processing is also based on an important and adaptive property of multisensory integration [13]. Repeated experience with the same AV stimulus can increase the neuron’s sensitivity to auditory and visual individual inputs [14], leading to the conclusion that visual and spatial compensatory functions can be reinforced by audio-visual training (AVT) in adult patients with chronic visual field defects following a stroke.

The factors that influence how a person adapts to visual field loss, the interventions that are available to aid the adaptation process as well as the effects of interventions on people with visual field defects after stroke are covered well in previous reviews [2, 6, 17]. AVT-induced plasticity in patients with hemianopia, however, is less well described [10]. Consequently, further research is required to identify the optimal training paradigm and the neural mechanisms underlying the effect of bimodal stimulation (AVT) [18]. This review focuses on the effectiveness of interventions that use AV multisensory training as rehabilitation for stroke survivors with visual field defects and its impact on the quality of their daily life and the underlying change in brain function and structure.

Methodology

The review is observed and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [19, 20] (Table 3 in the Appendix). We used the Preferred Reporting Items for Systematic review and Meta-Analysis Protocols (PRISMA-P) 2020 checklist with recommended items to address this systematic review protocol [19, 20] (Table 4 in the Appendix).

Eligibility criteria

Searching in the databases covered randomized trials, controlled trials, prospective and retrospective cohort studies, observational studies and case–control studies in adults after stroke, where the intervention is focusing on multisensory integration, especially AV training, to improve the visual and spatial compensatory functions or improving the ability of the participant to cope with the visual field loss. Case reports and letters were excluded. Articles that discussed other visual impairments alongside HH but discussed visual field loss separately were included. As most of the articles were in English, we searched without defining a specific language and there were no translations needed. It has been reported that AVT can induce activation of visual responsiveness of the oculomotor system in children and adolescents with acquired lesions as effective as in adults [21]. Therefore, we included studies of adult and child participants reporting on visual field loss. We included outcome measurements of a comparison between unimodal and multimodal stimulations in compensatory recovery, functional performance in daily living activities, visual impairment measures for HH, dyslexia, visual scanning and searching task, effect on brain multisensory integration by using neuroimaging, and AV localization and space perception in hemianopia. Studies reporting on mixed populations must have had 50% or more of subjects diagnosed with hemianopia and data available within this category.

Information sources and keywords

We conducted a full systematic review of the literature in the Scopus, PsycINFO, ScienceDirect, Web of Science and PubMed databases dating from the start of recorded databases for each information source to June 2020 (Table 1).

Table 1 Search terms

Selection process

After identifying the titles and abstracts from the search, these were screened through each phase of the review using the predetermined inclusion criteria. The titles and abstracts identified from the search were independently screened by one author (KW), and at least 10% were double checked by a second author (FJR) through each phase of the review. When further information was required for this process, the full paper was obtained and the selection criteria were applied. A subsequent review of the full papers was undertaken to determine which studies should be included. We resolved disagreements at each step by discussion between the two review authors. If a disagreement remained, we sought the opinion of a third reviewer (GM).

Data extraction for the included studies

A designed form was used for the data extraction process. The data extraction form included all the factors identified by the researchers (KW, FJR) as having potential importance for analysing the effect of the AVT on visual compensatory functions: extent of visual field loss; age and gender; research design, sample size, AV training paradigm and training dose; primary and individual assessments; and primary and individual results.

Quality assessment

The term “quality” refers to “the degree to which a study employs measures to minimize bias and errors in its design, conduct, and analysis” [22]. The quality of included studies was reviewed using the following checklists:

  • An adaptive version of PRISMA was used to assess evidence in review articles [19].

  • An adapted version of Consolidated Standards of Reporting Trials (CONSORT) was used for evaluation of the quality of evidence in randomized controlled trials and controlled trials [23].

  • Quality assessment using the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist was performed for the observational studies (available on STROBE statement: available checklists; strobe-statement.org).

Dealing with missing data

Our strategy was to contact authors of included studies if important data were not available. The reviewed papers contained the relevant data so that no authors had to be contacted.

Data analysis

We provided a narrative synthesis of the findings from the included studies, structured around the type of intervention, target population characteristics, type of outcome and intervention content. We found a significant heterogeneity between the included studies; therefore, a meta-analysis could not be undertaken. The results are supported by a summary of findings (table and figure highlights).

  1. 1.

    First is AV stimulation training outcome measures on HH adults/children by a comparison between unimodal and multimodal stimulations in compensatory recovery. The outcome measures involve functional ability in activities of daily living, visual scanning and search tasks, hemianopic dyslexia and the electroencephalographic (EEG) measures on spatial attention for HH. In addition, we included the summary of findings regarding the maintenance of functional ability post training.

  2. 2.

    Second is AV localization and space perception in HH.

  3. 3.

    Third is AV integration in patients with visual field defects.

Results

The flowchart for this systematic review is illustrated in Fig. 1. Sixteen studies were included (fourteen articles (188 participants) and two literature reviews). The key data extracted from the included studies are presented in Table 2.

Fig. 1
figure 1

PRISMA flow diagram. Schematic of the literature search and article selection used to identify studies on using AVT as rehabilitation strategy for stroke survivors with HH

Table 2 Key data extracted from the studies of audio-visual training for patients with hemianopia

Quality assessment

The risk of bias was assessed for each of the included articles (Online Resources 1, 2 and 3) (see additional Supplementary Files 1, 2 and 3). Overall, no article scored 100% for quality assessment in this section. The articles included 12 observational studies, two randomized controlled studies and two reviews. Twelve of the 16 articles scored between 76 and 87% on qualitative assessment and were deemed to have good quality. Four studies scored between 52 and 73% on the relevant quality checklists. All articles were included in this review.

Adult populations

AVT outcome measures on patients with hemianopia

Visual training vs. audio-visual training

Two reviews discussed visual rehabilitation using multisensory stimulation to compensate for the visual loss after stroke [24, 25]. Seven original research articles recruited patients with HH and trained them on AVT to study the effects of multisensory training on oculomotor scanning behaviour in comparison to the visual training (VT). A total of 71 patients (and 36 controls) were recruited for the AVT and tested on the same apparatus before and after the training period, either on unimodal visual detection task or on both unimodal and multimodal visual detection tasks. These included the following study types: two randomized controlled studies (n = 32) [18, 26], two cohort studies (n = 40) [27, 28], one case–control study (n = 24) [12] and two uncontrolled longitudinal studies (n = 11) [15, 29]. The training duration varied between 2 weeks and 2 months, except in one study where one practice and one experimental session were undertaken, and the evaluation was performed during the experiment [29].

During the training, visual and auditory stimuli were presented in four different ways:

  1. 1.

    Spatially and temporally congruent only, by presenting the same duration of acoustic and visual stimuli (100 ms) at the same time and in the same spatial position, in two studies [18, 27]

  2. 2.

    Spatially congruent and disparate with temporally congruent only, in which the acoustic and visual stimuli were presented either at the same time and location or in the same time but different locations (16° and 32° of disparity in either side), in two studies [15, 29]

  3. 3.

    Spatially and temporally congruent and disparate, in which the spatial disparity between the acoustic and visual stimuli was systematically varied (0°, 16° and 32° of disparity) and the temporal interval between the acoustic stimulus and the visual target was gradually reduced from 500 to 300 to 0 ms, in two studies [12, 26]

  4. 4.

    Passive auditory stimulation which depends on the hypothesis that sensory input from an intact modality (auditory) may improve processing of information by spared structures of a damaged sensory system (visual) through synchronous neural activity by repetitive sensory stimulation without requiring any active task from the patient; this protocol is referred to as coactivation or unattended activation-based learning, in one study [28]

All studies reported an improvement in ocular exploration after AVT, which allowed patients to efficiently compensate for the loss of vision with a clear advantage of the AVT in comparison to the visual-only exploration training. It has been found that a sound, spatially and temporally coincident to a visual stimulus, can improve visual perception in the blind hemifield of patients with HH [24, 26].

Imaging studies in humans have confirmed the involvement of the SC and posterior cortical areas, including the temporo-parietal and posterior parietal cortices, in mediating AV multisensory integration [24]. The studies suggested that, because most of the patients with damage to the visual cortex have an intact SC, it might be possible to train the use of retinotectal functions by AV stimulation in both chronic and acute phases after the stroke [12, 18, 24]. Nevertheless, Ten Brink et al. [29] indicated that saccade accuracy was affected only by visual stimuli in the intact, but not in the blind visual field for all patients with HH participating in their study and, only in one participant with a more limited quadrantanopia, was an enhancement in the oculomotor eye movement after the spatially coincident visual stimulus was observed. They concluded that multisensory integration is infrequent in the blind field of patients with hemianopia.

Conversely, in the study of Jay and Sparks [30], trained monkeys made saccadic eye movements to auditory or visual targets whilst monitoring the activity of visual-motor (VM) cells and saccade-related burst (SRB) cells. The authors stated that “the SC is a site where sensory signals (either auditory or visual signals), originally encoded in different coordinates, converge and are translated into a common motor command: a command to correct for saccadic motor error” [30]. This was based on the largely accepted basic hypothesis that sensory input from an intact modality (audition) can enhance processing of information by spared structures of a damaged one (vision). Lewald et al. [28] assumed that AV bimodal neurons not only react to stimulus combinations and integrate information from different sensory inputs, but also can respond to unimodal stimuli and provide a substrate for signalling in two separate modalities. It has been shown that one-time passive auditory stimulation on the side of the blind, but not of the intact, hemifield of patients with hemianopia induced an improvement in visual detections by almost 100% within 30 min after stimulation [28]. The authors assumed that an activation of the surviving parts of the primary pathway and/or the colliculo-pulvinar-extrastriate pathway in HH may lead to an improvement of the related residual visual abilities in the blind field, either by more effective sensory processing of unimodal visual information within the residual pathway or by an increase of spatial attentional functions.

Head fixation and eye movement were monitored in all the studies. Five studies used an optic eye tracker (Eye-Track ASL-6000) and/or infrared video camera where the position of the subject’s eye in the visual scene was monitored on-line by the experimenter [12, 15, 18, 27, 29]. Two studies analysed the eye movement using electro-oculography (EOG) [18, 28]. Fixation was monitored visually by the experimenter standing behind the apparatus in one study [26]. Improvement in oculomotor exploration characterized by fewer fixations and refixations, quicker and larger saccades, reduction in scan path length and the mean exploration time was reported in five studies [12, 15, 18, 27, 28].

All studies demonstrated that the improvement found can be ascribed to compensatory behaviour as there was no significant difference observed in perceptual sensitivity when patients were not allowed to move the eyes (fixed-eyes condition), emphasizing that the treatment did not improve the scotoma in the visual field. This means that AVT is not a restorative treatment in nature.

Activities of daily living

Three articles evaluated the activities of daily living (ADL) for HH (total n = 30 [8, 10, 12]) in the chronic stage, more than 3 months after stroke [12, 26, 27]. A self-evaluation questionnaire containing 10 items, on a 5-point Likert scale was used including (1) seeing obstacles, (2) bumping into objects/obstacles, (3) losing the way, (4) finding objects on the table, (5) finding objects in the room, (6) finding objects in the supermarket, (7) walking in a crowd, (8) reading, (9) to go upstairs/downstairs on the staircase and (10) crossing the streets. Evaluation of ADL for stroke survivors with HH in the subacute stage, between 3 and 24 weeks after stroke, has been reported in one study (n = 20) [18]. The questionnaire comprised only items that can be observed in an inpatient rehabilitation setting including finding objects on the table, avoiding bumping into objects/persons, eye contact, seeing obstacles and reading.

AVT promoted a significant reduction in subjective perceived disability according to the analysis of ADL for patients in both chronic and acute stages after brain damage, confirming a transfer of training effects to ecological environments. It has been indicated that whilst there was a significant improvement in the ADL results after the AVT, no difference was observed after a control VT, consisting of systematic visual stimulation of the visual field on the same apparatus as the AVT. The control VT was performed for 2 weeks before starting the AVT and for a similar amount of training time (4 h/day) [12].

Visual scanning/searching

Visual scanning or visual search tests consisted mainly of three subtests: (1) the ‘E–F test’, where patients search for the letter F embedded among distractors, the letter E; (2) the ‘triangle test’ where patients reported the number of triangles embedded within square distractors with the same size and colour; and (3) the ‘number test’ containing 15 numbers (from 1 to 15) randomly distributed over a black background from which the patient is asked to point to the numbers in an ascending order.

Findings provided by four articles (n = 50, 32 males and 18 females) reported a significant improvement in visual search performance both in terms of accuracy and search times. Patients’ visual scanning behaviour became more efficient and faster, by comparing pre- and post-AVT tests, giving evidence that stimulating the SC may induce a more organized pattern of visual exploration due to an implementation of efficient oculomotor strategies [12, 18, 26, 27]. Passamonti et al. [12] observed significantly fewer fixations, saccadic duration was reduced and mean saccadic amplitude was significantly increased in all comparisons before and after AVT. Additionally, in a study conducted by Keller and Lefin-Rank [18], the detection rate of target stimuli improved by about 46% in patients of the AVT group, whereas in patients of the VT group, it only improved by 16%. This may suggest that the amelioration in visual perception induced by training is mostly mediated by the oculomotor system where patients can actively scan via eye movement [26], supporting the hypothesis suggested by [30] that auditory and visual signals have been translated into common coordinates at the level of the SC and share a motor circuit involved in the generation of saccadic eye movements.

Hemianopic dyslexia/reading

Assessment of hemianopic dyslexia has been undertaken in four articles before and after the AVT (n = 50, 32 males and 18 females). In one study, the reading task was for single words only, which were presented in upper-case Italian letters [26]. The other three studies examined the reading time and accuracy by using longer texts (8 to 20 lines) [12, 18, 27]. Comparisons between VT and AVT revealed statistically significant differences in favour of bimodal training. The reading time for patients in the AVT group reduced from 177 s before training to 75 s after training. However, a slight reduction in the reading time was shown for patients in the VT group (from 195 s initially to 175 s post training) [18]. In addition, Grasso et al. [27] revealed a significant effect of the AVT on the reading speed of patients with HH. Generally, according to a review on visual rehabilitation comparing multisensory stimulation and visual scanning, the reading performance improved in all patients after the AVT treatment period, reducing the ocular reading parameters for both progressive and regressive saccades [24]. Lateralization effect on reading impairment was observed in regard to the affected hemifield in patients with HH, by measuring five variables: the number of saccades in the reading direction, the number of regressions (backward saccades), the number of saccades during the return sweep (additional to the one necessary to start a new line), the mean duration of fixation and the amplitude of reading saccades, in only one study [12]. For right HH, the saccadic amplitude increased and the fixation duration reduced during reading, with fewer errors, fewer progressive saccades and fewer regressions, whilst only the number of saccades during return sweep decreased in left HH [12]. Patients with left HH obtained an almost complete normalization of defective ocular responses; however, patients with right HH still showed an impairment of the ocular responses, despite the clear benefit gained [12].

Electroencephalographic measures on spatial attention for HH

A similar study paradigm was used in two studies for the EEG assessment, in which patients performed a simple visual detection task. EEG data were recorded in both studies (n = 18, 15 males and 3 females) [27, 31] at three time points: baseline (B1), 2 weeks after B1 and immediately before the AVT (control baseline B2) and after the AVT treatment (p). In addition, Grasso et al. [27] included a follow-up session (f) 8 months after the treatment ceased (Fig. 2).

Fig. 2
figure 2

The timeline for the EEG measurements in two studies [27, 31]. Clarification of the sessions where the EEG measurements were obtained over time

The data was recorded from 27 electrode sites and the right mastoid. The left mastoid was used as reference, whilst the ground electrode was positioned on the right cheek [27, 31]. P3 components were measured as the mean amplitude in a time window between 200 and 600 ms after the presentation of the stimulus. In the chosen time window, scalp topography at B1 indicated a maximal positive inflection over electrodes CP1, P3 and Pz [27, 31]. Therefore, data from these electrodes were used for statistical analysis. Dundon et al. [31] computed the P3 amplitudes separately for the left and the right HH groups. The average value of electrodes that fell within each group’s zone of maximal P3 amplitude from the individual group topographies was used, i.e. Pz, P3 and CP1 for the right lesion group, and P4, CP2, C4 and CP6 for the left lesion group. A reduction in P3 amplitude in response to stimuli presented in the intact field was reported in both studies, indicating reallocation of spatial attention resources after AVT (Fig. 3). The EEG results obtained by Grasso et al. [27] and Dundon et al. [31] showed that the mean P3 amplitude at session P (7.38 µV) was significantly lower compared to the mean P3 amplitude at B1 (9.62 µV; p < 0.05) and at B2 (9.435 µV; p < 0.05). No significant difference, however, in P3 amplitude was recorded between B1 and B2 (Fig. 3). In the follow-up session, Grasso et al. [27] reported that the mean P3 amplitude at session F (7.99 µV) was also significantly lower than the mean P3 amplitudes at B1 and B2. A reduction in the intensity of cortical processing in the contralesional hemisphere manifests an improvement in the dynamic visual performance, specifically in the hemianopic field, which indicates attenuation of the allocation of attention towards the intact hemifield. Dundon et al. [31] concluded that multisensory stimulation may significantly reduce the ipsilesional attentional bias in patients with HH.

Fig. 3
figure 3

Mean P3 amplitudes. Calculated as a mean value of P3 amplitude results of Grasso et al. [27] and Dundon et al. [31], measured as a function of testing session (B1, B2. P). Asterisks connected with lines indicate significant comparisons (p < 0.05)

To test possible different contributions of the left and right hemispheres to the resulted P3 reduction in the study of Dundon et al. [31], only the more lateralized electrodes CP1 and P3 were considered, with group as a between-subjects factor (left lesion patients vs. right lesion patients) and with electrode (CP1, P3), session (B1, B2, P) and position (upper, middle, lower) as within-subjects factors. A significant effect of session (p = 0.029) was found with no significant effect of group or any significant interaction between groups and the other factors. These results suggested that no considerable difference of the P3 amplitude reduction was found between left and right HH patients in post AVT training. So, the observed reduction in attention towards the intact hemifield, which might co-occur with a shift of spatial attention towards the blind field, happens similarly in both hemispheres [31].

Change over time

Assessment of the intervention effects over a prolonged period of time is of importance to consider treatment effectiveness [25]. Four articles (n = 45, 28 males and 17 females) incorporated a follow-up test post AVT training in their design at a period between 1 month and 1 year [12, 15, 26, 27]. There was a transfer of AVT treatment gains to functional measures assessing visual field exploration and to daily life activities which were found to be stable at follow-up control sessions in all the studies, indicating a long-term persistence of treatment effects on the oculomotor system. These long-lasting effects according to Grasso et al. [27] are most probably subserved by the activation of the spared retino-colliculo-dorsal pathway, which boosts orienting responses towards the blind field, increasing the ability to both compensate for the visual field loss and concurrently attenuate visual attention towards the intact field.

Nevertheless, in the study conducted by Lewald et al. [28], ten patients with pure HH received 1 h of passive auditory stimulation by application of repetitive trains of sound pulses. Immediately before and after the auditory stimulation as well as after a recovery period of 2 h, they completed a simple visual task (see visual training vs. audio-visual training section for more details). Whilst the visual detection improved immediately post auditory activation, after the recovery period, the enhancement in performance had returned to baseline, showing that the improvement in performance is not long-lasting when passive auditory stimuli were used.

Audio-visual localization and space perception in hemianopia

Four observational studies (n = 43, 29 males and 14 females) investigated AV localization [32, 33] and the geometry of the visual space in HH using multisensory stimuli [34, 35]. The former studies predicted that visual stimuli in the intact visual field would bias the auditory localization, so that sounds would be mislocated towards their visual source. On the contrary, they expected that in the blind field, where the occipital cortex damage had disrupted its underlying neural circuit, this effect of bias would not occur [32]. Both studies examined the cross-modal condition in which the auditory stimuli were presented with visual stimuli in either spatially coincident or spatially disparate. The results in the intact visual field were in line with the phenomenon known as the ventriloquism effect. In this effect, a presentation of auditory and visual stimuli that are temporally coincident and spatially disparate might lead to mislocation of sounds towards their visual source [36]. In the hemianopic field, however, no visual bias occurred when the two stimuli were spatially separated, which supports the key role of visual cortex for such an effect, so that, when the visual cortex has been damaged, no visual bias was observed [32].

This is because the enhancement of auditory localization is expected via SC neurons, depending on the multisensory activation. It has been shown that visual stimuli affected auditory localization only when stimuli were spatially and temporally coincident, meaning that covert visual processes remain active in hemianopia [32]. The authors explained the difference between the enhancement in multisensory stimulation and the visual bias as these two results are dependent on different neural pathways. The multisensory stimulation is dependent on circuits that involve the SC which facilitate orientation and localization of cues from multiple senses, and the visual bias is dependent on geniculo-striate circuits that provide analysis of the visual scene [32]. A similar result was obtained by Passamonti et al. [33], by comparing patients with HH and patients with neglect (n = 9 and n = 6, respectively). A consistent shift in sound localization towards the visual attractor was still evident in patients with neglect but not in patients with HH, supporting the role of the geniculo-striate circuits, which is damaged in HH, in such an effect.

The latter studies investigated the concept of how unilateral brain damage in HH can affect the perception of body orientation in space, leading to an attentional bias towards the contralesional field. Lewald et al. [35] indicated that auditory spatial orientation in HH, without spatial neglect, was almost normal compared with healthy subjects. Thus, it was suggested that in multimodal space, visual brain areas, as are damaged in HH, are not directly involved in relating body position to the external space [35]. Additionally, subjects were asked to match the location of a single visual target with an auditory marker or vice versa to estimate the potential distortions in the representation of visual space accompanied by HH [34]. It has been highlighted that patients with HH may exhibit distortion in both the visual and auditory spaces. However, in the bimodal approach, they would cancel each other out, and as a consequence, the cross-modal abilities might be preserved [34].

Audio-visual integration in patients with visual field defects

The anatomical correlation of audio-visual integration was investigated by a comparison between patients with hemianopia and patients with spatial neglect in two studies (n = 36, 22 males and 14 females) [13, 33]. Both studies showed that after adaptation to spatially coincident AV stimuli, both HH patients and neglect patients exhibited significant reduction in auditory and visual localization errors. A possible explanation for these effects is the function of multisensory neurons in the SC, which can be activated when the stimuli from different sensory modalities at close spatial proximity interact [13]. Thus, the results indicated that damaged brain areas (striate and parieto-temporal areas) in HH and neglect patients were not contributory in this specific form of perceptual learning [33]. In other words, visual information is capable of calibrating auditory space, even without the involvement of those brain areas, as long as visual information and auditory information are spatially coincident. Passamonti et al. [33] found that adaptation to spatially disparate stimuli invokes the geniculo-striate circuit to correct and reduce the discrepancy registration. However, adaptation to spatially aligned stimuli invokes the collicular-extrastriate circuit to reduce the localization errors. Therefore, the multisensory enhancement should be observed in both neglect and HH patients as the collicular-extrastriate circuit is spared in both patients [33].

By contrast, in patients suffering from hemianopia and neglect, multisensory integration did not occur [13]. It has been reported that integrative multisensory effect depends on the extension and/or the localization of the lesion. Lesions causing neglect are mainly confined to the frontotemporal and parietal lobes (visuospatial attentional system) whilst lesions causing HH are mainly confined to the occipital lobe (the primary sensory visual system), and for patients with both neglect and HH, the lesion could involve both areas [13]. A possible explanation provided was that the influence of these cortical areas modulates the ability of SC to synthesize cross-modal inputs, preventing the cross-modal integration in patients with both deficits.

Childhood population

The possibility of inducing long-lasting amelioration after AVT in children with chronic HH due to acquired brain lesions was investigated by only one study [21]. The study included three children (one male and two females aged between 9 and 17 years). The training duration was one and a half hour daily for 4 weeks. Outcome measures consisted of correct number of visual detections, visual search ability and reading speed. The visual search test consisted of six different subsets: the apple, frog, smile, E–F, triangle and number tests [21]. Each subject was tested before and after the training period and after a follow-up period of 1 month, and in one case, further follow-up was obtained after 12 months. The authors found a marked improvement in detections and response times only when subjects used explorative eye movements, but not with fixed eyes on a central point [21]. This suggests that the enhancement in visual perception induced by training is mediated by the oculomotor system, reinforcing orientation towards the blind hemifield. For all the tests, the main factor session was significant when response times were considered [21].

Improvement in reading speed after training was observed for the single word reading performance for all subjects. The results of this study confirmed that AVT can also induce activation of visual responsiveness of the oculomotor system in children and adolescents with visual field deficits as the visual search behaviour became more efficient and faster after treatment. Tinelli et al. [21] argued that this manifests the important role of the multisensory integration especially the SC in this type of ocular compensation and in the plasticity of the visual system in the presence of ‘blindsight i.e. residual visual capacity but without acknowledged perceptual awareness after lesions of the striate cortex’ even when the occipital cortex is completely damaged. Long-lasting effect of the treatment was reported in both 1-month and 12-month follow-up tests [21].

Discussion

This review assessed the effectiveness of AV stimulation as a rehabilitation option for stroke survivors with HH in adulthood and childhood. The included studies suggested that unilateral damage to the visual cortex of the occipital lobe may retain behavioural responses to visual stimuli in the lost visual field, with the existence of blindsight effect, especially when combined with acoustic stimuli. Neurophysiological and neuropsychological studies have discussed that the spared striate cortex, the extra-striate visual cortex and neural pathways involving subcortical nuclei such as the SC could mediate the blindsight effect [37, 38]. Although blindsight could be considered as an implicit process occurring without an explicit knowledge, Frassinetti et al. [13] argued that even when patients with HH were aware of the presentation of visual stimuli in the blind hemifield, there was an improvement in the processing of the visual detection in the multimodal stimulation, suggesting another interpretation of this finding. They have suggested that the responses probably are mediated by cortical areas (poly-sensory and/or sensory-specific cortices) that might be multimodal areas, and are involved in the cross-modal integration.

Several studies provide evidence showing that multisensory integration through the SC is expected to have enhanced the responsiveness of the oculomotor system by reinforcing the orientation towards the affected hemifield. The SC is a midbrain structure that receives visual, auditory and somatosensory inputs, and is involved in detecting, locating and orienting to external events [14]. Thus, the sensory information from an unimpaired modality (auditory) might improve the processing of information from an impaired sensory system (visual) [25]. The functional properties of multisensory neurons have been most extensively studied, since the last few decades, in the cats’ and monkeys’ SC [14, 37], yet its organization follows a general mammalian scheme. Stein and Rowland [14] explained the multisensory activation in the SC where the inputs from cross-modal stimuli that are spatially and temporally coincident assemble onto common multisensory neurons and transform these unisensory signals into a synthesized multisensory product in which physiological responses are faster, more reliable and more robust than those elicited by either individual stimulus.

This effect is in agreement with the results on human studies of patients with HH, revealing that the spared retinotectal functions of SC neurons can be trained by AVT to enhance the multisensory integration responses and increase the visual detection of stimuli presented in the blind field through the cross-modal blindsight phenomenon [12, 18, 24, 26, 27, 31]. Interestingly, it has been shown that in the multisensory mechanism, unseen visual stimuli in HH can influence perception in other sensory modalities such as improving auditory localization [32]. In addition, this result was observed for both HH and neglect patients after the exposure to spatially coincident visual and auditory stimuli, exhibiting a significant reduction in auditory localization errors [33]. On the other hand, Frassinetti et al. [13] found direct and short-term effects of multisensory stimulation, by adding a coincident sound, on the enhancement of detection of visual targets in the affected hemifield. In a study performed on HH cats that provided specially or temporally noncongruent AVT, rehabilitation failed in all cases even when the number of training trials was twice than the number usually required for recovery [39]. In contrast, the authors demonstrated resolution of HH when the test was repeated with spatially and temporally congruent AVT. It has been assumed that rehabilitation required the neural signals from different modalities to converge onto their target neurons in the SC within a short time window in which they would be able to interact [39].

Whilst most studies detected the short-term beneficial effects of bimodal stimulation, some studies provided evidence that these effects can be persistent, up to 1 year after treatment [12, 15, 26, 27]. The exact mechanism underlying these effects is not totally clear; however, either restoration of vision or compensation is suggested. Reviewing the involved studies confirmed that by using multisensory stimulation, a direct effect of compensation on the oculomotor function enhanced the attention to the stimulated location. Grasso et al. [27] emphasized that the stability at the follow-up session on the enhancement at the clinical, oculomotor and electrophysiological levels is extremely relevant to the neural plasticity of the visual system and the multisensory integration through the SC neurons (retino-colliculo-dorsal pathway).

It is worth noting that in humans, compensatory AVT has a clear advantage in improving the oculomotor exploration and visual perception, suggesting that the improvement was not due to restoration in the visual field, but rather to a compensatory activation of oculomotor system [12, 24]. On the contrary, after several weeks of training HH cats on an AVT paradigm, visual responsiveness was restored in SC neurons and behavioural responses were elicited by visual stimuli in the blind hemifield [14]. A possible explanation for that difference between humans and animals is that procedural differences resulted in different functional outcomes, e.g. differences in the contralesional exposure locations (fixed in the animal vs. variable in the human), variations in the stimuli used and/or frequency and duration of the trial sessions as well as site and extent of the lesions. In humans, when central fixation was maintained and with the absence of visual cortices to supplement ipsilateral visual function, this may have been sufficient to inhibit visual responses of the caudal SC (reacting to peripheral space) to an extent that they were insufficient to generate detection of the peripheral visual stimulus [14]. However, this remains an open research question.

As HH can have a great impact on functional abilities of daily life, it was important to focus on ADL measures. Although only a limited number of studies measured the effect of multisensory stimulation on activities of daily living, these studies consistently report that patients in acute and chronic stages after stroke showed significant improvement on the ADL scale [12, 18, 26, 27]. Additionally, only patients who had received AVT showed near-normal daily living activities in relation to visual impairment after 3 weeks of training compared to patients trained on visual stimulation only [18]. Yet, there were no patients who received no training at all in this study, making the finding unclear about the recovery as the selected groups were in the acute stage after stroke.

In everyday life, patients with HH experience asymmetric visual inputs, leading to an imbalance of attentional bias towards the intact hemifield [32, 33]. This attentional imbalance as well as the clinical signs of HH has been found to be diminished by the bimodal training. Stimulating the multisensory integration of the SC which has a crucial role in controlling both overt and covert spatial attention might explain this effect of improvement. Schneider and Kastner [40] used high-resolution functional magnetic resonance imaging to measure responses in the human lateral geniculate nucleus (LGN) and SC during sustained spatial attention. The study results indicated that activity in the LGN and SC can be modulated by sustained spatial attention. Moreover, the attentional modulation in the SC was especially prominent, demonstrating its importance in spatially directing and sustaining attention.

EEG recordings showed that behavioural improvement elicited by multisensory stimulation coincided with reduced hyperactivation within the contralesional hemisphere. The amplitude of the P3 EEG component, which has been demonstrated to be modulated by the attention, showed a reduction after AVT, reflecting less visual spatial attention allocated to the intact hemifield [31]. In addition, by assessing the auditory and visual straight-ahead directions in patients with HH, it was found that the visual space perception, which is known to be distorted in patients with HH, was shifted towards the anopic side [35]. However, patients’ auditory perceived ‘straight ahead’ was approximately veridical, indicating that HH did not influence the supra-modal processing of body orientation in space but restricted to the visual modality only [35]. Thus, it seems as if the straight-ahead perception in relation to visual space been affected by the damaged visual areas in HH, and these areas were proposed as not contributing in a multimodal or supra-modal straight-ahead perception. Lewald et al. [34] stated that there is a high possibility to exploit the auditory system in rehabilitation of visual field deficits as patients with HH retain an undistorted representation of auditory space.

The possibility of utilizing multisensory integration to compensate for visual field defect in children and adolescents was extremely rare. Tinelli et al. [21] demonstrated amelioration of performance after AVT with improvement in the visual search behaviour, as documented by reduction of visual scanning times, which was compatible with the results published on adults. The training period required to show an improvement in children was longer than that in adults as less daily sessions were used. The authors explained that children were not able to maintain attention for a long period as adults and, in addition, motivation to achieve results was less understood by children. It has been suggested that it would be useful to apply a fMRI paradigm before and after the training, to verify what happens at the cortical level, in the damaged and intact hemispheres, after the multisensory activation in patients with visual field defect during childhood [21].

There is evidence to suggest that combining different sensory modalities can provide an effective rehabilitation method for patients with visual field deficits. Nevertheless, the neural underpinnings of the compensation of visual field loss after AVT needs further study. Systematic AVT may have improved the processing of visual information by recruiting subcortical pathways, and because most of the patients with visual cortex damage have an intact SC, it might be possible to train the use of retinotectal functions by the bimodal AVT [18]. Using functional and structural MRI techniques may help in providing more evidence for such a hypothesis. A wide range of evidence converges on the presence of two pathways for conscious and unconscious perceptions, which diverge at early processing stages. In the pathway underlying conscious visual processing, the visual information is projected from the retina to the lateral geniculate nucleus, then to the occipital lobe. Unconscious visual processing, on the other hand, relies on a second pathway, in which the visual information is projected from the retina to the SC and the pulvinar, then to the dorsal parietal cortices [27]. Tamietto et al. [41] demonstrated anatomical connections between the SC and amygdala via the pulvinar by using diffusion tensor imaging (DTI) in humans. The phenomenon of cross-modal blindsight could be explained by this alternative connection, including the SC and its dorsal parietal projections, which improve perception when multisensory cues are used.

Additionally, Hertz and Amedi [42] performed a fMRI study on healthy subjects that involved a multisensory experimental condition. It was found that there is a network of areas showing AV interaction responses including the parietal lobe, the temporal lobe, the frontal lobe and the insula [42]. In line with this concept, neuroanatomical and neurophysiological studies in animals have illustrated strong connections between SC, posterior parietal cortex and frontal eye fields for control of eye movements [43]. Furthermore, neuroimaging data proved that the preparation and execution of eye movements have been enhanced by the activation of the frontal eye field, which involves descending input pathways to the SC [44].

In conclusion, this review supports the concept that compensatory AVT can be useful as a rehabilitation method for stroke survivors with HH. Moreover, there is a considerable lack of studies using AVT stimulation on human HH patients with concurrent functional and structural MRI to identify the optimal training paradigm and the neural mechanisms underlying its effect. Further research is warranted to explore these aspects.