Aversive reinforcement improves visual discrimination learning in free‑flying wasps ( Vespula vulgaris )

Understanding and assessing the capacity for learning, memory, and cognition in non-model organisms is a growing field. In invertebrate cognition, eusocial hymenopteran species such as honeybees, bumblebees, and ants are well-studied for their learning and memory abilities due to decades of research providing well-tested methods of training and assessing cognition. In the current study, we assess the use of different conditioning methods on visual learning in a non-model hymenopteran species which is becoming increasingly used in learning and memory tasks, the European wasp ( Vespula vulgaris ). We trained individual wasps to learn to discriminate between perceptually similar colours using absolute conditioning (reward on target stimulus in the absence of distractors), appetitive differential conditioning (reward on target stimulus and no outcome for incorrect stimulus), or appetitive-aversive differential conditioning (reward on target stimulus and aversive outcome for incorrect stimulus). When trained with absolute conditioning, wasps were unable to learn to discriminate between perceptually similar colours. However, when trained with appetitive differential conditioning or appetitive-aversive differential conditioning, wasps were able to learn to discriminate between two similar colours, although they performed best when an aversive reinforcement was provided during training. Our results show similarities to learning behaviour in honeybees and bumblebees, and provide insight into the learning and cognition of a non-model invertebrate. Our findings provide important comparative data to aid in understanding the evolution of learning and memory in hymenopterans.


Introduction
Different methods of training and testing animals can reveal variations in learning and memory capacity (Agrillo and Bisazza 2014;Avarguès-Weber and Giurfa 2014).Methodological differences in conditioning experiments such as training length (Stach and Giurfa 2005;Bisazza et al. 2014), apparatus type (Howard et al. 2017), or conditioning procedure (Dyer and Chittka 2004;Avarguès-Weber et al. 2010a;Agrillo and Bisazza 2014;Howard et al. 2019b)  significantly impact performance in discrimination and/ or learning tasks.For example, different bee species can be conditioned with absolute conditioning (Bombus terrestris: Dyer and Chittka 2004; Apis mellifera : Giurfa 2004; Melipona eburnea: Amaya-Márquez et al. 2019; Tetragonula carbonaria: Koethe et al. 2022), appetitive differential conditioning (B.terrestris: Dyer and Chittka 2004;A. mellifera: Giurfa 2004;Howard et al. 2019b; Osmia cornuta: Collado et al. 2021;T. carbonaria: Dyer et al. 2016;Spaethe et al. 2014;Trigona cf. fuscipennis: Spaethe et al. 2014), aversive differential conditioning (A.mellifera: Vergoz et al. 2007;Marchal et al. 2019;Nouvian and Galizia 2019;Lasioglossum lanarium: Howard 2021), or appetitive-aversive differential conditioning (A. mellifera: Avarguès-Weber et al. 2010a;Howard et al. 2019b;B. terrestris: Chittka et al. 2003), with the conditioning method often resulting in different discrimination outcomes (A.mellifera: Avarguès-Weber et al. 2010a;Howard et al. 2019b).Absolute conditioning involves rewarding a target stimulus in the absence of any distractor stimuli.Appetitive differential conditioning involves providing a reward for a correct choice of a target stimulus and no outcome for a choice of a distractor stimulus.Aversive differential conditioning involves no outcome for a correct choice and an aversive outcome/punishment for the choice of an incorrect stimulus.Appetitive-aversive differential conditioning involves providing a reward for a correct choice and an aversive outcome/punishment for an incorrect choice.The type of conditioning used may often depend on experimental access for the motivational state of an animal and thus what is logistically possible.For example, the capacity for an animal to move freely to make choices can also impact how conditioning may affect learning outcomes (de Brito Sanchez et al. 2015).In some model invertebrate species, such as honeybees (A.mellifera) and bumblebees (B.terrestris), introducing differential conditioning and/or aversive outcomes for incorrect choices is known to significantly improve visual discrimination in colour (Chittka et al. 2003;Dyer and Chittka 2004;Avarguès-Weber et al. 2010a) and spatial tasks (Howard et al. 2019b).Bumblebees can demonstrate impressive cross-modal object recognition between visual and tactile senses when trained with differential conditioning (Solvi et al. 2020).
Over the past two decades, there have been several advancements in our understanding of comparative neuroscience informed by the capacity of free-flying honeybees (Giurfa et al. 2001;Buatois et al. 2017) and bumblebees (Chittka et al. 2003;Brown and Sayde 2013) to learn complex visual tasks when provided with appetitive-aversive differential conditioning.As noted above, in bees, this form of conditioning involves training free-flying individuals to associate the choice of a correct stimulus option with a reward like sucrose (sugar water), whilst a perceptually similar visual stimulus that is designated as incorrect is associated with a bitter tasting substance, such as quinine solution (Chittka et al. 2003;Avarguès-Weber et al. 2010a).It is proposed that this form of conditioning may promote attention to observe the relatively small perceptual differences between respective stimuli (Avarguès-Weber et al. 2010a), and the conditioning technique has enabled researchers to show learning and discrimination of complex stimuli.For example, honeybees learnt to recognise human face stimuli via configural-type mechanisms (Dyer et al. 2005;Avarguès-Weber et al. 2010b), as well as perceptually difficult numerical tasks such as quantity discrimination (Howard et al. 2018;Bortot et al. 2019;Howard et al. 2019b), arithmetic (Howard et al. 2019a), and quantity categorisation (Howard et al. 2022).
In parallel with these discoveries in free-flying bees, there have also recently been emerging evidence that other hymenopteran species, including wasps, may acquire the capacity to improve visual learning when provided with appetitiveaversive differential conditioning (Avarguès-Weber et al. 2017).In a comparative study of honeybees and wasps (Vespula vulgaris), respectively trained with appetitive-aversive differential conditioning, it was shown that both species could learn to discriminate between human face stimuli and showed some similarities in holistic processing of the spatial information, although there were also some behavioural differences (Avarguès-Weber et al. 2018).Polistes fuscatus, a wasp species that have several queens cohabitating, shows evidence of having face recognition specialty when trained with an electric shock as an aversive outcome.A closely related species (Polistes metricus), that typically nest alone, does not show the same ability to learn to recognise faces (Sheehan and Tibbetts 2011).Such speciality still requires P. fuscatus wasps to learn the discrimination between perceptually similar face stimuli, which they do at a significantly better and quicker level to other stimuli like prey items or simple patterns.Thus, learning appears to also be important to the lives of wasps, and recent testing on two hornet species (Vespa velutina nigrithorax and Vespa crabro) suggests that both spatial and colour stimuli can be learnt and reverse learnt in flexible ways (Lacombrade et al. 2023).
The ecological relevance of why hymenopterans may exhibit a capacity to learn differently depending upon conditioning is evident when considering the complexity of decision-making that might be required in the natural world.For example, many Hymenoptera species forage on nectar rewarding flowers, which may be frequent and require little need to learn perceptual differences at certain times of the year when nearly all flowers contain abundant rewards, whilst at other times in the year, mimic flowers providing no rewards can require individuals to perform fine discrimination of either colour (Garcia et al. 2020) or spatial (Howard et al. 2021) signals to maximise nutrition collection.Foraging insects choosing between flowers may also have to manage risks like avoiding predation (Heiling et al. 2003;Ings and Chittka 2008;Howard 2021).Similar foraging complexities likely exist for generalist wasps that forage on a wide range of food sources that may or may not be optimal at certain times in the year (Richter 2000) and likely require flexible learning (D 'Adamo and Lozada 2011) for food types as diverse as flowers, insect prey, and rotting fruit (Balamurali et al. 2021).
The effects of appetitive-aversive differential conditioning on improving learning of perceptually similar stimuli have been demonstrated for bumblebees with colour stimuli (Chittka et al. 2003) and honeybees with spatial (Howard et al. 2019b) and colour stimuli (Avarguès-Weber et al. 2010a).However, currently, no such direct empirical evidence is available for any wasp species.This information is important to understand how results from different experiments should be interpreted, especially in ecological or comparative neuroscience frameworks and in the design of optimal future experiments.Colour stimuli are ideal for evaluating how appetitiveaversive differential conditioning may improve leaning compared to classical conditioning as colour stimuli are biologically relevant, and avoid potential confounds that may emerge with spatial stimuli (Zanon et al. 2021).Whilst relatively little is currently known about wasp colour vision and learning, it is possible to employ the principles outlined in Kemp et al. (2015) of inferring biologically and phylogenetically relevant information.In this regard, as wasps and bees are hymenopterans and bees are all trichromatic with true colour vision having been established (Briscoe and Chittka 2001;Kemp et al. 2015) and wasps have multiple photoreceptors to potentially enable colour perception (Peitsch et al. 1992), we can predict that wasps have comparable colour vision to bees.In the current study, we use similar colour stimuli that honeybees are known to only learn to discriminate between when trained with differential conditioning techniques but not absolute or classical conditioning (Dyer et al. 2014;Dyer and Garcia 2014;Sommerlandt et al. 2016).We aimed to determine (i) if differential conditioning improves learning compared to absolute conditioning and (ii) if appetitive-aversive differential conditioning enables significantly better discrimination than appetitive differential conditioning.The null expectation in each case is a non-significant difference in how the wasps discriminate between perceptually similar colour stimuli.As a result of conducting the experiment to address the primary research questions, we used colour stimuli.Therefore, the study also enables us to determine if wasps may demonstrate a capacity to learn colours differently depending upon conditioning, although this is not the primary focus of this study.

Study species and recruitment
Experiments were conducted with individual free-flying wasps (Vespula vulgaris) during August and September 2022 at the Johannes Gutenberg University Mainz's biological garden in Mainz, Germany.Previous studies have established V. vulgaris as an active forager for sucrose and have shown that these wasps act as central place foragers (Avarguès-Weber et al. 2017;Avarguès-Weber et al. 2018).Wasps were recruited to the experiment from a von Frisch-style gravity feeder containing 5-8% sucrose solution by volume to a nearby training and testing apparatus.To recruit wasps from the feeder, we collected them onto a transparent plexiglass spoon containing 20% sucrose solution by volume and placed onto a hanger platform on the rotating screen (Fig. 1) also containing 20% sucrose solution.Each wasp was individually marked on the thorax to identify individuals following standard methods (Avarguès-Weber et al. 2017;Avarguès-Weber et al. 2018).One individual wasp was trained and tested at a time.The experiment generally took 2-3 h to complete for each individual including the training and testing phases.We trained and tested 20 wasps overall.No individuals were excluded from the analysis.

Apparatus
The apparatus was a standard vertically rotating circular screen (50 cm in diameter) that allowed stimuli to be placed at random positions to exclude spatial cue factors (Fig. 1; Dyer et al. 2005).It was made of grey plexiglass and was able to be rotated to randomly change the hanger and stimuli positions.The screen consisted of pegs which were used to hang 6 × 8 cm grey plexiglass hangers (Fig. 1).The hangers had a small landing platform where wasps could land and drink the solution (e.g.sucrose or quinine).The stimuli could be presented on the hangers, and thus, wasps could be trained to associate a reward of sucrose with a correct stimulus option or learn to associate an incorrect choice of stimulus with an aversive outcome of quinine solution.Four hangers were presented to each wasp at a time.When initially training wasps to land on and return to the hangers, no stimuli were present.After wasps had learnt to land on the hanger platforms, training began and stimuli were placed on the hangers.

Stimuli
Stimuli comprised of 6 × 6 cm colour stimuli that were presented on vertical hangers.The two colours used were of either blue (130 GSM Tonpapier No. 37,Baehr,Germany) 101 Page 4 of 10 or turquoise (Tonpapier No. 32) appearance to a human observer.Previously reported colorimetry shows a colour distance of 0.06 hexagon units between these stimuli (Dyer and Garcia 2014).Psychophysics testing of free-flying honeybees has demonstrated that these respective stimuli are perceptually similar for a hymenopteran trichromat (Dyer et al. 2014;Dyer and Garcia 2014;Sommerlandt et al. 2016;Garcia et al. 2017).

Absolute conditioning phase and testing phase
The designated rewarding training colour, blue or turquoise, was associated with 20% sucrose.The target colour (blue vs. turquoise) for each wasp was assigned by pseudo-randomising (coin toss) the colour in a counterbalance way, so that each colour had an equal number of wasps trained to associate it with sucrose.Thus, 10 wasps were trained to associate blue with a reward and 10 wasps were trained to associate turquoise with a reward.In this initial absolute conditioning phase, each wasp received sucrose solution for 10 rewarded landings where only the target colour was present on the four hangers (Fig. 2).Each time a wasp landed on the hanger platform, it was collected onto the plexiglass spoon and placed behind an opaque barrier to drink sucrose whilst the rotating screen, hangers, and stimuli were cleaned with 20% ethanol solution and then water to remove any scent cues.The screen was then rotated to randomise stimuli position and the wasps were allowed to make another choice or return to the nest.
Following the 10 trials of absolute conditioning, wasps were tested for 10 unrewarded choices (no sucrose present on the hangers); when presented with blue vs. turquoise to determine if after absolute conditioning to one rewarding target colour, they could discriminate between the correct rewarding colour and the novel distractor colour stimulus.This test consisted of 10 touches or landings on the hanger platform or stimuli.These choices were scored as 'correct' or 'incorrect' depending on the target colour assigned during absolute conditioning.
This stage permitted us to collect baseline test data of wasp colour discrimination using absolute conditioning between the blue and turquoise stimuli prior to differential conditioning and would allow for a comparison between these methods of training.

Appetitive differential conditioning and appetitive-aversive differential conditioning phase
Following the absolute conditioning training and test phase, wasps were pseudo-randomly divided into two experimental groups.One group received appetitive differential conditioning (n = 10), whilst the second group received appetitiveaversive differential conditioning (n = 10; Fig. 2).Each group included either blue (n = 5) or turquoise (n = 5) being selected as the rewarding target colour, following on from the process used to designate target stimulus colour for absolute conditioning.Thus, each wasp always experienced the same target colour as rewarding during the absolute and differential learning phases of the experiments.For appetitive differential conditioning, wasps received 20% sucrose solution for a correct choice or plain water acting as a neutral substance for an incorrect choice, whilst wasps in the appetitive-aversive differential conditioning group received 20% sucrose solution for correct choices and 6-mM quinine solution for incorrect decisions.

Testing phase
Following 10 trials of either appetitive differential conditioning or appetitive-aversive differential conditioning for respective groups, each wasp underwent 10 unconditioned choices (no sucrose or quinine present) in a learning test with new training stimuli.This stage consisted of counting 10 touches or landings on the hanger platforms or stimuli in the absence of sucrose and quinine to determine whether wasps had learnt the colour discrimination task using the same stimuli as presented during the training phases.These choices were scored as 'correct' or 'incorrect' depending on the target colour assigned during conditioning.

Training phase
We analysed the appetitive differential conditioning and appetitive-aversive differential conditioning phases to determine whether wasps demonstrated significant learning (proportion of correct choices) over the 10 trials.Data for both groups of wasps were analysed with a generalised linear mixed-effects model (GLMM) with a binomial distribution using the 'glmer' package within the R environment for statistical analysis (R Core Team 2020).The full model included choice as the categorial response variable with two levels (correct; incorrect), individual trial number as a continuous predictor (1-10), colour (blue; turquoise) as a categorical predictor, and an interaction term between these two predictors.Subject (wasp ID) was included as a random factor to account for repeated choices of individual insects.
To compare whether learning differed significantly between wasps trained with appetitive differential conditioning and appetitive-aversive differential conditioning, we analysed the data using a GLMM with a binomial distribution using choice as the categorical response variable with two levels (correct; incorrect).Conditioning type (appetitive or appetitive-aversive), trial number, colour, and interaction terms between these predictors were included in the model.Subject (wasp ID) was included as a random factor to account for repeated choices of individual insects.
In order to determine what combination of predictors best explained the proportion of correct choices made during the learning phase, we compared the corrected Akaike information criterion (AICc) values from the different models (Burnham and Anderson 2004).The same analysis was employed for respective training phases (appetitive differential conditioning; appetitive-aversive differential conditioning) and tests (absolute conditioning learning test; appetitive differential conditioning learning test; appetitive-aversive conditioning learning test).We compared the possible models using the 'dredge' function in the MuMIn package written for the R statistical language, run in R version 4.0.3(Barton and Barton 2015).

Testing phase
To determine whether the insects learnt to choose the correct target colour (blue or turquoise) in learning tests, we employed a GLMM with a binomial distribution including categorial response variable with two levels (correct; incorrect) for all three learning tests.We included colour as a categorical predictor with two levels (blue; turquoise).Subject (wasp ID) was included as a random factor to account for repeated choices of individual insects.The mean proportion of choices for the correct colour (MPCC) recorded from the tests was used as the response variable in the model.The Wald statistic (z) tested if the mean proportion of correct choices recorded from the test, represented by the coefficient of the intercept term, was significantly different from chance expectation, i.e.H 0 : mean proportion of the correct choice (MPCC) = 0.5.
To compare whether results in the learning tests differed significantly between wasps trained with appetitive differential conditioning and appetitive-aversive differential conditioning, we analysed the data using a GLMM with a binomial distribution using choice as the categorical response variable with two levels (correct; incorrect).Conditioning type (appetitive or appetitive-aversive), colour (blue or turquoise), and an interaction term between these predictors were included in the model.Subject (wasp ID) was included as a random factor to account for repeated choices of individual insects.

Absolute conditioning test phase
Following 10 absolute conditioning trials to the target colour stimuli (blue or turquoise), wasps underwent 10 choices between the target colour and the alternative colour stimulus (blue vs. turquoise) to determine if wasps could differentiate between perceptually similar colours after receiving absolute conditioning.All wasps were pooled for this analysis as they had received similar conditioning (except for the different target colour) before this test.
We compared AICc values and determined that the best model was one excluding colour as a predictor of stimulus choice (Table S1).Thus, we found that in the absolute conditioning test, wasps were unable to differentiate between blue vs. turquoise following absolute conditioning (z = − 0.566; P = 0.572; n = 20), choosing the assigned target colour correctly in 48% of choices (Fig. 3, bottom panel).

Appetitive differential conditioning
When comparing AICc values for models describing data from the appetitive differential conditioning training phase, we found that the best model was one including choice (correct; incorrect) as the response variable and trial as the predictor (Table S2).Our analysis showed that during the 10 appetitive differential conditioning trials, wasps did not improve their performance at a level significantly different from chance expectation (z = 1.609;P = 0.108; n = 10; Fig. 3, top panel) throughout the trials, although the graphical plot (Fig. 3, top panel) suggests some learning may have occurred by the end of training as the 95% CI bars are above the chance expectation line (see test results below).

Appetitive-aversive differential conditioning
For analysing the appetitive-aversive differential conditioning training, the best model was one including choice (correct; incorrect) as the response variable and trial (1-10), colour (blue vs. turquoise), and an interaction between trial and colour as predictors (Table S3).
Wasps significantly increased the number of correct choices made during training showing learning of the target vs. distractor colour (z = 2.765; P = 0.0057; n = 10; Fig. 3, top panel), which was also significantly influenced by the colour of the target (z = 1.996,P = 0.0459), and an interaction between trial and colour (z = − 1.987; P = 0.0469).Wasps trained to associate blue stimuli with a reward and turquoise with aversion appeared to learn significantly more quickly than wasps trained to associate turquoise with a reward and blue with aversion.

Comparison between appetitive differential conditioning vs. appetitive-aversive differential conditioning
When comparing wasps trained with appetitive differential conditioning or appetitive-aversive differential conditioning during training, the model that best fit the data was one that included choice (correct; incorrect) as the response variable and training type (appetitive vs. appetitive-aversive), trial (1-10), and an interaction between trial and training type as the predictors (Table S4).The model showed that there was a significant interaction between trial and training type (z = 2.051; P = 0.040; n = 20), but not trial (z = − 0.742; P = 0.458) nor training type (z = − 0.960; P = 0.337) as individual predictors.The results show that wasps trained with appetitive-aversive differential conditioning learnt the task significantly better than wasps trained with appetitive differential conditioning (Fig. 3, top panel).

Appetitive differential conditioning
For wasps trained with appetitive differential conditioning, we found that the best model was one including choice (correct; incorrect) as the response variable (Table S5).
Despite not showing evidence of significant learning during the training phase, wasps trained with appetitive differential conditioning demonstrated the ability to significantly discriminate between the respective target and distractor stimuli colours in the test phase (z = 2.182; P = 0.0291; n = 10; Fig. 3, bottom panel), choosing the correct stimulus colour at a level of 61%.

Appetitive-aversive differential conditioning
The model which best explained the data from the appetitive-aversive differential conditioning test included choice as the predictor with two levels (correct; incorrect) and stimulus colour as the main predictor mediating behavioural choices (Table S6).Wasp choices showed a trend to be impacted by colour (z = 1.888;P = 0.059; n = 10) during the test, and colour of the rewarding target was found to influence choices during training (see above).Therefore, we decided to separate the groups for analysis for wasps trained to blue (n = 5) and wasps trained to turquoise (n = 5) in the appetitive-aversive differential conditioning test.Wasps trained to choose the target colour of blue chose correctly in 78% of choices, which was significant from chance level (z = 3.707; P < 0.001; Fig. 3, bottom panel).Wasps trained to choose the target colour of turquoise selected the correct stimulus in the test in 92% of choices (z = 4.685; P < 0.001; Fig. 3, bottom panel).Thus, wasps trained with appetitive-aversive differential conditioning were able to learn either target colour as a reliable predictor of an appetitive reward.

Comparison between appetitive differential conditioning vs. appetitive-aversive differential conditioning
When comparing wasps trained with appetitive or appetitiveaversive differential conditioning during the learning test, the model that best fit the data was one that included choice (correct; incorrect) as the response variable and training type (appetitive vs. appetitive-aversive) as the predictor (Table S7).The model showed that training type significantly influenced the results (z = 3.709; P < 0.001).Wasps trained using appetitive-aversive differential conditioning performed better than those trained with appetitive differential conditioning (Fig. 3).

Fig. 3
Learning and test performance of wasps.The top panel shows the performance of wasps during training with appetitive differential conditioning (violet, broken line; n = 10) and independent wasps that received appetitive-aversive differential conditioning (green, solid line; n = 10).Wasps trained with appetitive-aversive differential conditioning demonstrated a significant improvement in making correct choices, whilst wasps trained with appetitive differential conditioning did not demonstrate significant learning during the training phase (see main text for statistics).Violet crosses show mean data from each trial when wasps were trained using appetitive differential conditioning.The green stars show mean data from each trial when wasps were trained using appetitive-aversive differential conditioning.Dotted black line at 0.5 represents chance level performance.Shaded area shows the 95% confidence intervals.The bottom panel shows performance in the testing phases.Wasps were initially trained with absolute conditioning (grey; n = 20) and did not discriminate the target colour from a perceptually stimulus distractor in the test.The 10 wasps trained with appetitive differential conditioning (violet) significantly selected the correct colour stimulus compared to chance level.The 10 independent wasps trained with appetitive-aversive differential conditioning using blue (n = 5) or turquoise (n = 5) as target stimuli were significantly better at discriminations between stimuli.Open circles show the mean performance in tests of individual wasps.Dotted black line at 0.5 represents chance level performance.Wasps performed significantly above chance level in tests when trained with either appetitive or appetitive-aversive differential conditioning, but not absolute conditioning.Error bars show 95% confidence intervals.NS non-significant; *P <0.05; **P < 0.01; ***P < 0.001 101 Page 8 of 10

Discussion
In the current study, we evaluated if appetitive-aversive differential conditioning enabled superior learning of perceptually similar visual stimuli compared to appetitive differential conditioning and absolute conditioning.Consistent with previous reports on honeybees (Avarguès-Weber et al. 2010a) and bumblebees (Chittka et al. 2003), we observed that wasps have flexible learning that is significantly improved when the incorrect stimulus is associated with aversive bitter tasting quinine (Fig. 3).Our research further shows that for perceptually similar colour stimuli, wasps do not make reliable discriminations following absolute conditioning, whilst differential conditioning where the target colour is learnt relative to a perceptually similar distractor does enable the target to be selected at a level significantly different from chance expectation (Fig. 3).This finding is consistent with how honeybees (Giurfa 2004) and bumblebees (Dyer and Chittka 2004) learn colour information in different ways depending upon experience, which likely explains how some non-rewarding flowers achieve pollination via mimicry of rewarding species (Garcia et al. 2020).
Wasps of several species are starting to become important comparative models for how learning occurs in animals with miniaturised brains of less than a million neurons (Balamurali et al. 2021).The current study on wasps (V.vulgaris) shows that when appetitive-aversive differential conditioning is used, there is a significant improvement in learning performance.This suggests that plasticity in learning extends beyond the known insect models of eusocial bees including honeybees and bumblebees, and suggests that studying conditioning effects on more insects will be valuable to understand the true limits of learning capacity in insect brains.One unexpected finding from the analyses was that for wasps trained with appetitive-aversive differential conditioning to the turquoise target, performance in tests was significantly higher than the wasps trained with appetitive-aversive conditioning to the blue stimulus (Fig. 3, bottom panel).However, there was no evidence of an innate preference for either stimulus (Fig. 3; absolute conditioning experiment).A possible explanation for the observed difference might be that the turquoise stimulus is easier to learn as it contains more long-wavelength rich information based on the spectral data reported in Dyer and Garcia (2014), although the results for the learning phase do not support this as wasps learnt to associate the blue stimulus with a reward more quickly than the turquoise.Currently, too little is known about wasp colour visual processing to formally test how the different contributing components of colour vision may contribute to learning.
The results of the current study provide important initial evidence that wasps can learn fine colour differences which may be important for future research to understand colour vision in wasps.However, this current study specifically tested the role of conditioning to compare the most effective conditioning method to promote learning, and so the question of colour perception in wasps will require further research with appropriate controls for potential intensity differences to fully test colour perception (Kemp et al. 2015).For example, it would be of high value to understand how wasps might learn saliently different colours like yellow vs. blue with absolute conditioning, given that honeybees easily learn saliently different colours easily with absolute conditioning (Giurfa 2004;Dyer and Garcia 2014).In honeybees, initial studies into what brain areas may enable colour memory suggest that the input calyx to the mushroom body undergoes structural changes with colour conditioning (Sommerlandt et al. 2016), and in ants, accurate colour discrimination following differential conditioning was observed for up to 7 days dependent upon different memory phases (Yilmaz et al. 2017).Evidence that selective attention in hymenopteran insects can be recorded in brain regions like the optic lobes within the bee brain (Paulk et al. 2014) suggests a plausible neurobiological basis for how conditioning procedures can deliver different behavioural outcomes as was observed in the discrimination tasks presented to wasps (Fig. 3).An interesting result from the colour learning by wasps showed that following the absolute conditioning, choice frequency for the target colour was initially at random (Fig. 3).This result suggests that some experience with distractors over several trials is necessary to promote learning with differential conditioning, which is consistent with the need to develop attention to specific stimuli differences to improve accuracy (Garcia et al. 2020).A similar observation has also been made for how bumblebees learn colour differences (Dyer and Chittka 2004).Thus, as more hymenopteran insect models are established for colour learning and testing, it will be of value to further explore the neurobiological basis enabling plasticity in colour perception of animals.
The findings of the current study may also have important implications for spatial vision and understanding both the limits of perception and how insect brains may enable the individuals to operate in complex environments.In freeflying wasps, it has been observed that individuals appear to learn visual behaviours like inspecting nearby landmarks in a similar way during orientation flights from nest (Collett and Lehrer 1993;Zeil et al. 1996;Stürzl et al. 2016;Collett and Hempel de Ibarra 2023) and show evidence of attending to different features, which might link to attention-type evidence observed in bees.In honeybees, psychophysics evidence of modulation of attention can be seen in how freeflying honeybees preferentially use either global information in a complex pattern, but behaviour can also be driven by local features if conditioning primes to that type of feature (Avarguès-Weber et al. 2015).Recent work on P. fuscatus wasps, which use facial patterns to individually identify conspecifics, and Polistes dominula, which lack a capacity to recognise individuals, reveals evidence that some wasps can evolve holistic face processing in the right circumstances (Tibbetts et al. 2021), a seemingly complex process that was assumed to require a large mammalian brain (Maurer et al. 2002).These wasp experiments were enabled by testing aversion to stimuli with an electric field, showing that approaching cognitive capacities may require a variety of conditioning techniques.The evidence that visual performance in V. vulgaris wasps can be significantly improved by conditioning procedure in the current study provides a methodological framework enabling other wasp species to be tested in a similar way.Examining the effect of conditioning on visual learning and discrimination in other non-model insect species will allow us to better understand the evolution of cognition and vision in hymenopterans.
can Communicated by M. Lihoreau This article is a contribution to the Topical Collection Toward a Cognitive Ecology of Invertebrates-Guest Editors: Aurore Avarguès-Weber and Mathieu Lihoreau 101 Page 2 of 10

Fig. 1
Fig. 1 A front (A) and side (B) view of the rotating screen apparatus and parts with stimuli shown

Fig. 2
Fig. 2 Examples of the different conditioning procedures presented to wasps during the conditioning phase and learning tests