Introduction

Animals are able to evaluate the contingencies and coincidental order of events in their environment, and they readily infer predictable patterns (Mery, 2013). The ability to adjust to surrounding conditions is widely conserved across taxa (Shettleworth, 2009), and represents a major driving force in evolution. As learning and memory organization appear to have emerged in a long distant evolutionary past, they utilize vital and deeply rooted neural mechanisms and extend across broad phylogenetic divisions (Giurfa & Sandoz, 2012; Menzel & Benjamin, 2013). Despite the phenomena's broad taxonomic distribution, much of our knowledge regarding learning comes from only a small number of vertebrate species (Mackintosh, 1974; Papini, 2008). Unfortunately, the neural complexities associated with the vertebrate brain present significant challenges for research into the biological mechanisms of learning, and argue for the inclusion of alternate, more accessible model organisms. Due to their relatively simple, modularly organized nervous systems (Lukowiak et al., 1996), research in invertebrates has contributed greatly to our understanding of the synaptic plasticity associated with habituation and sensitization (Carew et al., 1971; Jennings, 1906; Kandel & Schwartz, 1982; Thompson, 2009), and the ability to associate co-occurring events in both classical (Carew, et al. 1981) and operant conditioning scenarios (Cook & Carew, 1986).

The ability to connect color cues with food reward was demonstrated in honey bees (Masuhr & Menzel, 1972; Takeda, 1961; Menzel, 1967; Vareschi, 1971; von Frisch, 1914), blowflies (Frings, 1941), and Drosophila (DeJianne et al., 1985). In Aplysia a light shock paired with a tactile stimulus is sufficient to elicit conditioned withdrawal reflexes of gill and siphon (Carew et al., 1981), and the phenomenon has been mapped onto underlying changes in synaptic plasticity (Castellucci et al., 1970; Carew et al., 1971; Hawkins, 1984). The conditioned place preference (CPP) paradigm offers an experimental approach in which a specific environmental cue is paired with a novel stimulus through Pavlovian labelling. A subsequent testing trial examines changes in the trained animal's preference from baseline values (Søvik & Baron, 2013). Rewarding properties are indicated by enhanced contact with the paired environment, whereas reduced contact implies conditioned aversion (Brandes & Menzel, 1990; Sitaraman et al., 2008; Tzschentke, 1998). Although CPP provides robust measures of the animal's subjective perception of a stimulus, it tells us little about specifically what, and how, it has learned. The paradigm is thus limited in what it can measure and is often confounded by novelty seeking (Alcaro et al., 2011; Bardo & Bevins, 2000). Moreover, as the measurement of place preference occurs in the absence of the previously established stimulus pairing, the testing phase itself interferes with conditioning (Bardo et al., 1984). As CPP paradigms assess learning only after the fact, they ultimately lack many of the metrics that can be obtained through instrumental approaches (e.g., number of operant behaviors, space use, or movement patterns). Research models of learning have thus benefited greatly from the development of instrumental approaches and the experimental paradigms which these made possible. A number of studies have been successful in implementing operant paradigms for invertebrate species (Abramson & Feinman, 1990; Makous, 1969; Tomina & Takahata, 2010). Reward learning was demonstrated in honey bees (Núñez, 1970; Pessotti, 1972), Aplysia (Brembs et al., 2002; Carew & Sahlely, 1986), and lobsters (Tomina & Takahata, 2010), while punishment has been shown to shape behavior in cockroaches (Horridge, 1962), locusts (Hoyle, 1982), Aplysia (Cook and Carew, 1986), and pond snails (Lukowiak et al., 1996). By adding components of instrumental learning to CPP paradigms, such as walking into a specific part of the arena, operant place-conditioning (OPC) provides real time measures of learning through changes in acquired place preference (Crowder & Hutto, 1992). Moreover, inferred motivational states evoked by paired cues are reflected in the amount of locomotion. OPC thus depicts to what degree a paired stimulus is perceived as rewarding, neutral, or aversive (Feduccia, Kongovi, & Duvauchelle, 2010), and indicates the extent of appetitive behaviors expressed for obtaining it. Several invertebrates have successfully completed OPC tasks, including Drosophila who will limit their exposure to a noxious stimulus (Putz & Heisenberg, 2002).

Crustaceans offer a compelling model system for studies of memory and learning because complex behaviors emerge from a modular, experimentally accessible set of segmental ganglia (Derby & Thiel, 2014). A rich history of behavioral neuroscience in crustaceans has revealed insights into the neuromuscular junction for the study of synaptic transmission (Furshpan & Potter, 1959), the role of glutamate and GABA as excitatory and inhibitory neurotransmitters (Iversen, Otsuka, Hall, & Kravitz, 1966; Taraskevich, 1971), the neural orchestration of escape behavior (Edwards et al., 1999), the complex coordination of motor networks (Nusbaum et al., 2001), and emergent properties that arise from direct neuron to neuron interactions within networks (Selverston, 1999). Crustaceans exhibit many forms of learning, including habituation (Applewhite & Morrowitz, 1966; Krasne & Woodsmall, 1969), classical conditioning (Abramson & Feinman, 1988; Orlosk et al., 2011), food aversion (Finn-Levy, et. al., 1988; Wight et al., 1990), conditioned place preference (Panksepp & Huber, 2004), spatial learning (Tierney & Lee, 2011), and operant conditioning (Abramson & Feinman, 1990; Tomina & Takahata, 2010). Studies of operant learning via punishment, however, have remained experimentally intractable. In a previous study of shock avoidance, very few crayfish were capable of learning the contingencies within the paradigm (Kawai et al., 2004).

The present study introduces an efficient system of avoidance learning in unrestrained crayfish, wherein mild electric shocks generate reliable substrate avoidance. Using fixed interval punishment for completion of a place-conditioning task, we examine whether (1) electric shocks yield unconditioned effects, (2) the use of punishment alters place preference, (3) unconditioned and conditioned effects of learning can be distinguished through the use of yoked controls, and (4) learning curves for completing the paradigm can be fitted. The present work of shock-induced place aversion also provides a case study for how complex learning paradigms can be implemented with tools emerging from the field of computational ethology. Here we introduce a public domain computer framework, in which a real-time video-tracking application is designed to automatically deliver electroshock punishment conditional on a focal individual's behavior.

Materials and methods

Study animals and surgery

Male crayfish (Orconectes rusticus) were wild-caught from the Portage River (near Pemberville, OH, USA), and housed in a large, aerated community tank (2,500 L, 20 °C, 7.0 pH, 16/8 light/dark cycle). Animals were fed twice weekly a combination of fish, earthworms, and rabbit chow. Prior to testing, intermolt individuals (carapace lengths = 25.8 mm, SE = 0.55; mass = 8.25 g, SE = 3.85) were isolated for 3 days in individual plastic containers (160 mm diameter, 95 mm depth) and maintained on holding trays with a continuous flow of filtered, aerated water. Following cold anesthesia (20 min in ice), a 28-gauge needle was used to puncture the carapace immediately adjacent to the pericard. A 31.5-gauge insulated copper wire, with insulation removed at the tip, was inserted 3 mm into the pericardial cavity, and sealed with super glue. After surgery, implanted crayfish were allowed to recover for 24 h in their individual containers. No mortalities were observed during this study and, following the experiment's completion, electrode implants were removed and crayfish were returned to the wild.

Experimental design

Crayfish were randomly assigned to 13 size-matched pairs in a “master/yoked” design. While the individual designated the role of “master” controlled the delivery of shock with its own behavior, the “yoked” animal received identical shocks regardless of its own actions. Yoked controls thus assess pure, unconditioned responses to electric shock and provide a means to distinguish these from learned associations between punishment and behavior/arena location. A circular experimental arena (polyethylene, diameter = 502 mm, height = 270 mm) featured two soft and two hard substrate quadrants, arranged diagonally. Soft quadrants featured five stacked layers of beige, PVC-coated polyester mesh (Non-adhesive Easy Shelf Liner, Duck Brand, OH, USA, combined depth = 10 mm), while hard quadrants were lined with beige ceramic floor tile (Model #8646, Mono Serra, Montreal, Canada, depth = 10 mm). Radially arranged sets of in- and out-flow tubes supplied the arena with a continuous flow of water. The experimental arena was radially uniform except for the different substrates, and was rotated between trials to reduce confounds from surrounding cues. In addition, the type of substrate was stratified in a balanced design, whereby half of the treatment pairs received shock punishment on the hard substrate, while the remainder received theirs on quadrants with soft substrates. The master and yoked subjects were generally run concurrently in two identical arenas positioned side by side. In some instances yoked individuals were run at a later date with timing of shocks determined by the recorded delivery schedule of its master.

Tracking commenced when the focal animal was placed into the experimental arena, and continued for 3 h. Time stamps, x and y cartesian coordinates, body orientation, and instances of punishment were obtained and saved into a text file for subsequent analysis. Temporal resolution was limited to two frames per second (fps), which provided sufficiently detailed records for this study's relatively slow-moving subjects. Maximum frame rates can be scaled up considerably for scenarios requiring increased temporal resolution. Utilizing routines from the OpenCV library <http://opencv.org/> the system is highly efficient in its use of resources and processing cycles. With maximum performance depending on hardware capabilities, an Apple Mac laptop (2008) for instance is able to process 30+ fps of standard digital video (720 × 480), 25 fps at HD (1,280 × 720), and 10 fps at fullHD resolutions (1,920 × 1,080).

No shocks were delivered during an initial 10-min pre-trial period. The power supply was then connected and the master animal (along with the control animal yoked to it) earned a mild electric shock (6 V DC, 300-ms duration) whenever it entered a punished substrate. Shocks repeated every nine seconds until the individual exited this substrate. Electric current for the shock was provided by a 6 V DC power supply (~10 mA) and conditionally applied to the indwelling electrode via a computer-controlled relay (Model 1017-0, 0/0/8, Phidgets Inc., Alberta, Canada) using 14-gauge speaker wire (Model AH1450SR, RCA, New York, NY, USA). Preliminary trials identified these electric shock settings as effective punishment without inducing long-lasting motor deficits. Initial responses to the punishment included enhanced motor activity, an occasional tail flip, or a defensive/threat display with claws raised in a meral spread posture (Bruski & Dunham, 1987). Responses to shock decreased with repeated exposure. Four ground wires, spaced equidistant around the arena, were connected to the reference terminal of the relay. The magnitude of the electric shock was uniform throughout the arena. An analog video camera (Sony Bullet, Model 800TVL, Sony, Tokyo, Japan) was placed centrally, 1 m above the tank. The analog signal was converted via A/D hardware (Canopus ADVC-300, Grass Valley, Montreal, Canada) and interfaced with an Apple Macintosh computer (iMac, 2.5 GHz Intel i5, OSX 10.6.8). Alternatively, digital webcams and multiple cameras are supported by the software. Real-time animal tracking and shock delivery used custom software developed using the JavaGrinders library, an extensible, java-based, public-domain set of programming functions for the analysis and control of behavioral experiments (available at <http://iEthology.com>). Minimal code needed to implement an automated learning setup is included in Appendix A.

Statistical analysis

Time-stamped records of animal locations and instances of shock delivery were used to characterize changes in locomotion, spatial use, and earned punishment throughout each trial. Descriptive statistics for time spent, distance traveled, mean speed, and shocks delivered were binned into 10-min time segments (N = 18) and parsed by quadrant and substrate. An individual's initial substrate preference was assessed during the initial, unpunished 10-min time segment. The application of electroshocks alone may bring about unconditioned changes in locomotion and space use, effects which are accessible through an analysis of the behavior of non-contingently punished yoked controls. The effect of contingently earned shocks on substrate preference was obtained, in contrast, by comparing individuals of the master group to their yoked counterparts. While individuals of the master group earned punishments when they entered or remained in a designated quadrant, meeting the condition for shock in yoked subjects was recorded without applying the consequence.

All statistical analyses were conducted in R (Version 3.0.3, <http://www.R-project.org>) with additionally installed packages: 'ggplot2', 'ez', 'Deducer', 'DeducerANOVA', 'stats', and 'lme4.' Repeated measures ANOVA was conducted on time sequence data for distance travelled using packages ez (Function: ezANOVA) and stats (Function: t.test).

Results

Changes in preference in response to shock

When a crayfish was initially placed into the testing arena, it quickly approached the closest arena wall and followed its curve. Progress often slowed or stopped as it approached the transition between quadrants, followed by what appeared to be tactile exploration of the surfaces on both sides of the border. After it entered the next quadrant, it resumed walking along the tank wall until another border was encountered. During the preconditioning period crayfish exhibited no preference for substrate (difference soft - hard, t [25] = 0.671, p = 0.509, Cohen's d = 0.268), and descriptive statistics for treatment group, substrate type, and time segment are reported in detail (Table 1). Following the introduction of punishment, individuals acquired significant place aversion to the paired substrate. Initial tests confirmed that effects of shock treatment were consistent regardless of whether punishment was paired with hard or soft substrates, and the two subsets were subsequently pooled for the overall analysis of shock effects.

Table 1 Space use and locomotion in subsets stratified for substrate type. Descriptive statistics (mean ± SD) are reported for different time segments (10-min pre-conditioning, and subsequent 1-h bins), and shock paired with a particular substrate. The effect was similar regardless of whether punishment occurred on hard or soft substrate quadrants. The number of shocks listed in brackets (during pre-conditioning and in yoked individuals) refers to the number of instances in which conditions for punishment would have been met, although no shock was actually applied

Figure 1 depicts tracks for the entire 3-h test period for representative individuals from different treatment groups. As conditioned crayfish earned an average of 8.88 shocks (SE = 2.38) during the first 10 min of punishment, their residence in punished quadrants decreased from 50.4 % (SE = 4.1) to 22.8 % (SE = 5.8). In contrast, non-contingent shocks applied to yoked controls did not affect substrate preference and these individuals continued to utilize the particular quadrants (52.0 % of time, SE = 3.7) their masters had learned to avoid. Crayfish of the master group continued to decrease their residence in punished quadrants, averaging only 1.1 % (SE = 0.4) of time during the last time segment (Fig. 2a). A significant sphericity test for the repeated measures ANOVA suggested reliance on results from multivariate analyses only. This confirmed significant changes between master and yoked groups in the time spent in punished quadrants (F [1, 24] = 57.052, p < 0.001, Cohen's d = −1.473), as the latter still spent on average 44.2 % (SE = 12.0) of time in quadrants associated with punishment for the master group (Fig. 2b). The master group earned only 0.87 shocks (SE = 0.24) during the final time segment, while their yoked counterparts exhibited 14.93 punishable instances (SE = 4.14). Moreover, the analysis demonstrated significant effects for time (F [17, 8] = 106.290, p < 0.001), as well as its interaction with treatment (F [17, 8] = 276.797, p < 0.001)).

Fig. 1
figure 1

Two-dimensional plot of captured coordinates for individual crayfish. Treatment groups displayed a variety of qualitative and quantitative differences in locomotion. 180 min of coordinates are plotted across soft (dark) and hard (light) quadrants for single representative individuals, which (a) did not received electric shocks, (b) had shock paired with soft substrate quadrants, (c) received shocks on hard substrate, or (d) had its shocks yoked to another individual's behavior

Fig. 2
figure 2

Probability of quadrant and substrate use in treatment (a) and yoked controls (b). Mean proportional residence in the four quadrants is depicted by size for 10-min segments of the 3-h experiment. The two quadrants with punished substrates are indicated by a dark border with standard errors plotted on their combined means. During the first 10-min time segment (Pre) no punishment was applied and the crayfish spend similar amounts of time across the four quadrants and types of substrates. Following that, place conditioning emerged when Master crayfish increasingly avoided quadrants that earned them shocks (i.e., content shaded gray). Yoked controls, in contrast, continued to spend similar amounts of time in the two substrates without evidence for avoidance (F [1, 24] = 62.48, p << 0.001, Cohen's d = -1.473)

Movement patterns

Individuals in both master and yoked groups displayed their highest levels of locomotion when they were initially placed into the arena. As they actively explored the arena, the treatment groups did not differ in either distance (Master = 0.63 m, SE = 0.12; Yoke = 0.89 m, SE = 0.15, F [1,24] = 2.03, p = 0.168) or speed of movement (Master = 15.4 mm/s, SE = 8.3, Yoke = 12.0 mm/s, SE = 4.2, F [1,24] = 3.721, p = 0.066). Regardless of treatment, crayfish gradually decreased locomotion over time (F [1,24] = 6.39, p = 0.018, r = -0.340, η 2 = 0.157). Following the onset of shock, the master group quickly learned to confine their movements to the safe substrates. In contrast, yoked controls continued to utilize the entire arena (Fig. 3) with a much smaller drop in levels of activity compared to pre-shock levels. During the last 10-min segment contingently punished crayfish moved an average of 2.7 mm/s (SE = 1.6) while yoked controls moved at 6.0 mm/s (SE = 3.6). When crayfish in the master group earned a shock in any of the later time segments, they quickly returned to the safe substrate and generally remained there until the end of the trial. Yoked controls in contrast showed a spike in locomotor activity following each (unearned) shock. The combination of both effects (i.e., reduced locomotion in masters and enhanced levels in yokes) were reflected in significant differences between the treatment groups (F [1, 24] = 60.97, p < 0.001, Cohen's d = −0.273). By the conclusion of the experiment, master animals had moved an average of 1.633 m (SE = 0.271), while yoked individuals had travelled more than double the distance at 3.651 m (SE = 0.613).

Fig. 3
figure 3

Mean distance travelled (±SE) by treatment and yoked individuals over time. Over the duration of the experiment crayfish of both groups exhibited a decrease in locomotion, however, this decrease was greater in treatment animals which learned to confine their travel to safe quadrants (Repeated Measures, F [1, 24] = 6.398, p = 0.018)

Discussion

The ability to learn from adverse consequences discourages actions that are damaging to an individual's well-being (Skinner, 1974; Thorndike, 1912). Punishment via electric-shock is perceived as aversive in a wide range of taxa (Glotzbach et al., 2012; Iwata & LeDoux, 1988; Vergoz et al., 2007) and has seen many applications in behavior modification. Uses range widely from keeping sharks away from swimmers (Huveneers et al., 2013), restricting livestock to particular spatial confines (Fay et al., 1989), and controlling barking in dogs (Juarbe-Diaz & Houpt, 1996). Impacts on human psychological phenomena have included effects on learned helplessness (Overmier & Seligman, 1967), obedience (Milgram, 1963), and the control of fetishes (Bond & Evans, 1967), self-injurious behavior in autistic children (Lichstein & Schreibman, 1976), or alcohol addiction (Cannon & Baker, 1981).

Crayfish are tuned to distinguishing tactile cues (Bouwma & Hazlett, 2001) and several crustaceans learn to avoid environmental cues that are paired with electric shock (Denti et al., 1988; Magee & Elwood, 2013). Although widely used as a punishment in learning studies, the application of electric shock may also constitute a confound where electromagnetic stimulation may, in itself, impact neuronal/synaptic function (Kling et al., 1990; Misanin et al., 1984). Moreover, evidence suggests that the degree to which shock is perceived as a punishment depends on the interaction with a variety of motivational factors, such as hunger (Gillette et al., 2000). A possible alternate explanation may arise when master individuals simply slow their movements in safe quadrants as a case of negative electrostimulation taxis. Assessing the validity of this explanation will require further characterization of movement patterns for trained individuals utilizing a rotated arena and the absence of shock.

Studies of learning are increasingly turning to invertebrate models. A wide array of mutant phenotypes for learning and memory in Drosophila (Kahsai & Zars, 2011) offer valuable insights into genetic and molecular substrates of acquired behavioral plasticity. Research in mollusks (e.g., Aplysia, Hermisenda) has helped connect our understanding of learned behaviors with their neuronal physiology (Brembs, 2003). Crustaceans possess a variety of amenable traits for studies of learning, including complex behavioral patterns that are modulated by experience, and which emerge from a relatively simple nervous system composed of large and accessible neurons. Extending previous work on learning in crayfish (Krasne & Woodsmall, 1969; Krasne, 1973; Tierney & Lee, 2011), the present work introduces an avoidance-learning paradigm with instrumental attributes whereby crayfish associate spatial and substrate cues with punishment. This framework offers a suitable paradigm for future research in which to explore cellular changes in crayfish neural circuitry for learning.

Development of a reliable conditioned avoidance paradigm in crayfish may also find use in other experimental contexts of crustacean behavioral neuroscience. With confirmed vulnerabilities to amphetamine, cocaine, and morphine (Huber et al., 2011; Nathaniel, et al. 2012a, b, c; Panksepp et al., 2004), a range of behavioral phenomena indicative of addiction have been modeled. When treated with addictive substances, crayfish exhibit strong psychostimulant activation of exploratory behaviors (Alcaro et al., 2011), which sensitize with repeated application (Nathaniel et al., 2010). Studies of crayfish CPP confirmed that drugs are associated with powerful perceptions of reward (Panksepp & Huber, 2004) and are accompanied by activation of accessory lobes in the super-esophageal ganglion (Nathaniel et al. 2012a, b, c). Cessation of drug-use induces withdrawal effects and the eventual extinction of drug-associated learning (Nathaniel et al., 2009). Drug-sensitive reward in crayfish proves a relentless phenomenon as a small priming dose quickly reinstates the preference for drug-associated cues (Nathaniel et al., 2009). The development of a spatial conditioning paradigm offers a powerful new tool for such work as it allows us to directly measure the strength of drug reward, to assess the emergence of compulsive drug taking, to explore changes in operant responding during withdrawal, and to further investigate how drugs are able to pharmacologically highjack relevant brain circuitries (Gardner, 2011). When drug self-administration was challenged with increasing levels of punishment (Vanderschuren & Everitt, 2004), rats were willing to tolerate much greater levels of electric shock after they had entered an addiction cycle. The paradigm presented here allows us to explore drug-sensitive reward in crayfish in two ways. It can be used to develop an operant self-administration paradigm within an addiction framework by pairing substrate quadrants with a rewarding bolus of psychostimulant drugs, instead of the shock punishment used here. Moreover, instrumental responses of drug-seeking can be challenged with variable doses of electroshock punishment in order to estimate a crayfish's motivation for obtaining access to the drug and its paired cues.

This study also illustrates how recent advances in computational ethology provide new tools for customizable, automated learning paradigms. Although automated systems for the collection of behavioral data have been used for some time, current developments in video analysis and effector control are significantly expanding their utility in terms of effort, ethics, efficiency, and cost. For instance, reliable assessment of appetitive and consummatory components of behavior in real-time greatly simplifies the development of high-throughput screens for learning, and allows us to move past the usual constraints of limited behavioral data in the search for genetic, neural, and neurochemical correlates (Anderson & Perona, 2014; Donelson et al., 2012). At the core of this application, a time-keeping thread both exerts executive control over routines for computer vision, and integrates with functions for robotic control. The present study illustrates how such a system can efficiently implement an automated learning setup, optimize it for the behavior and organism under study, and provide a range of fine-scale metrics that would be difficult to obtain by a human observer. The integration of computer science with studies of ethology is rapidly gaining acceptance, simplifying experimental approaches, generating large bodies of behavioral data, and allowing for many phenomena to be studied at increasingly finer spatial and temporal resolutions (Gomez-Marin et al., 2014).