Introduction

Female mating preferences have the potential to alter the evolution of male traits. For understanding the mechanisms of female preference within the context of sexual selection (Ryan and Rand 1993; Ritchie 1996) it is essential to describe variation in female response within populations (Gerhardt and Huber 2002). Female preference is, however, difficult to quantify because females often do not have graded and easily interpreted responses. In order to understand the evolution of sexual selection by female choice it is necessary to quantify the repeatability of female preference, the shape of individual female preference, as well as preference variation between females (Jennions and Petrie 1997; Wagner 1998). Due to the amenability of acoustic communication systems to experimentation (Gerhardt and Huber 2002) many studies investigating female preference have focused on amphibians and insects. The majority of these studies conducted experiments where only a qualitative response was required, i.e. a yes/no answer to the question: “Did the female track the sound?” (e.g. Doherty 1985a; Loher et al. 1992; Murphy and Gerhardt 2000; Grace and Shaw 2004; but see Wagner et al. 1995). This results in binomial preference data that are often complex to interpret biologically (Kime et al. 1998; Wagner 1998) and which provides less accurate information about the strength of preference than a quantitative measure of preference would (Murphy and Gerhardt 2000). The phonotactic responses of certain insects, e.g. crickets, appear to be sufficiently quantitative in order to characterise female preference efficiently, evident from intracellular recording of identified auditory neurons that closely correspond to phonotactic response (Schildberger et al. 1989). For example, this quantitative correspondence between neuronal and phonotactic response mirrors the steeper slope of the response function at faster syllable periods compared with the gradual slope of the response function at slower syllable periods. A large body of information on animal communication has been generated by phonotactic measurements on insects using locomotor compensators, specifically the Kramer spherical treadmill (Kramer 1976; Weber et al. 1981; Thorson et al. 1982). Briefly, this equipment allows the free, untethered movement of an insect towards a sound source, while remaining at a fixed distance from that source, assuring constant sound pressure level during the experiment.

Experiments attempting to quantify female preference on locomotor compensators have used several measures of phonotactic response, e.g. % time tracked (Thorson et al. 1982; Doherty 1985a), relative vector length (Loher et al. 1992), “vector score” (Wagner et al. 1995), relative distance run (Hedrick and Weber 1998), “net vector scores” (Gray and Cade 1999). These diverse methods impede comparison between studies on female preference. To date, no study on a locomotor compensator has investigated the sources of variation affecting measures of phonotactic precision, a crucial first step in developing a reliable measure of phonotactic response. For example, although several studies have suggested the occurrence of auditory asymmetry (Boyan 1979; Schul et al. 1998; Schul 1998), the relevance thereof to sound localization in insects has only recently become topical (Faure and Hoy 2000; Bailey and Yang 2002). Moreover, as far as we are aware, only a single study has attempted to compensate for auditory asymmetry (Schul 1998). Only once a reliable measure of phonotactic response has been developed can individual female preference be characterised quantitatively using phonotactic response.

Recently, it has been shown that female field crickets (Gryllus bimaculatus) make small steering movements toward the individual syllables of the male call (Hedwig and Poulet 2004; Hedwig and Poulet 2005; Poulet and Hedwig 2005). However, since female crickets respond to several properties of the song such as the syllable period, chirp period, call frequency and frequency bandwidth (Huber et al. 1989), additional neurological mechanisms for pattern recognition must exist, functioning on a timescale of seconds and modulating sound localization through simple reactive steering by increasing the gain of auditory steering when a pattern is recognised (Poulet and Hedwig 2005). Phonotactic response therefore quantifies the overall directional precision with which a female tracks a signal source. Her response function describes her phonotactic response across a range of different male signals (Brooks and Endler 2001). Female preference (see Wagner 1998 for a discussion on the definition of female preference) is then defined as the male signal (within the response function) that elicits the greatest phonotactic precision (see also Reinhold et al. 2002). The response magnitude of a female is her degree of phonotactic precision at her preference. Female selectivity (synonymous to female choosiness; Jennions and Petrie 1997; Gray and Cade 1999; Brooks and Endler 2001; Reinhold et al. 2002) is the degree to which female response decays with a departure of a signal from her preference.

Commonly used methods for describing the shape of female response functions can be inadequate to evaluate within-and between-female variation in preferences (Wagner 1998), as well as repeatability of response. Repeatability (Falconer and Mackay 1996) provides a measure of the consistency of a trait within an individual and sets an upper limit to the heritability of this trait (Boake 1989): it is crucial to our understanding of how female response and male signals co-evolve in a population (Kime et al. 1998; Wagner 1998; Widemo and Saether 1999; Murphy and Gerhardt 2000). It is therefore important to determine the repeatability of female phonotactic response as well as that of female response functions. To date, this aspect of the cricket communication system has been neglected (but see Wagner et al. 1995), probably because the resolution of measurement equipment has, until recently, been relatively course. Appropriate methodology would represent an advance towards a better understanding of the evolution of communication systems.

Few studies have successfully described female response functions at the level of the individual. This is important since studies of female response at the population level can mask individual variation in response (Kime et al. 1998). Reinhold et al. (2002) proposed an approach for describing individual female response functions using non-linear regression. They measured the response calls of female grasshoppers to artificial courtship signals varying in a temporal characteristic and then estimated a Gaussian function that best fitted the female response. In addition, they separated female choice into female preference (B the specific stimulus that elicits the greatest response), female response rate (R the magnitude of response at the preference value B) and female selectivity (C the parameter determining the width of the Gaussian function). Their methodology (hereafter RRJ-method) has not been applied in other studies of female preference, nor has its applicability been critically evaluated in other communication systems. Some studies have used cubic splines (Schluter 1988) to describe female response functions at the population level (Ritchie 1996; Brooks and Endler 2001; Ritchie et al. 2001; Simmons et al. 2001). As far as we are aware, the efficacy of this approach has to date not been evaluated for describing the shape of an individual females’ response function.

Female phonotactic response can be affected by factors such as developmental environment (Grace and Shaw 2004), resource acquisition (Hunt et al. 2005) and age (Prosser et al. 1997; Gray 1999; Reinhold et al. 2002; Olvido and Wagner 2004). This could have important implications for understanding sexual selection. Reinhold et al. (2002) found that female age had no significant effect on female preference or selectivity but did affect female response magnitude significantly. Similarly, Olvido and Wagner (2004) demonstrated an age-related decline in female responsiveness to chirp duration in Allonemobius socius. Gray (1999) argued that female age, fecundity, reproductive investment and nutritional condition may affect the acoustic preferences of female crickets (Acheta domesticus) but found that only age significantly affected female selectivity, older females being less selective. Consequently, selection on mating behaviour at older ages is thought to be weak. Studies on A. domesticus have shown that juvenile hormone III levels affect the sensitivity of auditory neurons (Stout et al. 1989a, b, 1991; Walikonis et al. 1991; Henley et al. 1992), causing older females to respond to a wider range of stimuli than young females. Since these studies showed no common effect of age on female response between species, it is necessary to quantify the effect of age on phonotactic response and preference for a particular species. In this paper, we calculate individual female response functions for four male calling song parameters in the chirping field cricket, Gryllus bimaculatus, De Geer (Orthoptera: Gryllidae), using “no-choice” sequential-stimulus phonotaxis experiments. The specific aims of this study were:

  • To develop methodology for the quantification and statistical interpretation of individual female phonotactic response functions, suitable for population-level analyses of G. bimaculatus females;

  • To quantify the effects of auditory asymmetry and fatigue on phonotactic precision;

  • To quantify the different levels of within-individual and between-individual variation in phonotactic response magnitude to an identical stimulus;

  • To quantify the repeatability and the effect of age on female phonotactic response and preference.

Materials and methods

Collection and captive care

We collected wild-living penultimate instar female field crickets from seven locations across South Africa as well as from seven captive colonies (F 1) originating from these wild-caught animals and allowed them to molt in captivity. Since female G. bimaculatus do not exhibit phonotaxis after being inseminated (Loher et al. 1992), we only used virgin females. Individuals were kept in a climate-controlled chamber (25 ± 1°C; 12:12 LD) in individual containers (500 ml) and provided ad libitum food (high protein cereal and fish flakes) and water (cotton-plugged vial filled with water). Individuals were randomly selected for each of the four experiments described below.

Measurement of female movement

We quantified female preference through untethered phonotactic response in total darkness at a temperature of 25 ± 1°C using a Kramer spherical treadmill (Kramer 1976) in an anechoic chamber (>2 kHz). We conducted three experiments, each consisting of four trials. For each trial, we presented a female with a series of stimuli by manipulating a single call parameter (Table 1). Trials began with one minute of silence allowing females to become accustomed to the movement of the sphere. Each stimulus thereafter was presented twice, played back alternately from two different loudspeakers for a minute at a time respectively. Speakers were situated at 210° (speaker 1) and 90° (speaker 2) respectively from a predefined zero point. This forced females showing phonotaxis to switch direction when a stimulus changed from one speaker to the other and allowed for the quantification of within-individual variation in phonotactic response to the same stimulus. Measurement of phonotaxis only started when a female approached a speaker with an accuracy of at least 30°. Stimuli followed consecutively and were not separated by a silent pause (e.g. Doherty 1985a). Female crickets continue phonotactic response for some 300 ms after presentation of an attractive stimulus (Poulet and Hedwig 2005). This is a time scale far below that of our measurements during which each stimulus was presented for 60 s. While performing phonotaxis, our crickets reoriented themselves towards the active speaker within 3 s (Fig. 1).

Table 1 Description of the four different phonotaxis trials that females crickets (G. bimaculatus) were exposed to
Fig. 1
figure 1

Trace diagram of a female’s movement during quantification of her preference response to different syllable periods. Each stimulus was played back for a duration of one minute per speaker from two different speakers; speaker 1 was situated at 210° and speaker 2 at 90°. Horizontal grey lines show the active speaker. Notice how the female does not respond to syllable periods >48 ms but resumes phonotaxis at the final stimulus (STD2 stimulus; syllable period = 43 ms)

Acoustic stimuli used

Male G. bimaculatus produce a calling song by stridulating their structurally modified tegmina. Each chirp comprises three to five syllables, each resulting from the closure of the tegmina. We generated synthetic acoustic stimuli played back at a sound intensity (i.e. maximum of the carrier envelope) of 70 dB SPL (measured at the top-center of the treadmill; re. 2 × 10−5 Pa), which is well above the thresholds of sound detection by both high-frequency and low-frequency auditory neurons in the prothoracic ganglia of G. bimaculatus for the frequency range tested in our experiments (Schildberger et al. 1989). All syllables had 2 ms linear rise-fall times. We designed four trials where we manipulated either call frequency (hereafter FQ), spectral bandwidth (hereafter BW), duty cycle (hereafter DC) or syllable period (hereafter SP) (Table 1). Pre-experimental trials as well as previous experiments on G. bimaculatus (Doherty 1985b) revealed a species-specific mean preference for stimuli conforming to 5 kHz frequency, 43 ms syllable period, 50% duty cycle, 4 syllables/chirp and 2 chirps/s (250 ms chirp duration). We maintained this standard, which served as the predicted preferred signal for this species [hereafter referred to as the standard (STD1) stimulus] for each trial (except for SP; see below) and only one acoustic property (e.g. frequency) was manipulated. Three of the trials (BW, DC and FQ) had an identical STD1 stimulus (Table 1). Following Thorson et al. (1982) and Doherty (1985a), we maintained a duty cycle of 50% (i.e. constant sound energy intensity per chirp) while keeping the chirp duration approximately constant across all SP’s tested (23–81 ms; Table 1) by varying the number of syllables. For the syllable period of 43 ms in the SP trial, stimulus characteristics for STD1 were as follows: six syllables with a chirp duration of 237 ms and an inter-chirp-interval of 349 ms, identical to that of Thorson et al. (1982). We repeated the standard stimulus (hereafter STD2) at the end of each trial in order to quantify the magnitude of any tiring effect and to determine whether the female was still responsive to the acoustic stimuli.

Previous experiments (Doherty 1985b; Hedrick and Weber 1998) and pilot trials revealed no effect of stimulus order on female response and therefore we presented stimuli in the same order for each trial (Table 1). Nevertheless, we did investigate the effect of sequence order on female preference by reversing the order of stimuli for the syllable period trial, the trial with the longest duration [hereafter SP(Rev)]. We randomised the order of the trials presented to individual females. Before quantifying either phonotactic response or preference we visually inspected a females’ phonotaxis by creating a trace diagram to indicate her movements throughout the duration of a trial (Fig. 1).

Measuring phonotactic response

Although the Kramer treadmill has a movement resolution of 0.25 mm, we sampled the mean female movement for every 1.0 cm moved. Females generally ran >250 cm min−1 which provided excellent resolution to quantify phonotactic response for each stimulus presented. We quantified phonotactic response for each stimulus (both speakers individually) of every trial, using a measure of phonotactic precision which relies on the calculation of a relative vector length (e.g. Loher et al. 1992), where

$$ {\text{relative vector length }}(r) = {\text{displacement/total distance run}}. $$
(1)

Displacement represents the straight-line distance between the females’ starting and end positions for a particular stimulus whereas total distance run includes the intervening meandering movements. We calculated the angular variance of phonotactic response (hereafter Batschelet deviation-BatD; Batschelet 1981), a measure of dispersion for a circular distribution used in navigation and orientation studies (e.g. Homing pigeons: Gagliardo et al. 1999; Macaques: Ringach et al. 2002) as follows:

$$ {\text{BatD}} = \sqrt {2(1 - r)} $$
(2)

Female crickets characteristically move with an angular error of up to 60° towards a sound source and the more precisely a female moves towards a sound source, the lower the BatD. To demonstrate the effect of phonotactic asymmetry on measures of phonotactic precision incorporating a term for angular deviation from the sound source we calculated the “sound directed component” (Schmitz 1985) or “vector score” (Wagner et al. 1995) for the females of experiment 1, and the FQ trial. These two measures are computationally identical (hereafter CosV):

$$ {\text{Cos}}V = {\text{Cos}}(\alpha - S) \times r $$
(3)

where α = the mean vector angle, S = angular direction of the speaker and r = relative vector length (Eq. 1).

Different approaches for determining phonotactic response function

Polynomial regression

We generated phonotactic response functions for each female and each trial by obtaining a high order regression equation from the phonotactic data (BatD) (Fig. 2a). Tests for the efficiency of the regression analysis for deriving the response function indicated that, for BW and DC, third order polynomials should be used and for FQ and SP, sixth order polynomials.

Fig. 2
figure 2

Three methods used in this study to calculate female response function for the field cricket, G. bimaculatus. Phonotactic response (open circles) to two different speakers per stimulus tested are shown for the frequency trial. a The polynomial regression generated from the original data is presented as the solid grey line and the polynomial regression generated from the ŷ values (filled triangles), obtained from cubic spline analysis, is presented as the solid black line. b A Gaussian function (solid black line) was used to calculate preference (B), response magnitude (R) and selectivity (C) from inverted phonotactic response data (inverted open triangles) (RRJ-method; Reinhold et al. 2002). Analogous measures of female preference to those calculated from the Gaussian function (B′, R′, C′) were calculated for the two polynomial regression equations a. Here, B′ and C′ are shown for the equation generated from spline data only

Non-linear regression (RRJ method)

Following Reinhold et al. (2002), we used non-linear regression and fitted a Gaussian function to describe the response functions for each female and each trial. Since the phonotactic data were initially not suitable for normal distribution fitting (lower BatD denotes greater preference), we transformed the data (BatD′) as follows:

$$ {\text{BatD'}} = \max {\text{BatD}} + \min {\text{BatD }} - {\text{ BatD}} $$
(4)

where BatD = phonotactic precision for a particular stimulus, maxBatD = the largest BatD value of a trial and minBatD = the lowest BatD value of a trial. This meant that the lowest original BatD value would now have the largest numerical value (and vice-versa) and that the original scaling of the data was maintained (Fig. 2b). Non-linear regression (NLREG DLL version 5.2; Copyright Phillip H. Sherrod 1992–2001) employing the Levenberg–Marquardt algorithm, was used to estimate the Gaussian function,

$$ f(x) = R \times {e}^{ - 0.5 \times [(X - B)/C]^2 } $$
(5)

that best fitted the transformed phonotactic data. Following Reinhold et al. (2002), we interpreted B and C, as estimates of female preference and selectivity respectively. Our measure of phonotactic response (BatD) was not directly comparable to “response rate” but served as a quantitative measure of female response magnitude (R), since females were expected to have a small BatD when showing strong phonotaxis towards their preferred stimulus. For the nonlinear regression the start value for R was always 90 while B and C were respectively set as follows: BW 0.4, 0.3; DC 50, 20; FQ 4, 1.5; SP and SP(Rev): 43, 20. If we obtained a value for B that was smaller than the starting value for the trial (Table 1) we set B to the starting value of the trial and recalculated the values for C and R (this happened four times in the BW and once in the SP trial). We never obtained a value for B that was greater than the largest value for a trial.

Cubic spline method

We generated response functions for each female and each trial by using cubic splines (Schluter 1988) with a smoothing parameter (λ) of −5 (Fig. 2a). Schluter (1988) cautions against the use of extreme λ values as it can either cause the resulting curve to be too smooth (large λ) or the curve will simply fit each data point (small λ). We tested a large range of smoothing parameters and chose the smoothing parameter that, across all four of the trials (BW, DC, FQ and SP), minimised the general cross-validation (GCV) score (Schluter 1988) and maximised between-individual differences in preference. Two hundred bootstrap replications allowed each response function to be fitted with error estimates (±1 SE), although these were always very small due to the relatively small number of unique stimuli presented within a trial (D. Schluter, personal communication). We did not require the addition of a random error to break ties (e.g. Ritchie 1996; Ritchie et al. 2001; Simmons et al. 2001) as no female yielded identical phonotactic response (BatD) for both speakers of a stimulus. The generation of a cubic spline does not provide an equation of predictive value. We therefore obtained a high order regression equation (same order as for the polynomial regressions above) from the predicted values (ŷ) from the spline analysis and used this equation to calculate B′, C′ and R′ (see below). These equations fitted the ŷ values extremely well (mean r 2 = 0.98 ± 0.03).

In order to compare the quantification of female response from the polynomial regression and spline methods with the RRJ-method, we required analogous measures for estimating female preference (B), selectivity (C) and response rate (R) (Reinhold et al. 2002). We achieved this by taking female response magnitude (R′) as the best phonotactic response (lowest BatD) during a trial; female preference (B′) as the stimulus value (on the x-axis) corresponding to R′, and female selectivity (C′) was taken as the width of the regression equation at 10° BatD above R′ (the lowest BatD; Fig. 2a).

Body size

After successfully completing an experiment, females were killed and digital images of the pronotum were generated with the Creative Laboratories VideoBlaster FS200 utility program. Using this software, we measured the pronotum length and width (resolution = 15.8 μm). The surface area of the pronotum (mm2) was taken as a mass-independent measure of body size.

We conducted the following four experiments:

Experiment 1: Phonotactic asymmetry, fatigue and levels of variation in phonotactic experiments

Females of equal age (10 days post adult ecdysis) were measured in groups of 20 with each individual (n = 130) subjected to three trials (BW, DC and FQ) in random order over a period of no longer than 2 days with a minimum of 10 h of rest between trials. This allowed, firstly, for the quantification of female phonotactic asymmetry and its effect on the calculation of phonotactic precision. Secondly, we were able to quantify the effect of tiring on phonotactic response since the duration between the STD1 and STD2 stimuli varied over a large range across the three trials (1–19 min, Table 1). We quantified the following hierarchical levels of variation in phonotactic response for these identical stimuli (STD1 and STD2): between-female; within-female between-trials; within-trials between-stimuli and within-stimulus between-speaker. We performed a nested random-effects ANOVA in order to quantify these different levels of variation using the PROC NESTED procedure of the SAS V8.02 statistics package (SAS Institute, Cary, NC, USA).

Experiment 2: Repeatability of phonotactic response and response function

Four groups of females of equal age (10 days post adult ecdysis) were selected, with each group subjected to either BW, DC, FQ or SP twice. This meant that each female within a group was subjected to two identical trials with at least 10 h of rest between trials. This allowed firstly, for the calculation of repeatability of phonotactic response to each individual experimental stimulus. Repeatabilities of phonotactic response for both the STD1 and STD2 stimuli were calculated separately. Secondly, the repeatability (and standard error; Becker 1984) of female preference (B and B′), female selectivity (C and C′) and female response magnitude (R and R′) was calculated using the three different methods described above (polynomial regression, RRJ and cubic spline). A population level response function for each of the four trials was generated for the females of experiment 1 (n = 130) using the cubic spline method. These functions, together with a number of response functions for different individuals were graphed in order to demonstrate the extent of between female differences within the context of population level response characteristics.

Experiment 3: The effect of stimulus sequence on SP preference

Females of equal age (10 days post adult ecdysis) were each subjected to the SP trial (Table 1) and an additional SP trial where we reversed the sequence of stimuli so that the syllable period was varied from long to short (81–23 ms). This experiment was similar in design to that of Doherty (1985b) and allowed for the calculation of repeatability of preference by comparing preference between the different trials. Low repeatability of preference in this experiment when compared to that of experiment 2 (above) would indicate an effect of stimulus sequence on phonotactic response.

Experiment 4: The effect of age on phonotactic response and preference

Females of equal age (10 days post adult ecdysis) were each subjected to four trials (BW, DC, FQ and SP) presented in random order within two days, with at least 10 h rest between trials. These females were subjected to the four trials every 10 days until death, allowing for the quantification of the effect of age on phonotactic response.

Results

Experiment 1a: Phonotactic asymmetry

Nearly all females had an angular offset, i.e. a constant difference between the female’s direction of movement and the true direction of the speaker. Figure 3 shows phonotaxis for a female with an angular offset of approximately −30°. Because phonotaxing females adjusts their movement direction in order to receive a similar sound pressure level (SPL) on both tympani over time (meandering about the sound direction; Bailey and Thomson 1977; Thorson et al. 1982; Schmitz 1985), a female with an auditory deficiency on her right side has an angular offset to the left (negative offset in Figs. 4, 5). By moving the more functional tympanum away from the sound source (left tympanum of female in Fig. 3), the net SPL on both tympani will be more similar during phonotaxis. The magnitude of the angular offset is determined by the degree of phonotactic asymmetry and affects measures of phonotactic precision that incorporate a term for angular deviation from the sound source. The CosV around the true speaker direction (0.86 ± 0.11) was significantly smaller than the CosV around the female’s perceived speaker direction (0.89 ± 0.09) (paired t test, t 259 = 6.05, P < 0.001) and the former measure therefore indicated poorer phonotactic precision.

Fig. 3
figure 3

A segment of a female’s trace diagram showing perfect phonotaxis at an erroneous angle. Grey bars indicate the position of the active speakers. The angular offset is calculated as the mean difference between the female’s perceived direction of the speaker and the actual direction of the speaker. In this case the angular offset was −31.4° for the 5 kHz stimulus (last 2 min)

Fig. 4
figure 4

The effect of an angular offset on the time taken to locate a speaker for 130 G. bimaculatus females. Angular offset is calculated as the mean difference between the female’s perceived direction of the speaker and the actual direction of the speaker (see text for more detail). Delay difference is calculated as the difference between the time taken to locate speaker 2 and the time taken to locate speaker 1. Values are presented as mean ± SE for three trials (BW, DC and FQ) and two stimuli (STD1 and STD2) per trial

Fig. 5
figure 5

Female response functions for four different calling song parameters for the field cricket, G. bimaculatus. Population level phonotactic response (mean Batschelet deviation ± SD) for 130 females (Experiment 1) is shown for the bandwidth (BW), duty cycle (DC), frequency (FQ) and syllable period (SP) trials in the left figures. The polynomial regression generated from the population ŷ values, obtained from cubic spline analysis is presented as the solid black line. For each trial, figures on the right show the individual phonotactic response of three different females demonstrating significant between-individual differences. Values are ŷ values, obtained from cubic spline analysis with the corresponding polynomial regression. Symbol and line legends are the same for each trial where each female’s individual response is represented with a square, circle or triangle symbol and her corresponding polynomial regression is indicated respectively with a line that is solid, dashed or dotted

The magnitude and sign of a female’s angular offset affect the time taken for a female to switch direction between speakers. In Fig. 3, time taken to switch between speakers differs as the female switches almost instantly from speaker 1 (210°) to speaker 2 (90°) by following the shortest angular difference between speakers. However, when switching from speaker 2 to speaker 1 this female took more time and the greater angular difference between speakers was followed when switching from 4.5 to 5 kHz. For the females in experiment 1 (n = 130), we calculated a female’s mean angular offset for the STD1 and STD2 stimuli of each of the three trials she was subjected to. We also calculated the time in seconds from the onset of the stimulus until a female moved to within 30° of her perceived speaker direction for that stimulus. We then calculated the mean difference in duration for locating speaker 1 and speaker 2 respectively, for each female. The signed difference between these two durations was plotted against a female’s mean angular offset (Fig. 4) revealing a highly significant relationship (F 1,128 = 63.86, r 2 = 0.33, P < 0.001).

Experiment 1b: Effect of tiring on phonotactic precision

For each female and each of the three trials in experiment 1, we calculated the difference between the distance run while tracking the STD1 and the STD2 stimuli. This difference served as a measure of tiring since a female is expected to run similar distances toward the same stimulus. Similarly, for each trial, the signed difference between the phonotactic precision (BatD) toward the STD1 and STD2 stimulus served as a measure of degradation of phonotactic precision. No significant correlation between the difference in phonotactic precision and the difference in distance run was found for any of the three trials. The strongest effect was found for the BW trial where the mean difference in phonotactic precision (BatD) was 2.04 ± 4.69° and the mean difference in distance run was 102.7 ± 89.94 cm (26% reduction) (Δ BatD vs. Δ distance run: Pearson r = −0.06, P = 0.47). This suggests that phonotactic precision is not affected by fatigue, even though there was a large mean difference in distance run between the STD1 and STD2 stimuli.

Experiment 1c: Sources of variation in phonotactic precision for an identical stimulus

We performed two nested random-effects ANOVA’s on phonotactic precision. The variance hierarchy for the first ANOVA was nested as stimulus within trial within individual. In order to quantify the amount of variation resulting from phonotaxis towards different speakers, we required an error term and therefore performed a second ANOVA that was nested as speaker within trial within individual. This was justified since the estimates of variation due to ID and TRIAL did not differ between the two analyses and the variance component due to different stimuli was not significant (F 390,780 = 0.94, P = 0.77, 0% variation explained). Table 2 therefore presents the results from the second ANOVA. A large proportion of variation in phonotactic precision was attributable to the significant between-individual differences, suggesting that different females have different innate abilities to track a sound source well. A females’ mean phonotactic precision at the STD1 stimulus was not affected by her body size (F 1,128 = 3.80, r 2 = 0.02, P > 0.05) or the absence of a hind leg (n 1 = 109; n 2 = 21; t 128 = 1.14, P = 0.39). The proportion of variation in phonotactic precision due to the different trials and the different stimuli (STD1 or STD2) respectively, were non-significant (Table 2). A large and highly significant proportion of variation in phonotactic precision was due to between-speaker differences, revealing that females always tracked speaker 1 better than speaker 2. However, the magnitude of this difference in BatD was only 3.3° (Table 2).

Table 2 Hierarchical levels of variation in phonotactic response to an identical (STD1) stimulus for 130 G. bimaculatus females

Experiment 2a: Repeatability of phonotactic response to a standard stimulus

The first four columns of Table 3 show the repeatability estimates of phonotactic response for the STD1 and STD2 stimuli respectively for the females of experiment 2 that were presented with two identical trials (either BW, DC or FQ). The repeatabilities calculated are high and significant, indicating that a females’ phonotactic response to a specific stimulus (near the predicted preference for this species) is similar between trials. We pooled the phonotactic response data for the STD1 and STD2 stimuli respectively across the three trials to estimate the repeatability of phonotactic response for an identical stimulus, independent of the trial. Repeatabilities of phonotactic response calculated in this manner were both high and significant (STD1 0.78 ± 0.7, P < 0.001; STD2 0.72 ± 0.09, P < 0.001).

Table 3 Repeatability estimates (±SE) of phonotactic response in G. bimaculatus

Experiment 2b: Repeatability of response function

For experiments 2 and 3, the polynomial regression and the spline methods yielded high and significant female preference (B′) repeatabilities which were very similar (Table 3). Since the response functions for each trial were unimodal, their shape could potentially be approximated by the shape of a normal distribution (a prerequisite for the RRJ-method). The repeatabilities calculated from polynomial regression and the spline method were similar except for the repeatability estimates of R′. Repeatability of response magnitude (R′) was low due to lack of between-individual variation, e.g. mean response magnitude derived from the spline method for the FQ trial was 21.3 ± 2.9° (range = 17.3–26.3°; n = 12). For each trial in Table 3 the R′ calculated using the spline method was significantly greater (following a Bonferroni correction; Rice 1989) than the R′ calculated from the polynomial regression (paired t tests; smallest t = 3.33 for the BW trial; P < 0.01 for all trials). The RRJ-method did not yield comparable results and does not appear to be suitable for this type of data as the Gaussian function fitted to the phonotactic response data explained significantly less of the variation than the polynomial regression did (Table 4). The population level splines (Fig. 5 a, c, e, g) indicate the different tuning curves with respect to the four acoustic parameters that were measured. The tuning curves for individuals (Fig. 5 b, d, f, h) show three different types of response. The response functions for different individual females showed large degrees of variation and, in many cases, were not similar to the population level response function.

Table 4 Comparison between the r 2 (mean ± SD) of the polynomial regression and the non-linear regression (Gaussian function, RRJ-method) indicating that for each trial the polynomial regression fitted the data significantly better (paired t tests)

Experiment 3: The effect of stimulus sequence on SP preference

For the females subjected to the SP trial and the SP(Rev) trial (reversed sequence of stimuli), SP preference (B′) and selectivity (C′) was highly repeatable and was indeed slightly greater than that calculated for the females of experiment 2 (Table 3), suggesting that the sequence of SP stimulus presentation has a negligible effect on female response.

Experiment 4: The effect of age on phonotactic precision and preference

Phonotactic precision

Each female was subjected to six standard stimuli at every age category (STD1 and STD2 stimuli for BW, DC and FQ trials, respectively), allowing us to quantify the effect of age on phonotactic precision. We performed a repeated measures ANOVA on the phonotactic precision (BatD) for the STD1 and STD2 stimuli respectively for all females surviving to 30 days of age (Table 5). We found no difference in phonotactic precision attributable to age, trial or their interaction. We performed an additional repeated measures ANOVA, this time with only two age classes namely at 10 days old and the last set of measurements performed before the female died (“final age”). This “final age” ranged from 20 to 40 days of age, depending on the individual female’s longevity. Table 5 shows that for this comparison, a significant effect of age was found for both the STD1 and STD2 stimulus, suggesting that females track a particular stimulus consistently throughout their lives until a few days before they die.

Table 5 Effect of age on phonotactic response to an identical stimulus

Preference

Using the spline methodology described above, we determined individual female preference (B′) for each trial completed at each age class. We performed several repeated measures ANOVAs in order to determine whether preference changed at ages of 10, 20 and 30 days (Table 6). Preference during a specific trial did not differ between the age classes. Although with a much reduced sample size, we repeated these exact analyses but additionally included the age class of 40 in order to determine if these results were consistent across a greater age range. Again, we found no effect of age. In order to determine if preference changes and becomes unreliable just before a female dies we performed paired t tests with two age classes, 10 days old and the last set of measurements performed before the female died. Preference did not differ between these two age classes (Table 6).

Table 6 The effect of age on female G. bimaculatus preference for four male song traits

Limitations of phonotactic experiments conducted on the spherical treadmill

To determine whether female field crickets are consistent in their response to stimuli outside of their preference range, we calculated the absolute difference in phonotactic precision (BatD) between identical stimuli of the FQ trial that was repeated by females of experiment 2. Duplicate trials for each female meant that for each stimulus of a trial we had a measure of the consistency of phonotactic precision, which differed between stimuli within a trial (e.g. FQ trial repeated measures ANOVA, F 8,88 = 4.23, P < 0.001). We also calculated the repeatability of the phonotactic response for each stimulus. Figure 6 shows the mean BatD difference between repeat trials for each stimulus for the FQ trial (n = 12) and the repeatability estimates of phonotactic response for each stimulus. Clearly the smallest mean difference between repeats, the smallest amount of variation around the mean difference and also the greatest repeatability is close to the preferred stimuli (4.5 or 5 kHz). This pattern was similar for the other trials. Differences in phonotactic response between identical stimuli at the lower and upper extremes of the signal range suggest that females do not respond consistently to stimuli outside of their preference range. This means that the degree of female indifference to a male signal cannot be quantified accurately through phonotactic response. Furthermore, the phonotactic response at the lower and upper extremes of the male trait affect the position of the mode of the Gaussian function fitted to the data (RRJ-method) and therefore the estimate of female preference. To demonstrate this we applied the RRJ-method to the FQ trial data of experiment 2 but this time we excluded the phonotactic response data for the two lower and upper extreme stimuli such that the range of frequencies now spanned from 4 to 6 kHz, which approximates the range that the male trait varies in the natural population (L. Veburgt; unpublished data). We did this since the shape of the response function in this range resembled a Gaussian function (Figs. 2a, 5e). We compared the estimates of female preference (B) obtained in this manner to the estimates of female preference obtained previously from all three methods (B and B′; Table 3) using a repeated-measures ANOVA. There was a significant difference between the preference estimates determined by the four different methods (F 3,69 = 22.65, P < 0.001) due to the preference estimates from the original RRJ-method differing from the preference estimates of every other method (Tukey HSD post hoc comparison; P < 0.001). In fact, the correlations between the preference estimates obtained from all methods except the original RRJ-method were high and significant (Table 7).

Fig. 6
figure 6

Inconsistency of female phonotactic response (BatD) for male trait values at upper and lower extremes from the predicted preferred signal (4.5–5 kHz) for G. bimaculatus females. Bars are the mean ± SD BatD difference between the first and second repeat of the trial frequency for the females of experiment 2. Values at each bar show the repeatability ± SE of the phonotactic response (* P < 0.05; *** P < 0.001)

Table 7 Correlation matrix (Pearson r) of female preference (B and B′) estimates obtained from four different methodologies for the FQ trial of experiment 2 (n = 12)

Discussion

Measuring phonotaxis in females with an angular offset

Previous studies on female crickets reported strong phonotaxis but at an erroneous angle (e.g. Fig. 3) to the sound source (Thorson et al. 1982; Doherty 1985a; Schul et al. 1998; Schul 1998), yet only a single study has attempted to correct for this phenomenon (Schul 1998). Measures of phonotactic response that include a term for angular deviation from the sound source underestimate the performance of a female with an asymmetrical auditory system. Our results show that Batschelet deviation, a measure independent of the angle of the mean vector, is an efficient tool for population level analysis of phonotaxis, since we were not interested in a females’ absolute ability to locate a speaker but in her tracking precision of where she perceived the speaker to be located. While this ensured that a female’s angular offset would not affect the calculation of her phonotactic response, a potential problem arises if a female walks constantly in any random specific direction for the entire duration of a stimulus, yielding low angular deviation and falsely indicating accurate tracking. However, this problem is a feature of any measure of phonotactic response that does not include a term for angular deviation from the sound source e.g. “relative vector length” (Eq. 1, Loher et al. 1992) or “relative distance run” (Hedrick and Weber 1998) and is not restricted to our measure. Even relative phonotaxis (PR; Schul 1998), which is one of the only phonotaxis measures that controls for asymmetrical auditory systems, can indicate good phonotaxis if a female walks in a similar incorrect direction for both the control and the test stimulus. However, after viewing female phonotactic response for more than 10,000 stimuli, we could not identify a single occurrence where a responsive female walked in an incorrect direction and maintained that direction throughout the presentation of a stimulus. Nevertheless, we recommend that phonotactic response during trials are visualised (as in Fig. 1) and screened for abnormal phonotactic behavior before the quantification thereof.

Several females shown in Fig. 4 took up to 10 s longer on average to locate one of the speakers due to their angular offset. The significant effect of angular offset on the time taken to switch between speakers (Fig. 4) affected quantification of within-stimulus between-speaker variation in phonotaxis. Furthermore, calculation of phonotactic response over the entire duration of the stimulus (1 min) underestimates the phonotactic precision to one of the speakers for a female with an angular offset. Since the delay in orientation to a speaker is not the same for both speakers, between-speaker differences in phonotaxis can arise. We corrected for this problem by quantifying phonotactic response (BatD) only after the female had orientated to within 30° of her perceived speaker direction. However, if a female did not orientate to within 30° of her perceived speaker direction at all during a particular stimulus then the BatD was calculated for the entire duration of the stimulus (1 min). Variation in the magnitude of angular offset between repeated trials for the same female could arise if tiny objects (e.g. dust) impair the function of one of the tympani, causing auditory asymmetry (e.g. horizontal error bars in Fig. 4) for one of the trials. It is unlikely that such temporary auditory asymmetry can be completely avoided and therefore, where applicable, experiments should control for phonotactic asymmetry to ensure accurate measurement of phonotaxis (e.g. Schul 1998).

Finally, the distinction needs to be made between auditory asymmetry and motor bias (turning bias to one side), the former occurring when the tympanum and associated neurons on one side function less effectively than on the other and the latter occurring where individuals tend to veer to a particular side irrespective of auditory input. Motor bias may account for the asymmetrical phonotactic behaviour observed during switches between speakers where the angular pathway chosen is greater to one side. However, Boyan (1979) found that 83% of Teleogryllus commodus had inherent left/right asymmetric spiking responses in the S and L auditory neurons indicating an asymmetry before motor control commences. Temporary auditory asymmetry caused by small particles on the tympanum (discussed above) will result in variation in the duration to locate a speaker (error bars in Fig. 4) because the particles are not always present or their effect on hearing is not constant. Motor bias should be more repeatable between trials because it is hard-coded by the CNS. The occurrence of an angular offset where females tracked the speaker at an erroneous angle (Figs. 4, 5) cannot be explained by motor bias alone. The typical meandering about the speaker direction would occur across the speaker direction but with greater translation angles to one side if motor bias were solely responsible for the behaviour. Although it is possible that motor bias could override auditory asymmetry or cause similar phonotactic observations, our methodology will control for either in unison or acting in concert.

Constraints of the Kramer treadmill

A small but significant proportion of variation in BatD attributable to between-speaker differences in phonotaxis (Table 2) was brought about by the mechanical arrangement of the experimental equipment, specifically the position of the speakers relative to the two motors driving the sphere and the acceleration of each motor. Motor A was positioned at 45° and motor B at 135°. This arrangement meant that when a female was moving towards speaker 1 (situated at 210°), motor A was mainly responsible to compensate for the forward acceleration of the female and motor B was mainly responsible for the angular correction of the female’s course. However, when a female was moving toward speaker 2, both motors were simultaneously responsible for forward acceleration compensation and angular corrections. Since no two motors have identical acceleration, failure of one motor to accelerate equally when a female moved directly toward 90° would result in a small angular error towards the direction of the slower of the two motors. Females would then have to correct for this angular error and therefore the resulting BatD would be greater for speaker 2 as is evident in Table 2 and visually in the trace diagram of Figs. 1 and 4. When calculating female response for a male trait however, the phonotactic precision toward both speakers was used in the generation of the response function (Fig. 2) and therefore conclusions about female response are not affected by these small yet significant differences.

Female indifference to a male signal cannot be quantified to the same level of accuracy as female preference. Doherty (1985b) showed that the tracking scores (a measure of phonotactic response) calculated for G. bimaculatus females in “no-choice” sequential experiments on syllable period had minimal variability in the effective range (40–45 ms) while stimuli with syllable periods at the margins of the effective range (30–35 and 50–60 ms) yielded variable scores within and between individuals. Also, Fig. 2 in Gray and Cade (1999) shows increased standard deviations for female net vector scores (their measure of phonotactic response) to both the upper- and lower extreme values for syllables per trill that G. integer females were exposed to. Our findings on female indifference are therefore not new, but the significance thereof have not been discussed to date. For example, Gray and Cade (1999) report a lack of heritable variation in female selectivity for syllables per trill in G. integer but do not discuss the effect of unreliable female response at the extreme male trait values on their measure of female selectivity. Consequently, their inability to detect heritable variation in female selectivity for pulses per trill may be tentative.

Phonotaxis towards a standard stimulus

Females from experiment 1 did not show reduced phonotactic precision or become unresponsive after the presentation of many stimuli (first nested ANOVA). We could also not detect any effect of reduced phonotactic precision due to fatigue per se. While this indicated reliability of phonotactic response within a trial, the small proportion of variation in phonotactic response to a standard stimulus between different trials (Table 2) showed that females tracked this stimulus with similar precision, irrespective of the stimulus setup of the trial. This was not because all females tracked the stimulus with the same accuracy since significant between-individual variation in phonotactic precision was found (Table 2), which was independent of body size or the absence of a hind leg. Furthermore, individual phonotactic response to a standard stimulus is highly repeatable between replicates of the same trial (Table 3). Given the reliability of phonotactic response in this species it is therefore valid to infer preference from phonotactic walking behaviour.

Methodology comparison

We do not believe the RRJ-method is appropriate for estimating parameters of female response from our data since the response function derived from this method explained significantly less of the variation in BatD than did the other methods (Table 4). The reason for this is twofold. Firstly, imposing a Gaussian shape on the female response function reduces the efficacy of that function to explain the variation in the data, even if the distribution of data deviates only slightly from the imposed shape. There is no a priori reason to believe that our phonotactic data should yield a response function that can be approximated by the shape of a Gaussian curve. In fact, the shape of the response function for frequency derived from the polynomial regression and spline methods (Fig. 2a) remarkably resemble the shape of auditory tuning curves derived from auditory thresholds of low frequency neurons in the prothoracic ganglia (AN1) which function as narrow band-pass filters (Schildberger et al. 1989). Secondly, the stimulus range over which we tested female response affects the efficacy of the RRJ-method. The unreliable phonotactic response at the extreme male trait values affects the position of the maximum of the Gaussian function and therefore the estimate of female preference for the RRJ-method, whereas the estimates of female preference from spline and polynomial regression methods are not greatly affected as evident from their significant correlation (Table 7). Since female phonotactic response is reliable close to the predicted species preference for field crickets and for some stimuli has a shape similar to that of an inverted Gaussian function, the RRJ-method may be useful if applied to data comprising phonotactic response to many stimuli situated close to the predicted species preference for that trait. For example, there is strong congruence between the preference estimates for FQ derived from the spline, polynomial regression and RRJ-method where we excluded the phonotactic response data for the two lower and upper extreme stimuli (min2; Table 7).

Female preference (B) derived from the RRJ-method was unlikely to reflect true female preference of our crickets (see discussion above), and therefore the estimates of repeatability for B are not biologically relevant. Measurements of response magnitude (R) at B are consequently also problematic in the case of our data. This can be visualised by the poor correspondence between R and the raw phonotactic response data (Fig. 2b). The two significant repeatability estimates of R from the RRJ-method (BW and FQ in Table 3) are therefore interpreted as spurious. Furthermore, the single significant repeatability estimate for FQ selectivity (C) derived from the RRJ-method is also likely to be spurious since both the spline and polynomial regression methods failed to detect repeatable selectivity for FQ (Table 3). We therefore do not discuss repeatability estimates derived from the RRJ-method.

Repeatability of response functions

Response magnitude (R′)

The phonotactic response magnitude (R′) at the peak preference (B and B′) was not always repeatable (Table 3). Due to high phonotactic precision by all females there was little observable variation in R′ between females when they were tracking the STD1 signal (low standard deviations in Table 2). The smoothing effect of the spline further reduced between-individual variation in R′ so that it was not significant and hence repeatability was non-significant. Conversely, the waviness of the polynomial regression (see Schluter 1988) inflated between-individual variation in several cases and yielded significant repeatabilities for three of the trials. We do not believe that, for our phonotactic data, R′ is a reliable surrogate for the response rate measured by Reinhold et al. (2002) who did not use phonotactic data. Although variation in responsiveness can mask variation in preference in some systems (Brooks and Endler 2001), it is not likely to affect the calculation of female preference here as we quantified preference from phonotactic response data independently of the response magnitude.

Female preference (B′)

For the repeatability of female preference (B′) (calculated from the polynomial regression and spline methods for the four male song traits; Table 3) there was strong congruence between preference estimates derived from three different methods (Table 7). For all of the trials, repeatability estimates for B′ were significant, suggesting possible quantitative genetic variation in preference for these male traits (see Brooks and Endler 2001). The very broad preference for duty cycle decreased the between-individual variation in preference and the repeatability is therefore only marginally significant. However, the other estimates of preference repeatability (BW, FQ and SP) are far greater than the preference repeatabilities for number of pulses per trill (0.50) and the inter-trill interval (0.59) reported for G. integer by Wagner et al. (1995). The large degree of between individual variation as reflected by the individual response functions in Fig. 5 therefore reflects true and repeatable sensory differences.

Female selectivity (C′)

There was almost no difference in the repeatability estimates of C′ calculated from the polynomial regression and the spline method (Table 3). The absence of repeatable selectivity for DC arises since females responded well to all duty cycles except the upper and lower extremes (10 and 90%). Consequently, almost no between-individual variation in selectivity for duty cycle was detected and therefore duty cycle is unlikely to be an important signal trait for sexual selection in G. bimaculatus. Conversely, females were highly selective for frequency and no between-individual variation (and repeatability) in selectivity could be detected. This does not rule out the possibility of heritable variation in selectivity for this species (see Brooks and Endler 2001). The significant repeatability of selectivity for BW probably reflects a combination of repeatable frequency preference and similar frequency selectivity between females, since frequency preference is likely to determine bandwidth preference in crickets (Simmons and Ritchie 1996). Females showed significant repeatability of selectivity for the SP and SP(Rev) trials (Table 3). Doherty (1985b), using similar equipment to ours but a crude measure of phonotactic response (% time tracked), showed that the syllable period range tracked by G. bimaculatus females differed between to and fro sequential trials. He did not calculate individual response functions but used population-level data to arrive at this conclusion. Our results using BatD show that SP selectivity is repeatable, even if the sequence of stimuli is reversed and further illustrate how population-level analyses can mask relevant individual variation in response (Kime et al. 1998). To our knowledge, the only other evidence for repeatability of female selectivity comes from guppies (Brooks and Endler 2001) and grasshoppers (Reinhold et al. 2002).

Effect of age on female preference

Female preference was not affected by age (Table 6). Despite the significant degradation of phonotactic precision just before a female dies (Table 5), her preference is still measurable, similar to the findings of Gray (1999), Reinhold et al. (2002) and Olvido and Wagner (2004). The findings of these authors together with those of this study reveal no effect of age on female preference while responsiveness appears to degrade with age. Although our measure of phonotactic response is not directly comparable to a measure of responsiveness, the decline in accurate tracking ability is likely to reflect both neurological changes as well as changes in motivational level.

Shape of response functions in G. bimaculatus

Phonotactic response is a complex interaction between environmental, neurological and behavioural components and is therefore not easily quantified. The quantification of the shape of female response is therefore not simple and requires complex techniques such as cubic regressions (Olvido and Wagner 2004), cubic splines (Ritchie 1996; Brooks and Endler 2001; Ritchie et al. 2001; Simmons et al. 2001) or non-linear regression (Reinhold et al. 2002). Whenever possible, experiments to determine female preference for a particular male trait should test an abnormally wide range of stimuli, i.e. extreme trait values that do not occur in the natural population on either side of the trait distribution (e.g. Shaw and Herlihy 2000; Grace and Shaw 2004). Female response functions based on such tests will mostly be unimodal (except in the case of strong directional selection) and therefore the shape of the response function is relatively easy to approximate with a high order polynomial equation or a nonlinear technique such as the RRJ-method (Reinhold et al. 2002). However, habituation or fatigue due to multiple testing may mask variation in some response functions (Brooks and Endler 2001). A way to avoid this problem is to limit the number of stimuli presented to a female by subjecting females to stimuli differing at a relatively course-scale. Fine-scale preference can then be estimated by interpolation using mathematical tools. The applicability of these methods for inferring fine-scale preference from course-scale experiments is not limited to the field of bioacoustics.

Our experimental procedure to determine female preference has several advantages over other methods. Firstly, the use of the Kramer treadmill has numerous advantages over binomial choice tests in an arena, constant SPL of the sound source being the most crucial. Also, the exact path of the untethered female can be investigated after completion of the experiment. Sequential presentation of stimuli has many advantages over choice tests (Wagner 1998; Murphy and Gerhardt 2000; Poulet and Hedwig 2005) although carry-over effects may sometimes affect phonotactic response (Doherty 1985b; Wagner 1998). However, the high and significant preference (B′) and selectivity (C′) repeatability of the SP(Rev) trial, where we reversed the sequence of stimuli (Table 3), suggests that, at least for temporal stimuli, it is unlikely that the sequence of stimuli affects the response of female G. bimaculatus.

Conclusion

Phonotactic response in G. bimaculatus varies between individuals, is repeatable within individuals if the signal is close to the predicted species preference, and is a reliable measure for inferring individual preference at the population level. We have developed methodology to quantify female preference and selectivity from phonotactic response that is independent of phonotactic asymmetry and the effects of fatigue. The spline method for estimating female preference may be directly applied to the study of female preference in other taxa where a quantitative measure of female response is generated. We have shown that female preference for certain male traits is highly repeatable and that although a females’ phonotactic precision declines within a few days of death, it does not affect her preference. Since we randomly selected females for our experiments, the observed repeatable variation in female preference may be due to either environmental effects during development or heritable differences between females. Low estimates of female preference repeatability reported in the past may have been a result of the lack of appropriate tools and methodology for quantifying female preference accurately (Wagner 1998). The combination of significant between-individual variation and the high repeatability of female preference for BW, FQ and SP in G. bimaculatus creates opportunities for new experiments in the field of sexual selection.