Introduction

Mapping genes for natural variation in behavior

The need to understand within- and between-population variation in behavior, especially social behavior and its evolution, remains a central theme in biology. Honeybee societies have facultatively sterile female workers and a specialized workforce. Honeybee social behavior is dramatically demonstrated by the efforts of worker bees collecting and hoarding pollen and nectar, and mass stinging responses in defense of the nest. Division of labor within a honeybee nest is a consequence of age-related changes in physiology plus genetic variation within the colony for tendency to perform specific tasks (Page and Robinson 1991). A number of studies have demonstrated genetic variation within and between honeybee populations in specialization for water, nectar, and pollen collecting (reviewed in Page et al. 2000) and variation in defensive responses (reviewed in Breed et al. 2004). The genetic architectures of these two task sets are more thoroughly characterized than other behavioral traits in the honeybee. For these reasons, we sought to determine whether it is possible to define, within a manageable number of candidate genes, regions influencing these behaviors.

Quantitative trait locus (QTL)-mapping studies provide information about the genetic architecture of a trait that cannot be determined by other methods. This information includes estimates for the number and location of loci influencing population variation in the trait, the mode of inheritance of these loci (dominance, epistasis, or imprinting effects), and amount of phenotypic variance each locus accounts for (Lander and Botstein 1989). But QTL studies do not have a good track record for isolation of causal genes. For example, genes for only 20 of 2,000-rodent QTLs have been cloned (Flint et al. 2005). Identifying genes responsible for naturally occurring phenotypic variation is especially challenging for behavioral traits where complex interactions of genes and the environment are expected (Flint 2003; Plomin and McGuffin 2003; Arnholdt and Mackay 2004; Goldman et al. 2005).

The endgame in determining nucleotide sequence variation responsible for natural variation in honeybee behavior will require identification of QTLs, confirmation in independent crosses, fine-scale mapping, expression assays, and finally, experimental modulation of gene expression or complementation tests (Mackay 2004). The biggest problem in taking a map-based approach to isolation of genes is that confidence intervals (CIs) for QTL location are usually quite large. The size of the CI primarily depends on the number of individuals in the mapping population, the number of genetic markers scored, the magnitude of effect that the QTL has on the phenotype, and the recombination rate (numbers of crossovers per unit of chromosomal distance). The first two factors, sample size and marker density, are important for reducing CIs, but they are subject to diminishing returns as more individuals or more markers are added to the experiment. The effect of the QTL is a major concern because QTLs often account for only a small proportion (often <5%) of phenotypic variance. The QTL effect can only be increased up to a certain level by improving the phenotypic assay and using the most appropriate cross (Darvasi 1998; Arnholdt and Mackay 2004). On the other hand, recombination rate can have a large effect on the size of chromosomal regions covered by CIs because the physical size scales inversely with recombination rate. For this reason, fine-scale mapping studies are often designed to effectively increase recombination by taking advantage of historical recombination events through linkage-association studies in populations or by developing multigeneration, recombinant inbred lines (Darvasi 1998). In the honeybee, map resolution is enhanced by a meiotic recombination rate that currently ranks highest among metazoans (Hunt and Page 1995; Solignac et al. 2004), so that large genetic distances correspond to relatively small physical regions containing few genes.

In this study, we review what is currently known about the genetics of honeybee foraging and defensive behaviors, and how a map-based approach leads us to a manageable number of candidate genes that seem to fit what is known of the behavioral patterns. Information from the draft honeybee genome sequence (Honeybee Genome Sequencing Consortium, HGSC 2006) was used to delimit sets of candidate genes flanked by marker sequences that were identified in prior QTL studies, and expression data from candidate genes provided additional information on likely candidates.

Behavioral genetics of foraging

Foraging specialization involves two components: the onset of foraging that establishes division of labor between foragers and bees that perform tasks inside the hive and the subsequent bias in foraging for either pollen or nectar. These components are related. Bees from lines selected for storing more surplus pollen (hoarding) are more likely to initiate foraging earlier in life and to specialize in pollen collection than bees from lines selected for low pollen hoarding. These associations are also linked to sensory response in that bees selected for higher levels of pollen hoarding are more responsive to low concentrations of sugar (less discriminatory) when tested with the proboscis extension assay (Fig. 1 and Page et al. 1998). This link between sucrose responsiveness and the task of pollen foraging is also present in “wild-type” bees that have not been selected for pollen hoarding (Scheiner et al. 2004).

Fig. 1
figure 1

Genetic and phenotypic associations involved in foraging division of labor. Arrows indicate significant correlations between phenotypic traits at the levels of behavior, hormonal signaling, and development and associations between traits and genotypes at specific QTLs (pln 1–4). Colored lines indicate relative titers of specific hormones. The picture illustrates a method for determining the threshold concentration of sugar that a bee will respond to by extending its proboscis or tongue

QTLs were mapped based on whole-colony behavioral traits, and QTL effects were subsequently confirmed based on the behavior of individual bees. Three “pollen” QTLs, designated as pln-1, pln-2, and pln-3, were detected (Hunt et al. 1995; Page et al. 2000) based on the quantity of pollen in colonies from a backcross population derived from high and low pollen-hoarding strains. Association of marker alleles near the QTLs with individual foraging traits within single backcross families of bees confirmed the effects of pln-1, pln-2, and pln-3 on behavior. Pln-1 and pln2 were associated with the size of the pollen loads collected by workers (Fig. 2a and b). Pln-2 and pln-3 were shown to influence the discrimination for the sugar concentration of the nectar collected (Hunt et al. 1995; Page et al. 2000; Fig. 2b,c). Subsequent association studies using a candidate gene AmFor as a marker have mapped and confirmed an additional QTL region designated as pln-4 (Rueppell et al. 2004). This QTL maps to about 50 cM from pln-1. Allelic variation and pleiotropic effects of these QTLs have been associated with sucrose responsiveness and age at onset of foraging (Fig. 1 and Rueppell et al. 2004, 2006).

Fig. 2
figure 2

QTLs were mapped based on the amount of pollen stored in combs of colonies and confirmed based on individual behavior. Solid bars represent linkage groups with markers. Markers used in confirmation studies are shown. Orthologs of fly genes are indicated with arrows. Dashed lines indicate 97% CIs. Sequenced markers are underlined. No figure is shown for pln-4 because it was mapped by association to one marker (AmFOR) rather than by interval mapping. a Pln-1 Map on the left is the localization of the QTL based on colony pollen stores. Map on the right is based on response thresholds to sucrose of individual worker bees. b Pln-2 QTL map on the left is based on colony pollen stores in a cross between European strains and the map on the right is based on a European by African strain cross. c Pln-3 QTL map is based on colony pollen stores

Preferential foraging for either a nectar or protein source such as pollen are sequential parts of the gonotrophic cycle of many insect females. When nonreproductive, females tend to forage for nectar as a carbohydrate source for maintenance. When reproductively active, insects such as solitary bees and mosquitoes seek protein that is incorporated into eggs (Amdam et al. 2004). Although worker honeybees are facultatively sterile, they can produce eggs if their ovaries develop sufficiently as happens in the absence of a queen. Amdam et al. (2004, 2006) hypothesized that remnants of the ancestral gonotrophic cycle and the correlated foraging behavior remain and “drive” foraging behavior (Fig. 1). In support of this hypothesis, it was established that workers from the high pollen-hoarding strain are characterized by elevated titers of the conserved yolk precursor vitellogenin and have larger and more active ovaries than low strain bees. It was also found that bees that were unselected for pollen hoarding but had enlarged ovaries foraged earlier in life showed a preference for pollen foraging and collected nectar of lower concentration than bees with fewer ovarioles. Thereby, “wild type” bees show the same correlated phenotypes that differ between high and low pollen-hoarding strains (Amdam et al. 2006).

In honeybees, ovariole number is determined during the third larval instar through a nutrient-dependent endocrine signaling cascade. The endocrine factors, juvenile hormone and ecdysteroids, are involved in the initiation of vitellogenin expression at adult emergence, and vitellogenin and juvenile hormone interact during adult life to affect sensory responsiveness and onset of foraging behavior (Guidugli et al. 2005). In solitary insects, endocrine cascades involving juvenile hormone and ecdysteroids have pleiotropic effects on sensory tuning, yolk protein production, ovarian physiology, and life span (Amdam et al. 2004; Flatt et al. 2005; Guidugli et al. 2005). This hormonal pleiotropy appears to be regulated by upstream signaling through the insulin/insulin-like signaling (IIS) pathway (Claeys et al. 2001; Flatt et al. 2005; Tu et al. 2005). The association between traits in high and low pollen-hoarding bees, therefore, suggests that honeybee foraging division of labor has evolved from an ancestral reproductive regulatory network involving IIS. With knowledge of the association between components of physiology and foraging behavior, we expected an overrepresentation of genes involved in IIS and ovarian development within the CIs for “pollen” QTLs.

Behavioral genetics of defensive behavior

Honeybee defensive behavior is not as thoroughly characterized as foraging behavior in terms of correlated physiological and sensory traits. Honeybees exhibit defensive behavior near the nest, but highly defensive bees may pursue for considerable distances away from the nest. Defensive behavior involves at least two tasks: guarding behavior at the hive entrance and flying out and stinging. Guards specialize in exploratory behavior in the nest entrance. They learn to recognize the hydrocarbon blend in the cuticles of their nestmates by olfaction, and they reject non-nestmates by biting or stinging. Only 10–15% of workers have been observed to guard the entrance during their lifetime (Moore et al. 1987). Both the number of days that individuals in a colony persist at guarding and the number of bees guarding the nest entrance correlate with the intensity of the stinging response (Arechavaleta-Velasco and Hunt 2003; Breed et al. 2004).

Multiple sensory modalities influence stinging behavior. A moving visual stimulus usually is necessary to release stinging behavior (Free 1961). Substrate vibrations also increase the chance of a mass stinging response. Alarm pheromone also is an important component of colony defense. This pheromone blend is released from the sting apparatus during the act of stinging and as guards extrude their stings at the colony entrance in response to relevant stimuli. A transient increase in metabolic rate occurs after exposure of bees to alarm pheromone, and this increased rate genetically correlates with the defensiveness of colonies (Southwick and Moritz 1985; Moritz and Southwick 1987; Andere et al. 2002). Although the alarm pheromone components vary with strains of bees, QTLs influencing this variation were distinct from QTLs influencing stinging behavior (Hunt et al. 1999, 2003). Defensive strains of bees respond more quickly to all of these stimuli.

Crosses involving highly defensive African-derived honeybees and low-defensive European races were used to map putative “sting” QTLs based on colony-level stinging assays at hive entrances (Hunt et al. 1998). Subsequent crosses with stocks unrelated to the first studies confirmed that three “sting” QTLs affect individual guarding behavior because guards from a backcross family were more likely to have the allele from the defensive parent of the F1 queen mother than were sisters chosen at random (Fig. 3; Arechavaleta-Velasco et al. 2003). These sting QTLs also were associated with higher activity levels of colonies, which was assessed as their tendency to fly up or to sting when colonies were opened (Hunt et al. 1998). The QTL that had the largest effect on the phenotypic variance of colony stinging responses, sting-1, was shown to influence individual stinging behavior in two independent studies (Guzmán-Novoa et al. 2002; Arechavaleta-Velasco et al. 2003). Higher activity levels, faster stinging responses, and greater sensitivity to stimuli exhibited by high-defensive strains suggest that sensory signaling pathways and heightened neuronal activity in the central nervous system (CNS) are involved in the defensive response, so we searched the QTL CIs for conserved genes with neuronal functions.

Fig. 3
figure 3

Genetic and phenotypic correlations in defensive responses and specific QTLs. Arrows indicate significant associations between QTL genotypes and behavioral traits and between individual guarding behavior and colony stinging response

Materials and methods

QTL mapping and confirmation

QTLs’ influencing traits related to foraging behavior or defensive behavior were mapped and confirmed previously (Hunt et al. 1995, 1998; Page et al. 2000; Rueppell et al. 2004, 2006), but new analyses were performed on the data to include additional markers, and in addition, “sting” QTLs were reanalyzed by combining two traits as stated below. Maps were based on crosses that are appropriate for a haplodiploid, colonial species. For QTL detection, haploid drone progeny of an F1 queen were each backcrossed to sister queens (that were daughters of a single haploid drone). Colony phenotypes were correlated with inheritance of paternal marker alleles. Whole-colony phenotypes resulted from the behavior of many individuals, such as the number of stings in a leather patch in 1 min or the area of wax combs containing stored pollen. Confirmation studies all involved the use of F1 queens backcrossed to single drones and analyses of worker progeny for individual behavior and genotypes. Linkage maps that were used to identify candidate genes were constructed with JoinMap (3.0) and MapManager QTX software using the Kosambi mapping function (Van Ooijen and Voorrips 2001; Manly et al. 2001), and interval mapping was performed with MapQTL (v. 4.0; Van Ooijen et al. 2002). Interval mapping was performed as previously described, except that new analyses were used to map defensive-behavior QTL that combined two sets of phenotypic data. Individual z scores for the two correlated traits, the number of stings in 1 min, and ratings for the degree to which bees flew up at the beekeeper during colony manipulations were averaged to produce composite z scores used for interval mapping. The z scores were calculated using the formula z = (yμ)/δ, in which y is the phenotypic value, μ is the mean, and δ is the standard deviation. This provides a new trait value with a mean of zero and standard deviation of one. The colonies were rated on a relative scale of one to five based on the researchers’ experience for the tendency of bees to fly up during colony manipulations. Ratings from two observation periods were averaged. This analysis resulted in somewhat reduced CIs and higher LOD scores for the three QTLs (3.84, 2.25, and 2.39, respectively for sting1, sting2, and sting3). A map with 1,154 markers was used to locate candidate genes for “sting” QTLs, and two maps, each with about 400 markers, were used to locate “pln” QTLs. Markers in common between the dense map and other maps were used to interpolate the location of pln QTL and align QTL CIs with the physical map because more cloned markers were available from the dense-map population.

Simple interval mapping was used rather than multiple factor interval mapping to determine CIs to have a more conservative (inclusive) search for candidate genes. The genomic region within 1.5 LOD value of the LOD-score peak was used to define each CI for QTL location. This corresponds to an approximate 97% CI for QTL location in a dense-marker map (Dupuis and Siegmund 1999). The LOD score is a log-likelihood estimator, for the probability that a QTL influencing the trait is present at a given map position (Lander and Botstein 1989). The honeybee genome assembly (v. 3.0) and positions of sequenced markers were used to identify predicted peptides that were evaluated for likely gene function.

Cloning and sequencing marker fragments

Sequences derived from DNA markers were obtained to integrate physical and genetic maps. More than 300 marker fragments not only from (primarily) amplified fragment length polymorphic markers (AFLPs), but also from random amplified polymorphic DNAs (RAPDs) and microsatellites linked to behavioral QTLs and throughout the genome, were cloned and sequenced. The first step in cloning was the excision of fragments from gels. For AFLPs from polyacrylamide gels, products were re-amplified and resolved on agarose gels to verify correct size before cloning with the TOPO-TA cloning kit and the pCR4-TOPO vector (Invitrogen, Carlsbad CA). RAPD marker fragments were excised from agarose gels and cloned into the same vector. Multiple sequences were obtained from each clone, and the consensus sequence was aligned with the genomic sequence scaffolds (HGSC 2006) using the nucleotide–nucleotide basic local alignment search tool (blastn) algorithm with Pymood BLAST software (Allometra, Davis CA).

RACE and cDNA cloning

Before expression analyses by quantitative real-time polymerase chain reaction (qRT-PCR), cDNA cloning was performed to confirm sequences and the gene prediction (location of introns and exons) and to provide information on sequence variation within some of the candidate genes. In the case of the serotonin receptor, this process provided complete sequence of the gene by making primers based on sequence of a putative G protein-coupled receptor (GPCR). Rapid amplification of cDNA ends (RACE) and cloning was performed using kits and manufacturers’ instructions. Total RNA was extracted from individual bees using the RNAqueous kit (Ambion, Austin TX). The cDNA was synthesized using the SMART PCR cDNA Synthesis kit (SMART, simple modular architecture research tool, BD Biosciences, Palo Alto CA). The cDNA clones were obtained using the TOPO-TA kit and the pCR4-TOPO vector (Invitrogen). Clones were sequenced from multiple worker bees. Several sequence reads were obtained from each clone.

Comparative bioinformatics

This study is based on predicted peptides from the draft sequence (HGSC 2006). However, gene ontology (GO) terms and functional annotation of many of the homologous genes are incomplete. For this reason, further analyses were performed. Predicted peptides from the HGSC “Glean3” dataset that were found within QTL CIs by first using blastn to determine which sequence scaffolds from the genome assembly contained sequences corresponding to markers within the CI. Then, scaffolds were searched for presence of predicted peptides using protein–nucleotide 6-frame translation (tblastn). Predicted peptides from the QTL CIs were used to search the nonredundant database (using protein–protein BLAST [blastp]) at the National Center for Biotechnology Information website (http://www.ncbi.nlm.nih.gov). Protein domain information, gene similarities, and GO terms of homologs or orthologs were recorded. Literature and website searches were performed to assign putative functions based on reports from homologous genes of various species.

Quantitative real-time PCR

Some candidate genes for defensive behavior QTLs were tested for differential expression to better evaluate their potential for influencing behavior. These were chosen from a list of genes we were initially interested in based on functional studies in other species. Two families of worker bees were used: a low-defensive source and high-defensive source, each having a queen naturally mated to about 12–17 drones. The source families differed in the number of stings per minute deposited in a leather target (0 for low line and more than 100 for the high-defensive source) using a standard assay (Hunt et al. 1998). Although it is not known at which life stage a gene might exert its influence on defensive behavior, bees that are 10- to 20-days old are much more likely to sting than younger bees. Workers were collected within 12 h of emergence from brood combs placed in an incubator or marked and co-fostered in an unrelated hive and then collected 20 days later. Co-fostering was performed to eliminate potential environmental effects between hives. All bees were frozen in liquid nitrogen and kept at −80°C before RNA extraction. Heads of eight to nine bees from each of the two families were removed, and RNA was extracted using the RNAqueous® kit (Ambion). RNA yield was quantified using Ribogreen™ (Molecular Probes) dye on a fluorometer (Turner Biosystems, Sunnyvale CA). An aliquot of RNA was treated with DNAse (DNAfree® kit from Ambion) to remove any genomic DNA contamination before cDNA synthesis.

The cDNA template for qRT-PCR was generated according to Puthoff et al. (2005). For first-strand cDNA synthesis, the SuperScript First Strand cDNA Synthesis kit (Invitrogen) was used as per manufacturers protocol. The cDNA synthesis was monitored in a parallel tracer reaction as follows. A 5-μl aliquot was removed from each sample and mixed with 1 μl of a 1:5 dilution of 32P–dCTP (dCTP, 2′-deoxycytidine 5′-triphosphate, Amersham, Piscataway, NJ) in water. The remaining 15 μl of each reverse transcriptase reaction and the corresponding 5-μl tracer reaction were incubated at 42°C for 2 h. Reactions were terminated at 70°C for 15 min then chilled on ice. The 32P-tracer reactions were used to quantify the amount of cDNA synthesized in the larger experimental samples. The 32P-tracer reactions were spotted onto DE-81 filters (Fisher, Fairlawn, NJ), dried for 10 min, and then washed four times for 4 min each in 0.5 M sodium phosphate buffer (1 M of monobasic and 1 M of dibasic in 4 l of water). After two 1-min rinses in water, filters were washed in 95% EtOH and allowed to dry. Each filter was placed in a scintillation vial containing 5 ml of ScintiVerse (Fisher), and radiation from the newly synthesized cDNA was quantified in a scintillation counter. Resulting counts were used to normalize the cDNA from the corresponding reverse transcriptase reactions to a final concentration of 10 ng per μl of sample.

qRT-PCR was conducted on an ABI 7000 using the following mixture: 2 μl of normalized cDNA, 10 μl of 2X SYBR Green Mix (ABI, Foster City CA), and 0.25 μM of each primer in a 20-μl reaction. Reactions were carried out using the following cycling parameters: 50°C for 2 min, 95°C for 10 min, and 40 cycles of 95°C for 10 s and 60°C for 1 min. At the end of each run, a melt curve analysis was conducted to ensure primer specificity and purity of the PCR product. Relative mRNA levels were calculated by the standard curve method (User Bulletin 2: ABI PRISM 7700 Sequence Detection System) as described here. An aliquot was taken from each cDNA to construct a pooled sample. This pooled sample was serially diluted and subjected to qRT-PCR. The threshold cycle (Ct) for each dilution was plotted against its cDNA concentration (with an arbitrary starting quantity for the undiluted pooled sample assigned the value of 1) and used as the standard curve regression equation to generate the arbitrary expression values (AEVs). A standard curve was generated for each target gene on the same PCR plate that held the experimental samples. Linear standard curves, with a slope between −3.5 and −3.2 and R 2 value of at least 0.98, were required for all primers used in this study. The AEVs were then normalized to the expression values of the eukaryotic initiation factor EIF-S8 for each bee. Two technical replicates for each bee were used, and results averaged before obtaining the family average. Normalized, average AEVs for each bee were analyzed by two-factor analysis of variance (ANOVA) without replication to compare transcript levels of the high-defensive family to the low-defensive family.

Results and discussion

Bioinformatic analyses of putative gene functions and results of qRT-PCR allowed us to form hypotheses concerning gene networks involved in either foraging or defensive behaviors. Results of analyses and hypotheses concerning genes with potential to influence behavior are presented in the following two sections.

Candidate genes for honeybee foraging behavior

Inspection of the predicted peptides (HGSC 2006) in genome sequence surrounding the mapped QTLs lends support to the hypothesis of the involvement of the IIS pathway in pollen foraging (Table 1; Fig. 4). Within, the pln-1 CI is the bee ortholog of the Drosophila gene bazooka, a gene involved in oocyte fate determination and influencing IIS through modulation of PI3K activity, as discussed below. Closer to the center of the CI is the honeybee ortholog of the fly gene, midway, which is involved in lipid metabolism and oocyte development. This gene is particularly interesting because it encodes diacylglycerol acyltransferase (DGAT) and changes in DGAT activity have been shown to alter sensitivity to IIS (Yu and Ginsberg 2004). Near the center of the CI of pln-1 is a gene encoding a protein with homology to class W phosphoinositolglycan-peptide (PIG-P), involved in the production of glycosylphosphatidylinositol (GPI) anchors to attach receptors to plasma membrane in various species. Lipids/GPI anchors have insulin mimetic properties in some systems, where they modulate the IIS pathway by stimulating phosphoinositide-3-kinase (PI3K) activity (Müller and Frick 1999; Müller et al. 2002).

Table 1 Annotation of peptides within 97% confidence intervals for foraging-behavior QTLs
Fig. 4
figure 4

Hypothetical regulatory network influencing honeybee pollen foraging behavior modulated by insulin-like signaling and its effects on ovarian development. Inhibitory blue arrows bridging IIS with the reproductive physiology and hormonal dynamics of the honeybee denote the unique and mutually suppressive feedback interaction between vitellogenin and JII. This interaction is mediated via the allatoregulatory system (Guidugli et al. 2005, and references therein), which includes the IIS pathway (Flatt et al. 2005). ILPs Insulin-like peptides; PI phosphoinositol; PIP phosphoinositol phosphate; IRS insulin receptor substrate gene; PI3K phosphoinositide-3 kinase (class I or II); PIP5K 1-phosphatydylinositol-4-phosphate 5-kinase; PIG-P phosphatidylinositolglycan-peptide; PDK1 3-phosphoinositide-dependent kinase 1; PKB protein kinase B; HR46 honeybee ortholog of Dmel/HR46; PTEN phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase; JH juvenile hormone

The pln-2 region contains two LOD-score peaks that were resolved into two CIs, suggesting separate but linked QTLs (Fig. 2b). The CI with the higher LOD score contains a bee ortholog of a nuclear hormone receptor (Dmel/HR46). Nuclear hormone receptors bind ligands (such as ecdysteroids) and enter the nucleus as transcription factors (Simonet et al. 2004). The bee ortholog of HR46 was differentially regulated in microarray experiments that compared bees with and without application of queen mandibular pheromone treatment, which inhibits worker ovarian development and delays onset of foraging behavior (Grozinger et al. 2003). The pln-2 region also contains the tyramine receptor AmTyr1, which has a higher expression level in young bees selected for high levels of pollen collection and storage (M. H. Humphries, unpublished data). A Drosophila mutant for the tyramine receptor (hono) is deficient in its behavioral response to olfactory stimuli, an observation that is relevant because honeybee foragers respond to floral scents and brood pheromone. In addition, tyramine increases responsiveness to sucrose and is found at elevated levels in the brains of egg-laying worker bees, a pattern that fits the behavioral and reproductive state of pollen foragers (Scheiner et al. 2004). The second CI associated with pln-2 contains an ortholog of Dmel/skittles, encoding 1-phosphatydylinositol-4-phosphate 5-kinase (PIP5K), one of two phospholipid kinases known to produce phosphoinositol 4,5 phosphate (PI4,5P2), which is a key metabolite involved in IIS and the substrate of class I PI3K activity (Fig. 4; Carricaburu et al. 2003). The Drosophila skittles mutant is deficient in oocyte polarity and nurse cell development, and the gene is essential for germ line development (Hassan et al. 1998; Table 1).

At the most likely position for pln-3 is the honeybee ortholog of the fly gene for a class II PI3K (PI3K 68D). Class II PI3Ks have been shown to respond to insulin signals in mammals and use phosphoinositol to produce phosphoinositol phosphate, which influences glucose transport, a common effect of IIS (MacDougall et al. 2004; Shepherd 2005). In addition, three predicted peptides with glucose transport domains were found in the pln-3 CI. It is unknown whether class II PI3Ks can act directly on PI4,5P2 to produce PI3,4,5P3, thereby stimulating the primary downstream kinase in the IIS pathway 3-phosphoinositide-dependent kinase 1 (PDK1). However, the honeybee gene encoding PDK1 also lies within the pln-3 CI. PDK1 is a positive regulator of cell growth and size through its action on downstream protein kinase B (PKB; Fig. 4; Rintelen et al. 2001), which, in Drosophila, is required for egg chamber development and influences egg follicle cell size (Cavaleire et al. 2005). The presence of these genes at pln QTLs suggests a possible network of genes influencing ovarian development and foraging (Fig. 4). Linkage of a PI3K and PDK1 at pln-3 is intriguing, especially given the presence of PIG-P and the bazooka ortholog at pln-1 and the PIP5K ortholog near pln-2. PIG-P has the potential to activate PI3K, and thus PDK1, whereas bazooka has been shown to bind the protein tyrosine phosphatase PTEN. This phosphatase is a negative regulator of PI3K because it dephosphorylates PI3,4,5P3, and binding of dPTEN by the product of the bazooka modulates this process (Von Stein et al. 2005). The convergence of these pathways also suggests the potential for interaction between pln-3 and both the pln-1 and pln-2 regions, which has been observed experimentally (Rueppell et al. 2004, 2006).

A search of the 10 cM window surrounding AmFOR (pln-4) revealed only three predicted peptides, one of which was the insulin receptor substrate (IRS). Although AmFOR was chosen as a candidate because of its influence on foraging behavior in Drosophila and association with foraging-related behavioral states in bees (Ben-Shahar 2005), the IRS could be partly or wholly responsible for the behavioral effects of this QTL. In Drosophila, ovarian expression of this gene is necessary for vitellogenesis, independent of the action of juvenile hormone and ecdysteroids (Richard et al. 2006).

When the phenotypic architecture of foraging behavior is taken into consideration (Fig. 1), the identification of genes encoding class II PI3K and PDK1 at pln-3 and other key components of IIS at pln-1, pln-2, and pln-4, supports the hypothesis that IIS is the upstream mediator of foraging division of labor (Fig. 4). These IIS components do not constitute random distributions of genes. We obtained a rough estimate of the likelihood that a gene lies within the CIs for foraging behavior QTLs by comparing the genetic size of these four QTL regions relative to the genome (about 145:4600 cM, or 0.03). There are 12 Drosophila genes in Flybase with the GO term for insulin receptor signaling pathway (GO: 0008286), so the expected number of these genes within the CIs is 0.36. However, bee orthologs for two of these genes occur within foraging behavior QTL CIs, at least 5.5 times the expected number assuming independence of gene distributions. In addition, orthologs of four of these 12 genes were not in genome sequence assigned to chromosomes and so could not be sampled in our analyses. We also found at least four other genes known to interact with IIS, but not annotated as such with GO terms, and one nuclear hormone receptor within the QTL regions, which are characterized by epistatic interactions suggesting the presence of components of a common pathway. In contrast, we could not find any genes influencing IIS within CIs for QTLs that influence defensive behavior, although they represent a region of comparable size.

Candidate genes for defensive behavior

Sting-1 had the highest LOD score for colony stinging response and was also the only QTL associated with initiation of stinging at the individual-bee level (Guzmán-Novoa et al. 2002; Arechavaleta-Velasco et al. 2003; Fig. 5a). Among the 50 predicted peptides in this 1.2-Mb region, at least nine are orthologs or homologs of genes reportedly involved in neuronal development and CNS activity (Table 2). The interval includes the gene encoding 14-3-3 epsilon, a protein abundantly expressed in the CNS that modulates the activity of a number of kinases and ion channels (Berg et al. 2003). A Drosophila mutant for the ortholog (FBgn0020238) shows a failure to habituate to stimuli during nonassociative (unrewarded) learning trials (Skoulakis and Acevedo 2003), which is the type of learning that guard bees engage in when distinguishing nestmates from non-nestmates. The sting-1 CI contains six orthologs of Drosophila genes involved in CNS or antennal development. Of particular interest is the ortholog of the Dmel/tango gene, which is an aryl hydrocarbon receptor nuclear transporter (ARNT)-like transcription factor. Tango is a basic helix-loop-helix-PER-ARNT-SIM (bHLH-PAS) transcription factor that responds to hypoxia and is critical for the development of the fly neural midline and antennae. Other bHLH-PAS transcription factors act as heterodimers to sense light, temperature, oxygen, or endogenous hormones, and some have roles in regulating circadian rhythm (Roenneberg and Merrow 2003). The honeybee ortholog shares just 56% amino acid identity with tango and is diverged in the region important for activation of target genes, making it impossible to infer function (Sonnenfeld et al. 2005). The region also contains the gene for Huntingtin protein (htt), a large and unique protein with a complex structure that is conserved among metazoans. Expansions in htt cause Huntington’s chorea. Htt interacts with many proteins and has roles in modulating neuronal transcription, intracellular neuronal transport, synaptic transmission, and morphology of dendrites (Harjes and Wanker 2003; Li and Li 2004). Finally, mRNA for a carboxylesterase of unknown function was more abundant in high-defensive bees compared to low-defensive bees (see Table 3), and RACE sequences from seven workers revealed six amino acid substitutions and five alleles. Allelic variation for this gene was found in the population used to map sting-1, which is a necessary condition for a gene conferring variability in the behavior.

Fig. 5
figure 5

Defensive-behavior QTLs were mapped based on the stinging response of colonies derived from crosses involving haploid drones of a F1 queen (European×African) each backcrossed to a European queen. Markers used for confirmation studies are indicated. Letters and numbers next to vertical bar represent linked markers. Sequenced markers are underlined. Dashed lines indicate approximate 97% CIs. Approximate positions of honeybee orthologs to Drosophila genes are indicated. a Sting-1. b Sting-2. c Sting-3

Table 2 Annotation of peptides within 97% confidence intervals for defensive-behavior QTLs
Table 3 Expression of candidate genes in high-defensive bees relative to low-defensive worker bees

The primary stimuli that elicit stinging behavior are moving visual targets and alarm pheromone. The sting-2 region contains two obvious candidates for modulation of response to these stimuli. At the most-likely position of the QTL is the bee ortholog of Drosophila arr1 (AmArr4, Fig. 5b), an arrestin that binds metarhodopsin, the light-activated form of rhodopsin in the eye. Arrestins are involved in the desensitization of specific GPCRs and their recycling through clathrin-mediated endocytosis. But the so-called visual arrestins are also expressed in the antennae of Drosophila and are involved in olfaction. Fly arr1 mutants are insensitive to classes I and II odorants (Merrill et al. 2005). Near the edge of the CI is an ortholog of the fly gene encoding the metabotropic gamma-aminobutyric acid receptor (GABA-B-R1). GABA serves as the primary inhibitor of neuronal excitability in the CNS of both insects and mammals (Bettler et al. 2004).

The sting-3 CI contains only 17 predicted peptides in 0.96 Mb of DNA. Like sting-2, it has genes with the potential to modulate sensitivity to visual and olfactory stimuli. Results of cDNA sequencing of a putative GPCR revealed a 5-HT7 serotonin receptor (Am5HT 7, Fig. 5c). Serotonin influences associative learning and circadian rhythm in mollusks and insects, and both serotonin and GABA influence mood disorders in mammals (Hayley et al. 2005). Application of serotonin to the optic lobe of the bee brain reduced behavioral and neural responses to moving visual stimuli (Erber and Kloppenburg 1995). However, pharmacological experiments likely target all types of serotonergic neurons. The 5-HT7 receptor is just one of four serotonin receptor types known in insects and is unique in that it activates adenylate cyclase, resulting in increased levels of cyclic adenosine monophosphate (AMP) and activation of PKA. Therefore, activation of 5-HT7 receptors would likely cause a stimulatory response. Total serotonin levels have sometimes been reported to influence mammalian, insect, and crustacean aggressive interactions (Nelson and Chiavegatto 2001; Panksepp et al. 2003). Recent evidence suggests that 5-HT7 receptors modulate exploratory behavior and anxiety in mice (Takeda et al. 2005). The sting-3 region also contains one of the three catalytic subunits of cyclic AMP-dependent PKA, AmPKA-C1, a gene known to affect behavior in flies and mammals, including behavioral responses to alcohol, learning, and locomotor rhythm (Eisenhardt et al. 2001). Finally, the ortholog of homer, a dendritic gene involved in calcium signaling and synaptic plasticity, is closely linked to AmPKA-C1 (Diagana et al. 2002; Szumlinski et al. 2004).

Measurement of gene expression by qRT-PCR for defensive-behavior candidate genes showed several interesting trends. There were few statistically significant differences, because in this first screen, families were chosen that had divergent behavior but were genotypically diverse. Variation in expression between individuals was high, but numerical differences indicated a trend towards higher gene expression in older defensive bees. Only 14-3-3 Epsilon showed significantly higher transcript levels in the defensive family of bees. In contrast, the CG8165 ortholog, a putative jumonji-domain transcription factor, had numerically higher levels in newly emerged defensive adults, yet significantly lower levels in older defensive bees (Table 3). This suggests an earlier peak in expression of this transcription factor in defensive bees. It was interesting that the mRNA levels of the GABA-B-R1 receptor, part of the major inhibitory pathway of neural signaling, were significantly lower in newly emerged defensive bees. Transcript levels of seven other genes showed trends for higher expression in defensive bees at the >1.3-fold level. At sting1, an unknown carboxylesterase showed the greatest numerical expression difference of all, but its levels also showed the highest interindividual variability, resulting in no significant difference between families. As previously stated, this gene also showed the highest allelic variability in cDNA sequence. At sting2, both the ortholog of Dmel/discs lost and AmArr4 (arrestin) trended towards higher expression in defensive bees. Three other genes showed high numerical but not significantly different levels in defensive bees: oxysterol binding protein, the Dmel/tango ortholog, and the Am5HT 7 serotonin receptor (1.7-fold higher). Eight of the 19 genes tested appear to be more highly expressed in older defensive bees at 1.3-fold or higher although only one is significantly so. The fact that so many of these linked genes trended towards higher mRNA levels in defensive bees may mean that they are regionally regulated. More sampling of other bees and genomic regions is necessary. Some genes may be expressed at higher levels in high-defensive bees because of elevated metabolic rates (Harrison et al. 2005). Presumably, genes that influence behavior will show differences in expression levels of protein at some stage between low- and high-defensive alleles. However, demonstration of differences in transcript levels does not prove a causal connection to the phenotype. Conversely, failure to find a difference in transcript level does not disprove a causal connection to behavior and could be a consequence of not sampling the most relevant developmental stage, or that small differences in expression are still relevant to the phenotype.

The advantage of high recombination rates

By use of genome sequence and linkage maps, we reduced the list of candidate genes for honeybee foraging and defensive behavior from 10,157 (the current number of predicted peptides) to just 17 to 61 per QTL. As a consequence of high recombination rates, this level of resolution within 97% CIs was achieved in relatively large genetic distances, averaging 40 cM per CI. Table 4 shows a comparison of our results to a study in which QTLs for ovariole number were mapped in Drosophila. Because of the low recombination rate in Drosophila, CIs totaling 158 cM represented half the genome in this cross and contained 9,100 genes (Orgogozo et al. 2006). But regions of similar size influencing foraging and defensive specialization in the bee contained only 113 and 128 genes, respectively. Other model organisms would provide less drastic comparisons. For example, a 40-cM window in mouse would be expected to contain about ten times as many genes compared to the bee, or about 510 genes (assuming a total of 20,000 genes evenly distributed over 2800 Mb and a recombination rate of 0.56 cm/Mb). Of course, results will depend on local gene distributions and recombination rates. In practice, it is usually necessary to map a QTL to within 1 cM in mammalian species to reduce the list to five to ten candidate genes (e.g. Talbot et al. 2003; Flint et al. 2005). The genetic size of our QTL intervals compared to the size of the bee linkage map suggests that we should find 320 and 236 genes for the pln QTL and sting QTL, respectively, but instead, we found less than half this number (Table 4). The expected number of genes based on physical distance is closer to the observed value. The discrepancy between these predictions can be explained by higher than average recombination rates in the QTL regions. The average recombination rates for the pln and sting QTL intervals are 28.2 and 32.8 cM/Mb, respectively, but the genome average is 19 cM/Mb. Consequently, there is a little less than one gene per centimorgan in the QTL regions. Our analyses do involve some degree of uncertainty. First of all, the estimate of recombination rate in QTL regions does not take into account sequence groups that have not yet been assigned to chromosomes (21% of the genome; HGSC 2006) that may lie between assigned groups. Therefore, recombination rates actually may be somewhat higher in QTL regions than we estimated. Sequence groups missing within CIs also may contain additional genes. In addition, the annotated set of 10,157 genes is a high-confidence set and eventually, may increase by several thousands as genes that are more novel in sequence are added (e.g., Drosophila has a gene count of about 13,000). Finally, our analyses focus on conserved genes of known function. A previous study of 81 Kb of sequence linked to sting2 revealed 13 expressed transcripts, none of which showed homology to known genes (Lobo et al. 2003). It cannot be ruled out that these behaviors may be at least partly influenced by completely novel genes.

Table 4 Comparison of number of candidate genes in QTL confidence intervals from Apis and Drosophila a

Conclusions

Our findings lead us to propose that foraging division of labor (Fig. 1) is influenced by a gene network involving IIS (Fig. 4). This is just a hypothesis, but a testable one. We also suggest that the genetically variable defensive responses of bees may be explained by allelic differences in neuronal transcription factors and genes involved in G protein-coupled signaling pathways. The potential involvement of the Am5HT 7 serotonin receptor in defensive/aggressive behavior implies that the bee may be used to elucidate a role of serotonin in novelty-seeking (guarding) behavior and that this behavior could be modified by specific agonists and antagonists of this receptor subtype. Linkage mapping at a finer scale using many single nucleotide polymorphisms combined with genome-wide expression assays could be the next step in finding the sequences responsible for behavioral variation. The bee is likely to become an important species in this process and may become the first invertebrate model for understanding how gene-regulation of life histories are remodeled by social evolution.