Background

Lepidoptera sex pheromones are primarily C10-C18 long straight chain unsaturated alcohols, aldehydes or acetate esters [1], biosynthesised and released mainly from pheromone glands located between the 8th and 9th abdominal segments of the female moths. Usually the females use a mixture of compounds in a unique ratio to attract conspecific males [2]. The extremely high specificity and sensitivity of species-specific pheromones make them potential biological control agents for population monitoring, mass trapping and reducing pesticide use in integrated pest management (IPM) programs [35]. Further use of pheromones in such strategies would be aided by an understanding of the pathways involved in pheromone biosynthesis and transport.

Most sex pheromone blends of Lepidoptera insects are synthesised de novo via modified fatty acid biosynthesis pathways [2, 6, 7] and gland-specific enzymes are involved in desaturation, chain shortening, reduction and acetylation [1, 2]. Different species use different combinations of these reactions to produce unique species-specific pheromone blends. The first step is the synthesis of saturated fatty acid precursors malonyl-CoA from acetyl-CoA by acetyl-CoA carboxylase (ACC) and fatty acid synthetase (FAS) [8, 9]. Labeling studies conducted with acetate indicated that malonyl-CoA and NADPH are used by FAS to produce mainly saturated stearic acid (18:0) and palmitic acid (16:0) with 18 and 16 carbon atoms and no double bonds, respectively, as precursors [1012]. Modification of the fatty acid chain includes the introduction of a double bond by desaturases specific to pheromone biosynthesis followed by chain shortening using specific β–oxidation enzymes [13, 14]. So far, several types of desaturases have been extensively studied through gene characterization and expression analysis, including Δ5 [15], Δ9 [16, 17], Δ10 [18], Δ11 [19, 20], and Δ14 [21] desaturases. Once unsaturated pheromone precursor with a specific chain-length is produced, the carboxyl carbon is modified to form one of functional groups (aldehyde, alcohol or acetate ester). These modifications require the enzymes fatty acid reductase to produce the alcohols from the fatty acyl precursor [22], which in some species may be oxidized to aldehydes serving as pheromone components [23], and to acetate esters (OAc) by acetyltransferase [24]. Recently, a few members of the reductase gene family have been discovered and functionally characterized in several Lepidoptera species, including Ostrinia scapulalis[25], Heliothis virescens, Heliothis subflexa, Helicoverpa armigera, Helicoverpa assulta[26], Ostrinia nubilalis[27], Yponomeuta evonymellus (L.), Yponomeuta padellus (L.) and Yponomeuta rorellus (Hübner) [28]. A number of pheromone gland-specific enzymes have been identified and their essential functions in pheromone production demonstrated in vitro as well as in vivo. For example, using RNA interference, Matsumoto and colleagues showed that two pheromone gland-specific enzymes (acyl-CoA desaturase and a fatty-acyl reductase) are responsible for pheromone production in the silk moth Bombyx mori[2931].

After production and release of the sex pheromone components by female moths the males detect the pheromone and respond for mating. It is commonly accepted that pheromone molecules are captured and transported to the pheromone receptors on the dendrites of pheromone-sensitive neurons by olfactory binding proteins, including odorant binding proteins (OBPs) and chemosensory proteins (CSPs) [3234]. Pheromone binding proteins (PBPs) bind to sex pheromone components and classified into a subclass of OBPs [35]. After activation of the pheromone receptors the olfactory signals must be degraded rapidly to prevent from prolonged neuronal excitation [36]. This may involve pheromone degrading enzymes (PDEs) capable of degrading the pheromone molecules [37].

The black cutworm Agrotis ipsilon is a destructive polyphagous insect pest of many crops and for a strain from China the female sex pheromone blend comprises five main acetate components: (Z)-11-hexadecenyl acetate (Z11-16:OAc), (Z)-9-tetradecenyl acetate (Z9-14:OAc), (Z)-7-dodecenyl acetate (Z7-12:OAc), (Z)-8-dodecenyl acetate (Z8-12:OAc) and (Z)-5-decenyl acetate (Z5-10:OAc) [38]. These components indicate the involvement of different desaturases and ß-oxidases during the sex pheromone biosynthesis. However, the genes/proteins and their specific function in mediating A. ipsilon pheromone production, transport and degradation have not been characterized. Over the last few years, the next generation sequencing such as 454 pyrosequencing technique provides an easy and effective method for the discovery of novel genes. In present study, using the Roche GS FLX Titanium sequencing platform, we report a genetic database of the genes expressed in the pheromone glands of A. ipsilon and the identification of genes with putative roles in pheromone biosynthesis, degradation and transport as well as their tissue expression profiles.

Results and discussion

454 sequencing and unigene assembly

Sequencing of a cDNA library prepared from mRNAs of the pheromone glands of A. ipsilon gave a total of 631,425 raw reads with an average length of 517 base pairs (bp). After trimming adaptor sequences and removing low quality sequences, 629,273 clean reads remained with an average length of 496 bp. The size distribution of the clean reads is shown in Additional file 1. The sequences of all reads have been deposited in the NCBI SRA database with the accession number SRX189143.

The 629,273 clean reads were assembled into 23,473 unigenes, including 20,541 contigs (87.5%) and 2,932 singletons (12.5%), the largest transcriptome dataset so far from moth sex pheromone glands. An overview of the sequencing and assembly results is presented in Table 1. The length of the assembled unigenes ranged from 100 bp to 21842 bp with an average length of 770 bp. Among the unigenes, 22,035 (93.9%) are between 200 bp and 2000 bp long with an average length of 649 bp. These unigenes are in fact transcripts in the A. ipsilon pheromone gland cDNA library. Therefore we refer them as transcripts. All sequences of the unigenes used in the current study are provided in Additional file 2.

Table 1 Summary of A. ipsilon pheromone gland unigene sequences and assembly

Analysis of the transcripts from the A. ipsilon pheromone gland

BLASTx and BLASTn were used to compare each A. ipsilon transcript with a cut-off E-value of 1.0E-5 against GeneBank entries. 12,989 transcripts (55%) had BLASTx hits in the non-redundant protein (nr) databases and 9,392 (40%) had BLASTn hits in the non-redundant nucleotide sequence (nt) databases. This is consistent with a previous report of H. virescens pheromone gland ESTs [39]. Some of the A. ipsilon transcripts were homologous to those from more than one species but in general most were homologous to other Lepidoptera species taking up 2,379 in the 9,392 BLASTn hits, including 1,124 (12%) to B. mori entries. The second highest hits were to Dipteran species with 343 hits to D. melanogaster and 279 and 221 hits to the mosquitoes Anopheles gambiae and Aedes aegypti, respectively. The lowest hits were to the wasp Nasonia vitripennis (190 hits), the beetle Tribolium castaneum (147 hits) and the pea aphid Acyrthosiphon pisum (136 hits). The top 15 insect species that have significant BLASTn hits are shown in Figure 1.

Figure 1
figure 1

Top 15 insect species that have significant BLASTn hits. All A. ipsilon pheromone gland unigenes were used in BLASTn searches against the GenBank entries. The significant hits with an E-value >=1.0E-5 for each query were grouped according to species and the number of the unigenes that had significant homology is indicated after the specie name.

Gene Ontology of the genes expressed in the A. ipsilon pheromone gland

The 23,473 assembled transcripts were annotated into different functional groups according to Gene Ontology (GO) analysis. Some transcripts were annotated into more than one GO category. Of the 22,473 transcripts, 7,546 (32%) could be assigned to a GO category (Additional file 3). The “cellular process” and “metabolic process” GO categories were most abundantly represented with 4,056 (17.3%) and 3,361 (14.3%) transcripts, respectively, within the biological process GO ontology. In the “cellular components” GO ontology the transcripts were mainly distributed in cell (18.8%) (4,415 transcripts) and cell part (17.6%) (4,133 transcripts). The GO analysis also showed that in the molecular function ontology 3,271 transcripts (13.9%) were annotated as having binding functions and 3,484 (14.8%) to have catalytic activity.

Comparative analysis of transcripts in Lepidoptera pheromone glands

In order to compare the A. ipsilon pheromone gland transcriptome with those from other Lepidoptera and to identify A. ipsilon transcripts with potential involvement in sex pheromone production and transport we downloaded the pheromone gland ESTs of three other Lepidoptera A. segetum, B. mori and H. virescens from the dbEST database of NCBI and previously published pheromone gland transcriptome of H. virescens[39]. After assembling these ESTs we obtained 925 unigenes from A. segetum, 3943 from B. mori and 8202 from H. virescens with an average length of 384 bp, 692 bp and 474 bp, respectively. These are much lower numbers than that obtained by the current study through the 454 sequencing of the A. ipisilon pheromone gland, demonstrating that our pheromone gland transcriptome is currently the largest transcriptome resource for an insect pheromone gland.

When comparing the pheromone gland transcripts pairwise using best bidirectional hits, we found that there were 461 homologous transcripts between A. ipsilon and A. segetum, 1110 homologous transcripts between A. ipsilon and B. mori, and 2106 homologous transcripts between A. ipsilon and H. virescens (Figure 2). A large portion of A. ipsilon transcripts (86.4%) (20,274 out of 23,473) had no homologous ESTs in the available pheromone gland EST libraries of the other 3 species. This may be due to the larger dataset (23,473 unigenes) for A. ipsilon and lower coverage in the other studies. Nevertheless, it was shown that 309 transcripts, 5,755 transcripts and 2,556 transcripts are only found in A. segetum, H. virescen and B. mori, respectively, in our comparison (Figure 2).

Figure 2
figure 2

Comparative analysis of A. ipsilon pheromone gland unigenes with other insects. This shows the overlap of blast homology in genes expressed in pheremone glands in four species of Lepidoptera. The comparative analyses of A. ipsilon, H. virescens, B. mori and A. segetum pheromone gland unigenes were performed based on the Best Bidirectional Hits results (reciprocal BLASTn, E-value less than 1.0E-6).

Transcript abundance in the A. ipsilon pheromone gland

The pheromone gland mRNA samples used for constructing the cDNA library were non-normalized and non-amplified by PCR, so the reads in the sequencing dataset most likely represent the relative abundance of each assembled transcript in the pheromone gland as summarized in Table 2. The most abundant transcripts include vitellogenin, a major reproductive protein in insects (2,925 reads per kilobase per million mapped reads (RPKM); 2.2% reads), the precursor of egg yolk proteins for insect egg production [40] and genes involved in PBAN stimulated pheromone production such as lipase 3 [41] (4,731 RPKM; 0.8% reads) and in sex pheromone biosynthesis such as acyl-CoA desaturase (1,206 RPKM; 0.3% reads) and in lipid transport such as apolipophorin III (2894 RPKM 0.4% reads). Another highly abundant transcript (Unigene_721) with 1,365 RPKM encodes a CSP with a 76% protein identity to the H. virescens CSP (Protein ID: ACX53806) and 41% to the ejaculatory bulb-specific protein 3 of D. melanogaster (Protein ID: Q9W1C9).

Table 2 The most prevalent mRNAs in A. ipsilon sex pheromone gland

Candidate genes in the A. ipsilon pheromone gland with putative functions in pheromone production, transport and degradation

The overall enzymatic steps during pheromone biosynthesis in A. ipsilon are likely to be similar to those in other moth species, which include fatty acid synthesis, desaturation, chain shortening, reduction and acetylation [1, 2, 6]. By homologous searches we identified members of gene subfamilies in the A. ipsilon pheromone gland transcriptome putatively involved in these biosynthetic processes and pheromone production, including transcripts putatively encoding 3 synthases (2 actyl-CoA carboxylase and 1 fatty acid synthase), 5 desaturases, 13 acyl-CoA reductases, 5 alcohol oxidases and 5 acetyltransferases as well as 11 aldehyde reductases (Table 3); 17 transcripts encoding putative pheromone degradation enzymes (Table 4); 8 transcripts encoding putative CSPs and 7 transcripts encoding putative OBPs (Table 5). Their abundances in the pheromone gland transcriptome are shown in Figures 3 and 4. We further validated and characterized the expression level and the tissue distribution of these genes by RT-PCR and qRT-PCR and summarised below. There is a clear agreement between the transcript abundance estimated by the transcriptome sequencing and transcript expression level in the pheromone gland as measured by RT-PCR and qRT-PCR.

Table 3 Putative pheromone biosynthesis related genes in the A. ipsilon pheromone gland
Table 4 Candidate esterase genes likely involved in A. ipsilon pheromone degradation
Table 5 Candidate olfactory genes involved in A. ipsilon pheromone reception
Figure 3
figure 3

The abundance of the unigenes encoding the sex pheromone synthase in the A. ipsilon transcriptome dataset presented as normalized read count in reads per kilobase per million mapped reads (RPKM). The putative enzyme names are indicated as gene abbreviations followed by Genbank accession numbers. ACC Acetyl-CoA carboxylase, AOX Alcohol oxidase, AR Aldehyde reductase, ATF Acetyltransferase, DES Desaturase, FAR Fatty acyl reductase, FAS Fatty acid synthase.

Figure 4
figure 4

The abundance of unigenes encoding chemosensory proteins (CSPs), odorant-binding proteins (OBPs) and esterase (EST) in the A. ipsilon transcriptome dataset presented as normalized reads in reads per kilobase per million mapped reads (RPKM).

Receptor for the pheromone biosynthesis activating neuropeptide (PBAN)

PBAN is released from the suboesophagal ganglion in the brain and goes to the hemolymph, where it binds to the PBAN receptor in the membrane of the pheromone gland and triggers the pheromone production [42, 43]. Although there was no PBAN receptor found in the pheromone gland transcriptome of H. virescens[39] we found one transcript (Unigene_3821) encoding a protein highly homologous to PBAN receptor isoform B. It has very low abundance in the A. ipsilon transcriptome (31 RPKM) but high amino acid identity of 97% to H. virescens PBAN receptor in GenBank (Protein IDs: ABU93813) [44].

Acetyl-CoA carboxylase (ACC)

Saturated long chain fatty acids are the precursors of sex pheromones in most moth species. Their biosynthesis is started by ACC catalysing the production of malonyl-CoA from acetyl-CoA in the first committed biosynthesis step [8, 9]. In the A. ipsilon pheromone gland we found two transcripts (ACC-JX989149 and ACC-JX989150) encoding ACCs. ACC-JX989149 with an open reading frame (ORF) of 5841 bp encodes for a ACC with 67% amino acid identity with the ACC of T. castaneum (Protein ID: XP_969851) and ACC-JX989150 encodes a protein with 56% amino acid identity with the ACC of H. virescens (Protein ID: ACX53705) (Table 3). The RT-PCR and qRT-PCR revealed that both ACC-JX989149 and ACC-JX989150 are highly expressed in the pheromone gland as compared to the body (Figure 5 and Figure 6). However, they have very low abundance (81 and 21 RPKM) in the transcriptome (Figure 3).

Figure 5
figure 5

RT-PCR results showing the relative expression of the A. ipsilon pheromone biosynthesis-related genes in pheromone gland (PG) and the body (BO). The genes that are more highly expressed in the pheromone gland are labeled with red pentagram. β-actin was used as internal reference gene to test the integrity of each cDNA templates; the similar intensity of β-actin bands between the pheromone gland and the body part indicate the use of equal template concentrations.

Figure 6
figure 6

qRT-PCR results showing the relative expression levels of the A. ipsilon pheromone biosynthesis related genes between the pheromone gland (PG) and the body (BO). The putative enzyme names are indicated as gene abbreviations followed by Genbank accession numbers. ACC Acetyl-CoA carboxylase, FAS Fatty acid synthase, DES Desaturase, FAR Fatty acyl reductase, AOX alcohol oxidase, AR Aldehyde reductase, ATF Acetyltransferase. The internal control β-actin and ribosomal protein S3 were used to normalize transcript levels in each sample. This figure was presented using β-actin as reference gene to normalize the target gene expression and correct sample-to-sample variation; similar results were also obtained with ribosomal protein S3 as reference gene. The standard error is represented by the error bar, and the different letters (a, b) above each bar denote significant differences (p >0.05).

Fatty acid synthase (FAS)

FAS has been shown to catalyse the conversion of malonyl-CoA and NADPH to produce saturated fatty acids [8]. We identified one putative FAS transcript (FAS-JX989151) in the A. ipsilon pheromone gland (Table 3), containing an ORF of 7176 bp and encoding a FAS with 57% amino acid identity to the FAS of T. castaneum (Protein ID: XP_970417). The RT-PCR and qRT-PCR revealed that FAS-JX989151 is highly expressed in the pheromone gland (40-fold higher than in the body, Figure 5 and Figure 6) and also has a high abundance (343 RPKM) in the transcriptome (Figure 3).

Desaturase (DES)

Pheromone-specific desaturases introduce double bond(s) into the fatty acids at specific positions along the chain. Five putative sex pheromone components extracted from A. ipsilon sex pheromone gland are unsaturated fatty acids with acetate as the functional group and 16 or less carbons [38]. At least three active pheromone components (Z 7-12:OAc, Z 9-14:OAc and Z 11-16:OAc) have been identified in A. ipsilon strains from China [38], North America [45], France [46] and Japan [47]. It is reasonable to propose that the saturated fatty acid precursor of A. ipsilon sex pheromones would be palmitic acid (16:0) which is desaturated by ∆11-desaturase to form the precursor Z 11-16:acyl-CoA for the production of two major (Z 7-12:OAc and Z 9-14:OAc) and two minor (Z 11-16:OAc and Z5-10:Ac) pheromone components (Figure 7). It is not clear how the minor pheromone component (Z 8-12:OAc) is synthesized in A. ipsilon, which should involve a ∆12-desaturase. Other studies in Lepidoptera species support a ∆11-desaturase acting on palmitic acid and leading to the production of the sex pheromone components [19, 20, 48]. In the A. ipsilon pheromone gland transctiptome 5 transcripts have high homology to genes encoding desaturases (Table 3). DES-JX989152 is homologous to a gene encoding an acyl-CoA ∆9-desaturase in M. brassicae (Protein ID: ABX90048) with an amino acid identity of 96%. ∆9-desaturase makes oleic acid from stearic acid (18:0) and possibly palmitoleic acid from palmitic acid [16, 17, 49]. It would not participate in the biosynthesis of A. ipsilon sex pheromones. DES-JX989153 encodes a protein with 87% amino acid identity with the acyl-CoA ∆11 desaturase of M. brassicae (Protein ID: ABX90049). DES-JX989154, DES-JX989155 and DES-JX989156 encode proteins, respectively, with 94% amino acid identity to the acyl-CoA desaturase from H. assulta (Protein ID: AF482909), 64% amino acid identity to a S. littoralis desaturase (Protein ID: AAQ74260) and 93% amino acid identity to an acyl-CoA desaturase of S. exigua (Protein ID: AAM28510). These transcripts could possibly encode ∆12-desaturases in A. ipsilon in formation of the minor pheromone component Z 8-12:OAc from the precursor Z 12-16:acyl-CoA. However, they could also function as ∆9-desaturase. Further study on their enzyme activity could confirm their role in the sex pheromone biosynthesis. The RT-PCR and qRT-PCR results indicated that DES-JX989153 and DES-JX989154 are highly expressed in the A. ipsilon pheromone gland compared with the body (85 and 63 fold higher, respectively) (Figure 5 and Figure 6). One of the transcripts (DES-JX989154) is also highly abundant (1206 RPKM) in the pheromone gland transcriptome (Figure 3), suggesting a possible role in A. ipsilon sex pheromone biosynthesis.

Figure 7
figure 7

Putative biosynthesis pathways of the sex pheromones in Agrotis ipsilon . The saturated fatty acid precursor palmitic acid (16:0) is desaturated by ∆11-desaturase to form the precursor Z 11-16:acyl-CoA for the production of three major and one minor pheromone components (adapted from [2, 6, 12, 13, 50]).

Fatty acyl-CoA reductase (FAR)

Once a specific Δ11 and possibly Δ12 double bond is introduced into fatty acid precursors to form a fatty acyl-CoA precursor, the chain of the precursors is then shortened sequentially by ß–oxidation to form different shorter chain fatty acyl-CoA precursors [6]. These precursors are further reduced individually by fatty acyl reductase (FAR) to form corresponding fatty alcohols [26, 28, 51]. In the A. ipsilon pheromone gland transcriptome there are 13 transcripts homologous to putative FAR genes (Table 3). Among them, 5 transcripts encode proteins with 59%-80% amino acid identity to the fatty-acyl CoA reductases of Ostrinia nubilalis (Protein IDs: ADI82776, ADI82777, ADI82778 and ADI82779). Other FAR transcripts are homologous to the fatty acyl-CoA reductase from a wide range of insect species including H. virescens, N. vitripennis, Danaus plexippus, Bombus terrestris and Apis mellifera with amino acid identities of about 60% (Table 3). The RT-PCR and qRT-PCR results indicated that three transcripts (FAR-JX989157, FAR-JX989162 and FAR-JX989164) are highly expressed in the pheromone gland (Figure 5 and Figure 6). The other ten transcripts seem equally expressed in the pheromone gland and the body or highly expressed in the body. All FAR transcripts except two (FAR-JX989157 and FAR-JX989159) have low abundance (from 81 and 16 RPKM) in the pheromone gland transcriptome (Figure 3).

Alcohol oxidase/dehydrogenase (AOX)

Fatty alcohols can be used as pheromone components in many moth species, and they are also pheromone intermediates to produce aldehyde pheromones by the alcohol oxidases [52, 53]. In the A. ipsilon PG 5 homologous genes of alcohol oxidase/dehydrogenase were identified, the BLASTx results revealed three unigenes (AOX-KC007341, AOX-KC007342 and AOX-KC007344) are with the amino acid identity of 43%, 55% and 64%, respectively, to a putative alcohol dehydrogenase of D. plexippus (Protein ID: EHJ70611), and one unigene (AOX-KC007345) are homologous to another putative alcohol dehydrogenase of D. plexippus (Protein ID: EHJ73729 ) with the amino acid identity of 68%. AOX-KC007343 showed 78% amino acid identity with the alcohol dehydrogenase of H. virescens (Protein ID: ACX53694). The RT-PCR and qRT-PCR results indicated that AOX-KC007341 and AOX-KC007343 showed a higher expressed level in the PG than in the body (Figure 5 and Figure 6).

Aldehyde reductase (AR)

Aldehyde reductases are members of the aldo-ketoreductase superfamily and could be used to reduce long-chain acyl-CoA to form alcohol intermediates [13]. In the A. ipsilon pheromone gland we identified 11 transcripts with homology to the aldo-ketoreductases of Papilio dardanus, B. mori, H. armigera, D. plexippus, Culex quinquefasciatus, H. virescens and Papilio xuthus (Table 3). The derived protein sequences of these 11 transcripts show 53%-88% amino acid identity with their homologs in other insects. The RT-PCR and qRT-PCR results indicated that AR-KC007350 and AR-KC007351 are mainly expressed in the pheromone gland, while the other 9 putative aldehyde reductase transcripts have equal expression levels between the pheromone gland and the body or a higher expression level in the body (Figure 5 and Figure 6). All aldehyde reductase transcripts are present at low abundance (from 67 to 10 RPKM) in the pheromone gland transcriptome (Figure 3). The involvement of aldehyde reductase in sex pheromone biosynthesis has not been demonstrated in moth species.

Acetyltransferase (ATF)

The fatty acid alcohols are used as pheromone components in many moth species. In A. ipsilon whose sex pheromone blends comprise only acetates, they are intermediates and acetylated to pheromone components as acetate esters by actyltransferases [13]. In the A. ipsilon pheromone gland transcriptome 5 acetyltransferase homologous transcripts were identified (Table 3), 3 of them (ATF-KC007357, ATF-KC007360 and ATF-KC007361) encode proteins that are homologous to the acetyltransferase of D. plexippus (Protein IDs: EHJ65205, EHJ65977 and EHJ68573) with relatively high amino acid identities (<70%), one (ATF-KC007358) encodes a protein with 90% amino acid identity to H. virescens acetyltransferase (Protein ID: ACX53812) and one (ATF-KC007359) encodes a protein with 86% amino acid identity with the acetyltransferase of B. mori (Protein ID: NP_001182381). The RT-PCR and qRT-PCR revealed that three transcripts (ATF-KC007358, ATF-KC007360 and ATF-KC007357) are mainly expressed in the pheromone gland (Figure 5 and Figure 6) and have a relative high abundance of 195, 155 and 71 RPKM, respectively in the pheromone gland transcriptome (Figure 3).

Genes encoding candidate pheromone degrading enzymes in the A. ipsilon pheromone gland

It would be potentially harmful to insects if pheromone molecules and other odorants remained on the olfactory receptors after they had stimulated the olfactory receptor neurons (ORNs). It is therefore thought that there are mechanisms to protect the ORNs by odorant degrading enzymes (ODEs) [37] including esterases [54, 55], aldehyde oxidases [5658], cytochromes P450 [5961], carboxyl esterase [62], and glutathione S-transferase (GST) [63]. In this study, we identified 17 transcripts predicted to encode esterases in the A. ipsilon pheromone gland, and the BLASTx results showed that all have very high amino acid identities with the antennal esterases of S. littoralis (Table 4), we named them as AipsCXE1-AipsCXE16 and AipsCXE20 following the nomenclature in S. littoralis. Our qRT-PCR results revealed that 7 of the transcripts (AipsCXE3, AipsCXE7, AipsCXE8, AipsCXE9, AipsCXE11, AipsCXE14 and AipsCXE20) are antennal-enriched, 3 (AipsCXE5, AipsCXE10 and AipsCXE15) are both antennal- and pheromone gland-enriched and the remaining 7 (AipsCXE1, AipsCXE2, AipsCXE4, AipsCXE6, AipsCXE12, AipsCXE13 and AipsCXE16) have similar expression levels in antennae, body and pheromone gland, suggesting they are not pheromone specific (Figure 8).

Figure 8
figure 8

qRT-PCR results showing the expression of A. ipsilon unigenes encoding the putative esterase (CXE) identified in the pheromone gland in the male antennae (MA), the female antennae (FA), the body (BO) and the pheromone gland (PG). The standard error is represented by the error bar, and the different letters (a, b, c) above each bar denote significant differences (p > 0.05).

Genes encoding candidate pheromone carrier proteins in the A. ipsilon pheromone gland

Moth sex pheromones are synthesised and protected from degradation until being released from the female pheromone gland and it has been proposed that OBPs and CSPs could participate in this process. In this study we have identified transcripts of 7 OBPs and 8 CSPs from the A. ipsilon pheromone gland (Table 5), all of these have the typical insect OBP sequence motif C1-X15-39-C2-X3-C3-X21-44-C4-X7-12-C5-X8-C6 [35, 64] or CSP sequence motif C1-X6-8-C2-X16-21-C3-X2-C4[65]. One CSP transcript, AipsCSP2 seems to be gland-specific and has an extremely high expression level (<100 folds) in the pheromone glands compared with the antennae and body and a relative high abundance in the pheromone gland transcriptome. AipsCSP8 shows a higher expression level in the pheromone gland (10-fold higher than in body) (Figure 9) and is extremely abundant with 1,364 RPKM in the pheromone gland transcriptome (Figure 4).

There is one OBP transcript (AipsOBP6) which is highly expressed in the pheromone gland (more than 3-fold higher than in the antennae), and 3 OBPs (AipsOBP1, AipsOBP2 and AipsOBP4) are highly expressed in the antennae (Figure 10). This high expression of OBPs and CSPs in the pheromone gland is interesting because it suggests a possible involvement in carrying and releasing sex pheromones as demonstrated for the antennal OBPs and CSPs. However, the molecular mechanisms that connect these proteins with the involvement of pheromone production needs further investigation. No ORs, IRs and SNMPs are identified in the A. ipsilon pheromone gland.

Figure 9
figure 9

qRT-PCR results showing the relative expression of the A. ipsilon unigenes encoding putative chemosensory proteins (CSP) identified in the pheromone gland in the male antennae (MA), the female antennae (FA), the body (BO) and the pheromone gland (PG). The standard error is represented by the error bar, and the different letters (a, b) above each bar denote significant differences (p >0.05).

Figure 10
figure 10

qRT-PCR results showing the relative expression of the A. ipsilon unigenes encoding putative odorant binding proteins (OBP) identified in the pheromone gland in the male antennae (MA), the female antennae (FA), the body (BO) and the pheromone gland (PG). The standard error is represented by the error bar, and the different letters (a, b, c) above each bar denote significant differences (p > 0.05).

Conclusions

The black cutworm A. ipsilon is a destructive pest of many crops [66, 67] and mainly controlled by chemical pesticides, which has led to the development of resistance to various compounds [68]. Our study provides information and resource to identify and facilitate functional studies of genes responsible for pheromone production, transport and degradation at the molecular level both in vivo and in vitro.

By deep sequencing of the A. ipsilon sex pheromone gland transcriptome, we have identified 42 transcripts encoding enzymes putative involved in pheromone production. This is the first study reporting the key enzyme ∆11-desaturase involved in A. ipsilon sex pheromone biosynthesis. One new transcript (DES-JX989154) encoding a desaturase is highly abundant in the transcriptome and highly expressed in the pheromone gland, suggesting this desaturase encoded by DES-JX989154 or other newly identified transcripts (DES-JX989155 and DES-JX989156) may play important roles in A. ipsilon sex pheromone biosynthesis. They may contribute in the introducing a double bond at C11 and C12 positions of the saturated fatty acid precursor palmitic acid for the production of pheromone precursors. Further studies are needed to confirm the substrates and the products thus the involvement of these desaturases and other newly identified genes such as those encoding for aldehyde reductases and acetyltransferases in A. ipsilon sex pheromone biosynthesis. Two of the CSPs are highly abundant transcripts (AipsCSP2 and AipsCSP8) with 100- and 10-fold higher transcription level, respectively than in the body. Furthermore AipsCSP2 and AipsOBP6 are pheromone gland-specific and –enriched, respectively (Figure 9 and Figure 10). This suggests a functional role of the PG-enriched CSPs and OBPs in sex pheromone transport and release. It is clear that during perireceptor event after pheromones and odorants enter the sensillun lymph that the antennae-specific odorant binding proteins (OBPs) capture these hydrophobic pheromone and odorant and deliver them to the membrane-bound olfactory receptors (ORs) [35]. Further study of these PG-expressed OBPs, especially their binding to sex pheromone components is needed to confirm its function.

Methods

Insect material

The A. ipsilon colony has been reared in our laboratory (State Key Laboratory for Biology of Plant Diseases and Insect Pests, Chinese Academy of Agricultural Sciences, Beijing, China) since 2006 with field-collected moths introduced each summer to prevent inbreeding effects. The larvae were reared on an artificial diet comprising wheat germ, casein and sucrose as the main components. The colony was kept at 24°C with 75% relative humidity and a 14h:10h light:dark photoperiod. Pupae were sexed and kept separately in hyaline plastic cups before emergence. Adult moths were given 20% honey solution after emergence.

Pheromone gland dissection

The pheromone gland plus associated ovipositor valves and parts of the terminal abdominal segments were dissected with fine scissors [39] from the rest of the body parts refereed as ‘body’ which comprises of heads, thoraxes, legs, wings and abdomens (without the pheromone glands). The calling behavior of female A. ipsilon moths begins on the first night after eclosion and increases sharply, peaking on the third night [38]. So in order to cover all genes involved in pheromone biosynthesis, four glands of 1-day-old females, four glands of 2-day-old females and ten glands of 3-day-old females were dissected during the second half of the scotophase, which is reported to be the calling period of this moth [6971]. The eighteen glands were mixed in one RNase-free centrifuge tube for total RNA extraction and frozen in liquid nitrogen until further processing.

RNA extraction and cDNA library construction

Total RNA was extracted using TRIzol regent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. The quantity of RNA was determined using a Nanodrop spectrophotometer (Thermo Scientific, Wilmington, DE, USA) and 1.1% agarose gel electrophoresis. About 500 ng mRNA was further purified from 50 μg total RNA using the polyATtract mRNA isolation system III (Promega, Madison, WI, USA). The mRNA was then sheared into about 800 nucleotides using a RNA fragmentation solution (Autolab, Beijing, China) at 70°C for 30 sec, and then cleaned and condensed using RNeasyMinElute RNA Cleaning Up kit (Qiagen, Valencia, CA, USA). The mRNA was used as a template for first-strand cDNA synthesis using N6 random primers and MMLV reverse transcriptase (TaKaRa, Dalian, China) and the second strands were synthesized using Secondary Strand cDNA synthesis enzyme mixtures (Autolab, Beijing, China). cDNAs with appropriate length were purified with the QIAquick PCR Purification kit (Qiagen, Valencia, CA, USA) and eluted with 10 μl Elution Buffer. After blunt ending and the addition of a poly-A tail at the 3’ end according to the Roche’s Rapid Library Preparing protocols (Roche, USA), the purified cDNAs were linked to GS-FLX sequencing Adaptors (Roche, USA). Finally, the cDNAs shorter than 500 bp were removed using Ampure Beads according to the manufactures’ instruction (Beckman, USA) before the preparation of the cDNA library.

454 sequencing

Pyrosequencing of the cDNA library was performed by Beijing Autolab Biotechnology Company using a 454 GS-FLX sequencer (Roche, IN, USA). All sequencing reads were deposited into the Short Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the accession number SRX189143.

Sequence analysis and assembly

Base calling of the raw 454 reads in SFF files was carried out using the python script sff_extract.py developed by COMAV (http://bioinf.comav.upv.es). All raw reads were then processed to remove low quality and adaptor sequences using programs tagdust [72], LUCY [73] and SeqClean [74] with default parameters. The resulting sequences were then screened against the NCBI UniVec database (http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html) to remove possible vector sequence contamination. Cleaned reads shorter than 60 bases were discarded because they are likely to be sequencing artifacts [75].

Two steps were taken to assemble the clean reads. First MIRA3 [76] was used with the assembly settings of minimum sequence overlap of 30 bp and minimum percentage overlap identity of 80%. Then CAP3 was used with assembly parameters of overlap length cutoff <30 and overlap percent identity cutoff <90% [77]. The resulting contigs and singletons of more than 100 bases were retained as unigenes and annotated as described below.

Homology searches and functional classification

Following the assembly, homology searches of all unigenes were performed using BLASTx and BLASTn programs against the GenBank non-redundant protein (nr) and nucleotide sequence (nt) database at NCBI [78]. Matches with an E-value less than 1.0E-5 were considered significant [79]. Gene names were assigned to each unigene based on the best BLASTx hit with the highest score value.

Gene Ontology terms were assigned by Blast2GO [80] through BLASTx program with an E-value less than 1.0E-5. Then, WEGO [81] software was used for assignment of each GO ID to the related ontology entries. The longest open reading frame (ORF) of each unigene was determined by an ORF finder tool (http://www.ncbi.nlm.nih.gov/gorf/gorf.html).

Pheromone gland ESTs from other insects

The H. virescens pheromone gland ESTs (14112 with accession number: GR958232-GR972305, GT067784-GT067747) [39], the A. segetum pheromone gland ESTs (2286 with accession number: ES582156-ES584441) [82] and the B. mori pheromone gland ESTs (10501 with accession number: BP184340-BP182009; AV404455-AV403746; DC552314-DC544856) were downloaded from the dbEST database at NCBI (http://www.ncbi.nlm.nih.gov/nucest) and saved as fasta files. All the EST sequences were assembled using the CAP3 program with the same parameters as used in the A. ipsilon assembly. The comparative analyses of A. ipsilon, H. virescens, B. mori and A. segetum pheromone gland unigenes were performed based on the Best Bidirectional Hits results (reciprocal BLASTn, E-value less than 1.0E-6).

Identification of candidate genes associated with moth pheromone biosynthesis

Some putative genes and enzymes have been reported previously as being involved in moth sex pheromone production. We focused our research on the target genes: (1) Acetyl-CoA carboxylase; (2) Fatty acid synthase; (3) Desaturase; (4) Fatty acyl reductase; (5) Alcohol oxidase; (6) Aldehyde reductase; (7) Acetyltransferase.

Identification of putative genes involved in pheromone degradation

Since the sex pheromone blend of A. Ipsilon is comprised of acetate esters (Z)-7-dodecenyl acetate (Z7-12:Ac) (40.5%), (Z)-9-tetradecenyl acetate (Z9-14:Ac) (13.2%), (Z)-11-hexadecenyl acetate (Z11-16:Ac) (14.9%), (Z)-8-dodecenyl acetate (Z8-12:Ac) (17.2%) and (Z)-5-decenyl acetate (Z5-10:Ac) (14.3%) [38], esterases may play a major role in pheromone degradation. Therefore, we performed BLASTx and BLASTn searches to identify candidate esterase genes in the A. ipsilon pheromone gland NGS dataset.

Identification of putative genes involved in pheromone transport

Genes encoding odorant binding proteins (OBPs) and chemosensory proteins (CSPs) were identified using the “OBP sequence motif” C1-X15-39-C2-X3-C3-X21-44-C4-X7-12-C5-X8-C6 [64] and the “CSP sequence motif” C1-X6-8-C2-X16-21-C3-X2-C4, [65]. Candidate olfactory receptors (ORs), ionotropic receptors (IRs), sensory neuron membrane proteins (SNMPs) genes were identified by BLASTx and BLASTn searches.

Sequence analyses

The putative N-terminal signal peptides and most likely cleavage sites were predicted by the SignalP V3.0 program [83] (http://www.cbs.dtu.dk/services/SignalP/). Sequence alignments were done with ClustalX 1.83 [84] with default gap penalty parameters of gap opening 10 and extension 0.2.

RT-PCR and qRT-PCR

The cDNAs from female pheromone glands and other body parts (mixture of heads, thoraxes, legs, wings and abdomens (without the pheromone glands)) were synthesized using PrimeScript RT Reagent with gDNA Eraser (TaKaRa, Dalian, China). 200 ng cDNA was used as RT-PCR and qRT-PCR templates. Specific primer pairs for RT-PCR analysis were designed with Primer 3 (http://frodo.wi.mit.edu/) or Primer Premier 5 (see Additional file 4). To test the integrity of the cDNA templates, a pair of control primers for the β-actin (GenBank Acc. JQ822245) of A. ipsilon was used. The PCR cycling profile was: 95°C for 2 min, followed by 35 cycles of 95°C for 30 sec, 60°C for 30 sec, 72°C for 1 min and a final extension for 10 min at 72°C. PCR products were separated in 1.2% agarose gels and stained with ethidium bromide. Each reaction was done at least six times with three biological replicates.

qRT-PCR analysis was conducted using the ABI 7500 Real-Time PCR System (Applied Biosystems, Carlsbad, CA). The primers were designed by Beacon Designer 7.90 (PREMIER Biosoft International) (see Additional file 5). Two reference genes, β-actin (GenBank Acc. JQ822245) and ribosomal protein S3 (GenBank Acc. JQ822246) were used for normalizing expression of the target gene and correcting for sample-to-sample variation. qRT-PCRs were done in a 25 μl reaction containing 12.5 μl of Platinum SYBR Green qPCR SuperMix-UDG (Invitrogen, Shanghai, China), 0.5 μl of each primer (10 pmol/ μl), 0.5 μl of Rox Reference Dye, 1 μl of sample cDNA (200 ng/μl), 10 μl of sterilized H2O. The cycling parameters were: 50°C for 2 min, 95°C for 2 min, followed by 40 cycles of 95°C for 15 sec and 60°C for 30 sec. Then, the PCR products were heated to 95°C for 15 sec, cooled to 60°C for 1 min and heated to 95°C for 30 sec and cooled to 60°C for 15 sec to measure the dissociation curves. Negative controls, without either template or transcriptase, were included in each experiment. To check reproducibility, each qRT-PCR reaction for each sample was carried out in three technical replicates and three biological replicates.

qRT-PCR data analysis

Relative quantification was performed using the comparative 2-ΔΔCt method [85]. All data were normalized to endogenous β-actin or ribosomal protein S3 levels from the same individual samples. In the analysis of the relative fold change in different tissues, the body sample was taken as the calibrator. Thus, the relative fold change in different tissues was assessed by comparing the expression level of each target gene in other tissues to that in the body part. The results are presented as the mean of the fold change in three biological samples. The comparative analyses of each OBP, CSP and CXE gene among different tissues were determined with one-way nested analysis of variance (ANOVA), followed by a Tukey’s honestly significance difference (HSD) test using SPSS Statistics 18.0 (SPSS Inc., Chicago, IL, USA). The comparative analyses of each putative pheromone biosynthesis gene between pheromone gland (PG) and body part were determined with paired t-test. When applicable, values were presented as mean ± SE.