Background

The vestimentiferan annelid Riftia pachyptila lives around hydrothermal vents on the East Pacific Rise at 2600 meters-depth. These giant tubeworms form dense aggregations and constitute a major component of the biomass in these deep-sea oases of life that rely on chemosynthetic primary production [1]. Adult vestimentiferans lack a mouth, gut and anus [2]. Instead, they possess a specialized tissue, called trophosome, that contains symbiotic bacteria. This symbiosis with sulfide-oxidizing bacteria provides all the host's nutrition and is therefore obligatory [3]. Their larvae however, possess a digestive tract [4], and are devoid of symbiotic bacteria which they acquire from the environment. The acquisition of bacteria occurs through the skin, and the trophosome is established from mesodermal tissue. Then, apoptosis of infected cells in the host epidermis occurs at the end of the colonization process [5].

Several studies focused on the functioning of this symbiosis. Previous biochemical and enzymatic studies addressed the uptake of hydrogen sulfide [6, 7] and the transport of both oxygen and hydrogen sulfide by the giant extracellular hemoglobins [810]. The diffusion of carbon dioxide through the branchial plume [11] and its subsequent conversion into bicarbonate through the activity of carbonic anhydrase [12, 13] were also demonstrated. More recently, molecular techniques were used to better understand some aspects of the exchange mechanisms in the branchial plume and the trophosome, such as the existence of a carbonic anhydrase transcript[14]. The sequencing of the whole genome of the symbiont of Riftia pachyptila is currently under progress (Horst Felbeck, personal communication) and a proteomics approach has been carried out on the symbiont [15] revealing previously unsuspected carbon fixation pathways. However, no global genomic work on the host has been published to date.

Identification of differentially-expressed transcripts (i.e. transcripts which differ in abundance between samples being compared) has been conducted for the last ten years on symbiotic interactions between rhizobia and legumes (for review see [16]) thanks to improved molecular approaches such as Subtractive Suppression Hybridization (SSH), for example. Morel and coworkers [17] constructed cDNA libraries by a SSH procedure and performed hybridizations on arrays between two compartments of the fungus Paxillus involutus living in symbiosis with the plant Betula pendula. These methods successfully identified differentially-expressed sequences in this ectomycorrhizal symbiosis, suggesting differences in metabolism between the two studied compartments [17]. SSH appears to be a quick and efficient method to rapidly obtain many specific sequences. It is a powerful method to enrich samples for differentially expressed transcripts by combining steps of suppression and normalization prior to differential screening, and this starting from very little material.

A transcriptome analysis of a marine cnidarian-dinoflagellate symbiosis using microarrays to compare aposymbiotic and symbiotic stages of the host Anthopleura elegantissima revealed the existence of key genes involved in the maintenance of the symbiosis [18].

In Riftia pachyptila, aposymbiotic larvae/post-larvae are very small (less than 100 μm) and very difficult to obtain. In addition, the host cannot be kept alive without its symbionts. Therefore, comparison between aposymbiotic and symbiotic states in R. pachyptila cannot be considered at present. Previous studies on the host were only targeted molecular studies and no global molecular analysis has been carried out on the host Riftia pachyptila to date. The aim of the present study was to identify host transcripts that could be involved in metabolite exchanges in the branchial plume on the one hand, and in metabolite exchanges with the symbionts in the trophosome. We postulated that these specific protein-coding genes should be preferentially expressed in these two tissues that are directly involved in the symbiotic way of life. Instead of the usual application of SSH that compares the same tissue in two physiological states, we compared pairs of tissues from a single individual. The subtracted libraries obtained should therefore be enriched in specific sequences compared to a classical library without any subtraction procedure. In theory, identical sequences between key tissue and the reference tissue (housekeeping genes sequences in particular) should be eliminated by the subtractive suppression hybridization. Only tissue-specific sequences should be recovered in each library. We also maximized the chances of obtaining new sequences using the normalization procedure which increases the amount of rare transcripts during the SSH procedure.

Results

General results of sequencing

Global results including the total number of obtained sequences, contigs, singletons, and redundancy rates are given in Table 1 for all the libraries (the body wall-subtracted branchial plume library (BR-BW), the branchial plume-subtracted body wall library (BW-BR), the body wall-subtracted trophosome library (TR-BW) and the trophosome-subtracted body wall library (BW-TR)). The redundancy rates of the libraries range from 80.5 to 95.6 %. This indicates that additional sequencing should bring few or no new sequences. The sequences obtained were assembled into 58, 45, 59 and 17 different sequences (each putatively representing one cDNA) respectively for each library. Of those, 38, 17, 36 and 6 appeared as singletons, respectively.

Table 1 Overall statistics based on the analysis of each library

Figure 1 shows the proportion of sequences with homology or not in the GenBank protein database for the four libraries: BR-BW (Fig. 1A), BW-BR (Fig. 1B), TR-BW (Fig. 1C) and BW-TR (Fig. 1D). These sequences are split into different categories: 1) mitochondrial sequences, 2) all processes sequences (with E-value < 1), 3) hypothetical sequences (with E-value < 1), 4) hypothetical sequences (with E-value > 1) and 5) no similarity found. If we consider that the two last categories cannot improve our knowledge, it remains that the proportions of cDNAs which matched with good homologies scores in GenBank database are 54.2 % in the BR-BW, 54.6 % in the BW-BR, 45.9 % in the TR-BW and 70.6 % in the BW-TR libraries. The choice of E-value = 1 as the threshold to assess the degree of similarity to protein sequences in GenBank database is arbitrary: we are well aware that E-values of 10-3 to 10-5 are usually chosen but, given that the sequences we obtained are relatively short and that there is little molecular data on vestimentiferans or annelids in general, we decided to use this high threshold. For example some cytochrome c sequences (BRbwC9) showed high E-values (0.11, see Table 2) although sequence identity reached 81% based on a 16 amino acid alignment.

Table 2 List of contigs with best E-values obtained for the BR-BW library sequences
Figure 1
figure 1

Proportion of sequences and contigs split into 5 main categories (mitochondrial, all processes sequences (E-value<1), hypothetical sequences (E-value<1), hypothetical sequences (E-value>1) and no similarity found). (A) Results for the BR-BW cDNA library. (B) Results for the BW-BR cDNA library. (C) Results for the TR-BW cDNA library. (D) Results for the BW-TR cDNA library.

Specific sequences obtained from each subtracted library

Subtracted branchial plume library (BR-BW)

Fig. 1A shows a strikingly high proportion of mitochondrial sequences (62.4%) compared to what is observed in the three other libraries. Table 2 shows the sequences with the best E-values for the sequences obtained from the branchial plume library. First of all, among the most redundant clones, we found a 16S ribosomal mitochondrion sequence, which appeared highly redundant (contigs 1 and 2, corresponding to two fragments of mt16S, contain 78 and 32 sequences, respectively). High homologies scores were obtained for several contigs, in particular contigs 12 (carbonic anhydrase), 13 (Major Vault Protein), 14 (chitinase precursor), 15 (cathepsin L-like), 16 (BTG1 protein), 17 (α-tubuline), 18 (hydroxylamine reductase), and 24 (super cystein rich protein). The carbonic anhydrase cDNA obtained in the branchial plume (hereafter referred to as RpCAbr) was different from the one already sequenced from a trophosome cDNA sample [12] (hereafter referred to as RpCAtr). The RpCAbr full-length sequence (GenBank:EF490380, [19]) is only 66% identical in amino acids to RpCAtr (GenBank:AJ439711, [12]).

Subtracted body wall library (BW-BR)

The sequencing of the reciprocal library (i.e. BW-BR) revealed only 3 cDNAs in common with the BR-BW library: the two sections of the mitochondrial rRNA 16S large subunit (although they form a smaller proportion of the sequences), and the branchial carbonic anhydrase (RpCAbr) cDNAs. Although it was not the main target, this library yielded interesting sequences involved in the formation of the tube and therefore expected to be specific of the body-wall (Table 3), including a Riftia pachyptila exoskeleton β-chitin-binding transcript (contig 9) almost identical (2 differences over 74 amino acids aligned) to the one sequenced by Chamoy and coworkers [20]. We also obtained a different exoskeleton β-chitin-binding transcript (contig 10) with a highest homology score with the previously sequenced one [20]. Surprisingly, contig 11 showed a high homology with galaxin, a protein present in the calcified exoskeleton of the coral Galaxea fascicularis. Two transcripts coding for respiratory proteins were also found in this library: a new extracellular hemoglobin linker (contig 14) [21] which matches with Sabella spallanzanii linker chain sequence, and an intracellular globin (contig 15).

Table 3 List of contigs with best E-values obtained for the BW-BR library sequences

Subtracted trophosome library (TR-BW)

Sequences obtained from this library were compared to the unpublished genomic sequences of the symbiont (with Horst Felbeck permission) in order to verify that they were host-specific, and not a contamination from the symbionts that are contained in this tissue. The trophosome library yielded much less identifiable sequences (Table 4). Noticeably, among the identifiable sequences, we recovered the previously sequenced carbonic anhydrase transcript (RpCAtr) [12], and transcripts coding for a large number of globin chains (contigs 8–15), for a hemoglobin linker (contig 16), and for a myohemerythrin (contig 17). Among the cDNAs with a significant blast value (all processes sequences (E-value<1)), more than 56 % are respiratory pigment protein transcripts. These latter are partial fragments of the already known A1, A2, B1, B2 chains of the giant extracellular hemoglobin of Riftia pachyptila, and a probably new A2 (contig 12) and B1 (contig 14) chains. The extracellular hemoglobin linker identified here (contig 16) is different from the one identified in the BW-BR library. This brings the number of known partial linkers cDNAs up to three out of the four known types of linker chains [22]. A sequence coding for a serine-threonine rich protein was also found and matches with a T-cell receptor protein sequence (contig 18, E-value = 0.58). Among the unknown sequences, the most abundant contig was composed of 24 sequences (contig 27).

Table 4 List of contigs with best E-values obtained on the TR-BW library sequences

Subtracted body wall library (BW-TR)

Overall, the general results found for the BW-TR library (Table 5) are similar to those found for the BW-BR library (Table 3) although with a lower number of contigs. However, given the very large number of exoskeleton β-chitin-binding sequences (contig 3 and 4, comprising 24 and 59 sequences, respectively), we can suspect a less efficient normalization for these transcripts. The exoskeleton β-chitin-binding transcripts, galaxin, and myosin chains found in this library are the same as those found in the BW-BR library.

Table 5 List of contigs with best E-values obtained for the BW-TR library sequences

Checking the subtraction procedure and the specificity of the libraries

We only found 3 common cDNAs between the BR-BW and BW-BR libraries and 2 common cDNAs between the TR-BW and BW-TR ones. We were able to assess the degree of successful subtraction for several transcripts by performing regular PCR on subtracted and unsubtracted poly-A cDNA pools. Some results are shown in Fig. 2.

Figure 2
figure 2

Typical PCR profiles obtained after amplification of fragments of interesting cDNAs. S = subtracted sample; UN = unsubtracted sample. (A) and (B) Abundant tissue-specific transcripts: exosqueleton β-chitin-binding transcript (A) and galaxin transcript (B). (C) and (D) Transcripts enriched after SSH procedure: chitinase precursor transcript (C) and RpCAtr (D). (E) and (F) Rare transcripts enriched after SSH procedure: RpCAbr transcript (E) and intracellular globin transcript (F). (G) Abundant transcript in one tissue and rare in other tissue: MVP transcript. (H) Non equally subtracted transcript: cytochrome c oxidase subunit I transcript. The faint bands appearing at a smaller size than expected in some wells are interpreted as non-specific amplification (possible primer dimerization under specific conditions).

The profiles of abundant tissue-specific transcripts are presented in Fig. 2A (exoskeleton β-chitin-binding transcript) and 2B (galaxin transcript). The exoskeleton β-chitin-binding transcript was clearly present in the body wall tissue (Fig. 2A) in both unsubtracted cDNA sample and subtracted BW-BR cDNA sample (before and after SSH procedure respectively). The same profile was obtained for the galaxin transcript (Fig. 2B).

Typical profiles of enrichment of transcripts after SSH procedure are shown in Fig. 2C,D. The chitinase precursor sequence is enriched by the SSH, appearing on agarose gel after 28 cycles instead of 33 cycles on the branchial plume cDNA (Fig. 2C) and RpCAtr sequence amplification is visible after 18 cycles instead of 28 cycles from the trophosome cDNA before the SSH procedure (Fig. 2D).

Fig. 2E and 2F show typical profiles of amplification of rare transcripts. RpCAbr amplification can be seen on the BW-TR cDNA pool after 23 cycles and not from the body wall unsubtracted cDNA, even at 33 cycles (Fig. 2E). Amplification of the intracellular globin sequence can be seen on the BW-BR cDNA pool after 23 cycles and not from the body wall unsubtracted cDNA, even after 40 cycles (Fig. 2F).

Fig. 2G shows results of the amplification profile of MVP. This transcript appeared abundant in one tissue (the branchial plume) and rare in another (the body wall). Only the subtraction procedure allowed its detection in the body wall (not detected in the unsubtracted sample, even after 33 cycles).

Finally, Fig. 2H illustrates the difference of SSH efficiency of a same transcript in two different subtraction procedures (BR-BW and BW-BR). A successful amplification of cytochrome c oxidase I (ccox I) was obtained in both branchial plume and body wall unsubtracted cDNA pools after 18 and 23 cycles respectively. No amplification was observed in the BW-BR subtracted cDNA pool, in contrast with the BR-BW one. It seems that the subtraction was successful in the BW-BR library whereas this transcript could not be successfully subtracted in the BR-BW library after SSH procedure.

Relative expression levels of some target genes over the three types of tissues

We used quantitative PCR on some transcripts to further assess tissue specificity and gain data on the relative level of expression in each tissue starting from total cDNA. For all studied transcripts, PCR were performed from different initial amount of total cDNA in order to construct standard curves. The equations of the curves are reported in Table 6. All PCR efficiencies (E in Table 6), calculated based on the slopes of the curves, varied between 92 and 107%. For the dilution range we chose, we could only amplify the Major Vault Protein (MVP) transcript in the branchial plume and the body wall tissues, the chitinase precursor (ChPr) in the branchial plume tissue, and the myohemerythrin (MH), T-cell receptor (TCR) and contig 27 from TR-BW library (TRbwC27, unknown protein) transcripts in the trophosome tissue.

Table 6 Equations of the standard curves obtained by amplification from total cDNA samples of branchial plume, trophosome and body wall tissues

Relative expression levels were calculated between the different tissues of a whole organism after normalization of the transcripts amplifications with the 18S reference gene. The results from the analysis of several individuals are shown in Fig. 3. The 16S ribosomal gene has 7.5-fold and 10-fold higher expression levels in the branchial plume compared to the trophosome and the body wall, respectively. The ccox I and ATP synthase F1 transcripts were equally present in the branchial plume and trophosome tissues but comparatively less abundant in the body wall tissue (about 16-fold and 43-fold, respectively). The new CA sequence (RpCAbr) is preferentially expressed in the branchial plume tissue compared to the trophosome (1,000-fold less expression) and the body wall (109-fold less expression) tissues. On the opposite, the RpCAtr transcript was more abundant in the trophosome tissue than in the branchial plume (12-fold less expression) and the body wall (2,500-fold less expression) tissues. Relative quantification analyses also showed that cathepsine L-like genes are also up-regulated in the branchial plume tissue compared to the trophosome (about 4-fold less expressed) and the body wall (about 7-fold less expressed) tissues of the worm. The MVP transcript was nearly 10-fold more abundant in the branchial plume than in the body wall tissue but we could not detect it in the trophosome tissue total cDNA.

Figure 3
figure 3

Relative expression levels of ribosomal RNA 16S, ccox I, ATPF1, Cathepsin, RpCAbr, RpCAtr, and MVP transcripts in the branchial plume, trophosome and body wall tissues. For each transcript, the calibrator tissue was chosen as the tissue with the higher expression: the branchial plume was the calibrator for ribosomal RNA 16S, ccox I, Cathepsin, RpCAbr and MVP amplifications, the trophosome was the calibrator for RpCAtr and ATPF1 amplifications. The number of tissue replicates (n) ranges from 3 to 4, and corresponds to the number of intra-individual tissue pairs we had.

Relative expression calculations for ChPr, MH, TCR and TRbwC27 sequences could not be made because we could only generate standard curves from the branchial plume total cDNA (for ChPr transcript) or from the trophosome total cDNA (for MH, TCR and TRbwC27) which indicate that these transcripts are tissue-specific.

Discussion

Use of the SSH method for the study of symbiosis

We used SSH on different tissues from a single individual to look for genes involved in the functioning of the symbiosis because it is not possible to obtain aposymbiotic adult Riftia pachyptila. The body wall was used as a reference tissue to find specific proteins expressed in the gills (main metabolite exchange organ with the milieu) and in the trophosome (organ that houses the symbiotic bacteria). We then focused our attention on some chosen transcripts for a quantitative analysis. The remaining unidentified sequences (many of which could correspond to 3'UTR portions of cDNA) could prove interesting. Their future identification will require either a RACE approach or hybridization on a full length cDNA library.

Efficiency of the SSH method

All the 10 sequences that were more closely studied by quantitative PCR showed differential expression in agreement with the subtractive libraries where they were found. A transcript obtained in a given library showed the highest expression in the expected tissue, as evidenced by checking the result with rapid PCR validation (Fig. 2) and from several individuals by quantitative PCR (Fig. 3). In addition, although constitutively expressed in all cells of all tissues, only one tubulin transcript sequence was obtained from the BR-BW library, and one actin sequence from the BW-TR library. This demonstrates the adequate subtraction of these common sequences. However, some sequences are sometimes highly represented, possibly indicating a subtraction that was not as efficient. ccox I, for example, could not be eliminated in the BR-BW cDNA pool, but this may be due to the fact that it was more expressed in the branchial plume than in the body wall tissue. Strangely however, we recovered the RpCAbr transcript from the BW-BR library (Table 3) although it was 109-fold less abundant in the body wall than in the branchial plume tissues (Fig. 3).

As noticed by Ji and coworkers [23], SSH PCR favors highly differentially expressed genes. From our quantitative PCR results, some transcripts showed such high differential expression (e.g. RpCAbr in the branchial plume compared to the other tissues, and RpCAtr in the trophosome compared to the other tissues). These authors suggest that the primary application of SSH PCR should be the detection of dramatic alteration of gene expression, as it is for example the case for gene expression profiling of two different tissues. Our use of SSH for comparing pairs of tissues seems very appropriate.

Proteins degradation and turnover in the branchial plume tissue

Some transcripts were preferentially expressed in the branchial plume tissue. Relative quantification of the cathepsin transcript (a degradation enzyme found in lysosomes) revealed a more important expression in the branchial plume tissue, compared to the body wall (protected by the tube) and the trophosome. The plume is the only organ in direct contact with sea-water, and thereby strongly exposed to hydrogen sulfide and other toxic molecules such as heavy metals which are abundant in the hydrothermal vent environment. Electron-dense organelles (EDO) seem to be very common in tissues of sulfide-adapted marine annelids. Such structures have previously been observed in both the Riftia pachyptila epidermal body wall [24] and the branchial plume organ (Ann Andersen, personal communication). Arp and coworkers [25] hypothesized that EDO structures could actually be secondary lysosomes. This could be in agreement with our results of cathepsin expression, found to be highest in this highly exposed tissue. Julian and coworkers [26] showed that even in sulfide-tolerant organisms like the annelid Glycera dibranchiata, sulfide exposure poisons the mitochondria, leading to depolarization that is not reversible. Lysosomes could degrade mitochondria that have been damaged by sulfide exposure. Besides, we also found a Rab5 GDP/GTP exchange factor (BRbwC22), which is a member of the Ras superfamily of GTPases. This protein, involved in vesicle trafficking, is known to be located in early endosomes that precede the formation of lysosomes.

Interestingly, another pathway of degradation could also be involved since we obtained a Valosin-Containing Protein (VCP, BRbwC21), which is required in ubiquitin-proteasome degradation [27]. The existence of two VCP transcripts has recently been demonstrated in an annelid, the earthworm Eisenia fetida [28]. Our sequence shows the best homology scores with the predicted VCP protein sequence of Strongylocentrotus purpuratus [Expect = 0.014] and the eVCP-1 isoform from Eisenia fetida [Expect = 0.025] which is ubiquitously expressed in this worm [28].

If EDOs do represent autophagic degradation of organelles as suggested by Arp and coworkers [25], rapid replacement of organelles should take place [29]. The high expression level of ribosomal 16S, an essential gene for the translation of mitochondrial messenger RNAs into proteins, and the presence of some transcripts linked to transcription (BRbwC19 and 20, Table 2) is then consistent with a high protein turnover in this tissue.

Sulfide oxidation with the concomitant production of ATP by the mitochondria of the annelid Arenicola marina has been shown [30]. However, no protein sequence was found here that would suggest a similar property of R. pachyptila mitochondria.

Hydroxylamine reductase protein

Formerly known as prismane, the hydroxylamine reductase is a member of the Hybrid-Cluster Protein (HCP) family and is thought to play a role in nitrogen metabolism. It catalyses the reduction of hydroxylamine to form ammonia using NADH. In rat liver mitochondria, this enzyme is firmly attached to the mitochondrial membrane [31] and its activity can prevent hydroxylamine to inhibit mitochondrial respiration [32]. However, blasts of our sequence did not match with the few eukaryotic sequences available but resulted in 100 bacteria sequences hits. The best ones are those of the Actinobacteria Salinispora arenicola [Expect = 5e-19] and the α-proteobacteria Rhodospirillum rubrum [Expect = 5e-19]. Because this sequence did not match any sequence in the Riftia symbiont genomic database, it could be contamination from bacteria living close to the branchial plume of the worm.

Major Vault Protein gene expression

The Major Vault Protein (MVP) (100 kDa) is the major protein component of vaults, ribonucleic particles of 13MDa. Some studies established that vaults could be involved in nucleocytoplasmic transport of ribosomes and/or mRNA [33]. This could be coherent with our results of 16S expression obtained on the branchial plume tissue in which probable high transcription levels of this protein occur. Other studies indicate the participation of MVP in drug resistance mechanisms where it could act as a nucleocytoplasmic and vesicular transporter of drugs and/or metabolites to transport them to exocytotic vesicles or proton pumps [34, 35]. It could be evidenced that MVP gene in Mytilus edulis was predominantly expressed in epithelia-rich tissues such as the gills and digestive gland and could be involved in multixenobiotic resistance [36]. In our study, MVP transcript is preferentially expressed in the branchial plume tissue compared to the body wall, while no MVP transcript was detected in the trophosome samples. The presence of such a protein in the branchial plume tissue may be used to temporarily immobilize toxic molecules before they are processed.

Chitinase gene expression

Interestingly, a chitinase precursor was recovered as a branchial plume specific transcript. A previous report indicated chitinase activity in the opisthosome and branchial plume of R. pachyptila [37]. Chitin is a major component of the tube of R. pachyptila, produced by specialized glands located in the body wall and the vestimentum [38]. Chitinase activity was suggested to be involved in tube growth and tube shape modifications [37]. A chitinase sequence was recently discovered in the hydroid cnidarian Hydractinia [39] and a possible role of chitinase enzyme in pattern formation and allorecognition was suggested. Interestingly, the transcript was exclusively expressed in ectodermal tissues of the animal, and the authors also suggested a possible role in host defense against pathogens. Such a hypothesis could be interesting to explore given our quantitative PCR experiments because we only could amplify this transcript from cDNA from the branchial plume, the only organ in contact with the environmental sea water.

Tissue-specific expression of different carbonic anhydrases

Our quantification analyses showed a higher abundance of the RpCAbr transcript in the branchial plume compared to the trophosome (present at very low levels) and the body wall tissues. In contrast, the RpCAtr transcript was very abundant in the trophosome compared to the branchial plume (medium levels) and the body wall tissues. Fluorescent In Situ Hybridization confirmed the co-expression of the two transcripts in the branchial plume in contrast with the trophosome where only one transcript could be detected [19]. An alignment of these translated CA cDNAs with vertebrate and non-vertebrate CA protein sequences revealed the conservation of most amino acids involved in the catalytic site, indicating that the two proteins are probably functional if the cDNAs are translated [19].

Myohemerythrin, T-cell receptor, and unidentified transcripts from the trophosome library

A complete coding sequence obtained from the TR-BW library (contig 17, Table 4) showed a very high homology score with a myohemerythrin sequence from the Sipuncula Sipunculus nudus [GenBank:CAG14944] (Expect = 1e-11). The complete Riftia sequence has an open-reading frame of 120 amino acids. Myohemerythrin is an oxygen-binding protein that participates in the storage of oxygen in muscles. Such a protein could be involved in the regulation of cadmium levels in the gut of the annelid Nereis diversicolor [40]. In Hirudo medicinalis, it would have indirect antibacterial properties by regulating free iron availability to deprive bacteria of iron essential for their growth [41].

Both TCR and TRbwC27 cDNAs showed specific expression in the trophosome tissue where they could be essential. The TCR transcript first caught our attention because it matched a sequence fragment coding for a T-cell receptor, which is a complex of integral membrane proteins that participates in the activation of T-cells in response to the presentation of an antigen. The trophosome is mostly composed of bacteriocytes which house bacterial cells in intracellular vacuoles and cellular recognition may be very important for the functioning of this tissue. As for the TRbwC27 it is a large fragment of 273 nucleotides which is highly represented in our subtracted library but did not reveal reliable Blastx homology E-values.

Methods

Animals and sampling

Specimens of Riftia pachyptila were collected at the Oasis site (17°25.385 S, 113°12.280 W) at 2600 meters-depth along the South East Pacific Rise during the BIOSPEEDO 2004 cruise. For each individual, parts of the branchial plume, trophosome and body wall tissues were isolated on ice, placed in RNAlater (Ambion) for 24 h at 4°C and then frozen in liquid nitrogen.

RNA extraction

Plume, trophosome and body wall tissue samples were ground individually in liquid nitrogen under RNase-free conditions. For each tissue, total RNA was extracted using the RNAble buffer (Eurobio) following the manufacturer's instructions. Then, both for library constructions and for complete sequencing, messenger poly-A RNAs were purified using the oligo-dT resin column of the mRNA Purification Kit (Amersham).

Construction of subtractive tissue-specific cDNA libraries

Libraries were constructed from tissues taken from a single individual, thereby representing a single organism's transcriptome. A total of 4 libraries were produced: branchial plume vs. body wall subtracted library (and its reciprocal) and trophosome vs. body wall subtracted library (and its reciprocal). For all tissue pairs, cDNA synthesis as well as subtractive suppressive hybridization (SSH, [[42], [43]]) including steps of adaptor ligation, subtractive hybridization, and selective amplification were performed following the protocol of the Clontech PCR-Select™ cDNA Subtraction Kit (BD Biosciences).

In the BR-BW library, SSH was performed to produce a cDNA library enriched in branchial plume specific transcripts. The tester sample was the cDNA population from the branchial filaments (BR) and the driver sample was the cDNA population from the body wall tissue (BW). In the BW/BR library the tester and driver samples were reversed, and SSH was performed to produce a cDNA library enriched in body wall specific transcripts. In the TR-BW library, the tester sample was the cDNA population from the trophosome (TR) and the driver sample was the cDNA population from the body wall tissue (BW). In the BW-TR library, the tester and driver samples were reversed.

Cloning and sequencing

For each SSH procedure, the whole amplification product was cloned into the TOPO®-TA cloning vector (Invitrogen), producing a range of cDNA fragment sizes. Nearly 200 cDNA fragments were sequenced for each library. Plasmid DNA from individual colonies were purified with the FlexiPrep kit (Amersham) and used in a dye-primer cycle sequencing reaction with T3 or T7 universal primers and the Big Dye® Terminator V3.1 Cycle Sequencing kit (Applied Biosystems). Reactions were then run on a 16-capillary 3130 Applied Biosystems sequencer.

Sequence analysis and homology search

Most of chromatograms obtained after sequencing were treated with PHRED [ [44]] and Seqclean software (TGIR, the Institute for Genomic Research, Rockville, MD, USA) to remove vector and adaptors sequences. Progressively, additional sequences were treated manually. Clustering was performed with the TGICL programs, Megablast and CAP3 [[45]]. Clusters and contigs were formed on the whole set of sequences and also individually for each of the four libraries. Contigs were then verified manually to detect possible chimeras. BLAST analyses of the cDNA libraries sequences were performed on the NCBI server. The assembled sequences were analyzed for homology with known sequences in databases using the BlastX and BlastN programs [[46]] and also treated with the PhyloGena software [[47]] which combines both homology searching and phylogenetic reconstruction to verify the homology attributions.

The redundancy corresponds to the probability that a newly sequenced cDNA was previously obtained. Redundancy rate was calculated with the formula:

R = (1- (Nu/Nt)) × 100, where Nu is the number of unique sequences and Nt is the total number of sequences.

Validation of differential expression by simple PCR

Unsubtracted and subtracted PCR samples (respectively before and after SSH procedure) were obtained as recommended in the Clontech PCR-Select™ cDNA Subtraction Kit (BD Biosciences). These samples were then diluted ten-fold in sterile milliQ water. For each transcript tested, PCRs were conducted on these diluted unsubtracted and subtracted samples with specific forward and reverse primers (Table 7). Each reaction mixture was composed of 1 μl of cDNA sample; 1.2 μl of specific forward primer (10 μM), 1.2 μl of specific reverse primer (10 μM), 22.4 μl of sterile water, 3 μl of 10X PCR reaction buffer, 0.6 μl of dNTP Mix (10 mM) and 0.6 μl of 50X Advantage cDNA Polymerase Mix (BD Biosciences). The following thermal cycling program was used for 33 cycles: 94°C for 30 s, 60°C for 30 s, and 68°C for 2 min. For each cDNA pool tested, a 5 μl-aliquot was removed from the reaction mixture every 5 cycles, starting at the end of cycle 18.

Table 7 Primers sequences used for the transcripts amplifications

SYBR Green quantitative PCR

Reverse transcription (RT)

Fresh RT reaction was carried out with a random primer on each total RNA sample (branchial plume, trophosome and body wall). The reaction mixture was composed of 2 μl of M-MLV 5X RT buffer; 0.5 μl of BSA (10 mg/ml), 1 μl of total RNA (1.24 μg/μl), 2.5 μl of dNTP (4 mM total), 1.5 μl of Random Primer 9 (Ozyme) (100 ng/μl), 3 μl DEPC water. The reaction mixtures were then incubated at 80°C for 5 minutes and placed on ice. M-MLV RT was added (1 μl) to each reaction mixture and all reactions were incubated at 42°C for 1 hour and finally placed on ice.

Amplification

Specific pairs of primers for some target genes (Table 7) were designed using the software Primer Express. 18S rRNA transcript was chosen as a reference gene for the normalization of expression data and was amplified with the 18h and 18L primers designed by Halanych and coworkers [ [48]]. For amplifications, the Power SYBR Green PCR master mix (PE Applied Biosystems) was used in 23 μl reaction mixtures on a Chromo4™ System CFB-3240 (BIORAD). Amplification conditions were 40 cycles with the following profile: 95°C for 30 s, 60°C for 30 s, and 72°C for 1 min.

Standard curves

In order to estimate the relative expression levels of each transcript by the 2-ΔΔCt method, we calculated PCR efficiencies of the transcripts amplifications to verify they do not highly differ from the reference transcript (rRNA 18S) amplification. PCR were performed from a dilution range of total cDNA from each tissue (from one sample). Therefore, we performed PCR starting from total cDNA amounts ranging from 6.2 pg to 620 ng. Each PCR reaction was performed in triplicate. For each initial template quantity, we looked at the threshold cycle, the Ct, corresponding to the number of cycles required to reach a set quantity of amplified cDNA during the exponential phase. The standard curves were generated by plotting the log of the initial template concentration against the Ct generated for each dilution.

Data analysis

For each transcript, the efficiency (E) was calculated from the slope (S) of the standard curve using the formula:

E = 10-1/S - 1

Once differences between efficiencies of reference gene and target gene amplifications were approximately equal, we calculated the relative expression level for each gene analyzed. In each tissue, amplification of the target transcript was compared to the endogenous control amplification in order to get the normalized number of cycles (NNC):

NNC = Ct target - Ct 18S

Then, for relative quantification measurement, we used the 2-ΔΔCt method [ [49]] for individuals for which we had analyzed at least one pair of tissues. Relative quantification results were obtained by comparing levels of expression with the calibrator tissue, the latter being chosen as the tissue for which the better expression was observed with the following calculation:

Relative expression level = 2-(NNCsample-NNCcalibrator).