Introduction

Global warming is associated with an increased frequency of environmental extremes, such as heat waves, floods, and droughts (IPCC 2007). Such extremes may be more important for the persistence of local populations than changes in mean conditions (Gaines and Denny 1993). In coastal–habitat-forming species, such as seagrasses or corals, losses caused by summer surface water extremes in temperature have already been reported (Cerrano et al. 2000; Hughes et al. 2003; Reusch et al. 2005). Central questions in marine ecology are thus how organisms physiologically adjust to such stress events, which molecular genetic mechanisms may confer plasticity and tolerance toward extreme conditions, and whether such plasticity itself has a heritable genetic basis and may evolve in the face of global warming (Hofmann et al. 2005; Reusch and Wood 2007).

Ecogenomic techniques are increasingly utilized in the marine realm and hold great promise to address some of above questions (Hofmann et al. 2005; Dupont et al. 2007). Gene transcription profiling, in particular, is one important step toward identifying those genes and metabolic pathways that underlie ecologically important traits, such as stress tolerance (Feder and Mitchell-Olds 2003; Vasemägi and Primmer 2005; Ouborg and Vriezen 2007). In marine systems, transcription profiling has been successful in unravelling the genetic basis of temperature adaptation (Whitehead and Crawford 2006), of calcification in phytoplankton (Fujiwara et al. 2007), and the response of marine plant and animal species to abiotic stresses (Pearson et al. 2001; Jenny et al. 2002; Hashimoto et al. 2004; Kassahn et al. 2007).

The construction of expressed sequence tag (EST) libraries is a convenient entry point for a whole suite of ecogenomic tools (Bouck and Vision 2007). EST libraries enable us to obtain DNA sequence information of expressed genes sufficiently detailed to tentatively characterize the underlying gene function by homology search. Bulk RNA containing messenger (m)RNA is extracted, converted into complementary (c)DNA, and subsequently cloned and sequenced using standard methods. EST libraries are a cost-effective tool to characterize genes important under particular conditions, as well as the starting point for the development of molecular genetic markers, such as gene-linked microsatellites and single nucleotide polymorphisms (SNP). In marine species, gene-linked microsatellites (EST-SSR = simple sequence repeats) were successfully identified, for example, in oyster (Wang and Guo 2007) and shrimp (Pérez et al. 2005).

Any EST library is specific for a certain tissue and experimental condition under which the mRNA was sampled. Conversely, when several EST libraries were obtained under different conditions, the contribution of reads to an identical tentative gene cluster is indicative for its expression strength, provided that libraries were not normalized (Okubo et al. 1992; Bouck and Vision 2007). This way, important inferences on physiologic adjustments to changes in environmental parameters at the level of the transcriptome become possible at least for those genes that are higher expressed in at least one of the experimental conditions to be compared (Kore-eda et al. 2004; Kuo et al. 2004).

We present for the first time several EST libraries of a seagrass, Zostera marina (eelgrass). Z. marina is a habitat-forming, or ecosystem-engineering angiosperm (sensu; Jones et al. 1994), forming dense meadows along sedimentary shorelines, ranging from subarctic to subtropical latitudes. Experimental and observational evidence suggests that local populations from temperate regions experience mortality above a critical summer threshold temperature of 25°C (Williams 2001; Greve et al. 2003; Reusch et al. 2005), and such events are predicted to increase in the next decades (IPCC 2007).

Against this empirical background, our motivation was thus to assess how seagrass physiology is affected by stress events. In an attempt to obtain first data on transcriptional regulation under different temperature conditions, we conducted an analysis of the plants’ transcriptome comparing several nonnormalized EST libraries to identify genes that are critical under temperature stress. Subsequently, the response to temperature stress was compared with transcriptomic changes observable in the plant model Arabidopsis thaliana as a reference. Moreover, we wanted to mine the EST data for gene-linked microsatellites as a valuable resource for gene-linked genetic marker development.

Materials and Methods

Study Species

Zostera marina (eelgrass) is a widespread marine angiosperm or seagrass species exhibiting a mixture of sexual and clonal (vegetative) reproduction (Reusch et al. 1999). The species is monoecious and self-compatible and exhibits true subaqueous pollination (Reusch 2000). Systematically, seagrasses are polyphyletic clades within the monocotyledoneous order Alismatales (Les et al. 1997). Eelgrass populations sampled for constructing the genetic libraries were in the south-western Baltic, a semi-enclosed sea with brackish water. At sampling sites Schilksee and Maasholm (south-western Baltic Sea, Germany), salinities throughout the year range from 12 to 20 g/kg. Samples were collected in 1.6- to 2.5-m depth.

Experimental Conditions

In total, plants were sampled under five different experimental conditions and tissue types. Four of those were the focus of the present study, whereas sequence data from another library (A) solely serve to improve the clustering and homology search (Table 1). Libraries C and D represent plant material under natural conditions, collected from the field in average summer or winter conditions, respectively. In addition, for construction of library E and F, entire plants sampled at the same site during the same time together with material for library D were exposed to increasing heat stress in 60-L aquaria filled with ambient Baltic Sea water. Aquaria were aerated and illuminated for 10 h with approximately 40 μ E x m-2 * s-1 at the water surface. The water temperature was then increased to 17°C in library E during 36 h, and to 25°C within another 24 h in that plants that served as source for library F. Note that, in addition to the temperature stress, we may have elicited additional stress responses, for example to translocation and altered light regime.

Table 1 Details on EST libraries constructed for eelgrass Zostera marina

RNA Extraction

Total RNA was extracted using the RNeasy plant kit (Qiagen, Hilden, Germany). In the case of libraries based on field-sampled material, cleaned tissue was frozen in liquid nitrogen < 20 min after uprooting, while keeping them in ambient water. The RNA quality of an aliquot was checked on EtBr stained agarose gels. RNA concentration was equilibrated among leaf and meristematic tissue in the libraries C to F (Table 1) according to RNA concentration measurements using a NanoDrop spectrophotometer. Each library contained the pooled RNA of four to six genotypes.

Library Construction and DNA Sequencing

All libraries were constructed using the Creator SMART library construction kit (BD Clontech), using the LD PCR based method. Between 22 and 28 PCR cycles were performed before size separation of inserts. The size-selected cDNA (fragments > 800 base pairs) was directionally ligated at the restriction site Sfi1 of the pDNR-lib vector (BD Clontech) and electroporated into E. coli strain DH10B (Invitrogen). Library D shows a low amount of clones carrying an insert. Presence of inserts was controlled via PCR by using M13 primers to exclude empty clones. Dideoxy-termination DNA sequencing was performed on ABI sequencers (ABI 3130XL at MPI for Limnology Plön in case of libraries A, C, D) and ABI 3730XL (libraries E, F, at MPI Molecular Genetics, Berlin) after plasmid preparation using the BigDye 3.1 sequencing chemistry and a forward M13 primer only (GTA AAA CGA CGG CCA GT).

Data Analyses and Bioinformatics

The raw sequence reads were quality-trimmed, poly-A and vector clipped using pregap4 (Staden et al. 2000). Vector flanking sequences surrounding the cloning sites, including the linker used in constructing the library, were included as parameters. Successfully trimmed EST reads were then assembled into tentative gene clusters using CAP3 (Huang and Madan 1999). Two parameters were specified: there was no reverse orientation of sequence reads, and one read of good quality is sufficient to build a consensus at a given position.

Tentative Unigene TUG Annotation

To infer putative functions of the identified tentative unigenes (TUG; >100 nucleotides only), we performed homology searches against three protein databases: SwissProt, Gene Ontology (GO), and Prodom. For the SwissProt search, we used the BLASTX algorithm in conjunction with the database vs. 52.0, with an Expect-value threshold of E ≤ 0.01. For further analysis, the first hit was used if being a plant, otherwise we continued until a plant hit was found. If no higher plant hit was present, we used the other species instead, as our initial material was not axenic and may contain fungal parasites. The AmiGO tool (www.geneontology.org/cgi-bin/amigo/go.cgi) implementing BLASTX searches (Gish and States 1993) was used to identify similarities with proteins of the GO database (www.geneontology.org). The overlap in their annotation of all tentative unigenes was examined using Venn diagrams.

Open Reading Frame Identification and Microsatellite Search

To predict the presence of nontranslated regions of the TUGs, we were interested in identifying the open reading frames (ORFs). ORF prediction was based on BLAST hits to all three databases used: SwissProt (SP), Prodom (PD), and Gene Ontology (GO). We considered the part of the ORF that matched a SP / GO protein or a PD domain as seed. From there, we searched for the beginning and the end of the ORF. The end is defined by the last nucleotide before a STOP codon. As most likely beginning of an ORF, we went in 5′ direction until the last ATG (Methionine) codon before STOP was reached. Alternatively, if no STOP codon was found, the first nucleotide in frame was defined as the beginning. In the latter case, this was most likely to the result of an EST/contig only partly overlapping with an ORF extending 5′. When ORF prediction was solely based on Prodom annotation, a tentative unigene can have more than one domain, which can slightly overlap, and which can be in different frames. We then considered only the open reading frame corresponding to the match with the lowest E-value. Note that 73 TUGs annotated with a reading frame had at least one stop codon within the matching region to a protein or a domain, possibly because of sequencing errors or PCR/reverse transcription artifacts during the preparation of the libraries.

For searching simple sequence repeats or microsatellites, MISA (MIcroSAtellite, http://www.pgrc.ipk-gatersleben.de/misa) PERL script was used. We searched for all possible motifs of a repeat motif length of two to six nucleotides. Minimal length of repetitions for dinucleotide repeats was six, and for repeat units of three to six nucleotides, it was five. Above determination of ORFs was used to predict the position of microsatellite repeats with respect to coding regions.

Comparison of Gene Expression among EST Libraries

We compared global and individual gene expression patterns based on our nonnormalized cDNA libraries using the approaches proposed by Susko and Roger (2004). Our analyses of gene expression differences focussed on two a priori formulated hypotheses, corresponding to experimental conditions under which the libraries were constructed. We were first interested in a comparison of the natural undisturbed gene expression between summer and winter situation, and thus compared EST library C with D. As second comparison, we concentrated on library F as focal experimental condition and compared the composition of gene clusters with libraries D and/or E. If none of the latter nonstress libraries produced a significant outcome, we pooled the frequency of occurrence over D and E.

Binomial and Chi-square tests were used for gene-by-gene comparisons of expression (Susko and Roger 2004). We did not use any adjustment for multiple comparison, for example the cumulative probability of experiment-wise false-positives at less than alpha proposed by Benjamini and Hochberg (1995), and implemented by Susko and Roger (2004). Rather, we see those genes that individually seem to be over- or underexpressed among EST libraries constructed under different conditions as indicative for further studies. As a precaution, we first examined in a global test whether the transcriptome is generally different among conditions. Only if this was true at P ≤ 0.001, we proceeded with single gene tests.

Rocha et al. (2003) proposed that microsatellites are particularly abundant in stress associated genes because they facilitate their rapid expression level evolution as a result of high mutation rates in promoter regions. We were thus interested in comparing the abundance of microsatellites among those genes that were d.e. under heat stress compared with other TUGs. This was done in the comparison of libraries D+E vs. F only, using Chi-square tests.

Comparison with Arabidopsis thaliana

We were interested in similarities among Z. marina and the plant model Arabidopsis thaliana (mouse-ear cress) stress response. Any differentially expressed (d.e.) TUGs from Zostera were matched against the Arabidopsis thaliana protein data set from the MIPS Database website http://mips.gsf.de/proj/plant/jsf/athal/index.jsp) using BLASTX (Expect value threshold E = 1e-04). The global hypergeometric test (Falcon and Gentleman 2007) was performed to detect functional gene ontology categories overrepresented in the resulting set of significant Arabidopsis BLAST hits. Hypergeometric probabilites were computed to assess whether the frequency of BLAST hits associated with a particular GO term was larger than expected. The method ignores the structure of the gene ontology, treats each GO term as independent from all others and returns raw and adjusted P values. Bonferroni correction was used to adjust P values for multiple testing (Boyle et al. 2004). To avoid redundancy, connected GO terms with significant scores were excluded after the analysis.

All d.e. eelgrass genes were then compared to a set of Arabidopsis stress specific proteins, the AtGenExpress (Kilian et al. 2007). This database contains 3095 (of 22,000) genes, which are only expressed in a specific stress condition in at least one of six time points. The genes are unique for a given stress condition and do not overlap with other stresses.

Zostera marina EST Database

Clipped and passed sequence reads were submitted to dbEST within GenBank (accession numbers AM766003–AM773228 and FC822029–FC823189). In addition to deposition of cleared reads in GenBank all processed and assembled data are publicly available in a database at the Institute for Evolution & Biodiversity, accessible through a web interface http://www.uni-muenster.de/Evolution/ebb/Services/zostera). This database is searchable for tentative gene ID (singletons and contigs), gene name and annotation key words, and microsatellites. TUG identifiers (singletons and contigs) used in the present study are identical to those in the database. A local BLAST server also is implemented.

Results

Assembly, ORF Prediction, and Annotation

In the global assembly, i.e. when pooling sequence reads of all 5 libraries, 8573 passed EST reads clustered to 3593 tentative unigenes (TUGs), 2496 of which were singlets and 1097 were contigs, i.e. clusters consisting of 2 or more reads.

The annotation using the different protein and domain databases yielded largely consistent results. We considered only TUGs that were ≥ 100 nt and hits in a positive reading frame. A core of 1893 TUGs could be annotated with all three databases (Fig. 1), whereas 753 TUGs could not be annotated at all, resulting in total annotated fraction of 79% of all TUGs (2840/3593). Using the Prodom database, a total of 115 TUGs were associated with the key word “transcription,” whereas 35 (0.94%) were specifically designated as transcription factors. The majority of hits to the gene ontology GO database comprised genes of Arabidopsis thaliana (mouse-ear cress; 1857/3593) and Oryza sativa (rice; 853/3593). According to the database comparisons, 761 of all TUGs (singletons or contigs) contained the complete open reading frame, 899 contained no ORF or were not annotated, 1060 and 545 contained portions of the 3′- and 5′ untranslated region (UTR), respectively, whereas 481 genes contained stretches of both UTRs.

Fig. 1
figure 1

Venn diagram showing the overlap of significant hits (E-value threshold < 0.0001) of Zostera marina TUGs among different database searched. GO = gene ontology database. Numbers are given for the initial database search before manual editing of some TUGs displaying identical BLASTX hits

After initial annotation, 2 or more TUGs showed the same highly significant BLASTX score for particular SwissProt proteins in 188 cases. In these cases, the assembly into TUGs was manually edited. In 84 cases, we obtained novel “merged” TUGs from 2 or sometimes 3 initial CAP3 TUGs, indicated as “merge” instead of “contig” in the database. As criterion for merging, tentative gene clusters were considered to belong to the same gene if they overlapped < 20% of their read length. Otherwise, they were considered recent duplicates, based on the assumption that clustering in CAP3 is correct in producing different clusters given sufficient sequence overlap. This slightly altered the overall library statistics (total TUGs 3496 including 84 merged TUGs; total annotated TUGs 2743 = 78.5%). The assessment of differential expression among libraries was done with the modified data set.

Microsatellite Identification

In total, we identified 210 genes (6.01%) that contained a total of 223 microsatellite motifs under the specified criteria, i.e. some TUGs contained more than one motif. Among these we found 82 dinucleotide repeats, with the majority being AG/TC and AT-repeats, 113 trinucleotide repeats, and a few tetra-, penta-, and hexanucleotide repeats (Table 2). As hypothesized, trinucleotide repeat motifs are more abundant within the coding region of genes (55) than outside (24, Table 3). In contrast, repeat motifs containing less or more than three nucleotides, resulting in frameshifts when undergoing slip-strand mutations, are primarily found in untranslated regions of a transcript (65), whereas rare in ORFs (16). This difference was statistically significant in a (2× 2 contingeny table, df = 1, Chi-square = 42, P < 0.0001). Consistent with the prediction that microsatellites causing no frame-shift may be more abundant within ORFs, the only detected hexa-nucleotide repeat (AATACC9; unigene ZMD04004) was found within an open reading frame. This TUG had two domains that resembled a zinc-finger domain in PRODOM (PD007661, E = 4e-16). The identity of the gene itself is unclear, it may be a transcription factor, consistent with the zinc-finger domain, or a salt-tolerance like gene (Swiss Prot ID Q9SYM2, Arabidopsis thaliana, E = 4e-20).

Table 2 Composition and length of microsatellites detected among Zostera marina ESTs
Table 3 Position of microsatellites with respect to putative open reading frames (ORFs)

Comparison of Tentative Unigene Frequencies among Libraries

In a global comparison according to Susko and Roger (2004), divergent patterns of gene expression were detected among all pre-planned library contrasts (Table 4). We thus proceeded with a more detailed analysis of single genes that were differentially expressed (d.e.). Given the total number of TUGs in the libraries to be compared, the minimal frequency for detecting d.e. was four reads for the comparison library C vs. library D, and for library D + E vs. F. Accordingly, of the subset of 149 genes where differential expression is detectable, 7 were down- and 19 were up-regulated under winter conditions (library D) versus summer conditions (library C, supplementary Table S1). Qualitatively, many genes of the light reaction, in particular light harvesting proteins (as in TUGs contig 62, 114, merge 29, 71, 188) and reaction subunits themselves (contig 107, 172, 188) are more abundant under summer conditions (Table S1).

Table 4 Global comparison of EST library composition based on the frequency spectrum of single sequence reads contributing to tentative sequence clusters, according to Susko and Rogers (2000)

Note that the comparison of libraries C and D lacked the statistical power of the other comparison, as only 1248 passed sequence reads comprise the first library (Table 5). Therefore, despite the very low P value for the global comparison, relatively few individual genes are d.e. In the remainder of this study, we, therefore, focus our discussion on the comparison under experimentally induced stress conditions.

Table 5 Summary statistics of five eelgrass (Zostera marina) EST libraries

Our second comparison of libraries concerned the experimental response to heat (and possibly also uprooting and translocation) stress. As for the heat stress response, of 333 TUGs compared among the libraries D + E vs. F, 27 (8%) were up- and 36 (11%) were down-regulated under heat stress (Tables 6 and 7). Among the strongest responses was a 7-fold up-regulation of a putative photosystem I assembly protein (SwissProt Q3BAN1), and a 6-fold increase in a light harvesting, chlorophyll-binding protein (SwissProt P27495). Down-regulations observed were a 15-fold reduction in a chloroplast precursor gene (SwissProt Q6K953), and a 6-fold reduction in a metallothionein-like gene (SwissProt Q40256). Because in several cases, one library contributed no reads to the relevant gene cluster, frequencies could not be estimated, but fold-changes may even be higher in these cases.

Table 6 Zostera marina TUGs significantly up-regulated in library F (heat stress) with respect to library D (designated DF), E (EF), or both libraries pooled (DE-F)
Table 7 Zostera marina TUGs significantly down-regulated in library F (heat stress) with respect to library D (designated DF), E (EF), or both libraries pooled (DE-F), in descending order of total expression level

Among temperature responsive genes, 7 of 27 (26%) and 5 of 36 (14%) TUGs, respectively, had a role in photosynthesis, predominantly in the light reaction (photosystem I and II). Although under laboratory exposure with higher temperatures, several light harvesting complex proteins were up-regulated (as in contigs 983, 787; merge29 and 73, Table 6), the reaction subunits themselves were down-regulated (as in contig 314, 326 and 341, putative homology to photosystem I and II reaction center subunit genes, Table 7). The dark reaction also was affected. A 10-fold down-regulation upon heat exposure also is observed in the primary gene of photosynthetic carbon fixation, Rubisco (contig69).

Frequency of Microsatellites in Differentially Expressed Genes

When comparing the abundance of microsatellites among differentially expressed genes (comparison D + E vs. F only) with all other TUGs, no significant difference could be detected. Of all up-regulated genes, 5.16% carried a microsatellite, whereas of all down-regulated TUGs, 13.89% carried microsatellites. Both frequencies were not significantly different from the global frequency of microsatellites among all TUGs (6.01%) in Chi-square tests.

Abundance and Diversity of Genes Encoding Stress Proteins

Among 3496 TUGs we found 9 genes encoding for diverse families of heat shock proteins (HSP) that are known to be involved in mediating high temperatures and other stresses (Boston et al. 1996). All of those were genes encoding HSPs of large molecular weight > 60 kDa (Table 8). We also identified one heat shock transcription factor B4 (singleton, ZMD01094, Table 8). Because HSP-genes were too rare to allow frequency-based tests on single genes, we lumped them according to the GO category (biological function) response to heat. Interestingly, we find a higher frequency of HSP genes sensu latu under the summer conditions (library C vs. D, 7 vs. 11 reads; P = 0.035), but no significantly different frequencies among the “winter” and the heat stress libraries, with largely similar contribution of HSPs to the total number of reads [library D (7 reads), E (9), F (13)]. A stress-mediating gene that was significantly up-regulated under heat stress may be involved in scavenging reactive oxygen species, a Mn-superoxide dismutase (contig901, SwissProt P35017, Table 6).

Table 8 Zostera marina putative HSP (heat shock protein) encoding genes and heat shock transcription factors

Comparison of Z. marina Differentially Expressed Genes Against Arabidopsis thaliana

Among the 63 TUGs that were differentially expressed (d.e.) in library F (heat stress 25°C) vs. D + E, a BLASTX search with the MIPS Arabidopsis database resulted in 48 significant hits. These were distributed over all five chromosomes in the Arabidopsis genome. There also were three significant hits in the chloroplast genome (ycf4, rps18, atpA). Among those 48 genes, we found a highly significant overrepresentation of several GO categories. There were significantly more photosynthetic genes and those taking part in chromatin binding regulated differentially than expected by chance (GO molecular function, both P < 0.001). In terms of GO category, biological function, photosynthesis was more affected than any other process (P < 0.001). Finally, in terms of cellular components, mainly photosystem I components and light harvesting complexes were d.e., confirming above qualitative finding on composition of significantly up- or down-regulated genes on an individual basis (Tables 6 and 7).

When comparing Z. marina with the ATGenExpress, a stress specific database of Arabidopsis thaliana, 24/63 (38%) d.e. eelgrass genes also were reported to change expression levels in A. thaliana as response to stress (Table 9). The major organ of expression varied and comprised root and shoot. Interestingly, although many were responsive to the same stress type as in our Z. marina data (i.e. heat stress), several of these (10/23 = 43%) are primarily responding to osmotic and salt stress in Arabidopsis. Whether this reflects functional changes of genes in Zostera after adaptation to the marine environment requires further study.

Table 9 Similarity among eelgrass (Zostera marina) TUGs and Arabidopsis thaliana stress response

Only approximately half of the Arabidopsis thaliana stress response genes with putative homology to Z. marina TUGs are predominantly expressed in the shoot, whereas the others are characteristic for the root, although the tissue type used for constructing the Z. marina library did not contain root material. Whether this, too, reflects functional dissimilarity driven by the different habitat type or taxonomic affiliation of Arabidopsis (dicot) vs. Zostera (monocot) is unclear.

Further Characteristics of the Heat Stress Response

Under all experimental conditions, the transcriptome of Z. marina is dominated by a gene encoding for a cystein-rich metallothionein-like protein (mt3) comprising between 2.5% and 15% of all transcripts (contig479, SwissProt ID Q40256). Although one reported primary function of such genes is heavy metal homeostasis, in particular copper (Guo et al. 2003), such a dominant frequency suggests that this gene must be responsible for other important functions as well. Note, however, that this putative metallothionein is down-regulated approximately 6-fold under temperature stress. Almost exactly the same down-regulation was observed in winter (library D) compared with average summer conditions (library C, Table S1). Interestingly, in the transcriptome of the Mediterranean seagrass species Posidonia oceanica we also find a similar TUG that is even more abundant (G. Procaccini, personal communication, 2007). Finally, we have probably identified several genes that have a high homology to Dictyostelium, a social amoeba or slime mold (Table 6). One of those genes, encoding a 26S proteasome regulatory subunit, shows a significant up-regulation under heat stress (merge17; SwissProt ID P02889).

Discussion

In this study, we found striking differences in gene expression among experimental conditions in a coastal marine angiosperm, the ecologically important seagrass species Zostera marina (eelgrass). Among the TUGs that consisted of ≥ 4 total sequence reads, we found several individual genes that revealed strikingly different expression patterns when subjected to heat stress, both in terms of up- and down-regulation.

A priori, we would expect the heat stress response among terrestrial and aquatic angiosperms to differ in a number of ways. First, temperature changes are always gradual in the thermally buffered aquatic environment, whereas rapid temperature fluctuations are possible on land. On the other hand, once critical temperatures are reached in the sea, they have to be sustained by the plant for a longer time and cannot be ameliorated by increasing transpiration. This may explain why we identified no small heat shock proteins even under imposed temperature stress in Zostera marina, whereas these genes are a major group of inducible heat shock genes among terrestrial angiosperms (Waters et al. 1996). It also may explain why the response among the HSP encoding genes is relatively weak (≤2-fold induction), suggesting that most HSP genes identified were constitutively expressed and may indicate longer-term acclimation to high temperature. Evidently, the preliminary data obtained in this study need to be verified by quantitative PCR or by macro- or microarray work. The latter is under development in Z. marina.

Notwithstanding, it is notable that in eelgrass, many of the d.e. TUGs have putative homologous in the plant model A. thaliana with its well-characterized stress response associated genes. This demonstrates not only the principal validity of comparative EST-analysis but also is an indication that many components of the stress response may be conserved among the flowering plants. On the other hand, the time to the least common ancestor of the genus Zostera and other families of monocotyledoneous and dicotyledoneous plants is in the order of 100 Mio years (Les et al. 1997). Therefore, a fraction of approximately a fifth of genes that revealed no database hits even when compared against Prodom indicates the phylogenetic distance of the seagrasses relative to well-studied plant genomic model species, such as rice or Arabidopsis. That the majority of data base hits of eelgrass TUG queries revealed A. thaliana as species with lowest E-values probably reflects the abundance of sequences in Genbank, or conversely, the relative paucity of data on monocotyledoneous species. Many groups of monocotyledonous plants are currently poorly represented in transcriptomic/genomic databases (Jackson et al. 2006). Therefore, the TUGs of a seagrass species presented here also may serve as one initial attempt to close this gap within the order of Alismatidae.

Worth mentioning are not only those database hits that are indicative for higher plants but also those indicating a contamination of the EST library. Particularly noteworthy are three genes with high similarity to an amoeba species (Dictyostelium spp.), pointing to possible infection (or symbiotic association) of eelgrass with a protist. It is known that species of slime mold of the genus Labyrinthula are associated with eelgrass and have caused massive die-offs (also dubbed wasting disease) in the 1930s (den Hartog 1970; Muehlstein et al. 1988). It is possible that a related slime mold Dictyostelium yielded the best hit because its genome is much better characterized than those of Labyrinthula species in which only 22 genes are deposited in GenBank (as of September 20, 2007), but this requires further study.

Although some classical stress associated genes are higher expressed under elevated temperatures, such as superoxide-dismutase (SwissProt P35017), most classical heat shock proteins (HSP) had an overall frequency that is too low to allow tests based on single genes (<0.1% of transcripts). Nevertheless, when pooled, a significant 1.8-fold constitutive up-regulation under natural summer vs. winter conditions becomes apparent in libraries prepared from experimentally untreated material (libraries C and D). Such an effect was not detectable under experimentally induced stress, although a tendency of induction was observable in the raw data. Other work in progress (Ransbotyn & Reusch, preliminary data, 2007) revealed using quantitative, real-time PCR assays that two HSP70 genes (contig325, and singleton ZMC10006, SwissProt ID P22953 and P09189) respond to an increase in temperature from 18° to 25°C with moderate up-regulation (2 to 3-fold).

Within our EST libraries we have identified a total of 223 microsatellites or simple sequence repeats in 210 (∼6%) of all TUGs. They may serve as starting points for trait-associated genetic markers (van Tienderen et al. 2002). An equally high frequency of putative microsatellite marker loci associated in direct linkage with genes has now been identified in many other plant species (overview in Li et al. 2004). In Z. marina, PCR-based assays have already been successfully developed for 14 of these candidates, and they were proven to be polymorphic in natural populations (Oetjen and Reusch 2007a, b). As expected, we found a higher fraction of trinucleotide microsatellite repeats within coding regions compared with outside ORFs, because their length variation will not result in frameshifts, whereas the inverse was true for repeat motifs that would result in frameshift mutations.

For constructing the cDNA libraries presented, we used the SMART methodology, which involves a PCR amplification step before cloning to increase full-length representation of transcripts (Herrler 2000). Although this may lead to differential representation of mRNAs, empirical studies have verified that SMART PCR maintains the presentation of transcript abundance (Herrler 2000; Seth et al. 2003). To compare the global gene expression patterns among experimental conditions, tissues or different populations/species, subtractive hybridizations are an alternative to comparing the gene composition of the transcriptome (Diatchenko et al. 1996; for a marine example see Pearson et al. 2001). With sequencing technologies becoming much cheaper in the foreseeable future (Margulies et al. 2005), we predict that global comparison of redundant (i.e. nonnormalized) EST libraries will be a routine tool for gaining first insights into the adjustment of gene regulation as a response to environmental conditions, including stress (this study, Kuo et al. 2004). On the other hand, when sequencing moderate numbers of ESTs, for example, in the range of 1x 104 as in this study, meaningful information is only obtained for the most abundant transcripts. There also will be a negative correlation between global expression strength and the statistical power to detect differential regulation that may bias the outcome of studies (Susko and Roger 2004), in particular when particular gene classes differ nonrandomly in their expression strength under certain conditions. Nevertheless, to obtain a global snapshot of organismal physiology, the quantitatively important physiologic processes are detectable by a moderate-sized EST library.

The metabolic adjustments among the most expressed genes identified in this study suggest a complex stress syndrome in Z. marina subject to adverse conditions. Although light harvesting proteins are up-regulated, we find a strong down-regulation of the photosynthetic active complexes themselves. Several of the major responsive genes as well as stress genes candidates (in particular heat shock proteins and genes involved in scavenging reactive oxygen species) identified may serve as starting points to develop expression profiling techniques (Whitehead and Crawford 2006). The ultimate goal is to obtain a more exhaustive and fine scale picture of plastic and constitutive changes in cellular metabolism associated with short- and long-term adaptation to extreme water temperatures.