Introduction

Compared with terrestrial ecosystems, little is known about marine biodiversity and changes in species richness and ecosystem function. This is mainly because of sampling difficulties and problems in taxonomy. Many marine organisms and especially their diverse developmental stages, such as (1) eggs and larvae of fishes and invertebrates, (2) zoo- and phytoplankton, and (3) benthic invertebrates, are cumbersome and difficult to identify by morphological characters. Classical microscopy methods are time-consuming and require a high degree of taxonomic expertise, which is currently falling short. In many cases identifying a species is the major bottleneck in marine biodiversity and ecosystem research, hampering the necessary monitoring of marine biodiversity. As an example, a review of 138 studies on invertebrate diversity in European seas showed that approximately one-third of the specimens could not have been identified to species level (Schander and Willassen 2005).

DNA-based identification methods are meanwhile established (Barlow and Tzotzos 1995) as powerful tools, exhibiting an unprecedented accuracy because of their inherently highest possible resolution, which can reach even the level of single base changes in a whole genome. Using these methods, the following marine animals have been investigated: eggs, larvae, and adults of fishes (Rocha-Olivares 1998; Noell et al. 2001; Fox et al. 2005; Ward et al. 2005), planktonic copepods (Bucklin et al. 1999), invertebrate larvae (Garland and Zimmer 2002; Barber and Boyce 2006), and prey in the gut content or feces of penguins, whales, and fishes (Jarman et al. 2002; Saitho et al. 2003).

Sequences of the small subunit of the rRNA gene are used as a standard method for identifying microbial organisms (Ludwig et al. 2004), and a fragment of the mitochondrial cytochrome oxidase I gene is in use as “DNA-barcode” for the identification of metazoans (Hebert et al. 2002; Ward et al. 2005). As shown above, a growing number of recently published studies are using molecular genetic identification methods. Nevertheless, their application is still restricted mainly because of methodical problems and because special knowledge and experience in molecular genetics is required. This is especially true if DNA-based identification is performed by using microarray platforms that are error-prone and difficult to quantify (Shi et al. 2006). Whereas most of the methods presently in use, such as PCR-based DNA amplification followed by sequencing techniques, allow to handle only single or a few species at the same time, DNA microarrays are believed to have the potential of identifying hundreds of species in parallel and to differentiate them against an even larger number of related species.

Commonly, microarrays are glass microscope slides on which oligonucleotide probes are spotted that are complementary to the DNA target sequences to be analysed (Relógio et al. 2002, Pirrung 2002, Dufva 2005). The DNA target, which is usually fluorophore-labelled during PCR amplification, hybridises with the oligonucleotide probe on the microarray and can be detected after washing steps by its label. Whereas applying of DNA microarrays for gene expression has already reached the routine level of high throughput systems (Blohm and Guiseppi-Elie 2001, Hoheisel 2006), they have been only recently applied for the identification of organisms, such as microbes (Wang et al. 2002; Call et al. 2003; Korimbocus et al. 2005; Loy and Bodrossy 2005), animals (Pfunder et al. 2004), and plants (Rønning et al. 2005). In terms of identifying marine organisms, microarrays have been used for bacteria (Peplies et al. 2003; Peplies et al. 2004) and phytoplankton (Metfies and Medlin 2004; Metfies et al. 2005; Godhe et al. 2007) Other DNA-hybridisation methods for the identification of higher marine organisms, such as invertebrate larvae (Goffredi et al. 2006), copepods (Kiesling et al. 2002), and larvae of fishes (Rosel and Kocher 2002), have been applied, but microarrays have not been used for this kind of studies. Other applications of microarrays in research on marine organisms are gene expression analysis (Williams et al. 2003; Lidie et al. 2005; Wang et al. 2006; Cohen et al. 2007; Jenny et al. 2007) and genotyping in population genetics (Moriya et al. 2004; Moriya et al. 2007).

One of the methodical limitations for using microarrays is the design of species specific probes. On the one hand, oligonucleotide probes designed in silico do not always exhibit the experimental hybridisation properties they were selected for and must be empirically tested. On the other hand, the molecular marker must have highly selective characteristics, such as low intraspecific and a high interspecific variation. One of the most frequently used markers in phylogenetics of fishes is the mitochondrial 16S rRNA gene. This gene has a well-characterized secondary structure (Meyer 1993; Ortí et al. 1996) and especially the loop regions exhibit many insertions, deletions, and substitutions forming highly variable molecular features, which usually allow the design of highly specific probes. This is underlined by a study on lionfishes (Kochzius et al. 2003), which showed that individuals of one species exhibit identical 16S rDNA haplotypes even though they were sampled at sites thousands of kilometres apart, but clear differences could be detected between closely related lionfish species.

In this study, the development of a DNA microarray is described, demonstrating the suitability of the 16S rRNA gene for designing oligonucleotides as microarray probes to differentiate at least 11 fish species from European seas. Based on these data, a “Fish Chip” for approximately 50 fish species is under construction to support the identification of eggs and larval stages from species that are otherwise difficult to identify, and of adult or processed fishes in fisheries industry.

Material and Methods

Sampling and DNA Extraction

To consider possible intraspecies sequence variations, fishes were collected in five different regions of the European seas: North Sea, Bay of Biscay, and Western, Central, as well as Eastern Mediterranean (Fig. 1, Table 1). All together 267 individual samples from 79 fish species were investigated. In addition sequences from the EMBL sequence data base were included. Voucher specimens and tissue samples have been preserved in absolute ethanol and stored at 4°C or were frozen at -20°C. DNA was extracted from gill filaments with the Agowa mag midi DNA isolation kit (AGOWA, Berlin, Germany) or from muscle tissue with the DNeasy tissue kit (Qiagen, Hilden, Germany) according to the instructions of the manufacturer.

Figure 1
figure 1

Map with the sampling areas; NS North Sea; BB Bay of Biscay; WM Western Mediterranean; CM Central Mediterranean; EM Eastern Mediterranean.

Table 1 Number of sequences per fish species and sampling region

Polymerase Chain Reaction and Sequencing

A fragment of approximately 1380 bp length from the mitochondrial 16S rRNA gene was amplified with the primer 16fiF140 (5′-CGY AAG GGA AHG CTG AAA-3′), which has a single-base modification compared with Palumbi et al. (1991, unpublished manuscript) as well as with the newly designed primer 16fiR1524 (5′-CCG GTC TGA ACT CAG ATC ACG TAG-3′). Polymerase chain reaction (PCR) reactions with a total volume of 15 μl contained 1.5 μl 10 X reaction buffer, 1.5 μl dNTPs (10 mM), 0.05 μl of each primer (100 pmol/μl), 5 μl DNA-extract, 0.3 μl Teg polymerase (3 U/μl; Prokaria, Reykjavik, Iceland), and 6.6 μl deionized water. Thermal profile began at 94°C for 4 min, followed by 35 cycles of 94°C (30 s), 54°C (30 s), 72°C (90 s), with a final step of 7 min at 72°C.

PCR products were purified by using the ExoSAP-IT for PCR clean-up (GE Healthcare, Uppsala, Sweden). The newly designed sequencing primer 16fiseq1463 (5′-TGC ACC ATT AGG ATG TCC TGA TCC AAC-3′) was used to sequence one strand of the amplified fragment using the BigDye Terminator Cycle Sequencing Kit (ver. 3.1, PE Biosystems, Foster City, USA). The sequencing reactions were run in an ABI Prism 3730 automated DNA Analyzer (Applied Biosystems, Foster City, USA) according to the manufacturer’s instructions.

Sequence Analysis and Oligonucleotide Probe Design

In the framework of this project, 267 sequences have been acquired and 944 sequences have been obtained from EMBL sequence database and other projects, representing approximately 380 species of fish from European seas. Probes were designed based on 230 sequences obtained from 27 species (Table 1). A multiple alignment of these 230 sequences was performed with the programme Clustal W (Thompson et al. 1994) as implemented in BioEdit (version 7.0.4.1; Hall 1999) to ensure that all sequences represent a homologous fragment of the 16S rDNA. Before probe design gaps have been removed from each sequence. A computer program developed by the bioinformatics group of the Centre for Applied Gene Sensor Technology (CAG) and the Zentrum für Technomathematik (ZeTeM), both at University of Bremen, was used to design species-specific oligonucleotide probes, which ideally cover all sequences of one species and do not match any other species (Nölte 2002). The following criteria have been considered: (1) length of 23 to 27 bp, (2) melting temperature (Tm) of 81 to 85°C based on the unified model (SantaLucia 1998), (3) GC content of 52% to 54%, (4) secondary structure of the oligonucleotides and the target sequence, (5) possible dimer formation, and (6) the energy content of a bond between the probe and the target sequence. Minimal free energy (mfe) structures are computed by using RNAfold (Hofacker et al. 1994). Probes exhibiting strong secondary structures or binding to a region of the target with such a strong secondary structure were not used. If more than one probe qualified for a species, the one with the highest binding energy between probe and target was chosen. Already during the design-phase the selected oligonucleotide probes were tested in silico against 1211 background sequences from approximately 380 fish species.

Preparation of DNA Microarrays and Hybridisation Experiments

Aminosilane (3-aminopropyltrimethoxysilane)-coated glass slides were used with a PDITC-linker (1,4-phenylendiisothiocyanate) from Asper Biotech (Tartu, Estonia). Oligonucleotide probes (Thermo Hybaid, Ulm, Germany) with a 5′-amino-C6-modification were spotted in 150 mM Na3PO4 buffer (pH 8.5) at a concentration of 30 μM using a spotting robot based on a modified version of the contactless TopSpot® technology. The spotted volume of this oligonucleotides solution was 200 pl, producing a spot diameter of approximately 220 μm. Each probe was spotted in four replicates per block. An array contained five blocks and three arrays were spotted on one microarray slide (Fig. 2). After spotting, the microarrays were incubated for 16 h in a wet chamber to ensure efficient covalent binding of the oligonucleotides. Finally, the microarrays were shrink-wrapped under a nitrogen atmosphere and were storable at 4°C for up to 6 months.

Fig. 2
figure 2

Layout of the microarray

DNA for hybridisation experiments was amplified and labelled with 5′-Cy5-modified primers. The primers 16sar-L (3′-CGC CTG TTT AAC AAA AAC AT-3′) and 16sbr-H (5′-CCG GTT TGA ACT CAG ATC ACG T-3′) amplify a fragment of approximately 600 bp length from the mitochondrial 16S rRNA gene (Palumbi et al. 1991, unpublished manuscript). PCR reactions with a volume of 100 μl contained 10 μl 10 X reaction buffer, 4 μl dNTPs (5 mM), 2 μl of each primer (10 μM), 2 μl DNA-extract, 0.4 μl Taq polymerase (5 U/μl), 2 μl BSA (20 mg/ml), and 77.6 μl deionized water. The PCR thermal profile began at 95°C for 2 min, followed by 35 cycles of 95°C (30 s), 54°C (45 s), 72°C (60 s), followed by a final step of 10 min at 72°C. The Cy5-labelled PCR amplified DNA was purified using the QIAquick PCR Purification Kit (QIAGEN, Hilden, Germany).

Hybridisation experiments were performed with 11 target and 14 nontarget fish species. As nontarget species fishes were chosen that are closely related to the target species and from which false-positive signals could be expected according to the in silico specificity tests performed during the design phase of the probes (Table 2).

Table 2 Nontarget species tested in hybridisation experiments

A positive control at a concentration of 1 nM (5′-CGT GTG AGT CGA TGG ATC ATA-3′; 5′-Cy3-labelled) and 10 nM of the purified Cy5-labelled PCR product were hybridized to the microarray in a volume of 65 μl using GeneFrames® (ABgene House, Epsom, UK), which were applied to the microarray slides according to the manufacturer’s instruction (Fig. 2). Hybridisation was conducted at 50°C in a hybridisation oven. After 2 h hybridisation time the GeneFrames® were removed and the microarrays were washed 5 minutes each with 2 × SSC (sodium chloride trisodium citrate) buffer containing 0.05% SDS (sodium dodecyl sulphate), 1 × SSC containing 0.05% SDS, and 1 × SSC. Finally the microarrays were dried in a centrifuge at 2000 rpm for 3 minutes.

Measurement of Fluorescence Signals and Data Analysis

Hybridisation signals were measured using an Axon 4000B fluorescence microarray scanner at 635 nm (Cy5) as well as at 528 nm (Cy3). The fluorescence signal analysis was conducted with the software GenePix 4.1 (Axon, Union City, USA). The fluorescence signals of each probe were measured and the arithmetic mean was calculated. However, data were removed from the analysis if the spots showed artefacts caused during the spotting process (e.g., inhomogeneous spots documented by a monitoring camera during spotting) or experimental artefacts (e.g., air bubbles). Background noise was corrected by subtracting the arithmetic mean of the negative control measurement from the arithmetic mean of the spot measurements. Negative values were set to zero.

Results

Probes for 11 commercially important fish species have been designed (Table 3). Positions of the oligonucleotide probes in the 16S rDNA fragment used for probe design are given in Fig. 3. Following the nomenclature of Ortí et al. (1996), the binding sites of the microarray probes for Engraulis encrasicolus, Sparus aurata, and Trigla lyra are located in the variable region j, whereas the probes for Boops boops, Helicolenus dactylopterus, Lophius budegassa, Pagellus acarne, Scomber scombrus, Scophthalmus rhombus, Serranus cabrilla, and Trachurus trachurus bind to the variable region l.

Table 3 Oligonucleotide probes for the identification of fish species from European seas
Fig. 3
figure 3

Alignment (5′ > 3′) of representative 16S rDNA sequences from the target species with binding sites (light grey) of probes (5′ > 3′; probes hybridise to the reverse complementary target strand). Double stranded (dark grey) and single stranded regions (grey) of the secondary structure are indicated in the reference sequence of Pygoplites nattereri (Ortí et al. 1996; Accession number: U33590)

All single target hybridisations of the Cy5-labelled 16S rDNA fragment gave true-positive fluorescence signals for the corresponding probe (Fig. 4), but the hybridisation efficiency is very different. The signal intensity of the weakest probe-target pair gave approximately 1,000 arbitrary units and the strongest almost 30,000 (Table 4). Under the experimental conditions selected, single target hybridisations with B. boops, E. encrasicolus, H. dactylopterus, L. budegassa, S. rhombus, S. aurata, T. trachurus, and T. lyra did not show any false-positive signals. Eight probes gave under these conditions very weak false-positive signals with P. acarne, S. scombrus, and S. cabrilla, representing approximately 7% of all possible cross-hybridisation reactions. The values of these false-positive signals were several orders of magnitude lower than the true positive-signals.

Figure 4
figure 4

Signals (background subtracted from absolute signal) of single target and nontarget hybridisations. White bars represent true-positive signals; false-positive signals are shown as grey bars. Numbers at the basis of the bars indicate the amount of measured spots. The number of hybridisations is given in brackets after target and nontarget names. Replication and absolute signal intensities (± standard deviation) of hybridized targets to the corresponding probe are given in Table 4.

Table 4 Target hybridisations

Single nontarget hybridisations showed 48 very weak false-positive signals, representing 31% of 154 possible cross-hybridisations. These false-positive signals were at least one order of magnitude weaker than the true-positive signal. Merlangius merlangus and Merluccius merluccius did not gave any false-positive signal, whereas Dentex dentex, Diplodus vulgaris, Gadus morhua, Melanogrammus aeglefinus, Micromesistius poutassou, Mullus surmuletus, Pollachius pollachius, Pollachius virens, Psetta maxima, Serranus hepatus, Trachurus mediterraneus, and Trachurus picturatus showed two to eight cross-hybridisations. Considering all possible cross-hybridisations, only 18% showed usually very weak false-positive signals.

Discussion

The results show that the “Fish Chip” described enables the identification of 11 commercially important fish species from European seas in certain experimental limits. These limits are primarily given by the fact that the fluorescence signal intensities of true-positive hybridisation signals were heterogeneous. This phenomenon is commonly encountered in DNA microarray experiments (Peplies et al. 2003; Warsen et al. 2004; Korimbocus et al. 2005; Rønning et al. 2005; Tobler et al. 2006) and can probably be overcome only by an extreme methodical effort (Shi et al. 2006). The problems encountered when single-colour microarray experiments need to be quantified are severe and in part not yet solved, because complex parameters are influencing the results. These are the sequence dependent hybridisation efficiency, specifically steric hindrance, secondary structures (Southern et al. 1999), and the relative position of the fluorescent label at the target (Zhang et al. 2005).

Sometimes the duplex formation can be favoured by using spacers, which obviously give the captures a greater degree of freedom from their neighbouring molecules and from the surface (Southern et al. 1999), leading to an enhancement of signal intensity with increasing spacer length (Peplies et al. 2003).

An important criterion of probe design was the base composition (Southern et al. 1999) and therefore all but one capture oligonucleotides were designed to have a GC content of 52% and 54% for the exception (Table 3). The different hybridisation efficiencies of the captures and the varying sensitivity for the different fishes is still a severe disadvantage, because it can hamper the estimation of a small amount of fishes of one species in the presence of a larger number of individuals of another species in a mixed sample. These limits are presently under investigation with fish eggs and other biological material.

Because all oligonucleotide probes bind to the variable regions j and l of the 16S rRNA gene, which represent large single-stranded loops, the secondary structure is unlikely to be a factor contributing to the partially low sensitivity. Also, the POL effect, a phenomenon decreasing the fluorescence signal with increasing distance between the probe binding site and the label on the target (Zhang et al. 2005) seems not to be important in this case. Most of the probes used in this study hybridise to the variable region l of the 16S rDNA and their distance from the fluorescence label is more or less identical.

Although cross-hybridisations occur, true-positive signals could clearly be differentiated from false-positive signals because of their generally higher signal. Most false-positive signals occurred when the 16S rDNA fragment of nontarget species was hybridised on the microarray.

Testing of nontarget species is seemingly important in studies using DNA microarrays for the identification of organisms, because even if a comprehensive sequence background is utilised for probe design and even if extensive in silico testing has been performed, the specificity of oligonucleotide probes has to be evaluated with closely related species and specifically those that show cross-hybridisations in silico.

This study shows that the 16S rRNA gene of fishes is suitable to design oligonucleotide probes that are able to differentiate eleven fish species from European seas by single target hybridisation on a microarray. Such a “Fish Chip” can hopefully be applied in marine environmental and fisheries research, as well as in fisheries and food control if the uneven hybridisation signal intensities of the different probe-target pairs can be improved or compensated.

The correct identification of fish eggs and larvae is crucial for fish stock assessment based on ichthyoplankton surveys. Genetic identification has shown that the majority of eggs in the Irish Sea, wrongly believed to be from cod, were actually from whiting, leading to an overestimation of cod stocks (Fox et al. 2005). A study on food fish in the United States revealed that three-quarters of fish sold as “red snapper” were mislabelled and belonged to other species (Marko et al. 2004), a situation that needs better analytical tools to be changed. The European Union (EU) also has strict regulations for seafood labelling, which must include, for example, the species name (EU Council Regulation No 104/2000; EU Commission Regulation No 2065/2001). Approximately 420 species of fish are sold in the German market alone, making a reliable identification method urgently necessary to protect the customer. DNA microarrays might have the potential to fulfill these requirements.

Recent efforts in compiling sequences of fishes, such as the European projects “FishTrace” http://www.fishtrace.org) and “Fish & Chips” (http://www.fish-and-chips.uni-bremen.de; Kochzius et al. 2007), as well as the international “Fish Barcode of Life Initiative” http://www.fishbol.org; Ward et al. 2005), will provide the necessary sequence background for the design of species specific oligonucleotide probes for the development of DNA microarrays for the identification of fishes.