3.1 Construction of Whole Genome DNA Library of E. curvistellata
The library of E. curvistellata whole genome DNA was constructed with the next-generation DNA sequencer. Total three runs of sequencing gave rise to 27,939,250 reads, being assembled to 442,583 contigs with the average length of 427 and the median length of 420 (Table 16.1). The longest contig covers 145,960 while the shortest is 18 nucleotides. Total number of the nucleotides reached to 190,209,345. This number can be roughly considered as a genome size of the species, with fairly matching to that of Amphimedon queenslandica, being 167 Mbp (Srivastava et al. 2010).
Table 16.1 Summary of whole genome sequencing and assembly
To evaluate the quality of the library, we run the blast program with Aphrocallistes vastus Cox3 gene (GenBank accession no. EU000309.1) (Rosengarten et al. 2008) as a query. As a result, we obtained the single contig_1075 containing not only Cox3 gene but also the whole mitochondrion DNA sequence, 19,700 bp. This result indicates that the library is qualitatively sound and can be useful for gene searching.
3.2 Search for Silicatein Gene
The axial filament was obtained in the intact form by dissolving the silica spicules of T. aurantia (Shimizu et al. 1998). Although the axial filament was observed in the cross section of Euplectella silica spicules under the scanning electron microscope (Weaver et al. 2007), the axial filaments or any filamentous materials were not obtained in our attempt. The extract contained proteins, but these proteins had no silicatein sequences as long as we examined. On the other hand, a partial silicatein cDNA from E. aspergillum was archived in DNA sequence database (FR748156) (Table 16.2). In addition, silicateins or silicatein-like sequences have been reported from the hexactinellid sponges Aulosaccus sp. (Veremeichik et al. 2011), C. meyeri (Müller et al. 2008a), and M. chuni (Müller et al. 2008b).
Table 16.2 Identification of the genes related to hexactinellid biosilica
To verify the silicatein sequence in E. curvistellata genome, the local blast program was executed with these hexactinellid silicatein sequences as well as T. aurantia silicateins as queries and E. curvistellata genomic DNA library as a database. For the partial silicatein cDNA from E. aspergillum, no contig with E values less than 10 was hit. Similarly, no hit was obtained when T. aurantia silicateins α (AF032117) and β (AF098670), Aulosaccus sp. silicatein-like (ACU86976), C. meyeri silicatein (CAP49202), and M. chuni silicatein (CAZ04880) were used as queries.
The amino acid sequence KNSWG was widely conserved in silicateins and cathepsin L; 296–300 of T. aurantia silicatein α (Shimizu et al. 1998, AF032117), 296–300 of Suberites domuncula silicatein (Krasko et al. 2000, AJ272013), 292–296 of S. domuncula cathepsin L (Müller et al. 2003, AJ784224), and 299–303 human cathepsin L (Gal and Gottesman 1988, X12451). In the case of this sequence used as a query, contig_7,117 (8869 bp) was hit. The contig contains the stop codon, but not the first Met. The 5′ region, contig_50860, was obtained by running blast program with the contig_7,117 as a query. Total length of the coding region is 1343 bp composed of 4 exons coding 324 amino acid residues and 3 introns. The predicted protein sequence is more similar to those of sponge and human cathepsin L than silicateins. In addition, cysteine at the position corresponding to the catalytic residue and the surrounding sequences in cathepsin L are conserved in the contig, indicating that the gene encodes cathepsin L but not silicatein. The boundaries of all the four exons in the cathepsin L of E. curvistellata are identical to those of exon 2/exon 3, exon 3/exon 4, and exon 4/exon 5 in human cathepsin L gene consists of eight exons and seven introns (Chauhan et al. 1993), suggesting that the exon-intron structure of cathepsin L in E. curvistellata is conserved in the human gene.
The result of our blast search for silicatein in E. curvistellata genome is consistent with the fact that the silicatein or silicatein-like proteins were not obtained in dissolution of silica spicules. Previous transcriptome analysis concluded that any silicatein gene was identified in the hexactinellid A. vastus (Riesgo et al. 2015). It is unlikely that silicatein exists in all species of the class Hexactinellida. However, further research should be performed using the hexactinellid species which have been reported to have the evidence for the existence of silicateins before the conclusion is drawn.
3.3 Search of Genes Associated with Silicon Biomineralization
Genes for glassin were assigned by conducting the similarity research with glassin cDNA as a query. The two contigs 14,569 and 22,997 cover 5′ and 3′ regions of glassin gene, respectively, while overlapping each other. Some mismatches were observed in the overlapped and 3′ regions, indicating the assembly in complicated sequences including the repetitive sequences is incomplete. Therefore, further refinement on the library may be required.
Ehrlich and Worch (2007) reported chitin in E. aspergillum as an organic component of their silicious skeletal systems. A gene encoding chitin synthase was searched using A. queenslandica chitin synthase2 and 3-like protein sequences (XP_011402997 and XP_003389565, respectively) as queries, and contig_18,557 (12,682 bp) containing 4335 bp open reading flame encoding 1445 amino acid residues was obtained. Our result suggests that Euplectella is capable of chitin synthesis and thus is consistent with previous observation on occurrence of chitin in Euplectella.