Cell types in Komodo dragon blood
A sample of blood was obtained from a Komodo dragon named Tujah at the Saint Augustine Alligator Farm Zoological Park in accordance with required safety and regulatory procedures, and with appropriate approvals. At the time of collection, we were interested in collecting both genomic DNA for sequencing as well as mRNA to generate a cDNA library to facilitate our proteomic studies. In birds, the heterophils (white blood cells) are known to express multiple antimicrobial peptides [30]. Antimicrobial peptides identified from chicken heterophils exhibit significant antimicrobial [31, 32] and host-directed immunomodulatory activities [29]. Accordingly, after obtaining an initial sample of fresh Komodo dragon blood, we allowed the white blood cells to settle out of the blood and collected them because they were likely to be involved with antimicrobial peptide expression. The collected Komodo dragon white blood cells were then divided evenly, with half being processed for the isolation of genomic DNA in preparation for sequencing and library generation, and the other half reserved for mRNA extraction for our proteomic studies.
We then performed smears and identified the various cell types that we observed. Immune cell identification in Komodo dragon blood is challenging due to limited published literature for reference. The various cell types that were observed in Wright-stained blood smears are shown in Fig. 2. We identified these cells based on similarity to the immune cells we had previously identified in the American alligator blood [12]. Of interest were the large and elongated nucleated red blood cells of this reptile. In addition, we were able to identify heterophils (similar to granulocytes), a probable source of cathelicidin peptides, as well as monocyte and lymphocyte cells.
A second sample of Komodo dragon blood was later collected and processed for genomic DNA extraction by Dovetail Genomics for additional sequencing. The researchers at Dovetail Genomics did not separate white blood cells, and instead extracted DNA from cells pelleted directly from whole blood.
Assembly and annotation of the Komodo dragon genome
Previous analyses of Komodo dragon erythrocytes using flow cytometry estimated the genome to be approximately 1.93 Gb in size [33]. Using deep Illumina sequencing and Dovetail approaches, we obtained a draft genome assembly that was 1.60 Gb large, similar to the genome size of A. carolinensis lizard genome which is 1.78 Gb [34]. The draft assembly contains 67,605 scaffolds with N50 of 23.2 Mb (Table 1). A total of 17,213 genes were predicted, and 16,757 (97.35%) of them were annotated. Completeness estimates with CEGMA [35] were 56% (‘complete’) and 94% (‘partial’). The estimated percentage of repeats in the genome is 35.05% with the majority being LINEs (38.4%) and SINEs (5.56%) (Additional file 1: Fig. S1 & Additional file 2: Table S1). Genomic data will be available at NCBI with raw sequencing reads deposited in the Sequence Read Archive (#SRP161190), and the genome assembly at DDBJ/ENA/GenBank under the accession #VEXN00000000. The assembly version described in this paper is VEXN01000000.
Table 1 Genome assembly attributes Identification of potential innate immunity and antimicrobial peptide genes
Innate immunity in reptiles is a critical aspect of their evolutionary success, but it remains poorly understood in these animals. Innate immunity is defined as those aspects of immunity that are not antibodies and not T-cells. Innate immune responses to invading pathogens can include the expression of cytokines; the activation and recruitment of macrophages, leukocytes and other white blood cells; and the expression of antimicrobial peptides such as defensins and cathelicidins [13, 15].
We have taken a genomics-based approach [36] to identifying innate immunity genes in the Komodo dragon genome in this work. We have sequenced the Komodo genome and examined it for genes and clusters of important innate immunity antimicrobial peptide genes (β-defensins, ovodefensins and cathelicidins), which are likely involved in expressions of innate immunity in this giant lizard.
β-Defensin and related genes in Komodo genome
Defensins are one example of disulfide-stabilized antimicrobial peptides, with β-defensins being a uniquely vertebrate family of disulfide-stabilized, cationic antimicrobial peptides involved in the resistance to microbial colonization at epithelial surfaces [37,38,39]. The β-defensin peptides are defined by a characteristic six-cysteine motif with conserved cysteine residue spacing (C–X6–C–X (3–5)–C–X (8–10)–C–X6–CC) [40] and associated disulfide bonding pattern (Cys1-Cys5, Cys2-Cys4 and Cys3-Cys6); however, variations in the number of and spacing between cysteine residues has been observed. As with other cationic antimicrobial peptides, β-defensins typically exhibit a net positive (cationic, basic) charge.
One of the first extensive reports of an in vivo role for β-defensin peptide expression in reptiles is the inducible expression of β-defensins in wounded anole lizards (Anolis carolinensis) [10, 11, 14, 41,42,43]. Reptile neutrophils appear to have granules that contain both cathelicidin-like peptides as well as β-defensin peptides. β-defensin-like peptides are also found in reptile eggs [26]. It is well-known that some species of lizard can lose their tails as a method of predator escape, and that these tails then regenerate from the wound site without inflammation or infection. β-defensin peptides are expressed both within the azurophilic granulocytes in the wound-bed as well as in the associated epithelium [41, 43] and are observed in phagosomes containing degraded bacteria. There is a distinct lack of inflammation in the wound, which is associated with regeneration, and two β-defensins in particular are expressed at high levels in the healing tissues [10, 42] Overall, there appears to be a significant role for the β-defensins in the wound healing and regeneration in the anole lizard [44].
β-defensin genes have been generally observed to reside in clusters within the genomes of vertebrates [45, 46]. In humans, as many as 33 β-defensin genes were identified in five clusters [47, 48]. Recently, analyses of the genomes of several avian species including duck, zebra finch and chicken revealed that the genome of each species contained a β-defensin cluster [49,50,51,52]. A β-defensin-like gene cluster has recently been identified in the anole lizard (Prickett, M.D., unpublished work in progress), which is closely related to the Komodo dragon [13]. Interestingly, the cathepsin B gene (CTSB) has been identified as a strong marker for β-defensin clusters in humans, mice, and chickens [51]. Thus, we examined the Komodo genome for the cathepsin B gene (CTSB) as a potential marker to aid in the identification of the β-defensin cluster(s) therein.
Through these analyses, we identified a total of 66 potential β-defensin genes in the Komodo dragon genome, of which 18 are thought to be Komodo dragon-specific β-defensin genes (Table 2). The β-defensin genes identified from the Komodo dragon genome exhibit variations in cysteine spacing, gene size, the number of cysteine residues that comprise the β-defensin domain, as well as the number of β-defensin domains. With respect to the conserved cysteine residue spacing, especially at the end (C–X6–C–X (3–5)–C–X (8–10)–C–X6–CC), we found considerable variability in our analysis of the β-defensin genes in the Komodo dragon genome, in that five Komodo dragon β-defensin genes have seven resides between the last cysteines, 16 have six residues between the last cysteines, 42 have five residues between the last cysteines, and three Komodo dragon β-defensin genes exhibit more complex cysteine-residue spacing patterns (Table 2).
Table 2 Identified Komodo dragon Defensin genes grouped based on scaffold locations of gene clusters As with birds and other reptiles, the majority of Komodo dragon defensin genes appear to reside in two separate clusters within the same syntenic block (Fig. 3). One cluster is a β-ovodefensin cluster flanked on one end by the gene for XK, Kell blood group complex subunit-related family, member 6 (XKR6) and on the other end by the gene for Myotubularin related protein 9 (MTMR9). The intercluster region of circa 400,000 bp includes the genes for Family with sequence similarity 167, member A (FAM167A); BLK proto-oncogene, Src family tyrosine kinase (BLK); Farnesyl-diphosphate farnesyl transferase 1 (FDFT1); and CTSB (cathepsin B), which is a flanking gene for the β-defensin cluster (Fig. 3). In birds, turtles, and crocodilians, the other end of the β-defensin cluster is followed by the gene for Translocation associated membrane protein 2 (TRAM2). As is the case with all of the other squamate (lizards and snakes) genomes surveyed, the flanking gene for the end of the β-defensin cluster cannot be definitively determined at present as there are no squamate genomes with intact clusters available.
The end of the cluster could either be flanked by XPO1 or TRAM2 or neither. Two of the three genes found on scaffold 45 with TRAM2 (VkBD80a, VkBD80b) are nearly identical and potentially the result of an assembly artifact. The genes are orthologs for the final gene in the avian, turtle, and crocodilian β-defensin clusters. The anole ortholog for this gene is isolated and is not associated with TRAM2, XPO1, nor any other β-defensins, and there are no β-defensins found in the proximity of anole TRAM2. Two of the seven genes associated with XPO1 have orthologs with one of the five anole genes associated with XPO1 but it cannot be determined in either species if these are part of the rest of the β-defensin cluster or part of an additional cluster. The snake orthologs are associated with TRAM2 but are not part of the cluster.
Structural diversity
Diversity can be seen in variations in structure of the β-defensin domain. Typically, a β-defensin consists of 2–3 exons: a signal peptide, an exon with the propiece and β-defensin domain with six cysteines, and in some cases, a short third exon. Variations in the number of β-defensin domains, exon size, exon number, atypical spacing of cysteines, and/or the number of cysteines in the β-defensin domain can be found in all reptilian species surveyed (unpublished). There are three β-defensins with two defensin domains (VkBD7, VkBD34, and VkBD43) and one with three defensin domains (VkBD39). The Komodo dragon β-defensin genes VkBD12, VkBD13, and VkBD14 and their orthologs in anoles have atypically large exons. The group of β-defensins between VkBD16 and VkBD21 also have atypically large exons. Atypical spacing between cysteine residues is found in three β-defensins, VkBD20 (1–3–9-7), VkBD57 (3–4–8-5), and VkBD79 (3–10–16-6). There are four β-defensins with additional cysteine residues in the β-defensin domain: VkBD6 with 10 cysteine residues, and a group of three β-defensins, VkBD16, VkBD17, and VkBD18, with eight cysteine residues.
The two β-defensin domains of VkBD7 are homologous to the one β-defensin domain of VkBD8 with orthologs in other species of Squamata. In the anole lizard A. carolinensis there are two orthologs, LzBD6 with one β-defensin domain and the non-cluster LzBD82 with two β-defensin domains. The orthologs in snakes (SnBD5 and SnBD6) have one β-defensin domain. VkBD34 is an ortholog of LzBD39 in anoles and SnBD15 in snakes. VkBD39 and VkBD43 consist of three and two homologous β-defensin domains respectively, which are homologous to the third exons of LzBD52, LzBD53, and LzBD55, all of which have two non-homologous β-defensin domains. VkBD40 with one β-defensin domain is homologous to the second exons of LzBD52, LzBD53, LzBD54 (with one defensin domain), and LzBD55.
An increase in the number of cysteines in the β-defensin domain results in the possibly of forming additional disulfide bridges. Examples of this variation can be found in the psittacine β-defensin, Psittaciforme AvBD12 [52]. The β-defensin domain of VkBD6 appears to consist of 10 cysteines, four of which are part of an extension after a typical β-defensin domain with an additional paired cysteine (C-X6-C-X4-C-X9-C-X6-CC-X7-C-X7-CC-X5-C). The group of Komodo β-defensins VkBD16, VkBD17, and VkBD18, in addition to having an atypical cysteine spacing, also have eight cysteines within a typical number of residues. The β-defensin following this group, VkBD19, is a paralog of these three genes; however, the β-defensin domain contains the more typical six cysteine residues.
The gene structures of these Komodo β-defensin genes are subject to confirmation with supporting evidence. There are a number of atypical structure elements in anole lizards including additional non β-defensin domain exons or larger exons.
Analyses of the peptide sequences encoded by the newly identified Komodo dragon β-defensin genes revealed that the majority (53 out of 66) of them are predicted to have a net positive charge at physiological conditions, as is typical for this class of antimicrobial peptide (Table 3). However, it is notable that four peptides (VkBD10, VkBD28, VkBD30 and VkBD34) are predicted to be weakly cationic or neutral (+ 0.5–0) at pH 7, while nine peptides (VkBD3, VkBD4, VkBD11, VkBD19, VkBD23, VkBD26, VkBD35, VkBD36 and VkBD37) are predicted to be weakly to strongly anionic. These findings suggest while these peptides exhibit canonical β-defensin structural features and reside in β-defensin gene clusters, one or more of these genes may not encode for β-defensin-like peptides or canonical β-defensins, because β-defensins typically are cationic and their positive charge contributes towards their antimicrobial activity.
Table 3 Physical properties of identified β-defensin peptides Identification of Komodo dragon ovodefensin genes
Ovodefensin genes have been found in multiple avian and reptile species [26], with expression found in egg white and other tissues. Ovodefensins including the chicken peptide gallin (Gallus gallus OvoDA1) have been shown to have antimicrobial activity against the Gram-negative E. coli and the Gram-positive S. aureus. Presumptive β-ovodefensins are found in a cluster in the same syntenic block as the β-defensin cluster in birds and reptiles. There have been 19 β-ovodefensins found in A. carolinensis (one with an eight cysteine β-defensin domain) and five in snakes (four with an eight cysteine β-defensin domain) (Prickett, M.D., unpublished work in progress). The Komodo dragon cluster consists of six β-ovodefensins (Tables 4 and 5). Two of these may be Komodo dragon specific; VkOVOD1, which is a pseudois an ortholog of SnOVOD1 in addition to the first β-ovodefensin in turtles and crocodilians. The defensin domains VkOVOD3, VkOVOD4, and VkOVOD6 consist of eight cysteines, orthologs of SnOVOD2, SnOVOD3, and SnOVOD5, respectively. VkOVOD4 and VkOVOD6 are orthologs of LzOVOD14.
Table 4 Ovodefensin peptides predicted in the Komodo dragon genome
Table 5 Physical properties of identified ovodefensin peptides Identification of the Komodo dragon cathelicidin genes
Cathelcidin peptide genes have recently been identified in reptiles through genomics approaches [13]. Several cathelicidin peptide genes have been identified in birds [52, 54,55,56,57,58], snakes [59, 60] and the anole lizard [11, 14, 61]. The release of functional cathelicidin antimicrobial peptides has been observed from chicken heterophils, suggesting that reptilian heterophils may also be a source of these peptides [30, 62]. Alibardi et al. have identified cathelicidin peptides being expressed in anole lizard tissues, including associated with heterophils [11, 14, 61]. Cathelicidin antimicrobial peptides are thought to play key roles in innate immunity in other animals [29] and so likely play this role in the Komodo dragon as well.
In anole lizards, the cathelicidin gene cluster, consisting of 4 genes, is organized as follows: <FASTK> cathelicidin cluster <KLHL18>. We searched for a similar cathelicidin cluster in the Komodo dragon genome. Searching the Komodo dragon genome for cathelicidin-like genes revealed a cluster of three genes that have a “cathelin-like domain”, which is the first requirement of a cathelicidin gene, located at one end of saffold 84. However, this region of scaffold 84 has assembly issues with gaps, isolated exons, and duplications. Identified Komodo dragon cathelicidin genes have been named after their anole orthologs. Two of the Komodo dragon cathelicidins (Cathelicidin2 and Cathelicidin4.1) are in sections with no assembly issues. By contrast, Cathlicidin4.2 was constructed using a diverse set of exons 1–3 and a misplaced exon 4 to create a complete gene, which is paralogous to Cathelicidin4.1. As the cluster is found at one end of the scaffold, there may be additional unidentified cathelicidins that are not captured in this assembly.
A common feature of cathelicidin antimicrobial peptide gene sequences is that the N-terminal cathelin-domain encodes for at least 4 cysteines. In our study of alligator and snake cathelicidins we also noted that typically following the last cysteine, a three-residue pattern consisting of VRR or similar sequence immediately precedes the predicted C-terminal cationic antimicrobial peptide [12, 13, 15, 60, 63]. Additional requirements of a cathelicidin antimicrobial peptide gene sequence are that it encodes for a net-positive charged peptide in the C-terminal region, it is typically encoded by the fourth exon, and it is typically approximately 35 aa in length (range 25–37) [13, 15]. Since the naturally occurring protease responsible for cleavage and release of the functional antimicrobial peptides is not known, prediction of the exact cleavage site is difficult. As can be seen in Table 6, the predicted amino acid sequences for each of the identified Komodo dragon cathelicidin gene candidates are listed. Performing our analysis on each sequence, we made predictions and conclusions about whether each potential cathelicidin gene may encode for an antimicrobial peptide.
Table 6 Predicted cathelicidin antimicrobial peptide gene sequences It can be seen that the predicted N-terminal protein sequence of Cathelicidin2_VARKO (VK-CATH2) contains four cysteines (underlined, Table 6). However, there is not an obvious “VRR” or similar sequence in the ~ 10 amino acids following the last cysteine residue as we saw in the alligator and related cathelicidin sequences [12, 13, 15]. In addition, analysis of the 35 C-terminal amino acids reveals a predicted peptide sequence lacking a net positive charge. For these reasons, we predict that the Cathelicidin2_VARKO gene sequence does not encode for an active cathelicidin antimicrobial peptide at its C-terminus (Table 7).
Table 7 Predicted active cathelicidin peptides and calculated properties (APD3 [64]) For the identified Cathelicidin4.1_VARKO gene, the predicted cathelin-domain includes the requisite four cysteine residues (Table 6), and the sequence “VTR” is present within 10 amino acids of the last cysteine, similar to the “VRR” sequence in the alligator cathelicidin gene [12, 13, 15]. The 33-aa C-terminal peptide following the “VTR” sequence is predicted to have a net + 12 charge at physiological pH, and a large portion of the sequence is predicted to be helical [65, 66], which is consistent with cathelicidins. The majority of known cathelicidins contain segments with significant helical structure [67]. Finally, analysis of the sequence using the Antimicrobial Peptide Database indicates that the peptide is potentially a cationic antimicrobial peptide [64]. Hence, we predict that this gene likely encodes for an active cathelicidin antimicrobial peptide, called VK-CATH4.1 (Table 7).
In addition, this peptide demonstrates some homology to other known antimicrobial peptides in the Antimicrobial Peptide Database [64] (Table 8). It shows a particularly high degree of sequence similarity to cathelicidin peptides identified from squamates, with examples included in Table 8. Thus, the predicted VK-CATH4.1 peptide has many of the hallmark characteristics of a cathelicidin peptide and is a strong candidate for further study. Table 8 shows the alignment of VK_CATH4.1 with known peptides in the Antimicrobial Peptide Database [64].
Table 8 Comparison to other cathelicidins For the identified Cathelicidin4.2_VARKO gene, the predicted cathelin domain includes the requisite four cysteine residues (Table 6). As was noted in the Cathelicidin4.1_VARKO gene, the sequence “VTR” is present within 10 amino acids of the fourth cysteine residue, and immediately precedes the C-terminal segment, which encodes for a 30-aa peptide that is predicted to be antimicrobial [64]. The amino acid sequence of the C-terminal peptide is predicted to have a net + 10 charge at physiological pH, and it demonstrates varied degrees of homology to other known antimicrobial peptides in the Antimicrobial Peptide Database [64]. Thus, like VK-CATH4.1, this candidate peptide also exhibits many of the hallmark characteristics associated with cathelicidin peptides, and is a second strong candidate for further study. Table 8 shows the homology and alignment of VK-CATH4.2 with known peptides from the Antimicrobial Peptide Database. Finally, the gene sequence encoding the functional peptide VK-CATH4.2 is found on exon 4, which is the typical location of the active cathelicidin peptide. This exon encodes the peptide sequence LDRVTRRRWRRFFQKAKRFVKRHGVSIAVGAYRIIG.
The predicted peptide VK-CATH4.2 is highly homologous with peptides from other predicted cathelicidin genes, with similar predicted C-terminal peptides, from A. carolinensis, G. japonicus, and P. bivittatus (Table 8). Residues 2–27 of VK-CATH4.2 are 65% identical and 80% similar to the anole Cathelicidin-2 like predicted C-terminal peptide (XP_008116755.1, aa 130–155). Residues 2–30 of VK-CATH4.2 are 66% identical and 82% similar to the gecko Cathelicidin-related predicted C-terminal peptide (XP_015277841.1, aa 129–151). Finally, aa 2–24 of VK-CATH4.2 are 57% identical and 73% similar to the Cathelicidin-related OH-CATH-Like predicted C-terminal peptide (XP_007445036.1, aa 129–151).