Scorpion Venom Gland Transcriptomics

  • Martha Rendón-Anaya
  • Thalita S. Camargos
  • Ernesto OrtizEmail author
Living reference work entry


For decades, the study of venomous animals has focused on the isolation and biochemical characterization of specific venom components that have medical or biotechnological importance. Indeed, scorpions have been extensively studied under this optics, which has led to the identification of hundreds of different transcripts encoding toxic peptides. However, scorpions are interesting organisms not only because of their toxin diversity but also because they represent the most ancient terrestrial animals that fossil records have identified. About 2,000 species have been described around the world, which also implies that scorpions are extremely well-adapted arthropods that have managed to survive in different environmental conditions. Even though the divergence timing of scorpions places them as interesting model organisms for evolutionary inferences, little is known about the genomic organization, speciation events, and population dynamics of these arthropods.

Different “omic” approaches have become a very powerful strategy for understanding the complexity of venomous animals. Transcriptomics, in particular, has been widely used to explore the transcriptional diversity of venom glands of several scorpion species. Recently, high-throughput sequencing platforms have substantially improved our capacity to describe biological features of scorpions but, most importantly, have outlined new directions toward a more complete understanding of the evolution of these arthropods.

In this chapter, those transcriptomic strategies followed in the last two decades that went from cDNA cloning to next-generation sequencing methods will be described. Some biological and evolutionary questions about scorpion speciation and venom diversification will also be addressed. Finally, an attempt to raise some future directions in the field will be made.


cDNA Library cDNA Library Construction Venom Gland Scorpion Venom Scorpion Toxin 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


The transcriptome is the complete set of RNAs that is present in a cell, a tissue, or an organ at any given time. This includes the protein-coding messenger RNAs (mRNAs), ribosomal (rRNAs), transfer (tRNAs), and other noncoding or small RNAs. Since the transcriptome is specific for a particular cell type and is affected by the specific conditions where those cells live, including the external environment, it can be considered a snapshot of the genes that are actively expressed in those conditions at any given time. The study of particular transcriptomes therefore allows focusing on the subset of genes expressed in the relevant cell types or tissues without having to study the complete set of gene products that the organism’s genome encodes. This is particularly important for venomous species, such as scorpions, since they are relevant not only for their biology or ecology but mostly for the venom they produce, which is restricted to a specific organ: the venom gland.

Several tools and techniques have been developed in order to have a comprehensive profile of the transcripts present in a given cell population. Historically, the possibility of cloning individual DNA molecules that are complementary (cDNAs) to mRNAs coding for proteins of interest led to the development of cDNA libraries. These libraries constitute collections of cDNA sequences cloned into vectors. The first scorpion venom gland cDNA library was reported in 1989 (Bougis et al. 1989) for the North African scorpion Androctonus australis. From then on, for more than two decades, the construction of cDNA libraries has been the main source of contribution to the discovery of new protein precursor sequences in many other scorpion species. The cDNA library sequencing has usually been performed by the traditional and expensive chain-termination DNA sequencing method, which limits the number of sequenced individual clones and therefore the amount of information mined from the libraries. With the advent of high-throughput sequencing technologies, known as RNA-seq, the amount of information that can be gathered from the cDNA libraries has grown exponentially. Scorpions are not an exception, and by now, reports have emerged describing the whole transcriptome sequencing of two species using next-generation methods: the African scorpion Pandinus imperator (Roeding et al. 2009) and the Mexican Centruroides noxius (Rendón-Anaya et al. 2012).

The cDNA library construction generalities and the specific points will be considered, focusing on both, the traditional and the high-throughput methodologies for transcriptome analysis, their advantages and disadvantages. The most relevant examples of applications in the study of scorpion transcriptomes will be referred. The future of the field of transcriptomics for scorpions will also be addressed.

Scorpion cDNA Library Construction

A cDNA library consists of a collection of DNA sequences that are complementary to the RNAs that are present in a group of cells in a particular environment, individually cloned into vectors. The nature of the study to be performed determines the conditions in which the specimens are kept before the cells are harvested, the election of the tissue or organ from where they are dissected, and the cDNA library construction protocols employed. Scorpions are undoubtedly relevant members of the ecosystems they inhabit, but the interest they attract is mainly due to their capacity to produce a very complex venom, with a variety of bioactive components. Among them, there is a diverse group of toxins and other peptides with a great potential as therapeutics and tools for studying molecular interactions, with a special emphasis on ion channels. The transcriptome analysis in these organisms is therefore usually directed to unraveling the expressed peptide components of the venoms. In scorpions, the venom is produced exclusively in the venom glands, two very well-delimited structures within the last segment of their metasome (tail), the telson. Hence, it is the tissue that is usually processed to produce the cDNA libraries (Fig. 1). The telson is easily dissected and homogenized to release the cell contents, including the RNAs, into an RNase-inactivating buffer. It is best to select healthy individuals from their natural environment for this procedure, and it is advisable to use freshly collected specimens whenever possible. Though the collection of specimens is unavoidable, the unlimited availability of cloned sequences after the library is constructed eliminates the need for further recollection, outweighing by far the problem of extracting a few individuals from the environment. It should also be noticed that differences in the venom profiles between the two genders have been found (De Sousa et al. 2010). Thus, it could be reasonable to generate independent cDNA libraries for males and females. In order to stimulate the mRNA production, the specimens are usually milked to depletion 2–5 days before the telson is dissected.
Fig. 1

Scorpion cDNA library construction steps: (a) Telson dissection. (b) Tissue homogenization. (c) Total RNA purification. (d) First-strand cDNA synthesis by reverse transcription. (e) Double-stranded cDNA synthesis by PCR. (f) Ligation into vector. (g) Bacterial transformation. (h) Growth and selection of colonies

The first cDNA strand is generated by an RNA-dependent DNA polymerase, the reverse transcriptase (RT) that will create a complementary copy of the RNAs. All known DNA polymerases need a primer to function, and the reverse transcriptase is not an exception. The choice of primers is another important consideration. In most of the times, the interest resides in the protein-coding mRNAs, but in others, a complete study, including the other noncoding RNAs, might be the objective. In the first case, the goal is to make copies of the mRNA exclusively. Like most eukaryotic mRNAs, the majority of scorpion mRNAs are polyadenylated at the 3′ end, so the use of a poly-dT primer will selectively amplify mainly the mRNAs. For the second case, a mix of random hexameric primers is the choice, and all RNAs will be reverse-transcribed.

For the generation of the second cDNA strand, there are a large number of choices available, including different commercial kits with different strategies. The simplest way is to treat the RNA/DNA hybrid product from the previous step with a combination of RNase H and a DNA polymerase. The RNase H will randomly cut the RNA in the hybrid, providing small RNA primers for the DNA polymerase to synthesize the second strand. Several kits provide ways to generate cDNAs flanked by adaptors that facilitate the cloning of the cDNAs into vectors, either by ligation (providing sites for restriction endonucleases) or by recombination (providing the specific recombinase target sequences).

Once the double-stranded cDNA is produced, it is cloned into the chosen vector (either plasmidic or phage derived), and individual clones (transformed bacteria or lysis plaques, respectively) can be isolated if desired. The library can then be screened by traditional methods or be used for massive sequencing.

There are several intrinsic advantages in generating scorpion cDNA libraries that are independent of the screening and characterization methods. One was mentioned earlier: once the library is obtained, there is no further need to extract living specimens from their environments in order to fulfill the need for DNA sequences for a wide range of projects. Large enough cDNA libraries made from scorpion venom glands should contain all the sequence information related to all the protein components of the venom. This remains true for underrepresented transcripts corresponding to poorly expressed components (though their isolation and identification pose some challenges; see below), even those that are basically undetectable by standard proteomic methods. Since cDNA is copied from mature RNAs, there is no intron sequence present nor any information on regulatory sequences (for that, a genomic DNA library should be created instead). On the other side, the lack of introns is quite an advantage since the sequence of the encoded protein may be unambiguously assigned. Cloned cDNA sequences constitute a major source of DNA for protein expression in different heterologous systems, including bacteria, thanks again to the lack of introns (Quintero-Hernández et al. 2011). Many scorpion peptides are translated from mRNA as precursors : pro-peptides (endogenous peptides, not secreted) and pre-pro-peptides (those with a secretion signal peptide). Posttranslational modifications generate the mature peptides. That information is lost for proteomic analysis, but transcriptomes include it. Some other posttranslational modifications can be predicted from the cDNA sequence, including putative phosphorylation or glycosylation sites, disulfide bridges (though the specific connectivity cannot be assigned sometimes), or the amidation of the C-terminus, which is a very relevant modification for scorpion toxins, as some amidated toxins have been shown to have a higher affinity for their target than the non-amidated variants (Benkhadir et al. 2004). The availability of the derived complete amino acid sequence allows to design synthetic peptides and from them to generate antibodies that will aid in the localization and purification of the parental proteins. The comparison of the predicted peptide sequences could aid in designing immunogens for the generation of antivenom sera with cross-reactivity against the main toxic components of geographically related species (Becerril et al. 1996) (See chapter “Antivenoms: Recombinant Neutralizing Antibodies, A New Generation of Antivenoms”).

Scorpion cDNA Library Screening by Traditional Methods

When the cDNA library construction methodologies became widely available, different methods for library screening were also developed. As mentioned earlier, the first reported scorpion cDNA library was screened by means of colony hybridization with specific oligonucleotide probes designed from the known peptide sequence of the mature toxins. The authors were able to isolate the clones containing the precursors for AaHII and several other toxins specific for mammals and insects from some 400,000 different clones (Bougis et al. 1989).

A similar approach based on the reverse translation of known mature peptide sequences to design specific oligonucleotides, but relaying on the polymerase chain reaction (PCR), was first employed to amplify the cDNA precursor sequence of the BjIT2 toxin from the Asian scorpion Buthotus judaicus (Gurevitz et al. 1990). Since the primers are designed from the sequence of the mature toxin, the amplified sequence is only a fragment or a subsequence of the complete cDNA. The first report employing this strategy allowed the cloning of the cDNA of Na+ channel-blocking toxins from the Mexican scorpion C. noxius Hoffmann (Becerril et al. 1993). The amplified partial cDNA can be used to determine the complete cDNA sequence by means of the RACE (Rapid Amplification of cDNA Ends) protocols. After the first strand cDNA is amplified by reverse transcription, an adaptor with known sequence is ligated to its 5′ end. A direct primer specific for the 5′ adaptor sequence is used with a reverse gene-specific primer (derived form the cloned fragment) to amplify the 5′ region of the cDNA (5′-RACE), and a direct gene-specific primer (also from the fragment) is used together with a poly(dT) to amplify the 3′ region (3′-RACE). The complete cDNAs for the precursors of the neurotoxins from the Chinese scorpion Mesobuthus martensii Karsch, BmK AS and BmK AS-1, BmP01, BmP03, BmP05, and the insect-specific BmK IT-AP were the first to be cloned using the 5′-RACE and 3′-RACE techniques (Lan et al. 1999; Wu et al. 1999; Xiong et al. 1999). To date more than a hundred different scorpion cDNAs have been isolated and sequenced with these tools (for a comprehensive review see Quintero-Hernández et al. 2011).

A different strategy for cDNA library exploration is the sequencing from more or less randomly selected clones, which is called expressed sequence tags (ESTs) (Adams et al. 1991 and Adams et al. 1993). Bacteria from the library are plated in a dilution that allows individual clones to be isolated and analyzed by colony PCR (a PCR technique where the template DNA is not purified: total DNA from lysed cells is added to the reaction). Vector-specific primers flanking the cloning region are used for amplification. Gel electrophoresis is then used to analyze the PCR products and to select the clones that will be sequenced. The selection is based on the colony PCR product size, so that a heterogeneous group of colonies is then chosen to purify the vector and sequence the cloned cDNA. Since no gene-specific primer is used, the results are in correspondence with the cDNA size range selected for sequencing. Sometimes, no such selection is applied and random clones are directly sequenced. This holistic approach will give a partial snapshot of the specimen’s whole transcriptome. Otherwise, the focus can be put on large cDNAs (including enzymes as phospholipases, proteases, etc.) or smaller ones (coding for toxins, antimicrobial peptides, etc.).

The application of EST sequencing to uncover the global gene expression of a venom gland, as well as the description of the toxins potentially represented in a venom composition by transcriptomic studies, was first described for snake venom (Junqueira-de-Azevedo et al. 2002). The value of this strategy for scorpion venom study was then demonstrated with a cDNA library from the Mexican scorpion Hadrurus gertschi. The authors were able to isolate and sequence 147 ESTs, some even coding for undescribed putative toxins and other proteins (Schwartz et al. 2007). The number of sequences made available per study has been increasing since the first report. For example, the random sequencing of the venom gland cDNA libraries from Lychas mucronatus, Isometrus maculatus, and Scorpiops margerisonae resulted in 551, 743, and 730 ESTs, respectively (Ma et al. 2012), illustrating the power of this technique.

To date, the random sequencing of ESTs has been conducted with the following buthid species: Buthus occitanus israelis (Kozminsky-Atias et al. 2008), Tityus discrepans (D’Suze et al. 2009), Lychas mucronatus (Ruiming et al. 2010 and Ma et al. 2012), Hottentotta judaicus (Morgenstern et al. 2011), Isometrus maculatus (Ma et al. 2012), Tityus serrulatus (Alvarenga et al. 2012), Tityus stigmurus (Almeida et al. 2012), and Centruroides tecomanus (Valdez-Velázquez et al. 2013). Within the non-buthids, reports are available for Hadrurus gertschi (Caraboctonidae, Schwartz et al. 2007), Opisthacanthus cayaporum (Liochelidae, Silva et al. 2009), Scorpiops jendeki (Euscorpiidae, Ma et al. 2009), Heterometrus petersii (Scorpionidae, Ma et al. 2010), Pandinus cavimanus (Scorpionidae, Diego-García et al. 2012), Scorpiops margerisonae (Euscorpiidae, Ma et al. 2012), and Urodacus yaschenkoi (Urodacidae, Luna-Ramírez et al. 2013).

As more transcriptome analyses of this kind have become available in the literature, a striking pattern of expression for toxins and other peptides has begun to appear. For example, it is well known that the large majority of the scorpion species that are of medical relevance belong to the Buthidae family. They are (in)famous for the Na+ channel-modulating toxins present in their venoms that are capable of depolarizing the axonal membranes, with dire consequences for the stung victims, including death (See chapter “Molecular Description of Scorpion Toxin Interaction with Voltage-Gated Sodium Channels”). Transcriptomes from the milked venom gland of members of the Buthidae family show that their most abundant transcripts are precisely those for the Na+ channel-modulating toxins, with relative abundances as high as 54.2 % for B. occitanus (Kozminsky-Atias et al. 2008), while they are basically absent in non-buthid species. In the non-buthid scorpions, on the other hand, transcripts coding for K+ channel-blocking ß-toxins (See chapter “Venom Protein Families: Potassium Channels Blockers from Scorpion Venoms”), antimicrobial and cytolytic peptides, are more abundant (Diego-García et al. 2007). A somewhat surprising finding is that, within the Buthidae family, the level of those Na+ channel toxin’s expression seems to be very dependent on the physiological state of the venom gland. Two transcriptomics analyses that used the resting (not milked) venom gland as starting material reported relatively lower levels of transcripts for Na+ channel toxins: 1.3 % for T. stigmurus (Almeida et al. 2012) and 6.7 % for H. judaicus (Morgenstern et al. 2011). There is still the possibility that these results could reflect the particularities of these two scorpions, since their transcriptomes from milked glands are not available for comparison.

Although the need for the isolation of individual clones to be sequenced can be seen as painstaking, its rewarding counterpart is that the experimenter has the clones perfectly identified and matched with the determined sequence, which facilitates further experiments to be performed with those sequences. They can be subcloned into expression vectors to heterologously produce the encoded proteins. The cDNA sequences can be used as templates to generate probes (e.g., by primer extension) for the screening of scorpion genomic DNA libraries, for genetic analyses or Southern blots, for the screening for homologous genes in other scorpion species, for studying gene expression, for interference experiments, and for a large etcetera.

The disadvantage of these methods resides in their cost. Since the sequencing is performed by the expensive Sanger method, they are not easy, nor cheap, to upscale. This implies that only a relatively small subgroup of all clones gets sequenced; thus, they are only partially quantitative and focused on abundant transcripts. A relevant component of the scorpion transcriptomes is therefore missed: the transcription levels of low expressed transcripts. Tag-based methods have been developed to overcome this limitation, including serial analysis of gene expression (SAGE, Velculescu et al. 1995), massively parallel signature sequencing (MPSS, Brenner et al. 2000), and cap analysis of gene expression (CAGE, Kodzius et al. 2006). None of them has been applied to scorpion transcriptomes, mainly due to the need for reference genome to work properly.

Scorpion Transcriptome Analysis with High-Throughput Technologies

With the advent of so-called next-generation or high-throughput sequence technologies , the possibility of massively sequencing cDNA libraries (RNA-seq or whole transcriptome shotgun sequencing , WTSS) has become a reality (Morin et al. 2008). For RNA-seq, a library of cDNA fragments with flanking adaptors at one or both ends is created from RNA. Only small amounts of RNA are required, since the cDNAs are not cloned. The fragments are then subjected to direct high-throughput shotgun sequencing from the adaptors to generate a large collection of short reads (30–700 nucleotides, depending on the DNA sequencing technology used). Following sequencing, the reads are aligned to a reference genome (when one is available) or assembled de novo to produce a detailed transcription map with both the sequences of all transcripts and their expression levels (Wang et al. 2009).

Two different reports have taken advantage of massive sequencing, both of them using the 454 pyrosequencing platform, to explore the transcriptional universe of the scorpion venom glands. Pyrosequencing relies on the “sequencing by synthesis” principle: a complementary strand is synthesized enzymatically from the single-stranded cDNA template. The four nucleotides are added separately, and in a given order, to the reaction. The unreacted nucleotides are degraded before the next one is added. Every time a new complementary nucleotide is incorporated by the DNA polymerase, pyrophosphate (PPi) (hence the name of the method) is stoichiometrically released and detected by a chemiluminescent reaction with another enzyme, revealing the letter in the sequence. The intensity of the light is proportional to the number of times this same nucleotide is present in a row. The process is repeated until the sequencing is completed (Ronaghi et al. 1998; Margulies et al. 2005). With current capacities, an average of 700 nucleotides are determined per read, but array-based instruments are capable of generating a million reads per run, so that in a single run, 700 Mb can be read with a single machine (

The African scorpion P. imperator was the first species to be examined under this approach (Roeding et al. 2009). In contrast to the abovementioned reports, this study did not focus on the discovery of new toxins, but rather aimed at making a comprehensive multigene-based phylogenetic analysis of arthropods. C. noxius was the second Buthid species to be analyzed with this platform (Rendón-Anaya et al. 2012). In this report, the transcriptomes of resting and active (milked) venom glands were compared and contrasted to a third cDNA library obtained from RNA extracted from the body after telson removal. Altogether, around 19,000 different potential transcripts were identified. The functional annotation revealed that the use of microRNAs (miRNAs) as a posttranscriptional control mechanism is widely distributed among eukaryotes as the main components of the small RNA machinery are conserved in C. noxius. Additionally, a phylogenomic analysis of concatenated coding genes uncovered important differences in evolution rates of specific sets of genes. By means of a quantitative analysis of the transcriptional profiles of two different telson conditions, several regulatory and metabolic responses were detected, such as high representation of carbohydrate, lipid and amino acid metabolism, proteasome activity, membrane transport, and signal transduction pathways differentially expressed.

Although pyrosequencing was the methodology of choice for the two reported massive transcriptome analyses made with scorpions to date, several other methods of high-throughput sequencing are available to researchers nowadays, each with its advantages and disadvantages (Liu et al. 2012). These technologies have dramatically reduced sequencing cost while significantly increasing the throughput. This is a field under rapid development, and new improved and cheaper technologies are under development (Schadt et al. 2010). There is no doubt that these technologies will drive the field of scorpion transcriptomics in the near future, replacing the Sanger-based sequencing methods.

Future Directions for Scorpion Transcriptomics

Even though these analyses have placed us in the middle of a large amount of molecular data, it is easy to see how far we are from really understanding the evolution of the scorpion venom. So far, we have some evidence that important toxin genes might be overexpressed right after the electric milking of the venom glands (Morgenstern et al. 2011; Rendón-Anaya et al. 2012), but, which other factors alter these toxin profiles? It has been suggested that the length and location of introns in toxin genes could alter to some extent their expression, as observed for K+ channel toxins in M. martensii (Nie et al. 2012; Zeng et al. 2012). It is also possible that environmental conditions determine the venom composition. Indeed, the transcriptome of the venom glands from L. mucronatus (Ruiming et al. 2010) revealed important differences of the transcriptional profile of two geographically distinct populations (Yunnan and Hainan-sourced). Other factors such as sex and age might affect the transcriptional and proteomic profiles of the venom glands as well, which should necessarily imply differences at the regulatory level that need to be explored. In a recent report, it was observed that miRNAs could be involved in the control of the venom phenotype in the rattlesnake Crotalus simus simus (Durban et al. 2013). In this particular case, the comparison of the transcriptional activity of the venom glands of neonate and adult specimens suggested that age-dependent changes in the concentration of miRNA modulating the transition from a crotoxin-rich to a metalloproteinase-rich venom from birth through adulthood could potentially explain the proteomic differences in the venom composition of C. s. simus. Given this evidence, the natural question to ask would be what is the potential role of the scorpion miRNA machinery in the venom gland during toxin production (Rendón-Anaya et al. 2012)?

Another unsolved question that researchers have evaluated for a long time is how do scorpions achieve such a wide diversity of toxin peptides. A parsimonious explanation would be gene duplication and functional diversification accompanied by strong positive selection. An intriguing example, for which this alternative has been examined, is the venomous mammal Ornithorhynchus anatinus (Wong et al. 2012). The combination of transcriptomic data with the reference genome revealed that only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication, suggesting that gene duplications alone do not explain the “venome” of the platypus. This leads to the possibility that other mechanisms, such as alternative splicing and mutation, may be important in venom innovation. This has also been proposed for some scorpion toxins from M. martensii (Zeng et al. 2012), in which the BmKbpp toxin was proposed to be the result of a recombination event at the transcript level, opening the possibility of trans-splicing driving the functional diversification of venom peptides.

From this discussion, it becomes clear that, in order to understand the venom complexity of scorpions as well as other significant biological insights, a comparative genomic approach should become the next needed step in the field (Fig. 2a–c). Scorpions represent an excellent evolutionary model and unexplored subjects for population genetic studies as they are the most ancient terrestrial animals that fossil records have identified. Cladistic and phylogenetic analysis (Pisani et al. 2004; Jeyaprakash and Hoy 2009) suggested that they arose ~350 Ma ago, and, after a physical separation upon the partition of the African and South American continents (~150 My ago), several speciation events gave rise to different genera such as Buthus, Mesobuthus, Parabuthus, Hottentotta, Leiurus, and Androctonus in Africa and Asia and Tityus and Centruroides in South and North America, respectively (Fet et al. 2003). Speciation of Asian scorpions in particular was recently associated to climate changes (aridifications and glaciations) during the Mid-Miocene and Pleistocene periods. Using one mitochondrial and three nuclear genes, although limited in the number of sequences, it was possible to describe how intensified aridifications from Mid-Miocene onward drove the diversification of Mesobuthid scorpions and that a switch to a more humid habitat that occurred close to the most common ancestor of M. martensii and the lineage of M. caucasicus led to the adaptation of M. martensii to a humid environment (Shi et al. 2013). Such results outline the importance of phylogeographic studies in our interpretation of scorpion history and venom evolution (Fig. 2d, e). Furthermore, considering the geographic overlap of scorpion species in particular areas, it becomes easy to imagine how outcrossing and successive admixtures before reproductive barriers are established are feasible events after speciation took place, opening the possibility that the genome organization might be a mosaic of genomic fragments from parental scorpion species. In spite of these interesting evolutionary aspects, a small amount of reports have tried to elucidate genomic features of scorpions, leading to contrasting chromosome numbers and genome size estimations. Karyotype determination has concluded that scorpion chromosomes vary in number and morphology, ranging from a diploid number of <10 up to >100 chromosomes (Schneider et al. 2009; Schneider and Cella 2010). Additionally, flow cytometry experiments indicate that the genome size of buthid scorpions might be comprised between 600 (M. martensii, Li et al. 2009) and 880 Mbp (Centruroides vittatus, Hanrahan and Johnston 2011).
Fig. 2

Future directions in scorpion genomic studies. (a) The generation of a reference genome is the first step toward deeper comparative analysis. (b) Genome re-sequencing, mapping, and SNPs identification will depict the genetic diversity of scorpion species. (c) Functional annotation and metabolic reconstruction can be achieved after genome assembly. (d) Detailed phylogenomic reconstructions using orthologous and paralogous genes identified after whole genome comparison will illustrate the real tree of scorpion species. (e) Phylogenomic observations can ultimately be related to geographic conditions, climate changes, and population dynamics

The most important limitation for the maturation of a genomic strategy that would naturally lead to population genomics and phylogeographic studies has been the cost of obtaining true genome-scale data. Nevertheless, new sequencing tools such as restriction site-associated DNA (RAD) sequencing, a method that simultaneously types and scores thousands of sequence variants (such as single nucleotide polymorphisms (SNP)), open the possibility of gathering genomic information across multiple individuals at a genome-wide scale in natural populations (Hohenlohe et al. 2012). These technologies should become increasingly important for evolutionary genetics, even in organisms with few genomic resources like scorpions.

Existing model organisms are limited when it comes to answering evolutionary and ecological questions. Advances in sequencing have radically expanded the reach of genetic studies to non-model organisms and thus should allow us to exploit the potential of these fascinating arthropods in the short term.



  1. Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, Kerlavage AR, McCombie WR, Venter JC. Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991;252(5013):1651–6.PubMedCrossRefGoogle Scholar
  2. Adams MD, Soares MB, Kerlavage AR, Fields C, Venter JC. Rapid cDNA sequencing (expressed sequence tags) from a directionally cloned human infant brain cDNA library. Nat Genet. 1993;4(4):373–80.PubMedCrossRefGoogle Scholar
  3. Almeida DD, Scortecci KC, Kobashi LS, Agnez-Lima LF, Medeiros SR, Silva-Junior AA. Junqueira-de-Azevedo I de L, Fernandes-Pedrosa M de F. Profiling the resting venom gland of the scorpion Tityus stigmurus through a transcriptomic survey. BMC Genomics. 2012;13:362.PubMedCentralPubMedCrossRefGoogle Scholar
  4. Alvarenga ER, Mendes TM, Magalhães BF, Siqueira FF, Dantas AE, Barroca TM, Horta CC, Kalapothakis E. Transcriptome analysis of the Tityus serrulatus scorpion venom gland. Open J Genet. 2012;2(4):210–20.CrossRefGoogle Scholar
  5. Becerril B, Vázquez A, García C, Corona M, Bolivar F, Possani LD. Cloning and characterization of cDNAs that code for Na+-channel-blocking toxins of the scorpion Centruroides noxius Hoffmann. Gene. 1993;128:165–71.PubMedCrossRefGoogle Scholar
  6. Becerril B, Corona M, Coronas FI, Zamudio F, Calderon-Aranda ES, Fletcher Jr PL, Martin BM, Possani LD. Toxic peptides and genes encoding toxin gamma of the Brazilian scorpions Tityus bahiensis and Tityus stigmurus. Biochem J. 1996;313:753–60.PubMedCentralPubMedGoogle Scholar
  7. Benkhadir K, Kharrat R, Cestele S, Mosbah A, Rochat H, El Ayeb M, Karoui H. Molecular cloning and functional expression of the alpha-scorpion toxin BotIII: pivotal role of the C-terminal region for its interaction with voltage-dependent sodium channels. Peptides. 2004;25:151–61.PubMedCrossRefGoogle Scholar
  8. Bougis PE, Rochat H, Smith LA. Precursors of Androctonus australis scorpion neurotoxins. Structures of precursors, processing outcomes, and expression of a functional recombinant toxin II. J Biol Chem. 1989;264:19259–65.PubMedGoogle Scholar
  9. Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M, Roth R, George D, Eletr S, Albrecht G, Vermaas E, Williams SR, Moon K, Burcham T, Pallas M, DuBridge RB, Kirchner J, Fearon K, Mao J, Corcoran K. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol. 2000;18(6):630–4.PubMedCrossRefGoogle Scholar
  10. D’Suze G, Schwartz EF, García-Gómez BI, Sevcik C, Possani LD. Molecular cloning and nucleotide sequence analysis of genes from a cDNA library of the scorpion Tityus discrepans. Biochimie. 2009;91:1010–9.PubMedCrossRefGoogle Scholar
  11. de Junqueira-de-Azevedo I L, Ho PL. A survey of gene expression and diversity in the venom glands of the pitviper snake Bothrops insularis through the generation of expressed sequence tags (ESTs). Gene. 2002;299(1–2):279–91.CrossRefGoogle Scholar
  12. De Sousa L, Borges A, Vásquez-Suárez A, Op den Camp HJ, Chadee-Burgos RI, Romero-Bellorín M, Espinoza J, De Sousa-Insana L, Pino-García O. Differences in venom toxicity and antigenicity between females and males Tityus nororientalis (Buthidae) scorpions. J Venom Res. 2010;21(1):61–70.Google Scholar
  13. Diego-García E, Schwartz EF, D’Suze G, Roman-Gonzalez SA, Batista CV, Garcia BI. Rodriguez de la Vega R, Possani LD. Wide phylogenetic distribution of Scorpine and long-chain beta-KTx-like peptides in scorpion venoms: identification of “orphan” components. Peptides. 2007;28:31–7.PubMedCrossRefGoogle Scholar
  14. Diego-García E, Peigneur S, Clynen E, Marien T, Czech L, Schoofs L, Tytgat J. Molecular diversity of the telson and venom components from Pandinus cavimanus (Scorpionidae Latreille 1802): transcriptome, venomics and function. Proteomics. 2012;12(2):313–28.PubMedCrossRefGoogle Scholar
  15. Durban J, Pérez A, Sanz L, Gómez A, Bonilla F, Rodríguez S, Chacón D, Sasa M, Angulo Y, Gutiérrez JM, Calvete JJ. Integrated “omics” profiling indicates that miRNAs are modulators of the ontogenetic venom composition shift in the Central American rattlesnake, Crotalus simus simus. BMC Genomics. 2013;14:234.PubMedCentralPubMedCrossRefGoogle Scholar
  16. Fet V, Gantenbei B, Gromov AV, Lowe G, Lourenço WR. The first molecular phylogeny of Buthidae (Scorpiones). Euscorpius. 2003;4:1–10.Google Scholar
  17. Gurevitz M, Zlotkin E, Zilberberg N. Characterization of the transcript for a depressant insect selective neurotoxin gene with an isolated cDNA clone from the scorpion Buthotus judaicus. FEBS Lett. 1990;269:229–32.PubMedCrossRefGoogle Scholar
  18. Hanrahan SJ, Johnston JS. New genome size estimates of 134 species of arthropods. Chromosome Res. 2011;19:809–23.PubMedCrossRefGoogle Scholar
  19. Hohenlohe PA, Catchen J, Cresko WA. Population genomic analysis of model and nonmodel organisms using sequenced RAD tags. Methods Mol Biol. 2012;888:235–60.PubMedCrossRefGoogle Scholar
  20. Jeyaprakash A, Hoy MA. First divergence time estimate of spiders, scorpions, mites and ticks (subphylum: Chelicerata) inferred from mitochondrial phylogeny. Exp Appl Acarol. 2009;47:1–18.PubMedCrossRefGoogle Scholar
  21. Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D, Imamura K, Kai C, Harbers M, Hayashizaki Y, Carninci P. CAGE: cap analysis of gene expression. Nat Methods. 2006;3(3):211–22.PubMedCrossRefGoogle Scholar
  22. Kozminsky-Atias A, Bar-Shalom A, Mishmar D, Zilberberg N. Assembling an arsenal, the scorpion way. BMC Evol Biol. 2008;8:333.PubMedCentralPubMedCrossRefGoogle Scholar
  23. Lan ZD, Dai L, Zhuo XL, Feng JC, Xu K, Chi CW. Gene cloning and sequencing of BmK AS and BmK AS-1, two novel neurotoxins from the scorpion Buthus martensi Karsch. Toxicon. 1999;37:815–23.PubMedCrossRefGoogle Scholar
  24. Li S, Ma Y, Jang S, Wu Y, Liu H, Cao Z, Li W. A HindIII BAC library construction of Mesobuthus martensii Karsch (Scorpiones:Buthidae): an important genetic resource for comparative genomics and phylogenetic analysis. Genes Genet Syst. 2009;84:417–24.PubMedCrossRefGoogle Scholar
  25. Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012:251364.Google Scholar
  26. Luna-Ramírez K, Quintero-Hernández V, Vargas-Jaimes L, Batista CV, Winkel KD, Possani LD. Characterization of the venom from the Australian scorpion Urodacus yaschenkoi: molecular mass analysis of components, cDNA sequences and peptides with antimicrobial activity. Toxicon. 2013;63:44–54.PubMedCrossRefGoogle Scholar
  27. Ma Y, Zhao R, He Y, Li S, Liu J, Wu Y, Cao Z, Li W. Transcriptome analysis of the venom gland of the scorpion Scorpiops jendeki: implication for the evolution of the scorpion venom arsenal. BMC Genomics. 2009;10:290.PubMedCentralPubMedCrossRefGoogle Scholar
  28. Ma Y, Zhao Y, Zhao R, Zhang W, He Y, Wu Y, Cao Z, Guo L, Li W. Molecular diversity of toxic components from the scorpion Heterometrus petersii venom revealed by proteomic and transcriptome analysis. Proteomics. 2010;10:2471–85.PubMedCrossRefGoogle Scholar
  29. Ma Y, He Y, Zhao R, Wu Y, Li W, Cao Z. Extreme diversity of scorpion venom peptides and proteins revealed by transcriptomic analysis: implication for proteome evolution of scorpion venom arsenal. J Proteomics. 2012;75(5):1563–76.PubMedCrossRefGoogle Scholar
  30. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–80.PubMedCentralPubMedGoogle Scholar
  31. Morgenstern D, Rohde BH, King GF, Tal T, Sher D, Zlotkin E. The tale of a resting gland: transcriptome of a replete venom gland from the scorpion Hottentotta judaicus. Toxicon. 2011;57:695–703.PubMedCrossRefGoogle Scholar
  32. Morin RD, Bainbridge M, Fejes A, Hirst M, Kryzwinski M, Pugh TJ, McDonald H, Varhol R, Jones SJM, Marra MA. Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. Biotechniques. 2008;45(1):81–94.PubMedCrossRefGoogle Scholar
  33. Nie Y, Zeng XC, Luo X, Wu S, Zhang L, Cao H, Zhou J, Zhou L. Tremendous intron length differences of the BmKBT and a novel BmKBT-like peptide genes provide a mechanical basis for the rapid or constitutive expression of the peptides. Peptides. 2012;37:150–6.PubMedCrossRefGoogle Scholar
  34. Pisani D, Poling LL, Lyons-Weiler M, Hedges SB. The colonization of land by animals: molecular phylogeny and divergence times among arthropods. BMC Biol. 2004;2:1.PubMedCentralPubMedCrossRefGoogle Scholar
  35. Quintero-Hernández V, Ortiz E, Rendón-Anaya M, Schwartz EF, Becerril B, Corzo G, Possani LD. Scorpion and spider venom peptides: gene cloning and peptide expression. Toxicon. 2011;58(8):644–63.PubMedCrossRefGoogle Scholar
  36. Rendón-Anaya M, Delaye L, Possani LD, Herrera-Estrella A. Global transcriptome analysis of the scorpion Centruroides noxius: new toxin families and evolutionary insights from an ancestral scorpion species. PLoS One. 2012;7(8):e43331.PubMedCentralPubMedCrossRefGoogle Scholar
  37. Roeding F, Borner J, Kube M, Klages S, Reinhardt R, Burmester T. A 454 sequencing approach for large scale phylogenomic analysis of the common emperor scorpion (Pandinus imperator). Mol Phylogenet Evol. 2009;53:826–34.PubMedCrossRefGoogle Scholar
  38. Ronaghi M, Uhlén M, Nyrén P. A sequencing method based on real-time pyrophosphate. Science. 1998;281(5375):363–5.PubMedCrossRefGoogle Scholar
  39. Ruiming Z, Yibao M, Yawen H, Zhiyong D, Yingliang W, Zhijian C, Wenxin L. Comparative venom gland transcriptome analysis of the scorpion Lychas mucronatus reveals intraspecific toxic gene diversity and new venomous components. BMC Genomics. 2010;11:452.PubMedCentralPubMedCrossRefGoogle Scholar
  40. Schadt EE, Turner S, Kasarskis A. A window into third-generation sequencing. Hum Mol Genet. 2010;19(R2):R227–40.PubMedCrossRefGoogle Scholar
  41. Schneider MC, Cella DM. Karyotype conservation in 2 populations of the parthenogenetic scorpion Tityus serrulatus (Buthidae): rDNA and its associated heterochromatin are concentrated on only one chromosome. J Hered. 2010;101:491–6.PubMedCrossRefGoogle Scholar
  42. Schneider MC, Zacaro AA, Pinto-Da-Rocha R, Candido DM, Cella DM. A comparative cytogenetic analysis of 2 Bothriuridae species and overview of the chromosome data of Scorpiones. J Hered. 2009;100:545–55.PubMedCrossRefGoogle Scholar
  43. Schwartz EF, Diego-García E, Rodríguez de la Vega RC, Possani LD. Transcriptome analysis of the venom gland of the Mexican scorpion Hadrurus gertschi (Arachnida: Scorpiones). BMC Genomics. 2007;8:119–28.PubMedCentralPubMedCrossRefGoogle Scholar
  44. Shi CM, Ji YJ, Liu L, Wang L, Zhang DX. Impact of climate changes from Middle Miocene onwards on evolutionary diversification in Eurasia: insights from the mesobuthid scorpions. Mol Ecol. 2013;22:1700–16.PubMedCrossRefGoogle Scholar
  45. Silva ECN, Camargos TS, Maranhão AQ, Silva-Pereira I, Paulino L, Possani LD, Schwartz EF. Cloning and characterization of cDNA sequences encoding for new venom peptides of the Brazilian scorpion Opisthacanthus cayaporum. Toxicon. 2009;54:252–61.PubMedCrossRefGoogle Scholar
  46. Valdez-Velázquez LL, Quintero-Hernández V, Romero-Gutiérrez MT, Coronas FIV, Possani LD. Mass fingerprinting of the venom and transcriptome of venom gland of scorpion Centruroides tecomanus. PLoS One. 2013;8(6):e66486.PubMedCentralPubMedCrossRefGoogle Scholar
  47. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270(5235):484–7.PubMedCrossRefGoogle Scholar
  48. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63.PubMedCentralPubMedCrossRefGoogle Scholar
  49. Wong ES, Papenfuss AT, Whittington CM, Warren WC, Belov K. A limited role for gene duplications in the evolution of platypus venom. Mol Biol Evol. 2012;29:167–77.PubMedCentralPubMedCrossRefGoogle Scholar
  50. Wu JJ, Dai L, Lan ZD, Chi CW. Genomic organization of three neurotoxins active on small conductance Ca2+-activated potassium channels from the scorpion Buthus martensi Karsch. FEBS Lett. 1999;452:360–4.PubMedCrossRefGoogle Scholar
  51. Xiong YM, Lan ZD, Wang M, Liu B, Liu XQ, Fei H, Xu LG, Xia QC, Wang CG, Wang DC, Chi CW. Molecular characterization of a new excitatory insect neurotoxin with an analgesic effect on mice from the scorpion Buthus martensi Karsch. Toxicon. 1999;37:1165–80.PubMedCrossRefGoogle Scholar
  52. Zeng XC, Wang S, Nie Y, Zhang L, Luo X. Characterization of BmKbpp, a multifunctional peptide from the Chinese scorpion Mesobuthus martensii Karsch: gaining insight into a new mechanism for the functional diversification of scorpion venom peptides. Peptides. 2012;33:44–51.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Martha Rendón-Anaya
    • 1
  • Thalita S. Camargos
    • 2
  • Ernesto Ortiz
    • 3
    Email author
  1. 1.Laboratorio Nacional de Genómica para la BiodiversidadCentro de Investigación y de Estudios Avanzados del Instituto Politécnico NacionalIrapuatoMéxico
  2. 2.Departamento de Ciências Fisiológicas, Instituto de Ciências BiológicasUniversidade de BrasíliaBrasíliaBrasil
  3. 3.Departamento de Medicina Molecular y Bioprocesos, Instituto de BiotecnologíaUniversidad Nacional Autonóma de MéxicoCuernavacaMéxico

Personalised recommendations