Introduction

The evolution of new protein motifs with specific amino acid sequences, via classical Darwinian trajectories, has been challenged (Behe 1998, 2001, 2002, 2007, 2009; Behe and Snoke 2004, 2005) by attributing randomness to molecular change, deleterious nature to intermediate mutations (rather than neutrality or selective advantage), insufficient geological time or population size for molecular improvements to occur, and invoking “design” (= supernatural causation) for the materialization of complex molecular structures (Nelson 1996; Luskin and Gage 2008). This logic has been dismissed by researchers (Schneider 2000; Pennock 2001; Long et al. 2003; Young and Edis 2004; Lynch 2005; Forrest and Gross 2007; Petto and Godfrey 2007; Durrett and Schmidt 2008, 2009; Schneiderman and Allmon 2009; Paz-y-Miño C. and Espinosa 2010a) and journal editors (Hermodson 2005) based on fundamental evolutionary premises: (1) large variation in mutation rate between and within lineages, and/or protein sites, is susceptible to positive selection; (2) protein-site mutagenesis is associated with mutation and acceptance rates at multiple sites in a genome (= compensatory changes); (3) new protein functions after domain junction can experience faster evolution (e.g., fused genes); and (4) selection acts continuously and cumulatively (= “editing role”) on intermediate protein forms, increasing and maintaining molecular diversity, and expediting molecular evolution. Thus, single emergence of primordial genetic sequences or protein-adaptive change from “design creationism” is highly improbable.

Here, we use slot-machine probabilities—the “jackprot” model—and ion channel evolution to illustrate how mutation rate coupled with natural selection have expedited the diversification of ion channels, from simple two-transmembrane (2TM) proteins to complex, multi-domain (6TM) molecules highly tuned to respond to environmental stimuli and regulate ion passage through the cell membrane. Ion channels are essential to ionic homeostasis of all cells and crucial to the hyperpolarization and depolarization of neurons (Kress and Mennerick 2009; Miller 2009); ultimately, irritability of individual neurons, communication among neuronal networks, ganglia activity, and brain power depend on ion-channel function.

Ion Channels as Exemplars of Protein Evolution

Ion channels are integral proteins in the plasma membrane of all cells and probably all organisms. A single or limited number of prokaryotic precursors gave origin to the large diversity of modern ion channels (Derst and Karschin 1998; Durell et al. 1999; Anderson and Greenberg 2001; Martinac et al. 2008). Their genetic evolution is very complex and includes numerous gene duplications (orthologous and paralogous in prokaryotes and eukaryotes), vast nucleotide change, and elaborate alternative splicing (Miller 2000; Anderson and Greenberg 2001; Sansom et al. 2002; Pichon et al. 2004; Hill et al. 2008). For didactic purposes, we summarize ion-channel diversification as follows: simplest forms of ion channels probably consisted of two transmembrane (M1 + M2 = 2TM) hydrophobic domains with a pore-forming loop (P) in the middle (Fig. 1a); some modern potassium (K+) channels are tetramers (4 × 2TM) of this type (e.g., KcsA K+ of Streptomyces lividans, Fig. 1b and below). Additional transmembrane segments have evolved attached to the basic 2TM motif, generating 6TM proteins (one-subunit-6TM), which ancestral gene sequences have duplicated further into assemblages of two or four 6TM-linked subunits (two-subunits- or four-subunits-6TMs; Fig. 2). Assemblages of two 2TM-linked subunits (2 × 2TM; Fig. 2), or 6TM- and 2TM-linked subunits (6TM + 2TM; not depicted) have been described (Durell et al. 1999; Anderson and Greenberg 2001), suggesting great variability in the pattern of protein assemblage and gene fusion during ion channel evolution.

Fig. 1
figure 1

Cell membrane topology of a 2TM potassium (K+) channel: fundamental building blocks consist of multiples of transmembrane domains (M1 and M2), a pore-forming loop (P), and a signature (s) sequence of amino acids (TT V/I GYG or “ion selectivity filter”) highly conserved in K+ channels across taxa; intra- and extracellular segments have specific names (i.e., pre-M1, Turret and extended regions, post-M2); K+ ions are represented by small spheres. b Two subunits (cutaway view) of the tetrameric (4 × 2TM) structure of KcsA, one of the simpler prokaryotic K+ channels from the soil bacterium Streptomyces lividans, are depicted

Fig. 2
figure 2

Simplified pattern of ion channel evolution. Ancestral 2TM-like proteins probably gave origin to the complex families of ion channels known today. A precursor building block of two transmembrane domains, containing the pore-forming loop (P), is conserved in channels of many taxa. Gradual and cumulative nucleotide mutations, combined with gene duplications (orthologous and paralogous diversification), have given origin to the multiple-subunit-2TM and multiple-subunit-6TM channels

Cellular metabolism, including osmoregulation, secretory processes, signal transduction, and ion-homeostasis triggered the evolution of ion-transporting proteins in the plasma membrane (Derst and Karschin 1998), a function later “exapted” (= new adaptive role) to electrical excitability and signaling (via hyper/de/polarization) and communication (via networking) among neurons. Cells capable of detecting (sensing, e.g., mechanoreceptors), responding to and controlling the differential concentration of ions inside and outside the plasma membrane, by means of specialized proteins, probably evolved into primordial neurons (Galliot et al. 2009).

We consider ion channels didactic exemplars of protein evolution, in the context of the “jackprot” model (below), for various reasons: (1) ion-channel genetics, genomics, proteomics, cell and tissue localization, electrophysiology, response to neurotoxins or medical-drugs, bioinformatics, structural modeling, X-ray crystalography, and involvement in prevalent diseases or “channelopathies” (e.g., genetically defective K+ channels: type II diabetes, cardiac arrhythmia and epilepsy; chloride Cl channels: cystic fibrosis; Ca++ channels: Parkinson’s disease; and concerted activity of K+ and Cl channels: tumor metastasis) have been widely documented (Capener et al. 2002; Kunzelmann 2005; Rogers et al. 2006; Sontheimer 2008); (2) K+, Na+, and Ca++ channels are textbook case studies in neurobiology and electrophysiology of neurons and muscle cells (Kress and Mennerick 2009); their role in action potentials, neuromuscular junctions and cardiac rhythm are familiar to wide audiences; (3) the evolutionary patterns of ion-channel diversification, from simpler 2TM-like ancestors to more complex multiple-subunit 6TMs (Fig. 2), can be inferred from genomic analyses within (paralogous gene families) and between taxonomic lineages (orthologous gene families); (4) comparative DNA and amino-acid sequence analyses (e.g., Homo vs. Rattus vs. Mus vs. Drosophila vs. Caenorhabditis vs. Paramecium vs. Escherichia vs. Arabidopsis; Doyle et al. 1998; Shealy et al. 2003) reveal classical Darwinian patterns of ion-channel evolution via cumulative single-nucleotide mutations, gene duplications and fusions, and protein-domain junctions; and (5) neuronal networks, ganglia activity, and brain functions depend on ion-channels for sensitivity (i.e., touch/pressure/vibration, sound, light, chemosignals/odor, and electric fields) and electrical transmission of stimuli, motor (neuromuscular junction for voluntary or reflex movement) or excretory response (neuro-endocrine stimulation), behavior, and consciousness (Galliot et al. 2009; Kress and Mennerick 2009; Miller 2009). Thus, the ubiquitous inclusion of ion channels in significant empirical and practical aspects of the biology and health-related careers’ curricula makes them unique didactic tools for communicating evolutionary principles to all audiences, and promoting evolution literacy (innovation in science education has been prioritized by authors concerned with the misleading role of “design creationism” in public-outreach campaigns; Paz-y-Miño C. and Espinosa 2009a, b; Paz-y-Miño C. and Espinosa 2010a, b, c).

Slot-machine Probabilities and the “Jackprot”

The “jackprot” uses simplified slot-machine probability principles to demonstrate how mutation rate coupled with natural selection suffices to explain the origin and evolution of highly specialized proteins, such as the single- (K+) or multiple-subunit 6TM (Na+ and Ca++) channels. Winning the “jackprot,” or highest-fitness complete-peptide sequence, requires gradual and cumulative smaller “wins” (rewarded by selection) at the first, second and third nucleotide positions in each of the codons coding for a polypeptide (= “jackdons” that lead to “jackacids” that lead to the “jackprot”; Fig. 3). A slot-machine represents the cellular chemical apparatus, product itself of Darwinian evolution, required to generate, step by step, each of the three nucleotides coding for an amino acid. The probability of getting the correct triplet, for example, the start codon methionine or ATG, in a single attempt (or winning the “jackacid”), is equal to one in 64, or one divided by 4 × 4 × 4 (i.e., the total number of possible nucleotides per position multiplied by itself three times). But because molecular evolution occurs gradually, a naturalistic assumption of the “jackprot” model, each time any of the correct nucleotides is generated by the slot-machine, natural selection rewards it and keeps it (partial nucleotide win in a codon or “jackdon”). Therefore, the probability of arriving, nucleotide by nucleotide, at the ATG sequence is equal to one in 12, or one divided by 4 + 4 + 4 (i.e., the summation of the individual probabilities for each nucleotide position), a much faster evolutionary process. Note that the sequential and additive arrival at the phenotypically meaningful sequence of A plus T plus G, represents, in reality, the accumulation of events fixed by natural selection during protein evolution, which entails clustered changes of multiple parts, and at diverse locations, within functional domains.

Fig. 3
figure 3

The “jackprot” model of protein evolution. A slot machine (lower left) represents the cellular apparatus required to generate each of the three nucleotides coding for an amino acid, for example, the starting codon methionine ATG. The probability of generating ATG in a single attempt, without the influence of natural selection, is equal to one in 64 (1/4 × 4 × 4); however, each time a biologically meaningful nucleotide is generated by the slot-machine (mimicking mutation rate), natural selection would keep it as a building block of a codon and as a partial win, or “jackdon.” Thus, the probability of arriving under selection at the ATG sequence would be equal to one in 12 (1/4 + 4 + 4). Winning the “jackprot,” or highest-fitness complete-peptide sequence, for example, 160 amino acids plus one-stop codon in the sequence of KcsA, a K+ channel from the bacterium Streptomyces lividans, would require gradual and cumulative smaller wins (“jackdons”) at each nucleotide position, which lead to larger rewards when a correct amino acid is generated by the slot machine (“jackacids”), and which subsequently lead to the “jackprot” or the complete 161 codons. The smaller slot machines represent each cell apparatus necessary to generate the first ten amino acids of KcsA (complete sequence available at GenBank Z37969; Swiss-Prot P0A334); the genomic sequence, letter coding/acronym, and the number of codons coding for that specific amino acid within the genetic code are shown below each machine (e.g., ATG, M met, and one in 64)

The genomic and amino acid sequences of the well-studied 2TM K+ channel from the soil bacterium S. lividans (KcsA K+; Schrempf et al. 1995; Doyle et al. 1998; Lu et al. 2001; Shealy et al. 2003; Williamson et al. 2003; Doyle 2004) helps us exemplify how the “jackprot” works. KcsA is one of the simpler K+ channels (Fig. 1b): 483 nucleotides code for its 160 amino acids plus a stop codon (GenBank Z37969; Swiss-Prot P0A334). KcsA probably retains many features of earlier 2TM ancestors; the amino acid sequence of the M1 and M2 domains resemble the transmembrane segments immediately connected to the pore region in K+ channels of prokaryotes, invertebrates, vertebrates, and plants (Doyle et al. 1998; Williamson et al. 2003). The pore signature sequence is nearly identical (TT V/I GYG) to that of bacteria, protists, fruit flies, nematodes, mice, rats, and humans (Doyle et al. 1998; Lu et al. 2001), suggesting a common origin of all these channels.

Although randomness can be a statistical component of mutation rate in ion channel evolution, synergistic biological restrictions (e.g., structural compatibility of purine: pyrimidine pairing in DNA; differential codon representation per amino acid; residue site specificity for plasma membrane hydrophobicity and hydrophilicity; pore- and signature-sequence location in the non-polar region of the plasma membrane; and codon bias intrinsic to taxonomic lineages, below) impose directionality on molecular assemblage, and KcsA exemplifies it. Natural selection has tinkered with molecular improvements in ancestors of KcsA by favoring and retaining adaptive peptide sequence for optimal function.

Why Is Evolution Not a Random Process?

We address this question as follows: (1) the probability of arriving randomly at the correct arrangement of 483 nucleotides in the genetic code for KcsA is equal to the allocation of any of the four nucleotides (A, G, C, and T) each multiplied by four per nucleotide position, or 4 × 4 four hundred and eighty-three times (4483); the probability of generating by chance the correct codon sequence for the 160 amino acids of KcsA, plus one-stop codon, is equal to 64 (the number of codons in the genetic code) multiplied by 64 one hundred and sixty-one times (64161). This could occur once every 46 million years, assuming a mutation rate of one nucleotide every 95,000 years and 2,085 generations per year (S. lividans reproduces every 4.2 hours; Palacin et al. 2003); this didactic estimate is based on an average mutation rate of 5.0 base pairs every 1010 nucleotides per generation (Drake 1991; Drake et al. 1998; Lynch 2006; Bentley et al. 2008; note that our estimate disregards the tetrameric configuration of KcsA, whose structural and functional assemblage must have required additional time-consuming evolution). But mutations are complex, occur in clusters, occur at different rates within and between genes (= “hot spots” in the genome); and networks of genes can coevolve (e.g., interacting ion-channel genes), thus increasing and maintaining informational complexity, decreasing uncertainty, and expediting evolution (for detailed discussions on computational methods and theoretical implications see Schneider 2000; Lynch 2005, 2006; Durrett and Schmidt 2008; Stern and Orgogozo 2009). Interestingly, the first ancestors of Streptomyces species appeared as recently as 450 million years ago (Chater 2006), and S. lividans’ clade (violaceoruber/coelicolor) apparently separated from its sister clade, avermitilis, 220 million years ago (Hatano et al. 1994; Kawamoto and Ochi 1998; Duangmal et al. 2005; Chater and Chandra 2006; Ventura et al. 2007). S. lividans is probably a very recent taxon, much younger than 220 million years old, and likely more recent than the 46 million years needed to generate at random one of its plasma membrane proteins, KcsA (below). (2) Because nucleotide transitions (A/G to G/A or C/T to T/C) are more probable than transversions (purine to/from pyrimidine), due to structural and polar affinity between complementary bases, the sole random arrival at the correct arrangement of the 483 nucleotides of KcsA would be reduced to one in two, rather than one in four (above), nucleotides per complementary position of DNA sequence, or 2483 (a much faster process than 4483). Note also that redundancy in codon coding (nine amino acids are coded by two codons each, five by four, three by six, one by three, and two by one) determines differential probability of amino acid site allocation; for example, amino acids coded by two codons each (phe, tyr, his, gln, asn, lys, asp, glu, and cys) have a two in 64 probability of being allocated in a peptide sequence; in contrast, amino acids coded by six codons each (leu, ser, and arg) have a six in 64 probability of participating in the protein. This implies that amino acids coded by six codons each would be three times more frequent in KcsA than those coded by two codons each. But this is not the case (Table 1), although amino acids coded by six codons each occur at an average frequency of 9.7% (wide range [r = 3.7–14.9]), no different than the 9.3% expected by chance, the amino acids coded by two codons each are four times less frequent than those coded by six codons each, they occur at an average frequency of 2.2% [r = 0.0–5.5], rather than 3.1% expected by chance. Note that cys (coded by two codons) does not even occur in KcsA, although, according to chance, it should be present at a frequency of 3.1%. Further discrepancy between observed and expected frequency of occurrence applies to the rest of the amino acids of KcsA: those coded by four codons each (gly, thr, ala, val, and pro) occur at an average frequency of 8.6% [r = 3.1–13.6], rather than the 6.2% expected by chance; only ile is coded by three codons and occurs at a frequency of 1.8%, rather than 4.6% expected by chance, while met and trp are coded by one codon each and occur at frequencies of 2.4% and 3.1%, respectively, rather than the 1.5% expected by chance (Chi-square = 52.91; df = 19; p ≤ 0.001). (3) Although the frequency of polar (P) plus electrically charged (EC) amino acids (N = 12) in the genetic code does not differ from that of their non-polar (NP) counterparts (N = 8; binomial two-tailed test, n.s.), peptide site specificity for plasma membrane hydrophobicity and hydrophilicity follow a non-random pattern in KcsA (Table 2): P + EC versus NP amino acids are unequally distributed across the lipid bilayer (Chi-square = 21.20; df = 5; p ≤ 0.001); NP residues are significantly more frequent than P + EC amino acids in the hydrophobic regions M1 (binomial two-tailed test; p = 0.014) and M2 (binomial two-tailed test; p = 0.04), while P + EC residues in the post-M2 segment are significantly more frequent than NP residues inside the cytoplasmic environment (binomial two-tailed test; p = 0.003), evidence of strong selective pressure for residue location; the phenomenon is striking considering the overall abundance of polarity in the 160 amino acids of KcsA (77 P + EC vs. 83 NP; binomial two-tailed test, n.s.). (4) Non-random pattern of third-codon sequence and overall nucleotide content are also evident in KcsA: 89% GC versus 11% AT in the third codon position, and 68% GC- versus 32% AT-overall content (data generated from genomic sequence; NCBI-GenBank Z37969), rather than the 1:1 ratio, in each case, expected by chance (values coincide with high GC frequency for third nucleotide position and high GC-overall content described for S. lividans; Wright and Bibb 1992; Fuglsang 2005; Wu et al. 2005); selection at translation has favored the codon bias composition of KcsA, intrinsic to its lineage (Wright and Bibb 1992).

Table 1 KcsA amino acid (aa) composition (N = 160 aa, plus one-stop codon)
Table 2 KcsA amino acid site specificity for plasma membrane hydrophobicity and hydrophilicity

Winning the “Jackprot”

We ran a simulation to generate, under selection, the genomic sequence coding for the 160 amino acids plus one-stop codon of KcsA (Table 3). By blindly drawing from a hat one of four marked marbles (A, G, C, and T), each representing a nucleotide, we generated observed-under-selection values (= number of draws it took until the matching-to-the-sequence-nucleotide was drawn). The total observed-under-selection value of correct nucleotide sequence composition (= 1,799) was even lower than and statistically different from the total expected-under-selection value (= 1,932, or 4 + 4 + 4 per codon, above; Chi-square = 339.08; df = 160; p ≤ 0.05), and it did differ from what would be expected without selection (Chi-square = 7,081.95; df = 160; p ≤ 0.001). The correct nucleotides for the first, second, and third positions were generated in an average of 3.3, 4.0, and 4.7 number of steps, respectively, and the correct codons in 11.1 average steps. The effect of selection was such that the “jackprot” generated the 161 codons in one sixth (1,799 vs. 10,304) the number of steps expected by chance! This implies that a protein similar to KcsA could evolve in just eight million years, instead of 46 million years, as computed above.

Table 3 Jackprot probabilities applied to the genomic sequence coding for 160 amino acids (aa) and a stop codon of KcsA

The Jackprot Simulation: Computer Programs and Online Interface

We wrote a computer program in JAVA APPLET version 1.0 and designed an online interface, The Jackprot Simulationhttp://faculty.rwu.edu/cbai/JackprotSimulation.htm, to model a numerical interaction between mutation rate and natural selection during a scenario of polypeptide evolution. Instructors and/or students can access the simulation online and run exemplar statistics identical to those in Table 3, and also cut and paste any cDNA or nucleotide sequence obtained from the National Center for Biotechnology Information (NCBI http://www.ncbi.nlm.nih.gov/nucleotide/) or alternative sources; The Jackprot Simulation will generate statistics analogous to those in Table 3. The online interface is friendly and self explanatory, and we provide a comprehensive description of how to use it in S1 in the Electronic Supplementary Materials. The computer programs JackprotSimulation.java and JackprotSupport.java are also available in S2 and S3 in the Electronic Supplementary Materials.

Conclusions

The “jackprot” helps us understand how natural selection introduces speed into molecular evolution by interacting with mutation rate and retaining complex molecular structures and assemblages of high fitness value. Ion channels are ideal examples to illustrate how biological constraints have driven channel diversification from simpler 2TM-like ancestors to the complex single-K+ or multi-subunit-6TM Na+ and Ca++ proteins. Because of their ubiquitous distribution across taxa, relevance in biology, neurobiology and health-career curricula, and significance in modern behavioral and cognitive studies, ion channels are a sophisticated yet friendly didactic tool for communicating evolutionary principles to all audiences. Alternative perspectives to Darwinian evolution, which attribute randomness to molecular change, deleterious nature to single-gene mutations, insufficient geological time or population size for molecular improvements to occur, or invoke “design creationism” to account for complexity in molecular structures and biological processes, are empirically unfounded and conceptually wrong.