Background

Organisms must be able to respond to their environment to survive. In plants, mechanisms have evolved for sensing and responding to hormonal and environmental signals, both biotic (for example, pathogens) and abiotic (for example, heat, cold, light and salt/drought stresses). To elicit a response, the perceived signal must be conveyed to the cellular machinery. Messengers such as Ca2+, cyclic nucleotides (cAMP and cGMP), hydrogen peroxide (H2O2) and nitric oxide transduce the perceived stimulus to proteins that initiate a response. Ca2+ is one of the important messengers that mediate plant responses to hormones, developmental cues and external stimuli [1,2,3]. It is implicated in regulating such diverse and fundamental cellular process as cytoplasmic streaming, thigmotropism, gravitropism, cell division, cell elongation, cell differentiation, cell polarity, photomorphogenesis and plant defense and stress responses [1,3,4]. The Ca2+ concentration in the cytoplasm ([Ca2+]cyt) is maintained in the nanomolar range (approximately 100-200 nM), whereas the concentration in organelles and cell wall is in the millimolar range [1,3,5,6]. Several signals (hormonal, abiotic and biotic) have been shown to cause transient elevation of [Ca2+]cyt [1,2,5,6,7]. This transient increase in [Ca2+]cyt is sensed by Ca2+-binding proteins [4,8,9]. The conformation of the Ca2+-binding protein changes on binding Ca2+, resulting in modulation of its activity or its ability to interact with other proteins or nucleic acids and modulate their function or activity. The Ca2+ sensors in plants can be broadly divided into four major classes [2,9]: calmodulin (CAM) (class A), CAM-like and other EF-hand-containing Ca2+-binding proteins (class B), Ca2+-regulated protein kinases (class C) and Ca2+-binding proteins without EF-hand motifs (class D).

Three classes (A, B, and C) contain proteins with EF-hand motifs. This motif is a helix-loop-helix structure that binds a single Ca2+ ion [10]. The loop consists of 12 residues with the pattern X*Y*Z*-Y*-X**-Z. The residues X, Y, Z, -Y, -X, -Z participate in binding Ca2+and the intervening residues are represented by asterisks (*). Asp or Asn is usually found at X and Y; Asp, Asn, or Ser at Z; a variety of residues at -Y; usually Asp, Asn, or Ser at -X, but this position is more variable; and usually Glu at -Z [11]. The helix-loop-helix is only 29 residues long, the E α-helix being residues 1-10, the loop 10-21, and the F α-helix 19-29 [11]. Residue 1 is often Glu (E), and a Gly at residue 15 is highly conserved, as is Ile at residue 17. It has been reported that some of the EF-hand domains do not bind Ca2+ [11].

In quiescent cells, proteins with EF-hands are in an apoprotein form; when [Ca2+]cyt increases they bind Ca2+ and change their conformation. Some EF-hands can also bind Mg2+ (for example, the third and fourth EF-hands of troponin C bind Ca2+/Mg2+, whereas the first and second EF-hands are Ca2+ specific [12]). Ca2+/Mg2+ discrimination relies on the affinities of the EF-hands for these cations, which is dependent on the types of amino-acid residues in the binding loop [11,13].

EF-hands can be present in proteins with no other known domains, as is the case for CAM, or in proteins with other domains such as a protein kinase. In most cases EF-hand motifs are found in pairs, and proteins with four EF-hands usually have two domains with a pair of EF-hands in each. Calpain is an exception to the pairing rule. It comprises a large subunit with five EF-hands at the carboxyl terminus and a small subunit that also has five EF-hands. The two unpaired hands in these subunits pair to form a heterodimer [14]. The large superfamily of EF-hand proteins has been divided into 66 subfamilies on the basis of differences in number and organization of EF-hand pairs, amino-acid sequences within or outside the motifs, affinity for Ca2+ and/or selectivity and affinity for target proteins [11]. Of the subfamilies, 28 consist of a unique single member. The EF-hand proteins used in the classification by Nakayama et al. [11] include proteins from animals, plants, fungi and protists, with plants represented in only nine of the 66 subfamilies.

Several EF-hand proteins have been identified in the model plant Arabidopsis thaliana, including several CAMs [15,16,17,18], a Ca2+-binding protein (CaBP-22)[19], touch-induced proteins TCH2 and TCH3 [20], centrin [21], Ca2+-dependent protein kinases (CPKs) [22], calcineurin B-like proteins/salt-overly-sensitive3 family (CBLs/SOS3) [23,24], fimbrins [25], respiratory burst oxidase homologs (Rbohs) [26,27], a phospholipase [28,29], channel proteins [30,31], a NAD(H)-dependent glutamate dehydrogenase [32], a protein phosphatase [33,34], a NaCl-inducible protein [35], and a Ca2+-binding protein in pollen [36]. Some have been identified by screening with animal homologs whereas others have been identified by sequencing the genes induced in response to biotic and abiotic signals. The first method would miss plant-specific EF-hand proteins and the second method relies on comprehensive analysis of all genes that might be induced or activated by various signals. With the recent completion of the sequencing of the A. thaliana genome, the first plant genome to be sequenced, new methods of identifying genes encoding proteins with specific domains have become possible [37]. Insight into the function of the proteins can be gained by identifying and characterizing EF-hand proteins encoded in the Arabidopsis genome. Classification of the EF-hand-containing proteins can be a starting point in identification of Ca2+-binding proteins that might be involved in a particular cell process. With this in mind, we searched the Arabidopsis genome for genes encoding proteins with EF-hand motifs. We used three approaches to identify EF-hand-containing proteins in Arabidopsis. First, we analyzed data from the Munich Information Center for Protein Sequences (MIPS) A. thaliana database (MAtDB) [38,39]; second, we carried out BLAST searches using different known EF-hand sequences (nucleotide and amino acid) against the Arabidopsis genome database and third, we searched the literature. We identified 250 EF-hand or putative EF-hand proteins. This estimate represents a maximum number of possible EF-hand-containing proteins in the Arabidopsis genome as our analysis was very inclusive. Of these 250 proteins, only 47 have been reported in the literature. Of the 250, 73 were identified by only one prediction program as having an EF-hand and the rest were identified by two or more programs. Further study is needed to verify Ca2+-binding activity of many of these proteins. Each protein sequence was analyzed for domains other than EF-hands. Several have a variety of domains, which may be useful in determining protein function.

Results and discussion

Identification of EF-hand-containing proteins

To identify EF-hand-containing proteins in Arabidopsis, the protein sequences listed as having EF-hands in the InterPro Domain Table at MAtDB [39] were retrieved. Each protein sequence was then analyzed for the presence of an EF-hand motif and other domain(s) using InterProScan [40]. There are many databases for analyzing proteins using different approaches to search for patterns, profiles and hidden Markov models [41]. InterProScan was chosen because it integrates SWISS-PROT, PROSITE, PRINTS, Pfam, ProDom, SMART and TIGRFAMs programs into a single comprehensive format. Therefore, scanning one site is the equivalent of scanning seven databases that use different approaches [40]. The Inter-Pro Domain Table at MAtDB listed 219 proteins as having EF-hands. Eighteen sequences did not have EF-hands identifiable by InterProScan and so were eliminated from our analysis. We also did sequence-similarity searches using three different EF-hand proteins that have been characterized in Arabidopsis. The nucleotide and protein sequences of Arabidopsis CAM4, a protein containing four EF-hands, were used to do BLAST searches (TbIastN, BlastP) against the Arabidopsis genome at MAtDB [39]. We also used the protein sequences of a Ca2+-dependent protein kinase (CPK1) and a small protein with one EF-hand domain (At2g46600). Proteins showing similarity to these proteins were checked for the presence of EF-hands using InterProScan as above. Additional EF-hand proteins were found that had not been included in the MAtDB Inter-Pro domain table. We also searched the literature for reports of EF-hand-containing proteins in Arabidopsis that had been identified by various experimental approaches. Additional EF-hand proteins were identified from this search. Together, these searches resulted in identification of a possible total of 250 EF-hand-containing proteins (Tables 1,2,3). Seventy-three of the EF-hands were identified by only one of the seven prediction programs included in InterProScan. These proteins, which are indicated in bold in Table 1, could be false positives. Further studies are needed to verify the Ca2+-binding ability of these putative EF-hands. It is, however, worth noting that the activity of two of the proteins in this category (AtPLC1 and KCO1) has been shown to be dependent on Ca2+ [31,42]. All proteins are listed by their protein ID number except CAM6, which has not been assigned an ID number. CPKs and closely related CRKs (CPK-related protein kinases) are listed in Table 2, as this is a large family of proteins that has been relatively well studied in Arabidopsis.

Table 1 EF-hand-containing proteins (excluding CPKs and CRKs) in Arabidopsis
Table 2 Summary of the CPKs and CRKs in the Arabidopsis genome
Table 3 Plant proteins not identified as EF-hand-containing proteins using InterProScan but known to bind Ca2+

The InterPro domain table also lists the EF-hand-containing proteins for Saccharomyces cerevisiae (29), Caenorhabditis elegans (139) and Drosophila melanogaster (132). The number of EF-hand proteins in the human genome was given as 83, with a note that the number may be an underestimate as a result of the stringent E-value cutoff used for the analysis [43]. Figure 1a shows a comparison of the number of EF-hand-containing proteins in sequenced eukaryotic organisms and the percentage of the total number of genes represented by genes encoding EF-hand proteins. Our analysis revealed that there is possibly a very large number of EF-hand proteins in Arabidopsis.

Figure 1
figure 1

(a) A comparison of the number of genes encoding putative EF-hand proteins (green) in different species and their percentage of the total number of genes (blue). At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Sc, Saccharomyces cerevisiae, Hs, Homo sapiens. (b) The number of Arabidopsis proteins having 1, 2, 3, 4, 5 or 6 EF-hands.

We used TargetP [44] to identify cellular targeting signals in all the EF-hand proteins. The results from this analysis show that EF-hand proteins are present in all major subcellular compartments (Tables 1,2,3).

Ca2+-binding proteins with no recognized EF-hand

Table 3 lists proteins that were reported in the literature as proteins containing EF-hand-like domains but where the InterProScan of these proteins did not identify any. However, the proteins were shown to bind Ca+. We included proteins with sequence similarity to these proteins in Table 3. We did not include these proteins in the total number of EF-hand proteins nor in the phylogenetic analysis.

Caleosins are proteins with similarity to a rice protein that was shown to bind Ca2+ [45]. A localization study in rapeseed, using an antibody to AtClo1, showed the presence of this protein in the ER and lipid bodies [46]. Clo3 was shown to be induced by abscisic acid and to bind Ca2+ [47].

The InterPro documentation IPR000308 for 14-3-3 proteins describes them as a large family of proteins that are primarily homo- or heterodimeric within all eukaryotic cells. They appear to effect intracellular signaling by regulating the catalytic activity of the bound protein, by regulating interactions between the bound protein and other proteins, or by controlling the localization of the bound protein. The 14-3-3 protein GF14ω was shown to bind Ca2+ and the binding was localized to loop 8 of GF14ω [48,49]. Seven other proteins showing strong similarity to GF14ω have the exact sequence in the loop considered to be the EF-hand and so were included in Table 3. Several other proteins showing similarity were divergent in this loop and so were not included.

SUB1 was identified as a protein involved in the cryptochrome and phytochrome signaling pathways [50]. Guo et al. [50] identified two EF-hand-like domains and demonstrated binding of Ca2+ by SUB1. SUL1 and 2 are proteins showing similarity to SUB1 but their sequences diverge somewhat in the EF-hand domains.

Number of EF hands

The number of EF-hands in each protein varied from one to six. Figure 1b shows the number and percentage of proteins having a specific number of EF-hands. As stated above, most EF-hand proteins have pairs of EF-hands, which facilitate binding of Ca2+ [11]. There are a large number of proteins with an odd number of EF-hand motifs (1, 3 or 5). Several possibilities are suggested by this observation. The proteins with an odd number of EF-hand domains may function as homo- or heterodimers, they may bind Ca2+ in a weaker manner, there may be another 'cryptic' Ca2+-binding motif that is not identifiable, but is functional, or they may not bind Ca2+ at all. Many of the proteins containing a single EF-hand motif were identified by only one prediction program and could be false positives.

Examples of these possibilities can be seen in EF-hand proteins that have been isolated and characterized previously. The K+ channel protein (KCO1) has one identifiable EF-hand but another region within the protein also shows similarity to an EF-hand. Although Ca2+-binding of KCO1 was not tested, the activity of the channel was shown to be Ca2+-dependent [31]. AtPLC1, one of a small family of phosphatidylinositol-specific phospholipase Cs (PLCs), has a putative EF-hand but Ca2+ binding was not evaluated [42]. No other AtPLC has an EF-hand domain but the amino-terminal sequences of several other family members have two sets of a helices that may correspond to EF-hand domains [29]. The putative EF-hand loop of AtPLC1 lies between two of the a helices. The actin-binding activity of most fimbrins is inhibited by Ca2+ [51]. AtFIM2 was shown to be Ca2+ independent, suggesting this single-EF-hand protein does not bind Ca2+ [25,51]. The respiratory burst oxidase family (see Table 1) has nine members in Arabidopsis that have either one or two EF-hands [26,27]. An alignment (data not shown) of these proteins, however, shows the presence of EF-hand like sequences for the missing EF-hand domain in the one-EF-hand proteins. Keller et al. [27] identified two EF-hand domains in RbohA which both bind Ca2+in vitro (RbohF in this report and Torres et al. [26]) although only one is recognized by InterProScan. The ability to bind Ca2+ was not addressed for single-EF-hand proteins ABI1 or GDH2 [32,33]. The CBL/SOS3 family of proteins (see Table 1) shows the presence of three EF-hand domains [23,52]. Kudla et al. [23], however, identified a sequence that represents a variation of the EF-hand domain that may be a fourth EF-hand. AtCP1 a protein with three EF-hand domains, also has a fourth EF-hand-like sequence at the end of the protein but it is truncated and may not be functional [35].

The reported proteins with two EF-hand domains include two pollen-associated proteins and the Rboh proteins (see Table 1). Three of the proteins with two possible EF-hands were identified by only one prediction program. The CAM family and proteins closely related to CAM - CaBP-22, PM129, TCH2 and centrin (Table 1) - and most of the CPKs (Table 2) have four EF-hand domains. The two proteins with six EF-hands are TCH3 and an unknown protein (At4g27790) which are only 13% similar and thus are not likely to represent duplicate genes. Although the significance of the number of EF-hand domains in various proteins is not known, they may differ in their affinity for Ca2+ and, thereby, function to fine tune Ca2+-mediated cellular activities.

Identification of other domains in EF-hand-containing proteins

Table 4 lists the other domains found in the EF-hand-containing proteins, their InterProScan accession numbers and general type of protein or domain. As shown in Table 1, some of the proteins predicted to have other domains have putative EF-hand motifs identified by only one prediction program (shown in bold in Table 1). Schematic diagrams of representative EF-hand proteins are shown in Figure 2. As can be seen, the calmodulin-like proteins have the EF-hands distributed throughout the protein and no other domain is present. Other proteins have the EF-hands at one end or the other, or in the middle of the protein with enzymatic or regulatory domains preceding or following the EF-hands.

Figure 2
figure 2

Schematic diagrams of representative EF-hand proteins. The number of amino acids is given at the end of each diagram. Domain names are written above the domain except as given in the key. PI-PLC-X(Y) and C2, phosphatidylinositol-specific phospholipase C subdomains. ABI1, ABA-insensitive 1; APC1, Arabidopsis pollenCa2+-binding protein; AtCP1, Arabidopsis thaliana Ca2+-binding protein; AtFIM1, Arabidopsis thaliana fimbrin 1; AtPLC1, Arabidopsis thaliana phosphatidylinositol-specific phospholipase C; CAM2, calmodulin 2; CaBP22, 22 kd Ca2+-binding protein; CBL/SOS3, calcineurinB-like, salt-overly-sensitive protein; CH, calponin homology; CLO1, caleosin1; CPK, Ca2+-dependent protein kinase; CRK, CPK-related kinase; GDH2, NAD(H)-dependent glutamate dehydrogenase; GTPase, small GTPase-like protein (At3g63150); KCO1, potassium channel outwardly rectifying protein 1; KIC, KCBP-interacting CCD-1-like protein; MCP, mitochondrial carrier protein (At5g61810); PM129, protein isolated from plasma-membrane enriched library; PPA, protein phosphatase 2A-like protein (At1g03960); PYR, pyridine nucleotide-disulfide oxidoreductase (At2g20800); RbohA, respiratory burst oxidase homology; TCH2 and TCH3, touch-induced proteins. // indicates a break in the protein.

Table 4 Summary of various domains present in Arabidopsis EF-hand-containing proteins and their InterPro accession numbers

Some of the domains listed in Table 4 either contain an EF-hand within the domain or are a specific type of EF-hand. These include EPS15 repeats, calflagin, recoverin and S100/IcaBP. EPS15 repeats are protein-protein interaction modules of about 95 residues that were first identified in tyrosine kinase substrates EPS15 and 15R. The first of three sub-domains in EPS15 may include a Ca2+-binding domain of the EF-hand type. Calflagins are flagellar Ca2+-binding proteins found in Trypanosoma cruzi and T. brucei that have motifs similar to EF-hands. Recoverin is a retinal Ca2+-binding protein that belongs in the EF-hand family of proteins.

Some of the EF-hand proteins contain an Arabidopsis retrotransposon (ATHILA) ORF-1 protein domain and one has an En/Spm-like transposon protein domain. As shown in Table 4, domains found in various enzymes are present in many of the EF-hand proteins. Because of the presence of EF-hand motifs in these proteins, regulation of these putative enzymes is likely to be Ca2+-dependent. The diversity of enzymes that contain EF-hand(s) indicates that a wide range of cellular processes is likely to be regulated by Ca2+. It is also of interest that some proteins in a family have EF-hands and others do not, suggesting differential regulation of protein family members.

Several identified domains indicate that some EF-hand proteins interact with other proteins (or themselves) or with nucleic acids. Table 4 lists domains that are involved in interaction with protein or DNA. Cell processes that EF-hand proteins may be involved in are suggested by domains found in transcription or translation proteins including elongation factors and the bHLH domain found in transcription factors. Domains such as potassium channels, pollen allergen Bra r II, mitochondrial carrier proteins, the cation (Ca2+ and Na+) pore region and nucleoside transporters suggest possible functions. One EF-hand protein has a domain found in NPH3 protein, a photoreceptor-interacting protein that is essential for phototropism. Another EF-hand protein has a jacalin domain found in lectins. These domains in EF-hand proteins should help in evaluating the function of these proteins. For instance, the EF-hand protein identified as having a pollen allergen Bra r II domain is similar to an EF-hand protein from pollen (APC1) isolated by Rozwadowski et al. [36]. They showed APC1's affinity for Ca2+ and the potential for a Ca2+-dependent conformational change.

Motifs such as the ATP/GTP-binding region suggest that the proteins containing them interact with or bind to certain molecules. PTM is the site for attachment of phosphopantetheine (the prosthetic group of acyl carrier proteins in some multienzyme complexes), PTS_HPR_SER is a serine phosphorylation site found in HPr (a protein in the phosphoenolpyruvate-dependent sugar phosphotransferase system in bacteria) and UIM (ubiquitin interaction motif) is a receptor for polyubiquitination of polypeptide chains.

Phylogenetic analysis of EF-hand-containing proteins

The full-length sequences of all proteins identified by Inter-ProScan as containing an EF-hand (including those identified by only one database) were aligned using MEGALIGN (DNAstar). Phylogenetic analysis was carried out by PAUP 4.08a using a heuristic search method. A consensus tree was generated from all saved trees. This tree was used to identify groups of EF-hand proteins and closely related proteins. Five major groups of proteins could be identified. Figure 3 shows the overall tree with a few representative members of each group. A sixth group includes members that did not fall into the other five groups. Figures 4,5,6,7,8,9 are the expanded trees for each group.

Figure 3
figure 3

Phylogenetic tree showing the overall relatedness of the EF-hand proteins. All EF-hand proteins were aligned using MEGALIGN (DNAstar) and analyzed using a heuristic method in PAUP 4.08a. Numbers represent the number of times the branch appeared in 100 saved trees. The tree was reduced by hand to show a few representative proteins for each major group. The expanded groups are shown in Figures 4,5,6,7,8,9.

Figure 4
figure 4

Group I tree showing all proteins included in this group. None has been published in the literature.

Figure 5
figure 5

Group II tree showing all proteins included in this group. Proteins published in the literature are in color.

Figure 6
figure 6

Group III tree showing all proteins included in this group. Proteins published in the literature in color.

Figure 7
figure 7

Group IV tree showing all proteins included in this group. Proteins published in the literature are in color.

Figure 8
figure 8

Group V tree showing all proteins included in this group. CPKs are in red and CRKs are in green. Subgroups are named as in Figure 10.

Figure 9
figure 9

Group VI tree showing all proteins included in this group. Proteins published in the literature are in color.

Group I proteins

None of the proteins in group I (Figure 4) has been reported in the literature. Some of them contain domains that give clues to their function, including elongation factors, DNA-, protein- or ATP/GTP-binding proteins and others (Tables 1 and 4).

Group II proteins

Group II includes KCO1, AtPLC1 and the two fimbrins (Figure 5) that have been reported in the literature (Table 1). Two other proteins show similarity to KCO1 and may also be Ca2+-regulated K+ channels. A family of phosphatidylinositol-specific phospholipase Cs have been isolated, but only one of them, AtPLC1, has an EF-hand domain (Figure 2, and see also [29,42]). AtPLC1, a protein isolated as a dehydration and salt stress-induced gene, was able to hydrolyze phosphatidylinositol-4,5-bisphosphate and the activity was completely dependent on Ca2+ [42]. AtFIM1 was identified as an EF-hand-containing protein (Figure 2), however, Kovar et al. [51] found that AtFIM1 was Ca2+-independent and so this may be a non-functional EF-hand as pointed out by McCurdy and Kim [25]. Four other proteins in the MatDB are similar to AtFIM1 but only one has an EF-hand motif.

Two small subgroups, nucleoside transporters and mitochondrial carrier proteins, also fell into group II (Figures 2, 5). As far as we know, Ca2+-binding studies have not been done for these proteins.

Group III proteins

CBL/SOS3S fall into group III (Figure 6). The first CBL/SOS3 was isolated as a protein involved in salt stress (SOS3) and as a calcineurin B-like protein (CBL1) [23,24]. Ten CBL/SOS3S have been identified [13]. Expression of CBL4 is induced by drought, cold and wounding stress. In animals, calcineurin is a heterodimer composed of a regulatory B subunit and a protein phosphatase catalytic A subunit. CBL/SOS3S show similarity to the B subunit (Figure 2). SOS3 was, however, shown to interact with a protein kinase [53] and a family of interacting protein kinases has been identified [54,55,56]. Both Albrecht et al. [55] and Kim et al. [54] have identified the domain in the kinases required for the interactions of protein kinase with CBL/SOS3.

AtCP1, which contains three EF-hands (Figure 2), is another NaCl-stress-induced protein that has been shown to bind Ca2+ [35]. A bean homolog of AtCP1 has been shown to be associated with the hypersensitive response [57]. A subgroup of proteins identified in this search show similarity to the protein phosphatase 2A regulatory B subunit. One of these, At5g44090 (AF165429), was reported in the literature [58]. They have one EF-hand domain but contain no other identifiable domains (Figure 2). One other protein of interest in this group, KIC (KCBP-interacting CCD-1-like protein), was identified as a protein that interacts with KCBP (kinesin-like calmodulin-binding protein), a protein known to interact with and be regulated by Ca2+/calmodulin [59,60]. KIC has only one EF-hand (Figure 2) and is similar to a wheat Ca2+-binding protein (CCD-1) [61].

Group IV proteins

Group IV contains the calmodulins (CAMs) and closely related proteins such as CaBP-22, centrin and the TCH gene proteins (Figures 7). CAMs are highly conserved small-molecular-weight acidic proteins of 148 amino acids (listed as 149, the starting Met is cleaved following translation). The four EF-hands (two pairs connected by a central helix) bind four molecules of Ca2+ [11,14]. Binding of Ca2+ to CAM results in a conformational change which then allows CAM to interact with target proteins to modulate their activity or function [3,18,62].

Nine Arabidopsis CAMs have been reported in the literature [15,16,17,18,20]. Seven of these are highly conserved, having 148 amino acids (CAMs 1-7) with only 1-4 amino-acid differences between them. CAM6 has not been given a protein identification number. BLAST searches with CAM6 pick CAM7 as the closest sequence. However, at the nucleotide level they are only 86% identical. There are two ESTs that are 83 and 94% identical to CAM6 but only 72 and 86% identical, respectively, to CAM7. CAM8 and CAM9 are divergent CAMs. They have 151 rather than 148 amino acids and vary considerably in the fourth EF-hand domain. Although they complemented a yeast calmodulin (CMD1) mutant, they did lot form a complex with a basic amphiphilic helical peptide in the presence of Ca2+, unlike conventional CAMs that do [18]. As can be seen in Figure 7, they do not fall into the same group as the other CAMs, with CAM9 being more divergent than CAM8. No other EF-hand proteins have 149 amino acids, although others do have a few more or less (Table 1). Expression studies of the Arabidopsis CAM genes show that they are differentially expressed in different tissues and circumstances. For CAM1, CAM2, and CAM3, CAM1 was the only one expressed in roots and CAM3 could not be detected in floral stocks; CAM1, CAM2, and CAM3 are inducible by touch stimulation but at different levels and with different kinetics [15]. CAM4, CAM5 and CAM6 were all expressed in leaves, but only CAM4 and CAM5 were detected in siliques [17]. Different Arabidopsis CAM isoforms also differ in their affinity for the same protein [60,63].

Two proteins induced by touch, rain, wind, wounding, and darkness, TCH2 and TCH3, are also in group IV and are related to the CAMs. TCH2 has 161 amino acids with four EF-hands and TCH3 has 324 amino acids with six EF-hands. Another CAM-like protein, CaBP-22, is closely related to the conventional CAMs (Figure 7). It has 191 amino acids, 66% ammo-acid sequence identity with CAM (79% in the EF-hand domains) and has been shown to bind Ca2+ [19]. Centrins are a little more distantly related to CAMs (Figure 7). An Arabidopsis centrin gene (ap3.3a) was isolated as a gene rapidly induced after pathogen inoculation [21]. One other EF-hand protein is 65% similar to centrin, suggesting there are two centrin genes in Arabidopsis.

Two proteins reported in the literature fall into a subgroup of CAM-like proteins. A novel EF-hand protein (PM129, Figure 2) was isolated from a cDNA library that over-represents plasma-membrane-associated proteins [64]. The other protein, APC1, is a pollen Ca2+-binding protein that is a member of the pollen allergen family [36]. Another protein, At3g03430, is 89% similar to APC1. They are the smallest of the EF-hand proteins having only 83 amino acids (Figure 2).

Group V proteins

The 34 CPKs and three CPK-related protein kinases (CRKs) make up almost all of group V (Figure 8). The other three proteins in the group do not have any other identifiable domains. CPKs are serine/threonine protein kinases with a CAM-like domain (CLD) usually containing four EF-hands (with two exceptions) (Table 2, Figure 2). These kinases have been called CDPKs; however, we use the most recent designation, namely CPKs [22,65,66]. Three of the eight CRKs in Arabidopsis have one EF-hand domain. However, sequence alignment of the EF-hand regions of CPKs and CRKs revealed that CRKs contain degenerate EF-hand motifs (Figure 2). CPKs are present only in plants and some protozoans. The PlantsP database [65] reports that most of the CPKs and CRKs contain transmembrane and N-myristoylation domains. TargetP predicts that some CPKs and CRKs are targeted to the chloroplast or mitochondria [44]. The cellular localization for most of these protein kinases needs to be confirmed experimentally.

CPKs range from 453 to 646 amino acids (Table 2) with four distinct domains; a variable region at the amino terminus (approximately 22-184 amino acids), a serine/threonine protein kinase domain (approximately 275 amino acids), an autoinhibitory domain (also called the junction region) (approximately 31 amino acids) and the regulatory CLD (approximately 165 amino acids) (Figure 2). The auto-inhibitory domain is involved in inhibiting the enzyme activity in the absence of Ca2+ whereas the variable region at the amino terminus may account for their substrate specificity and/or localization [22]. CRKs have similar domain organization as compared to CPKs (Figure 2). Most of the CPKs contain fatty-acylation sites, including those for myristoylation and palmitoylation, which seem to be necessary for targeting to membranes and for protein-protein interactions [67]. The protein kinase domain in CRKs shows strong sequence similarity to the kinase domain in CPKs, but the autoinhibitory and CLDs in CRKs show weak sequence similarity to the corresponding domains in CPKs.

CPKs have basal activities in the absence of Ca2+ as a result of the presence of the autoinhibitory region. Ca2+ binds the EF-hands of the CLD, which results in intramolecular rearrangement and relief of autoinhibition [22,68]. Eight CPK isoforms have been shown to be activated by Ca2+ [66,69,70,71]. The presence of multiple isoforms of CPKs in the Arabidopsis genome implies that they may be involved in specific Ca2+-signaling networks, may respond differentially to changes in oscillation, frequency, magnitude and duration of Ca2+ signal, or may have temporal and spatial patterns of expression and localization. Little is known about the function and substrates for CPKs in Arabidopsis. CPK1 is known to interact with 14-3-3 proteins [72] and is involved in the inactivation of a Ca2+ pump [73]. Expression of CPK10 and 11 is inducible by cold and drought [70,74,75]. An Arabidopsis CPK phosphorylates tonoplast intrinsic protein, α-TIP, a putative water-channel protein [76]. Substrates of CPKs in other plants have been identified and can be used to deduce the function of homologs in Arabidopsis. The PlantsP website database [65] is a valuable source of information on CPKs.

Besides the inclusion of the full-length sequences of CPKs and CRKs in the overall phylogenetic analysis, a phylogenetic tree was constructed using the protein sequence of the CLD region of CPKs and CRKs (Figure 10). Similar trees were obtained using either full-length CPKs/CRKs (Figure 8) or the CLD region of the CPKS/CRKs (Figure 10). As shown in Figure 10, the CPKs form five distinct subgroups (I-V). The CRKs are most closely related to subgroup IV. The tree made using full-length sequences (Figure 8) has four subgroups (members of subgroups III and V in Figure 10 fall into one subgroup in Figure 8).

Figure 10
figure 10

Phylogenetic analysis of CPK, CRK and CCaMK. The EF-hand domains of CPKs, CRKs and other plant and protist protein kinases were aligned in MEGALIGN (DNAstar) and analyzed using a bootstrap method. Numbers are the percentage of bootstrap replicates showing the branch. The accession numbers are listed here in brackets for EtCDPK (CAA96439; 332-487 amino acids), TgTPK4 (AAC02532; 355-501 amino acids), LiCCaMK (AAC49008; 339-520 amino acids) and NtCCaMK (AAD28791; 336-517 amino acids). Et, Eimeria tenella; Li, Lilium longiflorum; Nt, Nicotiana tabacum; Tg, Toxoplasma gondii.

Group VI

Figure 9 shows the remaining proteins that do not fall into one of the other five groups. It includes the respiratory burst oxidase homology proteins (Rbohs) ABI1, GDH2, and TPC1. Plant defense responses include production of reactive oxygen species (oxidative burst) [77]. Torres et al. and Keller et al. [26,27] isolated Arabidopsis homologs (Rbohs) to the gp91phox subunit of the neutrophil NADPH oxidase, which generates a similar oxidative burst in neutrophils. Six Rbohs (A-F) have been isolated experimentally and three others have been identified in the Arabidopsis genome (Figures 2 and 9). RbohF (called RbohA in Keller et al. [27]), like animal Rboh enzymes, is an intrinsic plasma membrane protein but, unlike animal Rboh, it has EF-hands that were shown to bind Ca2+ [27]. Both Leung et al. and Meyer et al. isolated the ABI1 gene [33,34]. ABI1 is similar to serine/threonine phosphatase 2C, which in animals is Mg2+ or Mn2+ but not Ca2+-dependent (Figure 2). ABI1 is induced by abscisic acid and was shown to regulate stomatal aperture in leaves and mitotic activity in root meristems [33,34]. A second ABI gene (ABI2) was isolated using ABI1 as a probe [78]. The protein encoded by the cDNA had an eight-residue insertion in the EF-hand domain that does not conform to the EF-hand signature. A similar situation holds for glutamate dehydrogenases. Two genes were isolated; one coded for a dehydrogenase with an EF-hand (GDH2) (Figure 2) and the other one without (GDH1) [32]. Studies by Furuichi et al. [30] indicate that TPC1 is a two-pore channel that mediates Ca2+ influx. It has two EF-hand-like motifs located in a hydrophilic domain that connects the two transmembrane regions containing the pores [30]. Ca2+ binding was not shown experimentally for ABI1, GDH2, or TPC1.

Three proteins in this group have a domain present in a small GTPase protein (Figure 2 and Gtp in Figure 9). Another three-member group (Pyr in Figure 9) of proteins has a domain for FAD-dependent pyridine nucleotide-disulfide oxidoreductase (FAD_pyr_redox) (Figures 2 and 9).

Conclusions

A plant's adaptively variable behavior or plasticity during its lifetime has been described as 'plant intelligence' [79] and Ca2+ and its sensors are key players in this adaptive behavior. The large number of potential EF-hand-containing proteins indicates how [Ca2+]cyt changes can profoundly affect a wide array of cellular processes. Plants seem to have a large number of EF-hand-containing proteins, some of which have homologs in non-plant systems whereas others are unique to plants. In addition, plants also have unique sets of proteins that interact with Ca2+ sensors [80]. Signals from stresses such as pathogens, drought, cold and salt are mediated by EF-hand proteins [20,23,3,42,57]. Hormones induce EF-hand proteins [46,78] as do pathogen attacks [21,26]. Some EF-hand proteins, such as small GTPases and potassium channels, are involved in signaling [31]. Others appear to be involved in developmental pathways. For example, the S-100 domain proteins are involved in cell growth and differentiation, cell-cycle regulation and metabolic control [40]. The wide variety of domains in EF-hand proteins also shows the diversity of the processes in which Ca2+ is involved. However, many of the EF-hand proteins give little clue as to their function and may lead us in many other directions as their functions are elucidated.

The complexity of the Ca2+ messenger system is increased through the existence of families of proteins. At the first level, calmodulin adds complexity first in the number of isoforms present in Arabidopsis. The regulation of expression and the kinetics of interaction of these isoforms with different proteins can lend complexity to cell signaling. A second level of complexity is the number of proteins identified that interact with calmodulin. At least 100 calmodulin-binding proteins (CBPs) have been identified in Arabidopsis [80]. Differential expression of these proteins in cell types, developmental periods and in response to signals adds more complexity to the pathways that can be regulated by CAM.

The CPKs, a very large family of protein kinases, also add complexity to Ca2+ regulation. Although the mechanism of regulation may be similar in each CPK, Arabidopsis CPKs may differ in their affinity for Ca2+ in general and in the presence of their specific substrate in particular. CPK's affinity for Ca2+ is many times higher in the presence of substrate than in its absence [22]. The involvement of only specific CPKs in stress responses has already been shown [74]. Identification of the substrates for each CPK and their temporal and spatial expression will be needed for elucidating the pathways of Ca2+ control in plants.

Other families, such as the phosphatases, Rbohs and CBL/SOS3S, also contribute to the complexity of Ca2+ involvement in the regulation of many processes. Characterization of more Ca2+-binding proteins will lead to further understanding of the roles of Ca2+ and cross-talk among various components of Ca2+-signaling and other messengers in plants.

Materials and methods

Identification of EF-hand-containing proteins

Proteins containing EF hands were first identified using the InterPro Domain Table at MIPS A. thaliana database (MAtDB) [39]. The protein sequences of the 219 proteins listed at MAtDB were obtained and analyzed for EF-hands and other domains using InterProScan [40]. Proteins not showing EF-hand domains were eliminated from the list. To identify proteins not listed in the InterPro Domain Table at MAtDB, BLAST searches were done with three different EF-hand-containing proteins: calmodulin, KIC, and a Ca2+-dependent protein kinase (CPK1). BlastP searches were done for KIC and CDPK and BlastP and TbIastN for calmodulin [39]. Sequences for proteins showing sequence similarity to these proteins were also checked by InterProScan and any protein containing an EF-hand domain but not found on the InterPro Domain Table was added to the list. A literature search for Arabidopsis proteins containing EF-hands was also done using PubMed at NCBI (National Center for Biotechnology Information) [81].

Identification of domains and organellar targeting signals

Information about domains other than the EF-hand domain was collected from the InterProScan searches done for each protein sequence. Targeting information for each protein was obtained from the MAtDB general report that includes the results of TargetP [39].

Phylogenetic analysis

The full-length sequences of all proteins identified by InterProScan as containing an EF-hand (including those that are identified by only one prediction program) were aligned using MEGALIGN. A heuristic method using PAUP 4.08a generated 100 trees. A majority-rules consensus tree was computed from the 100 trees. For the CPKs and CRKs, a second alignment was done using the CAM-like domains. A bootstrap method of PAUP.4.06a with 100 bootstraps was used to generate the tree.