Gene organization and evolutionary history

The human Wiskott-Aldrich syndrome protein (WASP) gene was the first of the WASP and WAVE family genes to be isolated, in 1994, as a mutated gene associated with Wiskott-Aldrich syndrome (WAS), an X-linked recessive disease characterized by immunodeficiency, thrombocytopenia and eczema, clinical features caused by complex defects in lymphocyte and platelet function [1]. Another WASP family member, neural (N-) WASP, was then identified from a proteomic search for mammalian proteins that interact with the Src homology 3 (SH3) domain of growth factor receptor binding protein 2 (Grb2, also known as Ash) [2]. Although expressed ubiquitously, N-WASP is most abundant in the brain - hence its name. The first WAVE protein was identified in humans by our group and another group independently as a WASP-like molecule and was named WAVE and SCAR1, respectively [3, 4]. Currently, it is agreed that mammals possess five genes for the WASP and WAVE family, WASP, N-WASP, WAVE1/SCAR1, WAVE2, and WAVE3 [59]. Human WASP and WAVE family genes are located on different chromosomes, with each gene showing a unique expression pattern (Figure 1). The human WASP gene is carried on the X chromosome and is expressed exclusively in hematopoietic cells, which explains the inheritance pattern and the immunodeficiency and platelet deficiency characteristic of WAS. WAVE1 and WAVE3 are strongly enriched in the brain and are moderately expressed in some hematopoietic lineages, whereas WAVE2 appears to be ubiquitous.

Figure 1
figure 1

Comparison of the domain structures of the WASP and WAVE family proteins from different species. Color coding indicates conserved domains. The percentage amino acid similarity of WH1/EVH1 domains or WHD/SHD domains is shown below each domain. For species abbreviations, see the legend to Figure 2.

Human WASP and WAVE proteins are between 498 and 559 amino acids long and are encoded by 9 to 12 exons. The length of the genes is relatively similar, ranging from 67.1 kb for N-WASP to 131.2 kb for WAVE3, with the exception of WASP, which is a compact 7.6 kb. The restricted expression of WASP in hematopoietic cells is dependent on a 137-bp region upstream of the transcription start site [10]. It is unclear how brain-specific expression of WAVE1 and WAVE3 is regulated, but the proximal promoter region of mouse WAVE1 retains potential recognition motifs for the transcription factor hepatocyte nuclear factor 3β (HNF3β) and putative E2-box sequences that can be recognized by some basic helix-loop-helix transcription factors, such as MyoD and Twist, upstream of the transcription start site [11].

The WASP and WAVE family proteins possess a carboxy-terminal homologous sequence, the VCA region, consisting of the verprolin homology (also known as WASP homology 2 (WH2)) domain, the cofilin homology (also known as central) domain, and the acidic region, through which they bind to and activate the Arp2/3 complex, a major actin nucleator in cells (Figure 1). Besides the VCA region, the WASP subfamily proteins are characterized by the amino-terminal WH1 (WASP homology 1; also known as an Ena-VASP homology 1, EVH1) domain, which functions as a protein-protein interaction domain. In contrast, WAVE subfamily proteins are characterized by the presence of the WHD/SHD domain (WAVE homology domain/SCAR homology domain), which is located at the amino terminus. This domain is highly conserved between species, for even the distantly related Arabidopsis WHD/SHD domain has 74% amino acid similarity to the WHD/SHD domain of human WAVE1. This domain seems to be involved in the formation of the WAVE complex (see later). Using these sequence signatures together with genomic information from various organisms, WASP and WAVE homologs have been discovered in a wide variety of eukaryotic species; WASP and WAVE homologs (one of each) are found in Dictyostelium discoideum (WASP and SCAR) [12, 13], Caenorhabditis elegans (WSP-1 and WVE-1) [1416], and Drosophila melanogaster (WASP and SCAR) [17, 18]. Budding yeast has only one WASP homolog, Las17/Bee1 [19, 20], and seems to lack WAVEs. In contrast, the plant Arabidopsis thaliana appears to have four WAVE genes, SCAR1-4 [21], but no WASPs.

Given that even plants have WAVE homologs, the evolutionary history of the WASP and WAVE family is likely to extend back to before the divergence of the eukaryotes. Along with the evolution of the actin cytoskeleton, eukaryotic cells must have needed means to control actin polymerization and reorganize the actin cytoskeleton, which presumably led to the development of the WASP/WAVE-Arp2/3 axis of actin-polymerizing mechanisms. Although it is difficult to determine whether the WASP and WAVE subfamilies evolved from a common ancestral gene, Arabidopsis SCARs seem to have evolved independently of the evolution of WASPs and other fungal and metazoan WAVE/SCARs, which is suggested by the alignment of conserved verprolin domain (V) and cofilin homology domain (C) sequences (Figure 2a). More detailed phylogenetic trees can be drawn from the alignment of highly conserved WH1/EVH1 domains of WASPs and the alignment of WHD/SHD domains of WAVEs. Zebrafish homologs of human WASP and N-WASP have been reported recently [22], and a TBLAST search over the Ensembl zebrafish genome (Zv8) revealed at least one homolog of WAVE1, one of WAVE2 and two of WAVE3 (see the legend to Figure 2 for the zebrafish gene accession numbers).

Figure 2
figure 2

Evolutionary relationships between the WASP and WAVE family proteins. The phylogeny was inferred using the neighbor-joining method. ClustalW was used to align sequences and perform phylogenetic analysis. Any position containing gaps was excluded from the dataset. Trees were drawn by NJplot [89]. Bootstrap values were calculated over 1,000 iterations and values greater than 50% are shown as percentages next to branches. The bar in each figure indicates the proportion of amino acid differences. (a) The phylogenetic tree based on the alignment of combined sequences of V and C regions. WASP and WAVE sequences were retrieved from the NCBI protein database and the V/WH2 domain for each protein was identified by homology search over the Pfam-A database. C regions were identified according to the previously reported consensus sequence [29]. The sequence to be analyzed was generated by joining the identified V sequence and C sequence. (b) The phylogenic tree based on WH1/EVH1 domain alignment. WH1/EVH1 domains were identified by homology search over the PROSITE database. (c) The phylogenetic tree based on WHD/SHD domain alignment. WHD/SHD domains were identified following the consensus sequence described previously [90]. Species examined are Homo sapiens (Hs), Mus musculus (Mm), Danio rerio (Dr), Drosophila melanogaster (Dm), Caenorhabditis elegans (Ce), Saccharomyces cerevisiae (Sc), Dictyostelium discoideum (Dd) and Arabidopsis thaliana (At). Ensembl protein IDs for the zebrafish sequences used in the analysis are as follows: Dr WASP1, ENSDARP00000039217; Dr WASP2, ENSDARP00000007963; Dr N-WASPa, ENSDARP00000094295; Dr N-WASPb, ENSDARP00000005823; Dr WAVE1, ENSDARP00000079387; Dr WAVE2, ENSDARP00000093195; Dr WAVE3a, ENSDARP00000077123; Dr WAVE3b, ENSDARP00000085962. Two other homologous genes for WAVE were identified in the zebrafish genome, but could not be assigned to homologs of mammalian WAVE1/2/3, so they were omitted from the analysis. These proteins are ENSDARP00000047935 and ENSDARP00000102646.

Phylogenetic analyses that include the zebrafish amino acid sequences give us some interesting insights into the evolution of these proteins in vertebrates. First, both ancestral WASP and N-WASP seem to be present in a common ancestor of fish and mammals (Figure 2b). This means that WASP could have acquired its specialized function in the adaptive immune system early in vertebrate evolution, as the adaptive immune system is first seen in the jawed fishes. Second, WAVE is split into three distinct clades, WAVE1-3, as early as the emergence of the vertebrates (Figure 2c). Considering that WAVE1 and probably WAVE3 are involved in brain development in mammals [2327], WAVE1 and WAVE3 might be the basis for the advent of the central nervous system (CNS).

Characteristic structural features

The WASP and WAVE family proteins share a common domain architecture: a proline-rich stretch followed by the VCA region located at the carboxyl terminus (Figure 1). The VCA region simultaneously binds to two proteins to trigger actin polymerization. The V domain binds to an actin monomer (G-actin) and the CA domain binds to the Arp2/3 complex. The rate-limiting step to initiate actin polymerization is the assembly of a trimeric actin nucleus. The Arp2/3 complex contains two actin-like proteins, Arp2 and Arp3, serving as an actin pseudodimer. Therefore, the VCA region can mimic the assembly of an actin trimer by providing a platform that efficiently brings an actin monomer and the Arp2/3 complex into close proximity, which leads to efficient actin nucleation (Figure 3) [28]. The C domain, which consists of approximately 20 amino acids, forms an amphipathic α-helix whose hydrophobic surface interacts with and activates the Arp2/3 complex [29]. Notably, there are two V domains in tandem in mammalian N-WASP as well as in Drosophila WASP and C. elegans WSP-1, a configuration that is thought to increase their actin-nucleating activity [30]. Recently, Co et al. [31] suggested a novel function for V domains - that they capture elongating ends of actin filaments (barbed ends) to ensure the dynamic attachment of growing barbed ends to the membrane. Thus, the tandem V domains of N-WASP would not only provide efficient actin nucleation, but might also increase the ability of N-WASP to localize and concentrate at the interface between the barbed ends and the membrane.

Figure 3
figure 3

Multiple regulatory pathways for N-WASP and WAVE2 activation. (a) N-WASP is autoinhibited in a basal state through the interaction between the GBD/CRIB domain and the VCA region. PIP2 and GTP-loaded Cdc42 bind to the B and GBD/CRIB domains, respectively, resulting in synergistic activation of N-WASP. Binding of SH3 domains to N-WASP can independently compete with the autoinhibitory interaction, and thus can activate N-WASP. SH3-domain-containing proteins that interact and potentially activate N-WASP include cortactin, WISH, Nck, Grb2, Crk, FBP17, CIP4, Toca1, Abi1, endophilin A, and sorting nexin 9 (not all shown on the diagram). Concurrently, the BAR-domain superfamily proteins bend the membrane. (b) WAVE proteins exist in cells as a heteropentameric protein complex as indicated. WAVE2 has been shown to translocate to the membrane via interactions with phosphatidylinositol-(3,4,5)-triphosphate (PIP3) and IRSp53. The affinity of WAVE2 for IRSp53 is enhanced when GTP-loaded Rac binds to the RCB/MIM domain of IRSp53. IRSp53 is also able to enhance the ability of WAVE2 to stimulate Arp2/3-mediated actin polymerization [91]. This pathway via IRSp53 is an indirect activation by Rac, as it is suggested that Rac can activate the WAVE complex through direct interaction with Sra1. The direct pathway was shown in a recent paper but needs more experimental evidence to be widely accepted (hence marked by a question mark in the figure).

The amino-terminal sequence of WASP subfamily proteins is different from that of WAVEs. The amino terminus of WASPs has the WH1/EVH1 domain following a basic region and a GTPase-binding domain (GBD; also known as the CDC42/Rac-interactive binding (CRIB) domain). The WH1/EVH1 domain binds to WASP-interacting protein (WIP) family proteins, which include WIP, CR16 (corticosteroids and regional expression-16), and WICH/WIRE (WIP- and CR16-homologous protein/WIP-related) in mammals [3234]. In cells, most WASP proteins and N-WASP proteins appear to form a stable one-to-one complex with the WIP-family proteins, which seem to protect WASP and N-WASP proteins from proteasomal degradation [3537]. NMR studies suggest that the WIP ligands wrap around the N-WASP WH1/EVH1 domain and that the interacting surface of WH1/EVH1 is a hotspot for mutations in WAS patients, suggesting that disruption of WASP-WIP binding and resulting WASP degradation underlies the loss of WASP function and defective actin cytoskeleton mophology of immune cells in WAS [38]. GBD/CRIB domains are critical for the control of WASP and N-WASP activity because they bind to and inhibit the VCA region. The hydrophobic cleft of GBD/CRIB domains forms an intramolecular interaction with the hydrophobic face of the amphipathic helix of the C domain, thereby exerting an autoinhibitory control on VCA activity [39]. This autoinhibition is released by the competitive binding of GTP-bound Cdc42 to the GBD/CRIB domain, leading to activation of the Arp2/3 complex. Phosphatidylinositol-(4,5)-bisphosphate (PIP2) binds to the basic region amino-terminal to the GBD/CRIB domain, and synergizes with Cdc42 to activate WASPs and N-WASPs.

The amino-terminal feature of WAVE is the presence of the WHD/SHD domain followed by a stretch of basic residues (Figure 1). In the cell, the WAVE proteins are constitutively incorporated into a heteropentameric complex, the WAVE complex, whose components seem to be conserved among species ranging from plants to humans. The other members of this complex are Sra1/CYFIP1 (and the homologous PIR121/CYFIP2), Nap1 (also known as Kette in Drosophila), Abi1/2/3 (Abelson-interactor), and HSPC300/Brick1 [40, 41]. Lack of any of these components destabilizes the WAVE complex, leading to proteasomal degradation of the whole complex [4244]. Biochemical studies suggest that direct stoichiometric association of the WHD/SHD domain with Abi and HSPC300 appears to contribute to the formation of the WAVE complex [45]. All the known WHD/SHD domains contain conserved coiled-coil motifs spanning at least 36 amino acids. These motifs are thought to associate tightly with other coiled-coil motifs predicted to exist in Abi and HSPC300.

Localization and function

The localization of the WASP and WAVE family proteins has been extensively studied in cultured cells, revealing that both WASPs and WAVEs are closely associated with the cell membrane through either direct or indirect binding to membrane phosphoinositides. As the Arp2/3 complex with which they interact intrinsically causes the rapid formation of branched actin networks, the common feature of WASP and WAVE function is coupling of the cell membrane to Arp2/3-dependent actin polymerization to achieve coordinated membrane-cytoskeleton dynamics.

Although N-WASP was originally proposed to be a down-stream effector of Cdc42 in the formation of filopodia [46], which are spiky actin-based motile structures protruding from the cell periphery, its role in endocytosis is currently the subject of intensive study. Whereas it remains unclear whether N-WASP in endocytosis is also under the control of Cdc42 activity, N-WASP is recruited to the site where the clathrin-coated pit (CCP) forms. This recruitment seems to be mediated through binding of the proline-rich domain of N-WASP to the SH3 domains of EFC (extended Fer-CIP4 homology)/F-BAR (FCH-Bin/Amphiphysin/Rvs) domain-containing proteins, which are thought to be involved in causing curvature of the membrane [47, 48]. N-WASP is thought to accelerate actin polymerization near the invaginating CCPs, providing them with the energy to pinch off from the plasma membrane. The idea that N-WASP may be involved in endocytosis arose originally from the study of Las17, the budding yeast homolog of WASP, which was first identified in a screen for mutants defective in endocytosis [20]. In yeast, Las17 and verprolin 1 (the yeast homolog of WIP) are recruited to CCPs with the proteins Bzz1 and Rvs167, which are now known to be members of the EFC/F-BAR and BAR domain-containing proteins [49, 50].

In contrast, mammalian WASP has been studied in relation to the pathology of WAS. When a T cell is stimulated by antigen on a target cell binding to the T-cell antigen receptor (TCR), a stable contact between the two cells, called an immunological synapse, is formed by the T-cell receptor interaction and by adhesion molecules on both cells. Dynamic filamentous actin (F-actin) rearrangement has been shown to be necessary for the formation of a mature immunological synapse. WASP seems to be involved in the late stage of its formation, as WASP-deficient T cells are able to form a stable immunological synapse in the initial contact with antigen-presenting cells, but are unable to re-establish it once the initial synapse is disturbed [51, 52]. Upon T-cell receptor activation, a signaling cascade is initiated by interaction with cytoplasmic protein tyrosine kinases that phosphorylate the receptor complex component CD3, and a transmembrane protein LAT. Phosphorylated tyrosine residues of these proteins then recruit various adaptor proteins, such as SLP-76, CrkL, Nck, and PSTPIP1, which in turn recruit and concentrate WASP at the immunological synapse to facilitate actin polymerization [5355]. Apart from T-cell activation, T lymphocytes from WAS patients have been shown to display defects in cell migration in response to the chemokine SDF1-α [56]. Thus, when WASP is defective and actin polymerization fails, T cells are unable to carry out their functions, resulting in immunodeficiency.

The activation of both WASP and N-WASP is tightly linked to their recruitment to the membrane (Figure 3). GTP-bound activated forms of Cdc42 localized at the membrane bind to the GBD/CRIB domain. PI(4,5)P2 is abundant in the plasma membrane and binds to the basic region. The Src family of tyrosine kinases phosphorylates tyrosine residues near the GBD/CRIB domain. All these events are thought to loosen the intramolecular interactions between the GBD and VCA domains, thereby activating the WASPs [9]. The EFC/F-BAR/BAR domain-containing proteins are anchored on the membrane via their affinity for acidic phospholipids, and many of them contain SH3 domains that can bind to the proline-rich domains of WASP/N-WASP. This interaction also seems to activate WASP/N-WASP, but as yet, the mechanism is unclear (see the Figure 3 legend for examples of proteins with N-WASP-activating SH3 domains).

WAVEs localize to the leading edges of lamellipodia, the flat protrusions that cells extend in the direction of cell movement [57]. Lamellipodia are filled with dense networks of branched actin filaments. This actin architecture is generated by the activity of the small GTPase Rac, and WAVE was originally identified as a downstream effector for Rac-mediated actin polymerization. Subsequently, WAVEs were found to activate the Arp2/3 complex, and now WAVEs are known to act downstream of Rac to trigger actin polymerization by the Arp2/3 complex. In this regard, WAVEs are essential for cell motility, as this is accomplished by cycles of lamellipodial extension and substrate adhesion. The localization of WAVEs to the edges of the lamellipodia is regulated by a similar but not identical mechanism to N-WASP localization (Figure 3). Through its basic domain, WAVE2 preferentially binds to and is recruited to the membrane by PI(3,4,5)P3 rather than PI(4,5)P2 [58]. Rac seems to recruit WAVEs to the membrane by at least two cooperative mechanisms. First, GTP-loaded forms of Rac directly bind to the WAVE complex component Sra1 [59]. This interaction presumably recruits WAVEs to the membrane in a Rac activity-dependent manner. Second, the proline-rich domain of mammalian WAVEs binds to the SH3 domain of membrane-associated IRSp53, which belongs to the RCB (Rac binding)/IMD (IRSp53-MIM homology domain) domain-containing proteins, another class of membrane-associated protein families with similar properties to the EFC/F-BAR proteins. The RCB/IMD domain simultaneously binds to activated Rac, which contributes to the Rac-dependent localization of WAVEs [6063]. Interestingly, WAVE2 has much stronger affinity for IRSp53 than have WAVE1 and WAVE3 [60]. Therefore, the interaction with IRSp53 is likely to contribute specifically to the localization of WAVE2 at lamellipodial tips.

In a multicellular context, WAVEs also function in cell-cell adhesion. In cultured epithelial cells, WAVEs localize at the cell-cell boundaries and are necessary for maintaining the integrity of the actin cytoskeleton at cell-cell junctions [64]. Genetic studies in multicellular organisms support this observation in cultured cells. The developmental defects observed in C. elegans embryos mutant for the WAVE homolog wve-1 suggest that the protein WVE-1 is required for epidermal cell-cell junction remodeling and for the remodeling of intestinal epithelium to modulate apical expansion of the gut lumen [16]. In Drosophila, SCAR/WAVE is required for fusion of myoblasts to form muscle cells, which is driven by remodeling of the actin cytoskeleton at cell-cell junctions [65]. In Arabidopsis mutant for SCAR complex genes and the Arp2/3 complex genes, the pavement cells of the epidermis are abnormally shaped and show occasional intercellular gaps [66, 67]. These studies clearly demonstrate the role of WAVEs in cell-cell junction formation and/or maintenance, although the molecular mechanism of action of WAVEs in cell adhesion is still not clearly understood.

The activating mechanism of the heteropentameric WAVE complex remains controversial. Consistent with the notion that WAVEs lack the GBD/CRIB domain by which the VCA region would be autoinhibited, many studies have reported that the WAVE complex reconstituted in vitro is constitutively active [9]. However, the in vivo WAVE complex biochemically purified from tissue homogenates appears to be basically inhibited [40, 68]. Recently, Ismail et al. [69] accurately reconstituted the human WAVE1 complex with purified components and showed that this reconstituted complex is inhibited. They also demonstrated that a similarly constructed Drosophila SCAR complex is inhibited, suggesting that the inhibited state is likely to be the default state. They then showed that these reconstituted complexes could be activated by active Rac. Thus, our current knowledge supports a model in which the WAVE complex is normally inhibited in cells. Yet, the precise mechanism of how Rac activates the WAVE complex is still unclear. There are other levels of regulation as well. For example, phosphorylation of WAVE1 by cyclin-dependent kinase 5 (Cdk5) suppresses Arp2/3-complex activation by WAVE1 during spine morphogenesis of neurons [26]. WAVE2 is also phosphorylated by extracellular signal-regulated kinase 2 (ERK2) or by c-Abl or casein kinase 2 (CK2), and its actin-polymerizing activity appears to be controlled by these kinases [7072]. Degradation of WAVEs appears to be controlled by the vinexin family of adaptor proteins, but as yet, the physiological significance of this is unknown [73, 74].

Frontiers

With a wealth of information now in hand about the molecular interactions and biochemical activities of the WASP and WAVE family proteins, one of the main issues to be addressed is how WASPs and WAVEs and their associated proteins work together to shape various and complex actin architectures. For example, N-WASP is essential for the formation of distinct cellular architectures such as endocytic vesicles, filopodia and podosomes/invadopodia [9]. How does N-WASP form these structures separately yet with a similar molecular action? One of the clues to solving this question exists in recently identified classes of membrane-deforming proteins, which bind directly to phospholipids and can deform membranes into curved surfaces [75, 76]. These proteins are classified into three structural families: the BAR domain, the EFC/F-BAR domain and the RCB/MIM domain. Most of these proteins have SH3 domains that interact with WASP and WAVE proteins. Thus, membrane-deforming proteins recruit WASPs and WAVEs to the membrane and concurrently may modulate the membrane curvature to shape unique membrane-cytoskeleton architectures. The EFC/F-BAR-containing protein FBP17, for instance, facilitates endocytosis through coordination of membrane invagination and N-WASP activation [48]. The linkage of WAVEs to membrane deformation remains to be examined.

Another unanswered question is how WASP and WAVE proteins function in tissue morphogenesis. To construct multicellular organs, the actin cytoskeleton underlying the adhesive junctions that connect neighboring cells must be plastic and be able to be remodeled in response to morphogenetic factors during organ development. In Drosophila epithelial cells, WASP is required for adherens junction stability, probably through a role in mediating E-cadherin endocytosis [77]. In mammalian cells, WAVEs are required for the maintenance and remodeling of the junctional actin cytoskeleton [64, 78]. Interestingly, studies in C. elegans embryos showed differential localization of WVE-1 in different epithelial tissues undergoing morphogenesis [16]. Therefore, WASPs and WAVEs seem to play distinct roles in the formation and modification of cell-cell contacts. However, how the activity of WASPs and WAVEs at the sites of cell-cell contact is regulated and coordinated by morphogenetic signals during development is largely unknown and thus needs to be investigated.

Recently, novel classes of WASP/WAVE-like proteins were identified by a database search based on similarity to the characteristic VCA segment [7981]. These include WHAMM and WASH in humans, and JMY in mouse. Although their physiological roles remain elusive, their existence clearly indicates that there are expanding signaling networks surrounding the WASP/WAVE-Arp2/3 complex in cells.

As the WASPs and WAVEs have an important role in cell motility, their dysregulation results in aberrant cell-motility phenotypes, such as those discussed above for WAS. In a quite different context, cancer invasiveness and metastasis are promoted by enhanced cell motility caused by aberrant upregulation of WAVEs [82]. WAVE2 appears to be associated with several types of human cancers, although why and how WAVE2 could be a factor in cancer progression is enigmatic [8388]. Thus, better understanding of WAVE functioning in cancer pathology as well as in normal cell physiology could lead to novel cancer therapeutics.