Background

In contrast to the centralized and highly structured nervous systems of bilaterians, some animals (cnidarians and ctenophores) have more simply organized networks, and still others (sponges and placozoans) appear to lack a nervous system entirely [1]. To the extent that these early branching animal phyla (the so called 'basal metazoa') have retained early metazoan characters, their study can inform our understanding of the early evolution of the nervous system. Although early metazoan phylogeny remains controversial [25], among the living phyla sponges were likely the first animal group to diverge, followed by the subsequent branching of placozoans, and then cnidarians/bilaterians. (The placement of ctenophores remains contentious [3, 6]). Both sponges [7] and placozoans (that is, Trichoplax adhaerens) [8] appear to lack a defined neuronal cell type, although evidence for putative sponge neurons has been put forward [9], and the genes corresponding to postsynaptic scaffolding have been identified in a demosponge [10]. In contrast, cnidarians (hydra, anemones, corals, jellyfish) all have clearly defined neurons [11], and neural networks of varying complexity (see, for example, [1220]). The differences between early branching phyla are traditionally thought to represent the evolutionary progression of the nervous system in the first animals, but molecular evidence supporting such gradual evolution has been lacking. Comparative analysis of nervous system patterning genes in diverse animal phyla with and without nervous systems provides an avenue for understanding the early evolution of this fundamental animal feature.

Genes of the LIM homeobox (Lhx) family perform fundamental roles in tissue-specific differentiation and body patterning during development in both vertebrates and invertebrates [21, 22] (summarized in Additional file 1, Table S1). These genes comprise a family of DNA-binding proteins with six subfamilies; each subfamily member is represented once in Caenorhabditis elegans and Drosophila melanogaster and twice in mammalian species [23]. Lhx proteins are composed of two N-terminal LIM domains (named after the founding members LIN-11, Islet-1, and MEC-3) and a helix-turn-helix forming homeodomain that binds regulatory DNA surrounding target genes [22, 24]. The zinc-finger forming LIM domains are essential for protein function in several subfamilies and are thought to regulate DNA binding by the homeodomain by interacting with other nuclear proteins [23]. The diverse functions of Lhx proteins include the development of kidney, pancreas, eyes, and limbs in vertebrates (by the Lhx1/5, Lhx3/4, Islet, Apterous, and Lmx subfamilies), the patterning of wings and imaginal disc precursor tissues in flies (by Apterous and Arrowhead), and the formation of the vulva in C. elegans (LIN-11 or Lhx1/5 family) [23]. Lhx genes mediate these developmental functions by specifying cellular identities and their loss of function can result in human disease [25, 26].

While Lhx proteins perform a diverse array of developmental functions, all members of the Lhx family are prominent in specifying the fates of motorneurons, sensory neurons, and interneurons [23]. More specifically, in both vertebrates and Drosophila, motorneuron subtype identity is determined by a combinatorial code of Lhx genes and a particular Lhx gene defines interneuron subtype identity, suggesting that these genes played such roles in the common ancestor of bilaterians [23, 2729]. Lmx proteins specify serotonergic neurons [30, 31]; Lmx genes are also implicated in dopaminergic neural fates [32]; Lhx8 and islet specify cholinergic fate [3337]; GABAergic fates are specified by Lhx7 and Lhx6 [36, 38, 39]. Many Lhx genes are involved in the development of various types of sensory neurons, such as photosensory, thermosensory, olfactory, chemosensory, or mechanosensory neurons (see, for example, [4044]).

Classic studies in Hydra, a hydrozoan cnidarian, and other cnidarians showed that the adult nervous system is composed of regionalized and overlapping populations of cells expressing various neurotransmitters and neuropeptides [1219]. Recently, the anatomy of the nervous system over developmental time has been studied in the anthozoan starlet sea anemone, Nematostella vectensis [20], revealing neural complexity comparable to that of Hydra. Are cnidarian neuronal subpopulations patterned in a manner similar to those in bilaterians, for example, using combinatorial expression of Lhx genes? If so, are these patterning mechanisms in place in placozoans and sponges despite the lack of nervous systems in these phyla?

LIM homeobox genes have been reported in the genomes of N. vectensis [45] and the demosponge Amphimedon queenslandica [46, 47]. Using the recently sequenced genomes of N. vectensis [48], Hydra magnipapillata [49], T. adhaerens [2], and A. queenslandica (Srivastava et al.: The genome of the haplosclerid demosponge Amphimedon queenslandica and the evolution of animal complexity, submitted), we trace the evolution of the LIM homeobox family. We then report the expression patterns of several Lhx gene families during embryonic development in N. vectensis and A. queenslandica. The territories of expression of these genes broadly overlap those of known neuronal subpopulations in the sea anemone, and putative photosensory cells in the sponge.

Results

Origin and early diversification of the LIM homeobox protein family

Genes with the LIM-LIM homeobox domain composition were found in all the animal genomes queried in this study. However, no putative Lhx proteins were predicted in the genome of Monosiga brevicollis, a unicellular choanoflagellate (the sister group to animals). This, together with the absence of LIM-LIM homeobox proteins in plants, fungi and other eukaryotes suggests that the combination of LIM domains and homeodomains is unique to animals.

The Nematostella genome encodes six Lhx proteins, which each fall into one of the six known subfamilies (Figure 1). In addition to the three Lhx genes classified into Islet, Lhx1/5 (LIN-11), and Lhx6/8 (Arrowhead) groups previously [45], we identified orthologs of the Lhx3/4, Lhx2/9 (Apterous) and Lmx groups in Nematostella (as found in [47]). The Lmx gene appears to have only one LIM domain, contrary to the usual two LIM domains followed by a homeodomain composition known from bilaterian Lhx genes (Table 1). As is the case with Nematostella, members of all six Lhx subfamilies are represented in the Trichoplax genome. This implies that the six Lhx subfamilies were already established in the common ancestor of cnidarians, placozoans, and bilaterians. While the putative Trichoplax Lhx6/8 (Arrowhead) ortholog encodes only two LIM domains but no homeodomain, it can nevertheless be robustly classified as a member of the Arrowhead/Lhx6/8 subfamily.

Figure 1
figure 1

Phylogeny of LIM homeobox genes. The maximum likelihood tree based on an alignment of two LIM domains and the homeodomain is shown here with support values from Neighbor-joining/Likelihood/Bayesian analyses shown for the major nodes (relationships within the major classes were well supported only for vertebrate sequences). Neighbor-joining and Likelihood bootstrap values above 50% are shown, as are Bayesian posterior probabilities above 0.95. Full trees from each analysis are shown in Additional file 1. Aq = Amphimedon queenslandica(blue); Ce = Caenorhabditis elegans; Dm = Drosophila melanogaster; Dr = Danio rerio; Hm = Hydra magnipapillata (orange); Hs = Homo sapiens; Nv = Nematostella vectensis (green); Rn = Rattus norvegicus; Sp = Strongylocentrotus purpuratus; Ta = Trichoplax adhaerens (red); Xt = Xenopus tropicalis.

Table 1 Summary of domain structures and expression evidence for LIM homeobox genes found in four early animal genomes

Only four Lhx genes were identified in Hydra, each orthologous to a different Lhx subfamily (Arrowhead, Apterous, Lmx, Lhx1/5) suggesting that members of the other subfamilies (Islet, Lhx3/4) have been lost along the lineage leading to Hydra, after its divergence from anthozoans (Figure 1). The Hydra member of the Arrowhead subfamily appears to be missing the first LIM domain (Table 1). The Amphimedon complement of Lhx proteins consists of members of the Islet, Lhx3/4 and Lhx1/5 families [46, 47]. Given the poor support for the relationships of Lhx subfamilies to each other, we cannot distinguish between two scenarios: first, that three Lhx subfamilies were lost in the Amphimedon lineage, and second, that the common ancestor of all animals may have only had three Lhx genes, with ancestral (and sponge) genes most resembling specific daughter families because of asymmetric evolutionary rates of gene duplicates [47].

Though there is poor support in the tree (Figure 1) for the Lhx1/5 and Lhx3/4 groups, Nematostella, Trichoplax, Hydra, and Amphimedon genes have been assigned to these subfamilies because these classifications are the most likely scenario. It is often difficult to classify genes from early-branching animal phyla into clear bilaterian subfamilies [47, 50] and the inability to find good bootstrap support for the Lhx1/5 and Lhx3/4 groups may be a result of high levels of divergence between the early animal sequences relative to their bilaterian counterparts. Indeed, in an Lhx tree constructed without Trichoplax or Hydra sequences, assignment of Nematostella and Amphimedon genes to specific subfamilies was well supported [47]. Also, given that Nematostella and Trichoplax have genes that can be confidently placed in each one of the other subfamilies (Arrowhead, Islet, Apterous, Lmx), it is likely that the tree in Figure 1 has recovered the correct placements of the Nematostella and Trichoplax Lhx1/5 and Lhx3/4 proteins.

Synteny and intron conservation of Lhx genes

Of the six putative Lhx genes in the Trichoplax genome, three are present as part of a tandem cluster on scaffold_2 that also includes another LIM-LIM domain containing gene (Figure 2a) (Additional file 1, Supplemental Section 2). This fourth member of the tandem cluster can be classified as a member of the Lmo family using phylogenetic methods (Figure 1). The three Lhx genes in the cluster belong to the Lmx, Arrowhead and Lhx3/4 subfamilies, and a fourth Lhx gene (of the Apterous subfamily) is present further downstream on scaffold_2. The classification of these proteins into distinct Lhx subfamilies suggests that these syntenic genes are unlikely to be the result of a recent duplication in the placozoan lineage. Therefore, this syntenic cluster of genes likely represents (that is, is a remnant of) the ancestral genomic context in which the different Lhx subfamilies first evolved. The preservation of this tandem cluster in Trichoplax (with only three Lhx families missing) and its disruption in most other genomes is consistent with the finding that the Trichoplax genome appears to be the least rearranged relative to the inferred ancestral genome [2]. Of the 12 Lhx genes in humans, 7 are located on segments in different human chromosomes, but these segments fall into the same ancestral linkage groups as the tandem cluster of Trichoplax Lhx genes (Additional file 1, Table S10) [51]. This suggests that the tandem Lhx gene cluster in Trichoplax descended from the same ancestral genomic context that gave rise to modern bilaterian Lhx genes.

Figure 2
figure 2

Synteny and intron conservation of LIM homeobox genes. (a) Four of the six Trichoplax LIM homeobox genes are present on one scaffold, three of these are present in tandem. This tandem cluster also contains a gene coding for a protein of the LIM only (Lmo) class. This scaffold is in the same putative ancestral linkage group as human chromosome segments that contain 6 of the 12 human LIM homeobox genes. (b) Two introns that interrupt the homeodomain in the Lmx class proteins are well conserved across animals, but one has been lost in both Nematostella and C. elegans. Introns are represented with square brackets with the enclosed number indicating the phase of the intron.

Introns are found in diverse bilaterian homeobox-containing genes at over 20 different positions that interrupt the homeobox [52]. In the bilaterian members of the Lmx subfamily, the homeodomains are interrupted by two conserved introns (Figure 2b). The first of these introns is found to be present in the cnidarian and placozoan orthologs of Lmx as well, but the second one has been lost in Nematostella (though it is present in Hydra and Trichoplax Lmx genes).

Normal and atypical Lhx genes are expressed in Nematostella, Hydra, Trichoplax, and Amphimedon

All four Hydra Lhx genes were successfully amplified from the cDNA of adults (some of which may have been reproducing asexually) (Table 1). The apterous gene model in Hydra was found to be incorrect as 5' rapid amplification of complementary DNA ends (RACE) determined the expression of another exon containing the second LIM domain that was found to be encoded in the genomic sequence upstream of the predicted gene model. However, 5' RACE verified that the Hydra arrowhead gene model, which also predicted only one LIM domain, is correct. Lmx and Lhx1/5 orthologs in Hydra also appeared to be missing a LIM domain, but lowering the e-value threshold in the National Center for Biotechnology Information (NCBI) Conserved Domain Search tool [53] identified an additional N-terminal LIM domain in the Hydra Lmx gene model and an additional C-terminal LIM domain in the Hydra Lhx 1/5 prediction. The expression of both LIM domains and the homedomain in the Lmx and Lhx1/5 orthologs was confirmed through molecular cloning and sequencing analysis. These findings suggest that Hydra Lhx protein LIM domains have an accelerated rate of evolution (resulting in decreased affinity to the position specific weight matrices that define conserved domains), consistent with the overall high rate of protein sequence evolution in the Hydra lineage.

Of the six Trichoplax Lhx orthologs, apterous, arrowhead and Lhx1/5 were found to be expressed in the animals in laboratory culture (The Lmo gene in the tandem cluster on scaffold_2 is expressed as well). Five of the six Nematostella genes were also amplified from cDNA of animals at various developmental stages, including the Lmx-like gene that is missing one LIM domain (Table 1). Thus, Lhx genes with atypical domain compositions predicted in the genomes of Nematostella, Trichoplax and Hydra, are found to be expressed in those configurations (no atypical forms were found in Amphimedon). This finding is similar to those in other families such as the Hedgehog ligand, where early animal phyla are found to encode proteins with domain compositions not seen in homologous sequences in bilaterians [54, 55]. However, since these configurations are not shared between phyla (for example, Hydra and Trichoplax Arrowhead proteins have different missing domains), they most likely resulted from independent evolution along these lineages.

Nematostella Lhx genes are expressed in discrete regions of developing embryos

The mRNA for the Lhx6/8 (arrowhead) ortholog in Nematostella first appears faintly in the aboral ectoderm in the early planula and subsequently its expression resolves to mark ectodermal cells in the apical tuft in late planula stages (Figure 3a-c). This mRNA is absent in juvenile polyp stages (Figure 3d). Lhx1/5 (lin-11) expression in Nematostella begins in the early planula in endoderm cells that will form the endoderm around the pharynx (Figure 3e-g). This expression persists in late planula and juvenile polyp stages in discrete clusters of cells in a ring around the pharyngeal endoderm (Figure 3g,h). The expression of this gene around the pharynx appears to be radial, with no apparent asymmetries (Figure 3h'). A third Lhx gene, the Lmx ortholog, starts out with strong expression in the oral ectoderm in the early planula, and over time its expression spreads to the pharynx and the endodermal tissue that will make the directive mesenteries (that is, the pair of endodermal infoldings that are the first to appear) (Figure 3i-k). In juvenile polyps, Lmx mRNA has strong expression in the pharyngeal endoderm and ectoderm and weak expression in the directive mesenteries (Figure 3l). The Lhx2/9 (apterous) ortholog in Nematostella has speckled expression throughout the endoderm in early planula stages, but is found in a few cells of the aboral region of the pharynx in the late planula stage (Figure 3m-o). Juvenile polyps express Lhx2/9 in the aboral end of the pharynx and in the directive mesenteries (Figure 3p,p'). The islet gene of Nematostella is expressed in the pharynx as it starts to form and its mRNA is found in directive mesenteries and aboral endoderm of later stages (Figure 3q-t).

Figure 3
figure 3

LIM homeobox gene expression during Nematostella development. (a-d) The arrowhead (Lhx6/8) ortholog is first expressed in the apical tuft of the late planula (c) but disappears in the juvenile polyp (d). (e-h) The lin-11 (Lhx1/5) ortholog is first expressed in the putative pharyngeal endoderm in the early planula and later resolves into an endodermal ring around the pharynx (g' = oral view of g; h' = cross-section through h). (i-l) The Lmx ortholog is first transcribed in the oral ectoderm of the early planula and then spreads into the pharynx and directive mesenteries (l' = oral view of l). (m-p) The apterous(Lhx2/9) ortholog is expressed in the planula endoderm in a speckled pattern and later its expression spreads to the end of the pharynx and throughout the directive mesenteries (p' = lateral view of p). (q-t) The islet ortholog starts out in the putative pharyngeal endoderm and over time spreads into the directive mesenteries. This gene is transcribed in cells of the planula body wall endoderm and in the polyp stage there it shows restricted expression in the aboral endoderm (t' = lateral view of t).

Amphimedon Lhx genes are expressed during embryogenesis

In Amphimedon, Lhx3/4 (lim-3) is expressed in the inner cell mass with transiently higher expression levels under the photoreceptor pigment ring as it develops (Figure 4a-e). When pigment cells have coalesced into a spot, Lhx1/5 (lin-11) appears to be expressed in the outer cell layer of the embryo, with higher levels of expression observed in cells around the pigment spot (Figure 4f,g). Lhx1/5 expression remains associated with the pigment ring as it forms and dramatically increases in the inner cell mass, especially at the anterior end (Figure 4h,i). Both of these genes appear to be ubiquitously expressed in a relatively uniform manner in the larva just prior to hatching (Figure 4e,j). The islet gene appears to be ubiquitously expressed during Amphimedon development (data not shown).

Figure 4
figure 4

LIM homeobox gene expression during Amphimedon development. (a, c, f, h) Whole-mount micrographs; (b, d, e, g, i, j) micrographs of sectioned embryos. (a-e) The Lhx3/4 ortholog is expressed in the inner cell mass during late gastrulation, when pigment cells form a spot (a,b) and then a ring (c,d). A stronger expression domain appears transiently under the photoreceptor ring when it is forming (arrowheads in d). Expression is ubiquitous in the prehatched larva, with higher expression levels in the subepithelial layer (e). (f-j) The Lhx1/5 ortholog appears to be expressed in the outer layer at the pigment spot stage, especially around the spot (f,g). When the pigment ring forms (h,i), the gene is highly expressed in the inner cell mass, especially inside the developing ring and at the anterior end. A strong expression domain also appears in the micromeres surrounding the developing pigment ring (arrowheads in i). Expression seems to be ubiquitous in the larva before it hatches (j).

Discussion

The six Lhx subfamilies originally identified in flies, nematodes, and vertebrates are all represented in the Trichoplax and Nematostella genomes, indicating that the diversification of the Lhx family by gene duplication had already occurred by the time of the last common bilaterian-cnidarian-placozoan ancestor. In Trichoplax, four of the six Lhx genes are colocalized to a region of a few hundred kb in the genome. This implies that the diversification of the Lhx family took place by tandem (or local) gene duplication, and that some of these linkages have been retained in the Trichoplax lineage. This is analogous to the diversification of the Hox cluster, which arose by tandem duplication in the bilaterian lineage and is preserved in multiple extant lineages. For the Lhx cluster, only Trichoplax preserves remnants of the ancestral organization.

The Amphimedon genome contains three Lhx subfamily members (Lhx1/5, Lhx3/4, and Islet) but we cannot resolve whether these three represent the ancestral metazoan Lhx complement, with Lmx, Arrowhead, and Apterous arising by duplication from within these families in the placozoan-cnidarian-bilaterian lineage, or if the sponge lost these subfamilies. Interestingly, the three Lhx gene subfamilies found on the same scaffold as the Lhx3/4 gene in the placozoan genome are missing from the sponge genome. From the phylogenetic tree, we cannot reject the possibility that these three genes arose after the divergence of sponges, from an initial duplication of the Lhx3/4 gene. Analysis of Lhx genes in other sponges may resolve this issue.

In contrast to Trichoplax and Nematostella, the Hydra genome lacks members of the Lhx3/4 and Islet subfamilies, which were evidently lost in the Hydra lineage. The arrowhead gene in Hydra has an atypical structure, lacking one of the two LIM domains characteristic of the family. Although such domain structures have not been reported in bilaterians, independently evolved atypical domain structures are also observed in Nematostella and Trichoplax Lhx genes. Nevertheless, such genes are expressed, suggesting that they retain some function and are not simply pseudogenes. Some Lhx proteins show long branch lengths on phylogenetic trees, suggesting that these subfamily members may be experiencing reduced constraint and/or positive selection.

In diverse bilaterians, the LIM homeobox 'code' is conserved in the sense that neural types are patterned by combinatorial expression of Lhx and other transcription factors; however, the same types are not generated by the same Lhx combinations in different species [23]. In Nematostella, the expression of Lhx genes during embryonic development also appears to correlate with neural territories, although we have not shown that these genes are expressed in neurons. Three different LIM homeobox genes are expressed in the three major neuralized regions: the apical tuft of the planula, and the oral and pharyngeal nerve rings in the polyp (Figure 5) [20]. DOPA-β-monoxygenase, the enzyme involved in epinephrine and norepinephrine synthesis, and anthoRFamide mRNA mark the oral nerve ring, a region that is found to express the Nematostella Lmx ortholog. Over the course of development, Lmx expression spreads into the pharynx and directive mesenteries, mirroring the changes in DOPA-β-monoxygenase expression. The Lhx6/8 (arrowhead) ortholog is expressed transiently in the apical tuft, a region found to have GABAergic sensory cells. The Lhx1/5 ortholog marks clusters of cells in an endodermal ring at the end of the pharynx, a region that contains a ring of GABA-positive neurons. In a recent paper, Yasuoka et al. [56] found that this gene is expressed around the blastopore during gastrulation, and suggested that this gene had an ancestral role as a blastoporal organizer. However, we have not detected any Nematostella Lhx1/5 expression at this stage.

Figure 5
figure 5

Schematic diagrams of Nematostella developmental stages showing combinatorial expression of LIM homeobox genes and overlap with known functionally different neural types. Neurons with putatively different functions emerge over the course of embryonic development (as assayed by neurotransmitter antibodies and in situ hybridization to detect neuropeptide or neurotransmitter synthesis enzyme mRNA) [20]. LIM homeobox genes have dynamic expression patterns that overlap with each other, as well as with territories of different neural types. The oral nerve ring (marked by 3,4-dihydroxyphenylalanine (DOPA- β-monoxygenase and RFamide), the pharyngeal nerve ring (marked by ©-aminobutyric acid (GABA)) and the apical tuft (marked by GABA) correspond to Lmx, Lhx1/5 and arrowhead expression respectively. DOPA-β-monoxygenase expression over developmental time is mirrored by Lmx expression. Two-color stripes show expression of two neural markers in the same region.

Our data provide circumstantial evidence supporting the hypothesis that Lhx genes in Nematostella are involved in combinatorially specifying neuronal identity, as they are in bilaterians, based on the coincidence of Lhx expression territories and regions where distinct neural populations are found. The hypothesis predicts that Lhx genes should be expressed in neurons themselves, which has yet to be shown. There are other predictions. For example, although only one functional neural type (adrenergic) has been found in Nematostella mesenteries thus far, based on the combinatorial expression of islet, apterous and Lmx in this specialized endodermal tissue we predict that there should be functional differentiation of neurons in this region relative to the pharynx, which has only Lmx expression. Though only the Lmx mRNA is found in the oral nerve ring and pharynx, the oral nerve ring contains both RFamide-producing and adrenergic neurons, and a part of the pharynx contains both adrenergic and FMRFamide-expressing neurons. Evidently Lhx gene expression is not necessary for neuronal specification in Nematostella, however, since none of the five Lhx genes assayed here have been found to be expressed in tentacles, although tentacle tips contain spirocysts, GABAergic, and RFamide-expressing cells. Thus, it is likely that other transcription factors in addition to Lhx genes are involved in cospecifying functionally different neurons in Nematostella (as is the case with the specification of bilaterian neuronal identity). Indeed, other transcription factors are known to be expressed in the neural territories where cnidarian Lhx genes are found. For example, PaxB (orthologous to bilaterian Pax2/5/8) is expressed in an endodermal ring at the base of the pharynx [57], corresponding to the location of the pharyngeal nerve ring.

Similar comparisons of the expression of Lhx family members and the many documented neural populations in Hydra [1619] will be invaluable in understanding the evolution of neural patterning mechanisms. In this study, we found that the four Lhx genes encoded by the Hydra genome are expressed in adults. We did find that all six Trichoplax Lhx subfamilies are present in the genome, and that three subfamilies are expressed in animals in laboratory cultures (Table 1). Trichoplax notably has no described nervous system, and only four to five recognized cell types [8, 58]. Further characterization of Lhx genes in Trichoplax could illuminate the ancestral function of these genes, or alternate derived functions if the placozoan-cnidarian-bilaterian ancestor had a nervous system that was lost in the placozoan lineage.

Our observation of patterned expression of Lhx1/5 and Lhx3/4 subfamily members during Amphimedon larval development is consistent with a scenario in which Lhx subfamily members were expressed in defined territories in the last common metazoan ancestor. Although Amphimedon has no defined neurons, we do observe correlation between Lhx gene expression and sensory cells. Both Lhx1/5 and Lhx3/4 are expressed around the larval pigment ring where photosensory cells form. As the two genes are highly expressed in different but overlapping territories in this region, it is tempting to speculate that the sponge Lhx genes are specifying cell identity in a combinatorial manner, as in bilaterians animals. If we further assume that the nervous system is a eumetazoan synapomorphy, originating after the divergence of sponge and placozoan lineages, this hypothesis would imply that the ancestral repertoire of three to six metazoan Lhx genes was co-opted into differentiating neural cell types in the first simple nervous systems. Although inferring the original role for these genes in the very first metazoans is difficult, gene expression patterns in Amphimedon suggest a number of possibilities, including in the development of non-neural sensory cells. The hypothesized linkage between Lhx gene expression and nervous system patterning does not exclude other roles for these genes in early metazoans. For example, shared expression of Lhx1/5 in bilaterian gastrulation, the cnidarian blastopore [56], and the sponge pigment spot suggests a possible organizing role during development. Likewise, the expression of Lhx3/4 in protochordate endoderm [59, 60] and Amphimedon inner cell mass points to a potential ancestral role in germ layer formation.

Conclusions

Through sequence analysis we have shown that the Lhx transcription factor family was already established, and had duplicated and diversified, in the last common metazoan ancestor. We find that Lhx genes are expressed in defined, overlapping territories in the sea anemone Nematostella. Combined with (1) the neural differentiation observed in these regions and (2) the well established role of Lhx genes in the combinatorial control of neural identity in bilaterians, this observation further suggests the hypothesis that Lhx genes may play a homologous role in specifying neural identify in non-bilaterians. In this scenario Lhx gene expression would be causally linked to the structure of the cnidarian nerve net, whose complexity has been long established in Hydra [1219] and more recently in Nematostella [20]. Alternately, the Lhx-neural identity linkage is a bilaterian synapomorphy, and our observed correlations reflect convergent evolution and/or non-homologous processes of neural specification in cnidarians and bilaterians. Early branching animal lineages share a large repertoire of patterning genes with bilaterians, but lack the overt bilaterian differentiation of body axes. We hypothesize that the genes function in defining the molecularly distinguished cell types that various studies are beginning to recognize in cnidarians and sponges [10, 20, 61, 62].

Methods

Animal culture, RNA extraction and cDNA synthesis

N. vectensis adults (descendents of the CH2 and CH6 cross) were maintained and spawned as described in Fritzenwanker and Technau [63]. H. magnipapillata were cultured in Hydra medium, consisting of 1% seawater. T. adhaerens of the Grell strain collected in the Red Sea were cultured in bowls or Petri dishes filled with filtered artificial seawater at room temperature. Every 2 weeks the bowls were fed with 3 to 5 ml of Rhodomonas salinas and salinity and pH were maintained between 32 ppt (parts per thousand) to 35 ppt and 8.0 to 8.4, respectively.

Nematostella embryos (collected at various time points), Hydra adults (including animals that were undergoing the process of budding) and Trichoplax from laboratory cultures (animals were starved for 24 h before collection) were collected, frozen in liquid nitrogen, and stored at -80°C. RNA was then extracted using standard TRIzol (Invitrogen, Carlsbad, USA) protocol. cDNA was made using the Superscript III First-Strand Synthesis System (Invitrogen, Carlsbad, USA) for reverse transcription polymerase chain reaction (RT-PCR) kit. cDNA for 5' and 3' RACE was prepared using the FirstChoice RLM-RACE Kit (Ambion, Austin, USA).

Amphimedon embryo and larval collection, RNA extraction, and cDNA synthesis were performed as previously described [64].

Identification of LIM homeobox genes in cnidarians, placozoans and sponges

Several known LIM homeobox (Lhx) sequences from human, mouse and D. melanogaster genomes were aligned using BLAST against the predicted gene models for the genomes of N. vectensis http://jgi.doe.gov/nematostella [48], H. magnipapillata http://hydrazome.metazome.net/cgi-bin/gbrowse/hydra [49], T. adhaerens http://jgi.doe.gov/trichoplax [2], A. queenslandica (Srivastava et al.: The genome of the haplosclerid demosponge Amphimedon queenslandica and the evolution of animal complexity, submitted) and M. brevicollis http://jgi.doe.gov/monosiga [65]. Gene models that picked up known LIM homeobox proteins by BLAST to the database of non-redundant proteins and contained LIM and homeobox domains were considered to be putative Lhx genes in these animals.

Verification of gene models

Primers were designed using Primer3 http://frodo.wi.mit.edu to amplify Nematostella, Hydra and Trichoplax Lhx genes using TaKaRa reagents (TaKaRa Bio Inc., Shiga, Japan) (Additional file 1, Tables S2-6). Cloning was performed using the Zero Blunt TOPO PCR Cloning Kit for Sequencing (Invitrogen, Carlsbad, USA) and minipreps were performed using the standard Qiagen (Valencia, USA) protocol. Sequence concordance was analyzed using Sequencher 4.5 (Gene Codes Corporation, Ann Arbor, USA) and sequenced cDNAs were BLASTed against the genome sequence for verification followed by a Conserved Domain Search to confirm Lhx gene identity [66].

Three of the four putative LIM homeobox predicted proteins in Hydra and one in Nematostella contained only one LIM domain, though all known LIM homeobox proteins have two N-terminal LIM domains, followed by the C-terminal homeobox (Table 1). Genomic regions 1 kb upstream of these predicted gene models were analyzed for LIM domains by translating in three frames. 5' RLM-RACE PCR was performed (Ambion FirstChoice RLM-RACE Kit) to verify gene model predictions for potential upstream LIM domains in Hydra (see Additional file 1, Table S6 for primers used). The predicted gene models were also analyzed by lowering the e-value threshold in conserved domain searches http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml [66].

In Trichoplax, one scaffold contained conflicting and overlapping gene predictions of Lhx genes (Additional file 1, Table S8). Some of these models appeared to have atypical domain composition such as having two LIM domains without a homeobox, while some had overlapping spatial location or different gene model predictions for the same locus. To determine the accuracy of hypothetical proteins, primers were designed to amplify all the predicted gene models by RT-PCR (Additional file 1, Table S3).

Phylogenetic analyses and identification of introns

LIM homeobox gene sequences from Nematostella, Hydra, Trichoplax and Amphimedon were aligned with Lhx genes from other animals known to fall into different subfamilies using CLUSTALW [67, 68]. The alignments were trimmed using GBlocks [69] and curated manually (both LIM domains and the homeodomain were used where available). Neighbor joining analyses were performed using Phylip [70] with default parameters and 500 bootstrap replicates. Maximum likelihoods were calculated using PhyML [71] with the WAG model of amino acid evolution, 4 substitution rate categories, proportion of invariable sites and γ distribution parameter estimated from the dataset, and 100 bootstrap replicates. Bayesian analyses were performed using MrBayes [72, 73]; 2 chains were started and allowed to run for over 1 million generations, 1 tree was sampled every 100 generations, and the first 1,000 trees were discarded as burn-in. Orthologous Lhx genes from different species were aligned for each Lhx subfamily and conserved introns identified as described in Putnam et al. [48].

Probe synthesis and in situ hybridization

Clones of Nematostella, Hydra, Trichoplax, and Amphimedon Lhx genes made using primers listed in Additional file 1, Tables S2-6 were used for probe synthesis. Digoxigenin (DIG)-labeled antisense and sense RNA probes corresponding to the putative Lhx genes in Nematostella were synthesized using labeling mix and T7/T3/Sp6 RNA polymerases from Roche Applied Science (Indianapolis, USA). Nematostella embryos at various stages were collected and fixed and in situ hybridization performed as described in Finnerty et al. [74]. DIG-labeled RNA probes were used at a concentration of 2 ng/μl for hybridization ranging from 12 to 48 h. Amphimedon in situ hybridizations were performed as described in Larroux et al. [64].