Background

Apoptosis - the programmed cell death - is one of the basic processes in regulation of cell count in ontogenesis. Early models defined apoptosis as a relatively simple a few step process starting from the initiation by one of pro-apoptotic signals and driving to the final endonucleolytic cleavage of genomic DNA and membrane blebbing. The identification of novel molecular mechanisms involved in this process suggests that the overall picture is far more complex and that the current cell status is directly related to the balance between pro- and anti-apoptotic factors. The induction of apoptosis may be caused by a number of events including disruption of genome, oncogenes' activation, stimulation of various receptors, disorders in calcium homeostasis, free radicals or other damaging factors like radiation [1, 2]. Finally, the limited number of apoptosis execution mechanisms is modulated by a wide range of signals [1].

The cell survival depends on homeostasis of signals involved in induction and inhibition of both proliferation and apoptosis. Within a set of regulators of apoptosis, several key proteins were already identified (i.e. Bcl-2, IAP, survivin) [3]. Two of the proteins constitute myeloid regulators of apoptosis: HS-1- associated protein X-1 (HAX1) and myeloid leukemia factors (MLF1, MLF2). Although these proteins were the subjects of several experimental studies, mechanisms of their activity remain to be only partially discovered.

HS-1 associated protein X-1 (HAX1), is a multifunctional protein first identified in a yeast two hybrid system as a partner of HS-1 (hematopoietic lineage cell-specific protein I) [4]. So far, HAX1 has been detected in various cellular compartments and is known to be involved in a number of different processes. The high level of its expression is observed in mitochondrion and nuclear matrix, although HAX1 was also found in endoplasmic reticulum, apical membrane of hepatocytes and nuclear envelope [5]. HAX1 is expressed ubiquitously among various tissues, mainly in skeletal muscle and heart [6] and in a number of cancer tissues [7, 8]. Although HAX1 is known as a multifunctional protein, the exact mechanism of its action remains unexplained [9].

Currently, there are several observations that are linking HAX1 and cell apoptosis together. As reported by Vafiadaki et al. a putative HAX1's anti-apoptotic role may result from an inhibition of caspase 9. HAX1 repress post-mitochondrial caspase 9 activation, cell death during hypoxia and following re-oxygenation and its overexpression protects cardiac myocytes from apoptosis [10]. Here, the C-terminal part of HAX1 interacts with caspase-9 at the region corresponding to residues 175-206 of human ortholog (GenBank identifier - gi|158562115), even though, both N-terminal and C-terminal domains are required for full anti-apoptotic function [11, 12]. Additionally, HAX1 interacts with other partners from the caspase cascade. Caspase 3 (CASP3) cleaves HAX1 at residue Asp127, while HAX1 over-expression inhibits CASP3 catalytic activity and blocks the initiation of apoptosis [13].

Several previous studies suggested HAX1 involvement in calcium homeostasis [6, 10]. Phospholamban (PLN) - a transmembrane regulator of the contractility in the heart and calcium homeostasis - binds to HAX1 within internal part of protein (residues 203-245 of human ortholog - gi|158562115, compare Figure 1B). The HAX1-PLN complex plays an important role in cardiac cell survival and the presence of PLN enhances the anti-apoptotic potential of HAX1. PLN regulates activity of the sarcoplasmic reticulum Ca2+ - ATP-ase pump (SERCA2a) - the regulator of heart calcium homeostasis [6]. Formation of the HAX1-PLN complex is modulated and interaction with HAX1 is reduced by either phosphorylation of PLN or increased concentration of Ca2+ [6]. Additionally, HAX1 when bound to PLN is redistributed from mitochondrion to endoplasmatic reticulum (ER) [6]. ER localization is also crucial for interaction of HAX1 with polycystic kidney disease 2 (PKD2) apoptosis regulating protein involved in Ca2+ signaling in kidney cells. Here, HAX1 acts as an adaptor between PKD2 and cortactin - a key regulator of PKD2 function [14].

Figure 1
figure 1

A. The overview of the domain composition of HAX1 and MLF proteins. B. Interaction sites within human HAX1 protein. C. Mutation sites within HAX1 (X-nonsense mutation, fs - frameshift).

HAX1 was previously described as a hairpin-structure RNA binding protein [15]. This observation was initially raised on the basis of interaction with vimentin's mRNA 3'UTR [15]. Sarnowska et al. suggested also that HAX1 plays a role in destabilization of mRNA through binding of Polβ (polymerase beta) mRNA.

Since multifunctional activity and a number of interactions involved mainly in regulation of apoptosis are the key processes in cell regulation, this protein plays important role in several diseases and carcinogenesis. One of the most important functions of HAX1 is the regulation of myeloid homeostasis. This protein is crucial for proper inner mitochondrial membrane potential and for protecting myeloid cells from apoptosis [16, 17]. Congenital deficiency of HAX1, leads to the development of Kostmann syndrome [17], via mitochondrion-dependent deregulation of apoptosis, which results in an infantile genetic agranulocytosis. Notable, in the later phase Kostmann syndrome can lead to the development of acute myeloid leukemia [17].

The data on the activity of myeloid leukemia factor 1 (MLF1) is far more limited. At first, MLF1 was identified in an oncogenic fusion gene with nucleophosmin (NPM-MLF1). This fusion gene is generated by t(3;5)(q25.1;q34) translocation [1820] and is linked with both acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). MLF1 is expressed in a number of tissues, including skeletal muscle, heart, brain and hematopoietic stem cells. Despite several reports about MLF family, mechanisms of its activity have not been reported. The previous analyses contain quite inconsistent data suggesting variable range of its actions. MLF1 protein product is mainly targeted to cytoplasm and is potentially involved in regulation of mitosis (cell cycle exit) and cell differentiation processes [19, 21]. Several lines of evidence suggest that MLF1 also participates in induction of apoptosis when fused with NPM [20, 22]. MLF1 - as a component of NPM-MLF1 complex - is known to be involved in various pathways of carcinogenesis. It negatively regulates the progression of cell cycle inhibiting cell growth through the accumulation of tumor suppressor protein p53 (TP53) [23]. Additionally, MLF1 in hemopoetic lineage is able to switch erythroleukemic cells to the monoblastoid phenotype. It can also deregulate pluripotent progenitor cells, participate in ineffective hematopoiesis and transformation into leukemia [20].

Similarly to MLF1, the chromosomal translocations and deletions of myeloid leukemia factor 2 (MLF2) locus were also observed in acute leukemia [24], but no direct studies were performed to identify the specific mechanism of its action. Both proteins share 64% of sequence similarity between human homologs (gi|194328680 and gi|54695734, MLF1 and MLF2 respectively), but no experimental studies on MLF2 were published recently. The lack of data on MLFs is reflected in the major secondary database - Sanger's Protein Families (Pfam). Although sequences of MLF1 (Q2TLE5_MLF1_HUMAN) and MLF2 (Q15773_MLF2_HUMAN) were deposited under the PfamA accession PF10248, the title of the entry as well as the description of the domain, are related to non-homologous protein MLF1-interacting protein (MLF1IP) (Q71F23|CENPU_HUMAN), what shows a misleading assignment of the MLF proteins. Comparison of MLF and MLF1IP proteins shoved only marginal similarity score (expected value above 2).

Identification of various pro- and anti-apoptotic pathways is a challenging problem. So far, no significant homology of the two discussed protein families connected to other proteins of similar function was reported. Several previous studies described inconsistent data on the structure of HAX1. Sharp et al. postulated that there is a presence of a transmembrane helix in the C-terminal region of open reading frame (position 261-273) [25], while other researchers suggested that, the motif is shorter than an ordinary transmembrane helix [4]. Suzuki et al. reported that the N-terminal region of HAX1 shows weak similarity to NIP3 protein which is up-regulated in oxidative stress conditions in cardiac myocytes. It is also involved in apoptosis which causes mitochondrial defects [26]. Another two regions consisting of residues 37- 56 and 74-89 of human HAX1 is reported to be homologous to anti-apoptotic protein, therefore, BCL2 (B-cell lymphoma 2) creating so called BCL2-homology domains, respectively BH1 and BH2 [11, 16, 26]. The detailed revision of the presented bioinformatic evidence reported in the above cited articles shows that original reports contain inappropriate application of bioinformatic methods and could not be reproduced. To summarize, as of now, no clear data on homology of either HAX1 or MLFs has previously been reported.

Previously we were able to successfully apply bioinformatic methods of fold recognition and identification of distant homology to analyze genes participating in important biological processes. These include ALK kinase regulator NIPA [3], family of transcription factors homologous to TFCP2 regulating transcription of globins [27] and several other human and viral genes [28, 29]. In this work we present a bioinformatic approach to elucidate the molecular function of HAX1 and MLF1 homologs.

Methods

Sequences of HAX1 and MLF1 proteins homologs were collected using PSI-BLAST (default settings) [30] in NR (non-redundant) database of NCBI (National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov). The sequences of each protein were clustered at 70% of sequence identity with CD-HIT [31] in order to select representative collection of protein sequences, with preserved functional motifs. The final protein sequence collections were aligned with ClustalW [32] and PCMA [33]. Based on the alignment conserved regions were selected. Conserved sequence regions of both families were subjected to iterative PSI-BLAST searches at NCBI resources. Sequences from each protein family were submitted to the Protein Structure Prediction MetaServer (http://bioinfo.pl/meta; [34]) - server joining a number of secondary structure prediction (PsiPred [35]), fold recognition (3D-PSSM [36], INUB [37], SAM_T06 [38]) and homology modeling methods (FFAS03 [39], MetaBasic [40]). Based on secondary structure predictions, location of critical residues and physicochemical properties of amino acids multiple sequence alignment of HAX1 and MLF protein families was manually corrected. Protein homology models of HAX1 parvalbumin-like domain were calculated with Modeler v. 9 based on 1RJP PDB structure [41] and implemented in Discovery Studio 2.1 (Accelerys Inc.). Visualizations of the obtained structures were performed with PyMol http://pymol.sourceforge.net/.

Results and discussion

Identification of protein domains

Based on sequence searches for the whole length of HAX1 and MLFs no significant similarity in NR database (NCBI) was detected. Similar searches applied only with the most conserved C-terminal part of HAX1 as a query, revealed a significant match to C-terminal domain of Myeloid Leukemia Factor family (Figure 1A). The comparison of similar regions of both protein families, together with the results of secondary structure prediction, confirmed the initial homology searches. The secondary structure prediction analysis performed for both protein families revealed a common pattern of secondary structure elements (three beta strands and an alpha helix at the carboxyl terminus of the proteins; Figure 2). The conserved domain is localized at positions 209-287, 149-234, 115-200 of human HAX1 (gi|158562115), MLF1 (gi|194328680) and MLF2 (gi|54695734) respectively. The most conserved region used as an initial query, spanned the potential β-sheet. The last common α-helix is preceded by an additional strand and marginal prediction of α-helix in MLF family. The final multiple sequence alignment of C-terminal parts of HAX1 and MLF protein families shown in Figure 2 was created by manual correction of the obtained multiple sequence alignment (ClustalW, PCMA).

Figure 2
figure 2

Multiple sequence alignment of HAX1 and MLF C-terminal domains. The corresponding sequences were defined with GenBank entries (coded with GenBank identifier - gi) and origin organism. Numbers in brackets refer to the positions of presented sequence fragments, numbers in square brackets indicate the number of residues removed for clarity of alignment. Predicted secondary structure elements (Psipred) are coded with letters (H - α-helix, E - extended).

The high conservation of C-terminal domain of these proteins suggest its importance in protein's function of apoptosis regulation. Within C-terminal regions of the two protein families, several similarities can be observed. As shown in Figure 2, common distribution of the secondary structure elements, a high conservation of several key residues and properties of corresponding amino acids supports their homology and suggest a common fold for these two protein families. Preserved distribution of charged and hydrophobic amino acids, presence of characteristic glycine residues between β-sheets of both protein families point to similarity of the predicted domain structure. Detailed analyses revealed no assignments to any known records in protein families databases (Pfam, SMART).

Although a number of protein structure prediction methods were applied via the Protein Structure Prediction MetaServer, no significant similarity to known protein structures was detected, indicating the probable new fold type. Since no proteins of homologous experimentally determined structure were found, fold recognition for HAX1 and MLFs C-terminal domain could not be performed.

The analysis of taxonomic distribution of homologous sequences of both families revealed additional similarities of these two protein families - both HAX1 and MLFs families are well conserved through Vertebrata, but surprisingly additional homologues of both families were found in genomes of insects.

Fold recognition of the N-terminal calcium binding domain

Notably, HAX1 homologs in insects are less divergent of remaining sequences of the whole protein family (whole sequence group exhibiting homology and likely to perform similar function). The application of several threading and profile-profile methods via Protein Structure Prediction MetaServer for fragments of insects HAX1 sequences showed marginal similarity scores to parvalbumin's EF-hand calcium binding motifs in the N-terminal region of HAX1 (gi|122937737:57-171 - supported by Meta Basic, INUB and 3D-PSSM; 3D-Jury score: 60.20).

Parvalbumin belongs to the superfamily of calcium binding proteins [42]. The overall function of this family is related to transport and storage of calcium ions. Present mainly in muscle and neurons; it is responsible for muscle relaxation and calcium transport from myofibrils to the sarcoplasmic reticulum. It may also affect some cellular properties such as: extension of the G1 cell cycle phase, increase of cell motility and extension of mitotic rate in ovarian carcinoma calls [42]. From the structural point of view, parvalbumin is composed of six α-helices forming three helix-loop-helix motifs known as EF-hand motifs. Hydrophobic surfaces of the helices form a core part of this fold. These motifs between two adjacent amphipathic helices contain a loop of different length coordinating calcium ions mainly by carboxylate groups of aspartic and glutamic acid side chains. The six helices form three pairs - the acidic residues (Asp, Glu) at loops between the second and the third pair of helices are involved in binding of ions. The loop between first helix pair, because of the absence of critical amino acids is unable to bind calcium ions [43, 44].

The manual adjustment of multiple sequence alignment of Insecta HAX1 with parvalbumin sequences confirms the overall similarity of these two protein families. The final alignment of HAX1 sequences of insects with imposition of EF-hand motif is shown in Figure 3. The prediction of secondary structure of HAX1 was highly consistent with observed six helices of parvalbumin (three consecutive EF-hand motifs). The analysis of biochemical properties of aligned residues pointed to the conserved pattern of hydrophobic surfaces of helices building the fold. Surprisingly, the presence of acidic residues that are critical for calcium ions binding within HAX1 sequences suggested that the first and second EF-hand motifs are involved in forming of calcium binding sites, while in parvalbumin, such motifs are located within second and third motif [43, 44]. The acidic residues within the third motif, located between fifth and sixth helix, are in the fact not preserved in instects' HAX1. Visualization of HAX1 models revealed that only within the second EF-hand motif of HAX1, the location of critical amino acids involved in formation of binding site can analogously to typical EF-hand motif (typically for parvalbumin) bind calcium ions. The role of potential site within first EF-hand motif remains highly speculative (Additional file 1).

Figure 3
figure 3

Multiple sequence alignment of insects HAX1 and Parvalbumin (PVALB) calcium binding domains. The corresponding sequences were defined with GenBank entries (coded with GenBank identifier - gi) and origin organism. Numbers in brackets refer to the positions of presented sequence fragments. The observed (Protein Data Bank entry 1RTP_1, 1RJV_A) and predicted (Psipred) secondary structure elements are coded with letters (H - α-helix,E - β-strands). Calcium binding sites within parvalbumin proteins are marked with x, putative calcium binding residues of HAX1 are marked with asterisks (*).

The fold recognition for insects' HAX1 gives a strong suggestion that this protein initially evolved as a regulator of calcium homeostasis. When compared to insect HAX1 proteins only a single EF-hand motif consisting of two alpha-helices is preserved in vertebrate HAX1 proteins. The detailed alignment of the HAX1 proteins of vertebrates with recognized calcium binding motif of insects' suggested the presence of a single active calcium binding site (Figure 3). This motif was preserved throughout the whole clade (Insecta) and the presence of hydrophobic residues within helices along with pattern of acidic binding site within loops. However, the active site in vertebrates HAX1 proteins diverged from typical EF-hand motif with additional 20-28 amino acids within the calcium binding loop region (Figure 4). Furthermore, the precise confirmation of the motif's functionality requires detailed analyses. The sequence conservation and secondary structure predictions failed to identify any known potential functional motif. Previous studies suggested that EF-hand remains functional only in multimeric composition [45]. Although within vertebrates' HAX1, we identified only a single motif corresponding to the classic three EF-hand domain our prediction can still be correct since calcium ion binding can occur by forming a dual EF-hand domain composition during formation of HAX1 dimers [46]. Summarizing, in the current state of data, the presence of active EF-hand motif in vertebrates' HAX1 is still speculative and detailed experimental studies are needed to confirm this hypothesis.

Figure 4
figure 4

Multiple sequence alignment of insects and vertebrates HAX1 proteins putative calcium binding motifs. The corresponding sequences were defined with GenBank entries (coded with GenBank identifier - gi) and origin organism. Numbers in brackets refer to the positions of presented sequence fragments. Predicted secondary structure elements (Psipred) are coded with letters (H - α-helix,E - β-strands). Putative calcium binding residues of vertebrates HAX1 are marked with x.

Location of HAX1 interaction sites

According to the current state of the knowledge, specific function of HAX1 is a final result of its various interactions. At first HAX1 was identified as a binding partner of HS1 (hematopoietic lineage cell-specific protein I), a substrate of Src family tyrosine kinase. HS1 protein is expressed specifically in hematopoietic cells and implicated in signal transduction in B cells. The complex formation proceeds between N-terminal part of HS1 (residues 27-66) and C-terminal part of HAX1. This probably results in altered signal transduction from the Ag receptor (B lymphocyte antigen receptor) to intracellular organelles. When associated with HAX1, HS1 is accumulated in mitochondrial membrane [26]. Beside this, a number of other interactions with cellular and viral proteins has been described [25, 4751]. Such a broad spectrum of interacting proteins of unrelated structures suggests that HAX1 is a protein involved in intracellular signaling and is harboring various intracellular molecules (hub protein). Most of the known interaction sites within HAX1 are located among two regions defined previously as putative functional domains (all the known interaction sites are shown on Figure 1B). This further supports the hypothesis of two functional domains of HAX1 spaced by additional unstructured protein-binding region.

Apoptosis regulation

Several studies indicated that HAX1 is involved in apoptosis regulation [10, 13]. Previous analyses linked apoptosis event with interactions of HAX1 with proteins of caspase cascade (caspase 3 and 9). Both N-terminal and C-terminal domains are required for full anti-apoptotic function [11, 12]. Caspase 3 interacts with a region within predicted calcium binding site (cleaves HAX1 at residue Asp127). HAX1 when over-expressed, inhibits CASP3 catalytic activity and blocks the initiation of apoptosis [13]. C-terminal part of HAX1 interacts with caspase-9 at the region corresponding to residues 175-206 of human ortholog (gi|158562115). HAX1 repress post-mitochondrial caspase 9 activation and cell death during hypoxia-reoxygenation and HAX1 overexpression protects cardiac myocytes from apoptosis [10]. Location of HAX1 interaction sites with both caspases within functional regions described in this report partially supports the involvement of both of these regions in apoptosis and proper function of HAX1.

Additionally, several points and frame shift mutations within HAX1 were previously described. This genetic alterations associated with neutrophil depletion [52] in fact affect mainly functional regions described here (compare Figure 1C).

RNA binding properties of HAX1

The previous data reports that mRNA binding occurred within C-terminal part of the protein only in a diametric form of HAX1. The overall picture of mechanism of HAX1 binding to RNA is still not clear. According to the results of assays performed with HAX1 deletion mutants, a potential nucleic acid binding region is located within carboxyl terminus of HAX1.

According to predictions of secondary structure - the C-terminal region contains three conserved fragments which seem to be in extended conformation, likely to form a beta-sheet. This local conformation is conserved also in MLFs. Although some RNA-binding folds consist mainly of beta-sheet, our analyzes did not allow to map a distant similarity of HAX1 or MLFs to any known RNA-binding proteins. There are two major explanations for this situation: either HAX1 forms a novel RNA-binding protein, or this C-terminal fragment forms rather a platform closing together other protein agents, where some possess RNA binding activity [9].

Conclusions

Although the mechanisms of HAX1 and MLF1 activity are not well described, the applied state-of-the-art of protein structural bioinformatics revealed several common traces of their biology. In the history of the molecular evolution of HAX1-MLF1 families we can indicate three distinct points.

The C-terminal domain of HAX1-MLF1 families revealed a sequence homology which can suggest the existence of novel protein structural domain potentially involved in RNA (or other protein) binding.

Similar taxonomic distribution of homologous sequences and conservation of C-terminal domain of both families emphasize its importance in proteins common function of apoptosis regulation. It is worth to note that we made an interesting observation that HAX1 and MLFs are present only in co-existence in a number of genomes. This might suggest that there is either physical interaction between their protein products or that there is a novel negative feed-back mechanism in regulation of related biological processes.

Additionally, N-terminal domain of HAX1 family was described. HAX1 homologues in insects showed marginal similarity to parvalbumin EF-hand calcium binding motifs with the second EF-hand motif pointed as a calcium binding site. In vertebrates' HAX1 - only a single EF-hand motif consisting of two α-helices with additional 20-28 amino acids within the calcium binding loop region is preserved. Performed structural and active sites analyses gave new insights into mechanisms of HAX1 and MLF families in apoptosis process and suggested possible role of HAX1 in calcium-binding, still the analyses require further experimental verification.

In the presented report we describe the application of several methods of homology search and protein structure prediction in drafting of hypotheses on function of HAX1. In the light of this new data, additional experimental work is needed to confirm and summarize a picture of HAX1 and MLFs biology.