Structure-function analysis of Sedolisins: evolution of tripeptidyl peptidase and endopeptidase subfamilies in fungi
Sedolisins are acid proteases that are related to the basic subtilisins. They have been identified in all three superkingdoms but are not ubiquitous, although fungi that secrete acids as part of their lifestyle can have up to six paralogs. Both TriPeptidyl Peptidase (TPP) and endopeptidase activity have been identified and it has been suggested that these correspond to separate subfamilies.
We studied eukaryotic sedolisins by computational analysis. A maximum likelihood tree shows one major clade containing non-fungal sequences only and two major as well as two minor clades containing only fungal sequences. One of the major fungal clades contains all known TPPs whereas the other contains characterized endosedolisins. We identified four Cluster Specific Inserts (CSIs) in endosedolisins, of which CSIs 1, 3 and 4 appear as solvent exposed according to structure modeling. Part of CSI2 is exposed but a short stretch forms a novel and partially buried α-helix that induces a conformational change near the binding pocket. We also identified a total of 15 specificity determining positions (SDPs) of which five, identified in two independent analyses, form highly connected SDP sub-networks. Modeling of virtual mutants suggests a key role for the W307A or F307A substitution. The remaining four key SDPs physically interact at the interface of the catalytic domain and the enzyme’s prosegment. Modeling of virtual mutants suggests these SDPs are indeed required to compensate the conformational change induced by CSI2 and the A307. One of the two small fungal clades concerns a subfamily that contains 213 sequences, is mostly similar to the major TPP subfamily but differs, interestingly, in position 307, showing mostly isoleucine and threonine.
Analysis confirms there are at least two sedolisin subfamilies in fungi: TPPs and endopeptidases, and suggests a third subfamily with unknown characteristics. Sequence and functional diversification was centered around buried SDP307 and resulted in a conformational change of the pocket. Mutual Information network analysis forms a useful instrument in the corroboration of predicted SDPs.
KeywordsFunctional redundancy and diversification Structure-function analysis Protein superfamily Mutual Infomation Protease Subtilisin
Cluster Determining Position
Cluster Specific Insert
Multiple Sequence Alignment
Specificity Determining Network
Specificity Determining Positions
Proteases are ubiquitous enzymes that can be classified in many ways
Proteases or peptidases degrade proteins by hydrolysis of peptide bonds. They are involved in various biological processes such as cell death , nutrition  and infections . MEROPS , the peptidase database, classifies proteases based on the catalytic mechanism into the types of asparagine, aspartic, cysteine, glutamic, metallo, serine and threonine proteases. Further hierarchical classification into clans and families is based on homology and structure similarity. The remainder of the proteases fall into five clans of mixed catalytic type, clans that are further organized in homologous families, and a class of proteases with unknown catalytic mechanism. Proteases can also be classified based on other characteristics. A major difference can be made between endo- and exopeptidases, where the latter include aminopeptidases, carboxypeptidases, dipeptitidyl-peptidases and tripeptidyl-peptidases (TPP) as well as dipeptidases and peptidyl-dipeptidases.
Sedolisins are acid proteases related with the basic subtilisins
Serine proteases are proteases in which a serine serves as the nucleophylic amino acid in the catalytic site. The catalytic site is most often formed by a triad which can differ among the different unrelated superfamilies. Currently 12 clans or superfamilies with 55 families have been assigned by MEROPS . One of the most important clans is the SB clan that contains the common subtilisins (S8), which include the kexins , and the rather rare sedolisins (S53, for review see ). Sedolisins have been described in prokaryotes and eukaryotes. Interestingly prokaryotic and eukaryotic sedolisins are very distant showing often less than 25% sequence similarity. Despite a large difference in optimal activity pH, there is ample evidence the two subfamilies form a superfamily. Note that superfamilies can be hierarchically organized into many different subfamilies with many different, sometimes unknown, functional characteristics. In 2001 the first structure of a sedolisin, endosedolisin PSCP from Pseudomonas sp. 101, complexed with inhibitor iodotyrostatin, was resolved (PDB code 1GA4) , shortly followed by a structure from kumamolysin from Bacillus sp. MN-32 (PDB codes 1T1E for precursor and 1GT9 for mature peptidase ). Although sequence similarity between sedolisins and subtilisins is low, structural alignments clearly indicate they are homologous  since they have similar folds. The basic subtilisins have a triad that consists of the serine, a histidine and an aspartate, the acid sedolisins have a homologous serin, a homologous glutamate that replaces the histidine as well as a non-homologous aspartate . Also the oxyanion aspartate appears as homologous. Sedolisins also have a calcium binding site albeit at a different position than subtilisins .
The best studied sedolisin is human lysosomal CLN2 since mutant forms are involved in the fatal classical late-infantile neuronal ceroid lipofuscinosis or Batten disease. The structure (PDB codes 3EDY for precursor and 3EE6 for mature peptidase) of this tripeptidyl aminopeptidase has been determined and a number of publications describe the effect of many mutations found [12, 13, 14]. Of particular interest is W542 which has been shown to be required for activity. The W542L mutant was shown to be retained in the ER which suggests misfolding . In addition, W290L and W307L showed largely reduced activities.
Sedolisins have a large prosegment that appears to have various functions
The processing of subtilisins and sedolisins is similar. Both have a large and similar prosegment that appears to be able to form an independent domain that seems to be involved in correct folding of the core or the catalytic domain [16, 17]. The prosegment and catalytic domain are separated by a short propeptide or linker that is removed during zymogen activation at low pH. In general it has been shown that prosegments assist in refolding as well as targeting (for review see ). For human TPP it has been shown that prosegment and catalytic domains have multiple molecular interactions including salt bridges and hydrogen bonds, covering 15% of the solvent accessible surface of the catalytic domain . It has been shown that the prosegment of human TPP also functions as an inhibitor . Secretome analysis of for instance Botrytis cinerea has shown certain paralogs consist of the core part of the enzyme only .
Fungal sedolisins can have endo- or tripeptidyl-peptidase activity
Other characterized eukaryotic sedolisins are of fungal origin. Scytalidolisin , grifolisin  and aorsin were the first fungal enzymes characterized as sedolisins. More recently, four homologs from Aspergillus fumigatus were characterized. SED_A was, similarly to aorsin form Aspergillus oryzae, characterized as an endosedolisin, whereas SED_B, SED_C and SED_D were shown to have TPP activity . Endo and TPP activity have been described for subtilisins. The authors suggested furthermore endosedolisins cluster in a different clade than TPPs, suggesting that gene duplication has resulted in functional diversification. A recent genome paper of fungal plant pathogens B. cinerea and Sclerotinia sclerotiorum showed that acid secreting fungi such as phytopathogens B. cinerea, S. sclerotiorum but also the saprophytic Aspergilli show relatively few subtilisins and many sedolisins, as compared to non-acid secreting fungi such as Giberella zeae . Interestingly, yeasts from Saccharomyces and Schizosaccharmyces completely lack homologs. This also suggests functional diversification has occurred. Here we study the functional redundancy and diversification of fungal sedolisins by computational analysis. We reconstructed a phylogenetic tree that, together with the underlying multiple sequence alignment (MSA), was used for the identification of cluster specific inserts (CSIs) and specificity determining positions (SDPs), which form the sequence characteristics that can explain functional diversifications. Modeling of wild type and mutant sequences was performed in order to show how functional diversification into endosedolisins and TPP has likely occurred, demonstrating important roles for part of CSI2 and the position homologous to human TPP W307.
Sedolisin sequence identification
A first HMMER  profile, built from the MEROPS  holotype sequences from the MEROPS database, was used to search a database containing the complete proteomes of 56 fungi and 186 non fungal eukaryotes complemented with the PDB and Swissprot database, yielding 230 sequences of sedolisin homologs. A sequence hallmark scrutiny for the presence of catalytic site and oxyanion residues was performed using MEROPS Batch BLAST , followed by a structural scrutiny finally yielding a total of 204 high fidelity sequences. These were aligned using MAFFT’s  iterative refinement method and the resulting MSA was manually corrected using as criteria that secondary structure elements (taking as reference 3EE6) should be represented by each sequence, combined with entropy minimization. The resulting MSA was used to construct a preliminary maximum likelihood tree using PHYML. The preliminary tree has three clearly separated, major clades and the three corresponding sub-MSAs were used to construct subfamlily specific HMMER profiles. These were used to iteratively screen HMMER’s Reference Proteomes dataset restricted to eukaryotes using the procedure as described by the recent superfamily classification software HMMERCTTER  resulting in a total of 2203 sequences.
MSA and phylogeny
Identification of specificity determining positions (SDPs)
We identified Cluster Determining Positions (CDPs) using SDPfox . MISTIC  was used to determine levels of Mutual Information (MI) between positions or columns of the MSA. Initially, CDPs are accepted as SDP when they contain at least two direct connections with other CDPs, using MISTIC’s default z-score cut-off of 6.5. CDPs with a single direct connection are considered as putative SDPs (pSDP) and require additional evidence in order to become accepted as SDP. Cytoscape  was used to identify and draw sub-networks of directly connected SDPs. Sequence logo’s were made using Weblogo .
Tertiary structures of sedolisins were obtained from the Protein Data Bank . 3EE6 , corresponding to mature human TPP was used as reference. Models were made using I-Tasser  using either default settings or using 3EE6 Chain A as the reference model. The SED_A dimer was made by structural alignment of the SED_A monomer model to both the 3EE6 A and B chains. Visualization was performed using VMD  which included structural alignment using the STAMP  extension. The pocket predictions for 3EE6 and the SED_A and SED_B models were performed with the software Fpocket  using the default parameters.
Datamining, multiple sequence alignment and phylogeny
In order to perform structure-function analysis of eukaryote sedolisins we set out to obtain a representative collection of sequences, while trying to avoid the inclusion of sequences corresponding to pseudogenes or derived from incorrect gene models. High sensitivity was obtained by applying HMMER iteratively, whereas specificity was obtained by an initial sequence scrutiny and using strict cut-off thresholds using a HMMERCTTER  procedure, for details see materials and methods. A total of 2203 sequences were aligned and an excerpt of the final MSA is shown in Fig. 1. In general, eukaryotic sedolisins are largely conserved, including the prosegment part. Interestingly, of the three disulfide bridges identified in the resolved structure 3EE6, only the second appears to be conserved among eukaryotes. A trimmed MSA, lacking low quality sub-alignments, was used to reconstruct a maximum likelihood tree using FastTree 2  with 1000 bootstraps.
Cluster specific inserts
The MSA demonstrates the presence of four Cluster Specific Inserts (CSIs) we identified in the Hypo-endosedolisins. Fig. 2c shows the distribution of insert length on the phylogeny thereby demonstrating an intricate clustering pattern. CSI2 has about 25 amino acids and is present in all Hypo-endosedolisins whereas CSI4 has about 40 amino acids and is found only in the large subclade of the Hypo-endosedolisin clade. Both show moderate levels of conservation (Fig. 2d). CSI3 is present in all Hypo-endosedolisins as well as in certain Hypo-TPPs and some non-fungal sedolisins. CSI1 is present in all sedolisins but is longer in the Hypo-endosedolisins. CSIs 1 and 3 show no clear conservation.
We used SDPfox  to identify CDPs, positions that contribute significantly to the underlying clustering. We identified 26 CDPs between the Hypo-endo and the Hypo-TPP clusters. Then we performed an analysis of mutual information between positions using MISTIC . Mutual information expresses levels of covariation and high levels suggest co-evolution. CDPs might result form genetic drift but CDPs that show high levels of interaction are more likely SDPs. We envisage that the functional characteristics of phylogenetically well separated subfamilies, such as the Hypo-endo and Hypo-TPP sedolisins, are the result of the interaction of multiple positions that have somehow co-evolved. As such, possibly one or more sub-networks of directly connected CDPs exist. We consider all CDPs that connect directly to at least two other CDPs with a score higher than MISTIC’s default threshold of 6.5 as SDP. CDPs with a single connection are initially considered as pSDP. Eventual sub-networks of directly connected SDPs are considered Specificity Determining Networks (SDNs) that not only substantiate that CDPs are SDPs but also show which positions have co-evolved towards a certain diversification.
SDP349 also occurs in the vicinity of SDP346 and connects directly to SDP346 (SDN1 and SDN2) and SDP89 (SDN2). Alanine, common in Hypo-TPP is slightly less hydrophobic than the predominant leucine of Hypo-endosedolisins. SDP340 is also in close range of SDP346 and connects directly to SDP89 in SDN2. It shows predominantly a polar glutamine in the Hypo-Endo and a hydrophobic valine in the Hypo-TPP clade. Interestingly, SDP340 is found next to position 341 that corresponds with one of two cysteines that are absent in non-fungal sedolisins and strictly conserved in fungal sedolisins (See Fig. 1). Since strictly conserved cysteine pairs often correspond with disulfide bridges we checked their orientation in the models of SED_A and SED_B. Although in both the SED_A (Additional file 4) and the SED_B model they are modeled at positions that seem to favor a disulfide bridge, this is not modeled. Nevertheless, the virtual C402A / C452A double mutant of SED_A also reverts to the structure lacking helix 9b (Fig. 4c). All together, this suggests that both SDN1 and SDN2 are related to the predicted structural changes discussed in the previous section.
SDP307, part of helix 9, is a position that, according to the MI analysis, interacts with SDP89 in both SDN1 and SDN2, as well as with SDP346 in SDN2. Although in the structure of human TPP 3EE6 W307 is located at over 9 Å from the local SDP network described above, in the model of SED_A, its counterpart alanine is found at 3.7 Å (See Fig. 7b). W307 is buried in 3EE6 and present in most other non-fungal sequences and represented by an aromatic residue in the Hypo-TPP clade. Substitution of a buried aromatic residue by the small alanine will most likely result in a conformational change. We envisaged that the substitution might be related to the conformational changes identified between 3EE6 and SED_B on the one hand, and SED_A on the other. We made structural models of virtual mutants, exchanging SED_A for SED_B residues. The model of virtual mutant A307F in SED_A suggests the loss of helices 9, 9b and 10. Compensation should, according to the above train of thought, come from SDPs 89, 343, 346 and 349, which directly connect to SDP307. Hence, we modeled the quintuple A343F-F92L-E404L-K407Q-L410S SED_A mutant. The obtained model resembles 3EE6 and SED_B since the principal helices H9 and 10 are modeled nearly identically (Fig. 4d and e).
The other SDPs show a lower level of connectivity and might involve secondary compensations. In SDN2, SDP73 connects to all key SDPs except SDP349. SDP73 is mostly S in the Hypo-Endo and H/N in the Hypo-TPP clade. Structural analysis reveals that this SDP is located in the hinge region preceding the helix that contains SDP89. SDP262, connects to SDP89 and SDP343 and is a hydrophobic residue in Hypo-TPP and a P in Hypo-Endo. Its position does not indicate an important role in structure (i.e. the P does not induce a turn). SDP340 connects toSDP89 and SDP349 and is a Q in Hypo-Endo, whereas mostly V in Hypo-TPP, which, combined with its closeness suggests this is yet another mutation that has co-evolved in order to further compensate changes in the folding induced by the key SDPs and CSI2. SDP 285 from SDN1 seems another important position, as it connects to many SDPs among which SDP89 and SDP307. In Hypo-TPP there seems to be a low level of prevalence whereas in Hypo-Endo it is predominantly a Q. Also the remainder of the SDPs and pSDPs also possibly fulfill additional supporting roles in Hypo-Endo rather than Hypo-TPP, given their high conservation in Hypo-Endo only.
We studied eukaryotic sedolisins by computational analysis of protein sequences. The protein sequences were obtained from a large set of EBI’s Complete Proteomes among which many are of fungal origin. Since it has been suggested that acidification by certain fungi is related to expansion of the sedolisin familiy in these fungi, the main attention of this study was directed at the evolutionary history of fungal sedolisins.
Sequences were aligned by MAFFT’s iterative refinement method and the resulting MSA was manually improved in order to correct poorly aligned residues, guided by hallmark residues and secondary structure conservation. Correction was likely required due to large taxonomic distances and the presence of CSIs which tend to disturb the alignment process. Prior to tree reconstruction, the MSA was subjected to trimming, guided by maintaining particularly β-sheets, in order to remove these CSIs and unreliably aligned subsequences. The clustering pattern of the CSIs largely corresponds with the phylogenetic clustering (See Fig. 2c), which confirms the topology of the tree is basically correct, as is also indicated by bootstrap support (See Fig. 2a). The Hypo-TPP clade shows a Felsenstein bootstrap support of 0.66, which is considered low. Felsenstein bootstrap yields a clear underestimation of statistical support: when any of the 971 leaves is differently placed, it results in a rejection of support. The recently proposed normalized bootstrap is over 0.95, strongly supporting this topology . In addition, the maximum likelihood tree obtained by PHYML is also nearly identical (Additional file 1). Interestingly, the tree contains a single clade with 216 non-fungal sequences and four fungal clades containing 971, 785, 213 and 18 sequences each. This corresponds to the fact that many fungi have various paralogs. Since the fungal sequences are clearly separated from the non-fungal eukaryotic sequences it appears the evolutionary rate of fungal sedolisins has been higher than that of other sedolisins. A process of functional redundancy, caused by gene duplications, and a resulting functional diversification, corresponds with an increased evolutionary rate.
It is possible that the hypothesized disulfide bridge C402-C452 in the SED_A model has played a crucial role in the evolution of fungal sedolisins. First, we consider the absence of the bridge in the SED_A model as a modeling artifact. The fact that two cysteines are modeled within 5 Å distance combined with the fact that they are strictly conserved among fungal sedolisins, is not likely a mere coincidence. Then, it can be envisaged that such a disulfide bridge stabilizes the enzyme in the oxidative extracellular environment. A more stable enzyme is more robust and allows for accelerated evolution . The predicted bridge appears to be required for the conformational change near the binding pocket as is shown by the virtual mutant (Fig. 4c). Furthermore it is striking that C402 of SED_A corresponds with position 341 in 3EE6, very close to many of the key SDPs. An obvious requirement for endosedolisin diversification would be the presence of at least part of CSI2 since the presence of the disulfide bond in SED_B does not affect the conformation, suggesting its presence is not sufficient for the conformational change in SED_A. Another possible explanation for the accelerated evolutionary rate in fungi, also related with the novel disulfide bridge, is given by the fact that fungal sedolisins are predicted to be secreted, hence act in a highly variable environment, whereas human TPP is lysosomal, which puts a rather high functional constraint.
The CSIs form a major obstacle when studying diversification of fungal sedolisins. Not only does their presence negatively affect the alignment process, it will also affect folding, making modeling less reliable. The residue quality (resQ) scores provided by I-TASSER give an indication of the reliability of the model and confirm that the CSI regions are unreliably modeled whereas in general confidence of the models is good (e.g. see for model of SED_A in Fig. 4a and b). The fact that the loops are likely incorrectly modeled does however not imply that their approximate location is incorrect and we envisage their solvent exposed location does not severely affect the folding of the core. This is supported by the dimer modeling of SED_A without CSI1 that, as part of the propeptide, is supposedly removed during zymogen activation (Additional file 2). It is difficult to envisage or explain how CSIs 1, 3 and 4 are related with the hypothesized diversification towards endosedolisin and TPP but the fact that CSI2 seems to instigate a conformational change and that removing CSI2 from SED_A yields a model that resembles the fold of 3EE6 (Fig. 4) is intriguing. Furthermore, CSI2 and CSI4 are conserved among all and most Hypo-endosedolisins respectively (Fig. 2 b d) and are located on opposite sides of the predicted pocket cavity of Hypo-endosedolisins. As such they might affect the exact conformation of the pocket. They also may have some importance in substrate binding or contain retention signals. Interestingly CSI2 is the only CSI that corresponds perfectly with the clustering into Hypo-endo and Hypo-TPP. CSI3 and CSI1 show no clear conservation and are found near to the calcium binding site and in the prosegment respectively and there are no clues regarding their functions.
A number of approaches and softwares to predict SDPs exist, although it must be noted that most often CDPs are identified. CDPs are most likely the result of positive selection but neutral evolution, or even phylogenetic reconstruction artifacts, can also result in CDP identifications. Diverge  should theoretically take this into account by using likelihood models. A recent version of Evolutionary Trace  includes mutual information in order to substantiate its predictions. We combined SDPfox , to identify CDPs, with MISTIC , to confirm CDPs as SDPs. In a recent paper we showed that that combination identified a number of known target positions in the evolution of truncated hemoglobins , which verifies the applicability of the method. The major disadvantage of MI is the requirement of large datasets. Then, given the high noise neutral evolution combined with epistasis can provide, robust MI determination requires many corrections or adjustments, such as provided by MISTIC. However, MISTIC’s output consists of z-scores that tend to increase with the number of sequences. We used the default cut-off threshold of 6.5, which was determined for using sets of about 400 sequence clusters. The CDP network we obtained from the Hypo-Endo / Hypo-TPP dataset contained 533 sequence clusters by which a more stringent threshold should be applied. Unfortunately, there is no easy method for the determination of that threshold. The separate Hypo-Endo and Hypo-TPP analysis had 214 and 300 sequence clusters each, by which the cut-off should likely be applied at a lower threshold. Using the 6.5 threshold, we identified SDN1 using the Hypo-TPP dataset and SDN2 and SDN3 using the Hypo-Endo dataset. Figure 6 clearly shows that not only SDN1 is similar to SDN2 sharing, what we refer to as, five key SDPs, but also that that three of these show the highest levels of connectivity to other SDPs in the same networks. Hence, we used a high threshold, identified similar networks using independent datasets yielding a number of highly connected SDPs. Basically, the method we used here is therefore specific rather than sensitive and as such we are particularly confident in the five key SDPs.
We have no explanation for SDN3 that shares the highly connected SDP228 with SDN1, SDN2 and SDN3 only connect when applying a very low threshold. All together, out of the 26 CDPs we selected only 15 SDPs and 1 supportive pSDP, besides 5 pSDPs for which no corroborating evidence was found. Hence, this further corroborates the method suffers from poor sensitivity rather than poor specificity.
The advantage in using SDNs is not only reflected in that it allows for a substantiation of SDP prediction, it also relies in the fact that it shows co-evolving partners allowing a more elaborated explanation and improved predictions and understanding of the complete process of functional diversification. As stated before, based on the tree topology we envisaged a single functional diversification towards, based on biochemical evidence, endosedolisin and TPP activity. Although there are many uncertainties since it seems many factors, including SDPs but also at least one of four CSIs, are involved, most results seem to converge around the conformational change of Helix 9 identified in the SED_A model. The endosedolisin activity seems not only caused by additional helix 9b, coded by CSI2, but is also related with the interaction with the chaperone-like prosegment, as shown by the apparently important interaction between the aliphatic sidechain of K346 and F89, further strengthened by SDP73. Models of virtual mutants directed at the central role of SDP307 residing in helix 9, the compensations of additional key SDPs 89, 343, 346 and 349, the predicted disulfide bridge and also the four CSIs (Fig. 4) suggest all are involved in that conformational change. Then, if we consider the conservation pattern of Trp/Phe of SDP307 in the hypo-TPP clade, we should actually consider that position 307 is an SDP in the TPP clade. Thus, although the evolutionary processes have taken place in two clades and are therefore strictly independent, they seem to converge at a central role for SDP307. Correspondingly, W307 has been shown to be important for activity in human TPP  and the same position is identified as the major difference of the clade containing 213 uncharacterized sedolisins. Since this clade contains at least both asco- and basidiomycete sequences, this clade is unlikely the result of a neutral mutation, by which it is likely the result of yet another as yet unknown functional diversification.
Although SDPs, mutant analysis and literature confirm each other, future validation of the proposed functional differences requires molecular dynamics and or wetlab experiments. Given the sheer amount of characteristics (i.e. CSIs and SDPs) involved, this will be difficult to achieve. As such, maybe a more feasible approach is to obtain a structure of SED_A or another characterized endosedolisin such as Aorsin.
The remaining question concerns how this change in conformation is related to the difference in activity. TPP activity can be envisaged to require a less spacious binding cleft, as can be seen by comparing Fig. 4d and e, in concordance with the fact that SDP307 is occupied by a large hydrophobic residue in the Hypo-TPP and NF clades and by small hydrophobic in Hypo-Endo.
Fungal sedolisins have undergone a process of birth and death evolution  or, more exactly, a process of functional redundancy and diversification. Diversification has resulted in TPP and endopeptidase subfamilies, as was suggested by Reichard and coworkers . In the endopeptidase subfamily, a CSI seems to be involved in a conformational change, likely possible by the predicted disulfide bridge identified in all fungal sedolisins. A network of SDPs, particularly the local network of SDPs 340–343–346-89 and the more distant SDP307 appear to be involved in the hypothesized diversification towards endo activity whereas other SDPs are linked to TPP activity. The interaction between SDP89 and SDP346 correspond with the chaperone-like character of the large prosegment of sedolisins. Besides the corroboration of predicted SDPs, MI network analysis confirms the generally accepted idea that evolution towards two specificities is independent and shows that MI analysis, albeit powerful, should be performed with care.
We thank Matías Irazoqui for performing preliminary analyses.
FO is a Consejo Nacional de Investigaciones Científicas y Técnicas doctoral fellow, AtH is a Consejo Nacional de Investigaciones Científicas y Técnicas career investigator. Project funding is from PIP11420100100286 from the Consejo Nacional de Investigaciones and Científicas y Técnicas and PICT2013–2296 from the Fondo para la Investigación Científica y Tecnológica.
Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.
FO performed all the analyses. AtH designed the research, assisted by FO. AtH and FO wrote the manuscript. Both authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors FO and AtH declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Dunn BM Overview of pepsin-like aspartic peptidases. Curr Protoc Protein Sci 2001; Chapter 21:Unit 21.3.Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.