Abstract
Brain-Derived Neurotrophic Factor (BDNF) is an essential mediator of brain assembly, development, and maturation. BDNF has been implicated in a variety of brain disorders such as neurodevelopmental disorders (e.g., autism spectrum disorder), neuropsychiatric disorders (e.g., anxiety, depression, PTSD, and schizophrenia), and various neurodegenerative disorders (e.g., Parkinson’s, Alzheimer’s, etc.). To better understand the role of BDNF in disease, we sought to define the evolution of BDNF within Mammalia. We conducted sequence alignment and phylogenetic reconstruction of BDNF across a diverse selection of >160 mammalian species spanning ~177 million years of evolution. The selective evolutionary change was examined via several independent computational models of codon evolution including FEL (pervasive diversifying selection), MEME (episodic selection), and BGM (structural coevolution of sites within a single molecule). We report strict purifying selection in the main functional domain of BDNF (NGF domain, essentially comprising the mature BDNF protein). Additionally, we discover six sites in our homologous alignment which are under episodic selection in early regulatory regions (i.e. the prodomain) and 23 pairs of coevolving sites that are distributed across the entirety of BDNF. Coevolving BDNF sites exhibited complex spatial relationships and geometric features including triangular relations, acyclic graph networks, double-linked sites, and triple-linked sites, although the most notable pattern to emerge was that changes in the mature region of BDNF tended to coevolve along with sites in the prodomain. Thus, we propose that the discovery of both local and distal sites of coevolution likely reflects ‘evolutionary fine-tuning’ of BDNF’s underlying regulation and function in mammals. This tracks with the observation that BDNF’s mature domain (which encodes mature BDNF protein) is largely conserved, while the prodomain (which is linked to regulation and its own unique functionality) exhibits more pervasive and diversifying evolutionary selection. That said, the fact that negative purifying selection also occurs in BDNF’s prodomain also highlights that this region also contains critical sites of sensitivity which also partially explains its disease relevance (via Val66Met and other prodomain variants). Taken together, these computational evolutionary analyses provide important context as to the origins and sensitivity of genetic changes within BDNF that may help to deconvolute the role of BDNF polymorphisms in human brain disorders.
Similar content being viewed by others
Introduction
Brain-derived neurotrophic factor (BDNF) is one of the most ubiquitously studied molecules in modern neuroscience [1]. BDNF is a neurotrophin that binds with high affinity to its cognate tyrosine kinase receptor, TrkB [2], to elicit rapid induction of synaptic plasticity [3,4,5] and neuronal spine remodeling [6, 7]. Additionally, BDNF has been implicated in a variety of brain disorders [1], including depression [8,9,10], PTSD [11,12,13,14], schizophrenia [9, 15,16,17], Parkinson’s disease [18, 19], and autism spectrum disorders [20,21,22] amongst many more. BDNF has correspondingly been the primary target, or an ancillary factor, of many novel therapeutics including small molecule mimetics [23, 24] and existing drugs (e.g., antidepressants [25, 26]). Yet, nascent research has provided the humbling reminder that much remains to be discovered about BDNF. In recent years, new BDNF ligands have been discovered [27], new receptor interactions unveiled [27, 28], and mechanisms of behavioral function unlocked [7]. This is a timely reminder that while BDNF has remained a seminal molecule of interest across the broader neuroscience literature, much remains to be discovered about its origins, evolution, function, and disease relevance.
A primer of the molecular biology of BDNF and its functional topology
BDNF is encoded by the BDNF gene [29], whose expression is regulated in humans by an antisense gene (BDNF-AS) that can form RNA-duplexes to attenuate translation [30]. Thus, the natural antisense for BDNF is capable of directly downregulating endogenous expression on demand [31]. The BDNF gene in humans comprises 11 exons [30] and can produce at least 17 detectable transcript isoforms [29]. Different transcripts are induced in response to activity and/or cellular states, allowing the BDNF gene to adjust to environmental stimuli and potential selection pressures. However, all transcripts ultimately yield a singular preproBDNF protein that (prior to intracellular processing, cleavage, and transport) can be partitioned into three domains [11, 29]: a signal peptide, a prodomain, and the mature domain. The signal domain is only 18 amino acid residues long (with ambiguously defined functionality) with the majority of BDNFs functional outputs reflecting sequence specificity to the prodomain and mature domain. The BDNF prodomain encodes binding sites for intracellular transport of both BDNF mRNA [32] and BDNF protein [33], and contains numerous posttranslational modification sites [29]. The BDNF prodomain is also the resident location of a widely studied Single Nucleotide Polymorphism (SNP) in neuroscience (Val66Met, or rs6265) [1], and the Furin consensus sequence (Arg 125) for cleavage to its mature form (including by plasmin [34]). The prodomain is composed of 110 amino acids within the N-terminus, and must be processed via proteases to generate mature BDNF [5]. The mature domain of BDNF is almost exclusively composed of the Nerve growth factor (NGF) domain and is responsible for the canonical trophic actions associated with BDNF (e.g., long-term potentiation, rapid-acting antidepressant effects, etc.). Following intracellular handling, processing, and transport, the preproBDNF isoform is cleaved to yield the mature BDNF peptide (which only contains the mature NGF domain). For many years the prodomain was thought to be degraded following the facilitation of BDNF trafficking. However, recent work has shown that the cleaved prodomain can be secreted and bind as a ligand to novel receptors (e.g., SorCS2) [27]. Thus, the BDNF prodomain can influence brain circuits as well as behavior [7]. For a comprehensive, detailed, analysis of the various intricacies of the BDNF gene, protein, and its regulation, more information is provided in [29].
The conservation of BDNF and neurotrophins: a signal that evolution is important
One of the interesting curiosities surrounding BDNF is its relationship to other neurotrophic (NT) growth factors, comprising NGF, NT-3, and NT-4. Specifically, all neurotrophins retain some intercalated functionality. Neurotrophins also share some commonalities in structure (pre-, pro-, and mature-domains) [29], posttranslational modification potential (e.g., glycosylation [35]), as well as catalytic processing, trafficking, and composition [36]. Specifically, neurotrophins share approximately 50% sequence homology [29], and a comparison of domains and motifs reveals that each comprises a prototypic NGF domain as the principal component of the mature pro-growth peptide (see PFAM database [37]). While each neurotrophin elicits functionality via binding to cognate receptors, neurotrophins also exhibit cross-affinity amongst neurotrophin receptors [38] presumably due to their high rates of structural homology. Not surprisingly then, there is some redundancy in the trophic effects of neurotrophins, yet each still maintains nuanced functionality which remains specific to each factor during central nervous system development [39]. Differences in the evolution and temporal dynamics of regulatory sequences, which target gene products to specific destinations within cell-compartments (e.g., dendrites) [40] or to processing routes (e.g., the activity-dependent release pathway) which alter secretory dynamics and/or bioavailability [41], likely contribute to both similarities and differences between neurotrophins. However, almost nothing is known about how the BDNF prodomain has evolutionarily adapted to specifically regulate BDNF dynamics. While evolution has almost certainly shaped the sequences, structure, and function of BDNF, the modeling of such remains relatively unexplored but could provide important insight into the phylogenetic evolutionary history of BDNF, its selection pressure sensitivity across lineages, and quantitative metrics of evolutionary change across species.
Purpose of this Study
Here, we use computational methods to explore the molecular evolution of BDNF. To reconstruct phylogenetic trees of BDNF, we utilized sequence alignments of over 160 mammalian species (all available mammalian sequences) to determine the genomic attributes of BDNF evolution that are specific to Mammalia. This analysis was specific by being constrained to sequences that have the most direct evolutionary relevance to humans. Notably, we sought to identify sites in BDNF that are subject to pervasive (i.e., consistently across the entire phylogeny) diversifying selection (FEL) or pervasive/episodic (i.e., only on a single lineage or subset of lineages, diversifying selection (MEME). Likewise, utilizing multiple models for the inference of selective pressure and the evaluation of evolutionary change, we identify novel sites within the BDNF prodomain and mature peptide coding regions that are susceptible to synonymous and nonsynonymous changes. Additionally, we investigate which sites in BDNF may be coevolving (BGM). Taken together, these computational evolutionary analyses provide an important context as to the origins and sensitivity of genetic changes within the BDNF gene, which may be important for providing insight into genetic risk factors linked to disease in humans.
Results
We find that unique evolutionary pressures have shaped the BDNF gene across time. These forces have mostly operated through strict purifying selection. Of note, BDNF elicits tight regulation and specific functionality that can be separated from other neurotrophins, yet these growth factors remain closely related in their structure and sequence, especially in the conserved NGF domain.
Evolutionary history of mammalian BDNF
Prior to conducting our primary evolutionary analysis, we ported our mammalian species into a platform (timetree.org, see refs. [42, 43]) to examine the epoch events that may have influenced the analysis described here. This was an important pre-analysis step to frame the age of our genomes, and the broad-stroke evolutionary pressures that these species have been exposed to (which, in theory, could contribute to subsequent purifying selection and coevolution analyses). As expected, this revealed BDNF as an ancient gene that has been preserved throughout the mammalian lineage and has both survived and been shaped under all major evolutionary events of the past ~177 million years (data not shown). We identified several examples of species-level evolutionary epochs that cross-referenced with major earth events (e.g., bottleneck events) that have historically been believed to drive evolutionary adaptation. This included major geologic periods that are cross-referenced against earth impacts, oxygenation changes across time, atmospheric carbon dioxide concentrations, and solar luminosity. This indicates that even under extreme evolutionary pressures, the BDNF gene has exhibited (relatively speaking) very specific adaptation events (see results below) over millions of years within Mammalia. This tracks with the idea that “old genes” tend to be highly conserved, evolve more slowly, and therefore are more likely to exhibit both specific and selective changes as opposed to more dramatic permutations (e.g., gene duplications, etc.).
Predominant purifying selection in BDNF
A common approach to gain an increased understanding of the evolutionary forces that have shaped proteins is to measure the omega ratio ω consisting of the nonsynonymous (β or dN) and synonymous (α or dS) substitution rates, with ω = β/α for each site in a particular gene of interest [44]. We define two major changes for the amino acid being coded for at each site: synonymous changes, which keep the same amino acid coded for at a particular site, and nonsynonymous changes, which change the amino acid coded for at a particular site. Non-synonymous changes can have strong influences on the structural, functional, and fitness measures of an organism. This is in contrast to synonymous changes which leave the amino acid at a particular site unchanged but can confer weak fitness effects through the emergent properties of codon usage bias, mRNA structural stability, translation, and tRNA availability. However, synonymous changes are typically understood to represent neutral selection acting on coding sequences and provide a baseline rate against which nonsynonymous evolutionary rates can be compared. The omega ratio ω of relative rates of nonsynonymous and synonymous substitutions is a common measure in evolutionary biology of the selective pressure acting on protein-coding sequences. These estimates provide increased information availability as to the type of selection (positive, with omega >1 or negative, with omega <1, or neutral with omega =1) that has acted upon any given set of protein-coding sequences.
As FEL analysis is a sensitive measure of negative (purifying) selection, for this analysis we observe a predominant amount of purifying selection (over 66% of sites, 174 sites out of 261; Table S1) in our recombination-free alignment for BDNF. The dN/dS estimates for the entire alignment were plotted including 95% lower- and upper-bound estimates (see Fig. 1 or Table S1). Overwhelmingly, the mature NGF domain of the BDNF exhibited evidence of greater pervasive negative purifying selection relative to the prodomain region of BDNF. Thus over the evolutionary history of Mammalia, negative selection has predominantly occurred in the regions of BDNF that encode the functional mature protein that binds TrkB to elicit neurotrophic effects. The mature domain of BDNF has exhibited remarkable conservation across innumerous epochs that have been defined by rapid evolutionary adaptation in other genes and taxa.
Specific sites that are evolving non-neutrally
To examine specific sites for episodic adaptive evolutionary selection, we utilized an algorithm known as MEME which is fundamentally similar to our FEL analysis (described above) except that it applies a more sensitive method for the detection of both pervasive (persistent) and episodic selection (transient selection occurring only on one or a subset of branches in the phylogenetic tree) as compared to only pervasive selection which occurs across all branches of the phylogenetic tree. Essentially, only a subset of the lineages (i.e., species) are affected allowing for a more granular/sensitive method of detecting selection (whereas FEL is better geared towards broad changes). This analysis revealed that for all sites, only 2.3% (6 of 261; see Table 1) exhibit evidence for episodic diversifying selection (i.e., positive selection) in at least one branch within the phylogeny. Spatially, these mutations occur outside of the NGF functional region of BDNF. Further, this result is essentially relevant as the MEME analysis is a sensitive measure of episodic selection. The sites we observe as statistically significant were 26, 27, 30, 38, 249, and 254. For comparison, these specific sites were realigned to the respective human sites with indel (insertion/deletion) events accounting for any respective discrepancy in specific site numbers. When mapping these sites to the human BDNF coordinate system, they correspond to sites 26, 27, 29, 36, 238, and 240, respectively.
Evidence of coevolutionary forces
To examine the coevolution of sites, i.e., if one particular amino acid was evolving in-tandem with another, we subjected our protein-coding gene sequences to the BGM algorithm which leverages Bayesian graphical models [45]. The BGM algorithm infers substitution history through the use of maximum-likelihood analyses for ancestral sequences and maps these to the phylogenetic tree, which allows for the detection of correlated patterns of substitution [45]. For our BGM analysis, we find evidence for 23 pairs of coevolving sites. This suggests interaction dynamics in tertiary space of the 3D, folded, protein level (see relevant sites in Fig. 2) BDNF protein structure. Alternatively, this data may be evidence that coevolving sites may be related to other fitness consequences (e.g., compensation) for maladaptive changes in another part of the protein sequence that may have occurred. When we review these sites, we notice that several pairs (see Fig. 2) occur within alignment sites, which correspond to the Human BDNF coordinate system (Table 1). These include pair-sites of (89, 184), (94, 155), (103, 233), and (135, 154). Of note, several other sites also display interesting geometric features including triangular relations [(81, 93), (93, 98), and (81, 98)], an acyclic graph network of site connections [(70, 74), (74, 94), (94, 155) and (25, 49), (49, 85), (49, 86)], more complex double-linked coevolutionary sites [(39, 103) and (103, 233)], and triple-linked coevolutionary sites [(30, 119), (33, 119), and (91, 119)]. Additionally, three-dimensional reconstruction—here focusing on a specific heterodimer configuration of BDNF and NT-4 as an example of a spatial protein–protein interaction—highlights that coevolving sites, as well as positively evolving sites, are likely to have been fine-tuned over time to help support BDNF’s cognate functionality (see Fig. 3). Mapping our FEL purifying sites in a structural configuration was not shown due to the overwhelming nature of negative selection acting on BDNF within mammals.
Discussion
In this study, we explore the evolutionary history of the BDNF gene in Mammalia. The BDNF gene is implicated in a number of human diseases including a variety of brain disorders such as neurodevelopmental disorders (e.g., autism spectrum disorder), neuropsychiatric disorders (e.g., depression, PTSD, and schizophrenia), and some neurodegenerative disorders [1]. By using orthologous BDNF sequences within the Mammalia taxonomic group, our results indicate that unique site-specific changes within BDNF have evolved over time. We performed a number of comparative evolutionary analyses to tease out signals from our orthologous gene collection in BDNF. Of note, the BDNF gene elicits tight regulation and specific functionality that can be separated from other neurotrophins, yet these growth factors remain closely related in their structure and sequence and conservation of the NGF functional domain. In the NGF domain, we observe a high degree of conservation (via purifying selection) across species, owing to the functional importance of this region in protein–protein interactions. This work additionally provides broad comparative insights into the evolutionary history of the BDNF gene family. Our MEME method identified novel substitutions (see Table 1) in regions of BDNF that may provide significant areas of interest for designing molecular therapeutic approaches, and their potential broader significance are outlined in further detail below.
Predominant purifying selection across BDNF in Mammalia
Over time, evolution drives the divergence of genetic sequences, but what can we learn from the direct comparison of the sequences of the BDNF gene in Mammalia? By comparing the BDNF products of orthologous sequences in different species, we observe the accumulation of mutations at different sites with varying degrees of insight into both BDNF functionality (see [29] for site annotation) and potential disease [1]. These are summarized in full within Table S1 and Table 1. Coding sequences with highly constrained structures are expected to fix nonsynonymous mutations at a slower rate due to the maladaptive nature of changes such as what we observe with FEL negatively selected sites across BDNF. Additionally, we observe a high degree of negative (purifying) selection across the main functional domain (NGF) of BDNF. While structures for the NGF domain in most species under analysis do not exist, based on our findings we expect a highly conserved tertiary structure. Based on the high degree of purifying selection observed across BDNF, we hypothesize that BDNF plays a critical role in the underlying network of genes governing homeostasis and normal organismal development. This may have happened because BDNF is particularly useful specifically for the phylogenetic branch in question (i.e., mammals). This interpretation is also consistent with the observation that BDNF is essential for normative development and is lethal in non-conditional full knock-out mammalian models.
Non-neutral positive diversifying evolution sites in the BDNF gene
It has been described that BDNF plays a particular role as a foundational gene for brain development [11]. Despite a significant level of purifying selection shaping the evolutionary history of BDNF (Fig. 1), we observe several novel statistically significant sites under positive episodic diversifying selection across the BDNF gene (see Table 1). Traditionally, the evolution of this variety consists of amino acid diversifying events that may promote phylogenetic adaptation and/or functionality. These results are entirely novel—they have not been previously reported (to the best of our knowledge) and MEME is an established and sensitive method for the analysis of episodic diversifying sites. Thus, the very specific and limited sites within BDNF to exhibit such patterns is a highly promising result from which to further disentangle BDNF's complex functionality and disease linkage. We would encourage biologists to consider these sites as those that may contain important adaptive functions within the BDNF gene. However, where our results fall within the context of a core protein–protein interaction network of required genes for neural cellular diversity and development is yet to be determined. We do note that at least one identified site (238) overlaps with potential posttranslational modifications to the human BDNF peptide (specifically, a disulfide bridge; see UniProt and [29]). This supports the idea that non-neutral positive diversifying sites within BDNF are not spurious and likely reflect specialized, regulatory, or functional capacities that may have yet to be annotated in full. Given that this manuscript is devoted to the analysis of BDNF’s evolution in mammals, we highlight the potential importance of these sites but emphasize that their importance remains a hypothesis that should be tested in well-defined experiments under controlled laboratory conditions.
Discovery of proximal and distal coevolving site-pairs in the BDNF gene
Another novel, and potentially important, series of findings in this manuscript was the presence of numerous sites that exhibit coevolution. In fact, we observe a significant number of coevolving sites within the BDNF gene (see Fig. 2 and Table 2), and these too reflect an entirely novel aspect of BDNF biology that has not previously been reported. Evidence of coevolving sites are not limited to a particular domain (e.g., prodomain vs mature) nor specific motifs. Instead, coevolving pairs seem to be distributed across the entirety of BDNF with, perhaps unsurprisingly, an increased density of interactions early in the prodomain region. However, we also note that there are coevolving sites in the mature NGF domain which are “linked” to early domain sites. Importantly, these relationships may confer strong epistatic interactions shaping the continued evolution of this critically important gene. The new evidence for coevolution may point to the importance of these sites in shaping the early regulatory or main functional (NGF) domain of BDNF. These residues may form important interactions for the functional integrity of BDNF and, importantly, the highly specific pairs which span the BDNF prodomain and its mature region point to a new mechanism by which the BDNF prodomain may have coregulated the mature domain (or vice versa). Alternatively, these coevolving pairs may be part of a network of residues occupying a shifted fitness landscape in order to accommodate new or species-specific functional requirements.
Potential structural implications of evolving sites
In considering our observation of both diversifying selection and coevolving sites in the BDNF gene, we considered the potential implications this may have at a protein structural level in three-dimensional space (see Fig. 3). While protein structural impacts from evolution remain poorly understood and cannot be completely experimentally disentangled in a confirmatory sense, the implications fall upon our understanding of basic BDNF neurobiology. Here we note that our BGM and FEL analyses implicate the prodomain—the primary topological region of BDNF known for polymorphic variability (e.g., Val66Met and Gln75His) that is often linked to disease [1, 11, 29], and our 3D modeling suggests that two of our coevolving sites appear to be associated with looping structures that could have important yet to be discovered functionality. In this regard, we predict that the evolutionary changes described here are likely to reflect some form of specialization and/or divergence in function and/or interaction partners at different points of BDNF’s evolutionary history in mammals. Thus, further work may unveil yet more novel sites that could provide further insight into the origins of BDNF’s diverse functionality and its role in disease.
Limitations of our computational evolutionary analysis
This analysis focused on BDNF sequences contained in the taxonomic group Mammalia in lieu of examining a more inclusive dataset for BDNF containing sequences from all of Gnathostomata (jawed vertebrates) or extension into invertebrate clades which may contain BDNF or BDNF-like analog genes. Our results are applicable to mammals, which are our intentional taxonomic group of study, but we nonetheless emphasize that our results do not capture the entirety of BDNF’s evolutionary history (e.g., there could be more to learn about BDNF from birds, lizards, fish, and higher-order taxonomic groups which we do not evaluate here). In addition, we do not explore the patterns or mutational processes occurring outside of coding-sequence evolution which include complex structure and dynamics of non-coding regions in the BDNF gene. Therefore, evolutionary temporality is important in the context and interpretation of our results because Mammalia represents only a portion of the long evolutionary history of BDNF. Although we failed to find evidence for recombination in our dataset, species where we may find evidence for recombination may have been precluded from our analysis due to our decision to focus on mammalian BDNF evolution. Further, a limitation of the current analysis is owed to the presence of indel events, especially in the early region of the alignment but which also occur in other spatially distributed regions of the BDNF gene. These indel events are not currently modeled in existing codon substitution models but may represent an additional pathway of evolutionary change. Nonetheless, the prominence of indels in our observations indicates that several regions of BDNF may evolve significantly through indel events across species. Lastly, although there is a risk that the “gappy” nature of the early region of our multiple sequence alignment may be a computational artifact of the alignment procedure, based on all other outputs we believe that our results are reasonably interpreted and have subsequently tolerated these potential effects.
Future directions: understanding the remainder of the neurotrophin family
We hypothesize that the similarities between neurotrophins reflects conserved evolutionary selection for motifs and domains which support common functionality in neurotrophic factors between sites and lineages. While we note significant isotropy in mature peptide sequences for these factors, anisotropic pressures likely influenced the prodomain sequences of neurotrophins leading to alterations in processing, trafficking, regulation, and secretion. As such, we also predict differences in the evolutionary fate of other neurotrophins which also exhibit compartmentalized functionality due to similar alterations within their prodomains (i.e., similar results may be reasonably anticipated in NGF, NT-3, and NT-4).
Conclusion
To sum up, our research modeled the natural evolutionary history of changes in BDNF across >160 mammalian genomes. Conservatively, this analysis spans ~177 million years of evolution—and going deeper could yet reveal more information on the ontogenesis of BDNF and its topological structure (and, consequently, function). Notably, we observed strict purifying selection in the main functional domain of the BDNF gene in mammals and discovered 6 specific sites in our homologous alignment that are under episodic selection in the early regulatory region of BDNF (i.e., the prodomain) and in the terminal region of BDNF. We also make the case for spatial coevolution within this gene, with 23 pair-sites that have evolved together. In sum, these data go above and beyond the common trope that “BDNF is highly conserved” by defining exactly where and how the mammalian BDNF has evolved. Thus, we confirm the widespread belief that the BDNF prodomain is more prone to change than the mature BDNF protein, having important implications for how we think about and consider genetic variation in BDNF and its linkage to disease.
Methods
Data retrieval
For this study, we queried the NCBI Ortholog database via https://www.ncbi.nlm.nih.gov/kis/ortholog/627/?scope=7776. For the purpose of this study, as we are interested in mammalian BDNF evolution, we limited our search to only include species from this taxonomic group (mammals, Mammalia). This returned 162 full gene transcripts and protein sequences. We downloaded all available files: RefSeq protein sequences, RefSeq transcript sequences, Tabular data (CSV, metadata). In Table 3, we provide a table of the species included in this analysis but we also make this accessible via GitHub. Furthermore, we also make these species NCBI accessions (see also Table 3) available for download on GitHub:
-
AnalysisOfOrthologousCollections/BDNF_orthologs.csv at main · aglucaci/AnalysisOfOrthologousCollections · GitHub
Data cleaning
We used the protein sequence and full gene transcripts to derive coding sequences (CDS) (via a custom script, scripts/codons.py). However, this process was met with errors in 20 “PREDICTED” protein sequences, which had invalid characters such as sequences, which have incorrect “X”, or unresolved amino acids and these sequences were subsequently exempt from the analysis. This process removes low-quality protein sequences from analysis which may inflate rates of nonsynonymous change.
Analysis of orthologous collections (AOC): alignment, recombination detection, tree inference, and selection algorithms
The analysis of orthologous collections (AOC) application is designed for comprehensive protein-coding molecular sequence analysis (https://github.com/aglucaci/AnalysisOfOrthologousCollections). It accomplishes this through a series of comparative evolutionary methods. AOC allows for the inclusion of recombination detection, a powerful force in shaping gene evolution and interpreting analytic results. As well, it allows for lineage assignment and annotation. This feature (lineage assignment) allows between-group comparisons of selective pressures. This application currently accepts two input files: a protein sequence unaligned fasta file, and a transcript sequence unaligned fasta file for the same gene. Typically, this can be retrieved from public databases such as NCBI Orthologs. Although other methods of data compilation are also acceptable. In addition, the application is easily modifiable to accept a single CDS input, if that data is available.
If protein and transcript files are provided, a custom script “scripts/codons.py” is executed and returns coding sequences where available. Note that this script currently is set to use the standard genetic code, this will need to be modified for alternate codon tables. This script also removes “low-quality” sequences if no match is found, see the above Data cleaning section.
Step 1. Alignment. We used the HyPhy [46] codon-aware multiple sequence alignment procedure available at (https://github.com/veg/hyphy-analyses/tree/master/codon-msa). This was performed with a Human BDNF coding sequence NM_001709.5 Homo sapiens brain-derived neurotrophic factor (BDNF), transcript variant 4, mRNA as a reference-based alignment. Our alignment procedure retained 126 unique in-frame sequences.
Step 2. Recombination detection. Performed manually via RDP v5 [47], see below, the “Recombination detection” section for additional details. A recombination-free file is placed in the following folder: results/BDNF/Recombinants. For the purpose of this study, we did not detect recombination in our dataset.
Step 3. Tree inference and selection analyses. For the recombination-free fasta file, we perform maximum-likelihood phylogenetic inference via IQ-TREE [48]. Next, the recombination-free alignment and an unrooted phylogenetic tree is evaluated through a standard suite of molecular evolutionary methods. This set of selection analyses includes the following but for the sake of brevity, some of these results were not shown (essentially, most were not statistically significant or not meaningful as relevant to the evolutionary results presented here).
-
FEL: locates codon sites with evidence of pervasive positive diversifying or negative selection [44].
-
BUSTEDS: tests for gene-wide episodic selection [49].
-
MEME: locates codon sites with evidence of episodic positive diversifying selection [50].
-
aBSREL: tests if a positive selection has occurred on a proportion of branches [51].
-
SLAC: performs substitution mapping [44].
-
BGM: identifies groups of sites that are apparently coevolving [45].
-
RELAX: compare gene-wide selection pressure between the query clade and background sequences [52].
-
CFEL: comparison site-by-site selection pressure between query and background sequences [53].
-
FMM: examines model fit by permitting multiple instantaneous substitutions [54].
Step 4A. Lineage assignment and tree annotation. For the unrooted phylogenetic tree, we perform lineage discovery, via NCBI and the python package ete3 toolkit. Assigning lineages to a K (by default, K = 20) number of taxonomic groups. Here, the aim is to have a broad representation of taxonomic groups, rather than the species being heavily clustered into a single group. As a reasonable approximation, we aim for <40% of species to be assigned to any one particular taxonomic group.
Step 4B. We perform tree labeling via the hyphy-analyses/Label-Trees (REF, link) method. Resulting in one annotated tree per lineage designation. For the purpose of this study, we will only consider the following five lineages for additional analyses (Artiodactyla, Carnivora, Chiroptera, Glires, Primates) as they are the most populated lineages.
Step 5. Selection analyses on lineages. Here, the recombination-free fasta file and the set of annotated phylogenetic trees (where labeling was performed in Step 4) is provided for analysis with the RELAX and Contrast-FEL methods.
Recombination detection
Manually tested via RDP v5.5 with modified settings as follows:
-
We also included the following algorithms/analyses: RDP [55], GENECONV [56], Chimaera [57], MaxChi [58], BootScan [59] (Primary and Secondary Scan), SiScan [60] (Primary and Secondary Scan), 3Seq [61].
-
Recombination events are “accepted” in cases where three or more methods are in agreement.
-
We slightly modified default parameters, such that
-
Require topological evidence.
-
Polish breakpoints.
-
Check alignment consistency.
-
Sequences are linear.
-
List events detected by >2 methods.
-
-
We manually recheck all of the events via “Recheck all identified events with all methods”.
-
We manually accept events detected by >2 methods.
-
The resulting alignment was saved as a distributed alignment (with recombinant regions separated).
Recombination was not detected within our Human reference-based alignment. Therefore we used the single recombination-free alignment for analyses.
Data availability
The AOC application is freely available via a dedicated GitHub repository at: https://github.com/aglucaci/AnalysisOfOrthologousCollections Raw data for this study is available on GitHub: https://github.com/aglucaci/AnalysisOfOrthologousCollections/tree/main/data/BDNF. Full results for this study include all HyPhy selection analyses JSON-formatted result files are available on GitHub: https://github.com/aglucaci/AnalysisOfOrthologousCollections/tree/main/results/BDNF.
References
Notaras M, Hill R, van den Buuse M. The BDNF gene Val66Met polymorphism as a modifier of psychiatric disorder susceptibility: progress and controversy. Mol Psychiatry. 2015;20:916–30.
Nagappan G, Lu B. Activity-dependent modulation of the BDNF receptor TrkB: mechanisms and implications. Trends Neurosci. 2005;28:464–71.
Black IB. Trophic regulation of synaptic plasticity. J Neurobiol. 1999;41:108–18.
Yoshii A, Constantine‐Paton M. Postsynaptic BDNF‐TrkB signaling in synapse maturation, plasticity, and disease. Dev Neurobiol. 2010;70:304–22.
Sakuragi S, Tominaga-Yoshino K, Ogura A. Involvement of TrkB-and p75 NTR-signaling pathways in two contrasting forms of long-lasting synaptic plasticity. Sci Rep. 2013;3:1–7.
Horch HW, Kruttgen A, Portbury S, Katz LC. Destabilization of cortical dendrites and spines by BDNF. Neuron. 1999;23:353–64.
Giza JI, Kim J, Meyer H, Anastasia A, Dincheva I, Zheng C, et al. The BDNF Val66Met prodomain disassembles dendritic spines altering fear extinction circuitry and behavior. Neuron. 2018;99:163–78.
Martinowich K, Manji H, Lu B. New insights into BDNF function in depression and anxiety. Nat Neurosci. 2007;10:1089–93.
Angelucci F, Brene S, Mathe A. BDNF in schizophrenia, depression and corresponding animal models. Mol Psychiatry. 2005;10:345–52.
Kim YK, Lee H, Won S, Park E, Lee H, Lee B, et al. Low plasma BDNF is associated with suicidal behavior in major depression. Prog Neuropsychopharmacol Biol Psychiatry. 2007;31:78–85.
Notaras M, van den Buuse M. Neurobiology of BDNF in fear memory, sensitivity to stress, and stress-related disorders. Mol Psychiatry. 2020;25:2251–74.
Pivac N, Kozaric-Kovacic D, Grubisic-Ilic M, Nedic G, Rakos I, Nikolac M, et al. The association between brain-derived neurotrophic factor Val66Met variants and psychotic symptoms in posttraumatic stress disorder. World J Biol Psychiatry. 2012;13:306–11.
Pitts BL, Whealin J, Harpaz-Rotem I, Duman S, Krystal J, Southwick S, et al. BDNF Val66Met polymorphism and posttraumatic stress symptoms in US military veterans: Protective effect of physical exercise. Psychoneuroendocrinol. 2019;100:198–202.
Zhang L, Benedek D, Fullerton C, Forsten R, Naifeh J, Li X, et al. PTSD risk is associated with BDNF Val66Met and BDNF overexpression. Mol Psychiatry. 2014;19:8–10.
Notaras M, Hill R, van den Buuse M. A role for the BDNF gene Val66Met polymorphism in schizophrenia? A comprehensive review. Neurosci Biobehav Rev. 2015;51:15–30.
Gratacòs M, Gonzalez J, Mercader J, de Cid R, Urretavizcaya, Estivill X. Brain-derived neurotrophic factor Val66Met and psychiatric disorders: meta-analysis of case-control studies confirm association to substance-related disorders, eating disorders, and schizophrenia. Biol Psychiatry. 2007;61:911–22.
Zakharyan R, Boyajyan A, Arakelyan A, Gevorgyan A, Mrazek F, Petrek M. Functional variants of the genes involved in neurodevelopment and susceptibility to schizophrenia in an Armenian population. Hum Immunol. 2011;72:746–8.
Howells D, Porritt M, Wong J, Batchelor P, Kalnins R, Hughes A, Donnan J, et al. Reduced BDNF mRNA expression in the Parkinson’s disease substantia nigra. Exp Neurol. 2000;166:127–35.
Palasz E, Wysocka A, Gasiorowska A, Chalimoniuk M, Niewiadomski W, Niewiadomski G. BDNF as a promising therapeutic agent in Parkinson’s disease. Int J Mol Sci. 2020;21:1170.
Correia C, Coutinho A, Sequeira A, Sousa I, Venda L, Almeida J, et al. Increased BDNF levels and NTRK2 gene association suggest a disruption of BDNF/TrkB signaling in autism. Genes Brain Behav. 2010;9:841–8.
Ricci S, Businaro R, Ippoliti F, Lo Vasco V, Massoni F, Onofri E, et al. Altered cytokine and BDNF levels in autism spectrum disorder. Neurotox Res. 2013;24:491–501.
Tsai S-J. Is autism caused by early hyperactivity of brain-derived neurotrophic factor? Med Hypotheses. 2005;65:79–82.
Massa SM, Yang T, Xie Y, Shi J, Bilgen M, Joyce J, et al. Small molecule BDNF mimetics activate TrkB signaling and prevent neuronal degeneration in rodents. J Clin Investig. 2010;120:1774–85.
Kingwell K. BDNF copycats. Nat Rev Drug Disco. 2010;9:433.
Chen B, Dowlatshahi D, MacQueen G, Wang J, Young L. Increased hippocampal BDNF immunoreactivity in subjects treated with antidepressant medication. Biol Psychiatry. 2001;50:260–5.
Björkholm C, Monteggia L. BDNF–a key transducer of antidepressant effects. Neuropharmacol. 2016;102:72–79.
Anastasia A, Deinhardt K, Chao M, Will N, Irmady K, Lee F, et al. Val66Met polymorphism of BDNF alters prodomain structure to induce neuronal growth cone retraction. Nat Comm. 2013;4:1–13.
Glerup S, Bolcho U, Molgaard S, Boggild S, Vaegter C, Smith A, et al. SorCS2 is required for BDNF-dependent plasticity in the hippocampus. Mol Psychiatry. 2016;21:1740–51.
Notaras M, van den Buuse M. Brain-derived neurotrophic factor (BDNF): novel insights into regulation and genetic variation. Neuroscientist. 2019;25:434–54.
Pruunsild P, Kazantseva A, Aid T, Palm K, Timmusk T. Dissecting the human BDNF locus: bidirectional transcription, complex splicing, and multiple promoters. Genomics. 2007;90:397–406.
Lipovich L, Dachet F, Cai J, Bagla S, Balan K, Jia H, et al. Activity-dependent human brain coding/noncoding gene regulatory networks. Genetics. 2012;192:1133–48.
Chiaruttini C, Vicario A, Li Z, Baj G, Braiuca P, Wu Y, et al. Dendritic trafficking of BDNF mRNA is mediated by translin and blocked by the G196A (Val66Met) mutation. Proc Nat Acad Sci USA. 2009;106:16481–6.
Chen Z-Y, Leraci A, Teng H, Dall H, Meng C, Herrera D, Nykjaer A, et al. Sortilin controls intracellular sorting of brain-derived neurotrophic factor to the regulated secretory pathway. J Neurosci. 2005;25:6156–66.
Gray K, Ellis V. Activation of pro-BDNF by the pericellular serine protease plasmin. FEBS Lett. 2008;582:907–10.
del Carmen Cardenas-Aguayo M, Kazim S, Grundke-Iqbal I, Iqbal K. Neurogenic and neurotrophic effects of BDNF peptides in mouse hippocampal primary neuronal cell cultures. PLoS ONE. 2013;8:E53596.
Maisonpierre PC, Le Beau M, Espinosa R, Ip N, Belluscio L, de la Monte S, et al. Human and rat brain-derived neurotrophic factor and neurotrophin-3: gene structures, distributions, and chromosomal localizations. Genomics. 1991;10:558–68.
Finn RD, Coggill P, Eberhardt R, Eddy S, Mistry J, Mitchell A, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–85.
Rodriguez-Tebar A, Dechant G, Barde Y. Binding of brain-derived neurotrophic factor to the nerve growth factor receptor. Neuron. 1990;4:487–92.
Castellani V, Bolz J. Opposing roles for neurotrophin-3 in targeting and collateral formation of distinct sets of developing cortical neurons. Development. 1999;126:3335–45.
Kuczewski N, Porcher C, Lessmann V, Medina I, Gaiarsa JL. Activity-dependent dendritic release of BDNF and biological consequences. Mol Neurobiol. 2009;39:37–49.
Chen Z-Y, Patel P, Sant G, Meng C, Teng K, Hempstead B, et al. Variant brain-derived neurotrophic factor (BDNF)(Met66) alters the intracellular trafficking and activity-dependent secretion of wild-type BDNF in neurosecretory cells and cortical neurons. J Neurosci. 2004;24:4401–11.
Kumar S, Stecher G, Suleski M, Hedges B. TimeTree: a resource for timelines, timetrees, and divergence times. Mol Biol Evol. 2017;34:1812–9.
Hedges BS, Dudley J, Kumar S. TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics,. 2006;22:2971–2.
Kosakovsky Pond SL, Frost SDW. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005;22:1208–22.
Poon AFY, Lewis FI, Frost SDW, Kosakovsky Pond SL. Spidermonkey: rapid detection of co-evolving sites using Bayesian graphical models. Bioinformatics. 2008;24:1949–50.
Kosakovsky Pond SL, Poon AFY, Velazquez R, Weaver S, Helper LN, Murrell B, et al. HyPhy 2.5—A customizable platform for evolutionary hypothesis testing using phylogenies. Mol Biol Evolution. 2020;37:295–9.
Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol. 2015;1:vev003.
Minh BQ, Schmidt HA, Chernomor O, Schremph D, von Haeseler A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37:530–1534.
Wisotsky SR, Kosakovsky Pond SL, Shank SD, Muse SV. Synonymous site-to-site substitution rate variation dramatically inflates false positive rates of selection analyses: ignore at your own peril. Mol Biol Evol. 2020;37:2430–9.
Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Kosakovsky Pond SL. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012;8:e1002764.
Smith MD, Wertheim JO, Weaver S, Murrell B, Scheffler K, Kosakovsky Pond SL. Less is more: an adaptive branch-site random effects model for efficient detection of episodic diversifying selection. Mol Biol Evol. 2015;32:1342–53.
Wertheim JO, Murrell B, Smith MD, Kosakovsky Pond SL, Scheffler K. RELAX: detecting relaxed selection in a phylogenetic framework. Mol Biol Evol. 2015;32:820–32.
Kosakovsky Pond SL, Wisotsky SR, Escalante A, Magalis BR, Weaver S. Contrast-FEL-A test for differences in selective pressures at individual sites among clades and sets of branches. Mol Biol Evol. 2021;38:1184–98.
Lucaci AG, Wisotsky SR, Shank SD, Weaver S, Kosakovsky Pond SL. Extra base hits: widespread empirical support for instantaneous multiple-nucleotide changes. PLoS ONE. 2021;16:e0248337.
Martin D, Rybicki E. RDP: detection of recombination amongst aligned sequences. Bioinformatics. 2000;16:562–3.
Padidam M, Sawyer S, Fauquet CM. Possible emergence of new geminiviruses by frequent recombination. Virology. 1999;265:218–25.
Posada D, Crandall KA. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci USA. 2001;98:13757–62.
Maynard Smith J. Analyzing the mosaic structure of genes. J Mol Evol. 1992;34:126–9.
Martin DP, Posada D, Crandall KA, Williamson C. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum Retroviruses. 2005;21:98–102.
Gibbs MJ, Armstrong JS, Gibbs AJ. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics. 2000;16:573–82.
Lam HM, Ratmann O, Boni MF. Improved algorithmic complexity for the 3SEQ recombination detection algorithm. Mol Biol Evolution. 2018;35:247–51.
Author information
Authors and Affiliations
Contributions
M.J.N. and A.G.L. contributed equally, with final co-authorship order determined via a competitive thumb wrestle tournament. M.J.N. was supported by an NHMRC CJ Martin Fellowship for stem cell training at the Center for Neurogenetics located at Weill Cornell Medical College of Cornell University.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lucaci, A.G., Notaras, M.J., Kosakovsky Pond, S.L. et al. The evolution of BDNF is defined by strict purifying selection and prodomain spatial coevolution, but what does it mean for human brain disease?. Transl Psychiatry 12, 258 (2022). https://doi.org/10.1038/s41398-022-02021-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41398-022-02021-w
- Springer Nature Limited
This article is cited by
-
The rs6265 Polymorphism of the BDNF Gene in the Population of Patients with Multiple Sclerosis in the Tomsk Region
Neuroscience and Behavioral Physiology (2023)