INTRODUCTION. PARKINSON’S DISEASE: CLINICAL PICTURE AND PATHOGENESIS

Parkinson’s disease (PD) is a steadily progressive neurodegenerative disease characterized by a tetrad of motor symptoms, such as resting tremors (primarily of hands), muscle rigidity, bradykinesia, and postural instability. In terms of frequency of occurrence among neurodegenerative diseases, PD is second only to Alzheimer’s disease, and at the age over 80 years, the symptoms of parkinsonism are observed in at least 2% of the population worldwide [1]. It is customary to distinguish between familial and sporadic (idiopathic) forms of PD, even that these forms are virtually indistinguishable phenotypically. Familial forms account for 5 to 10% of all PD cases, although this number might be underestimated. Taking into account some features of inheritance; the proportion of familial PD forms can reach 20%.

The development of motor disorders in PD is due to the selective death of dopaminergic neurons in the substantia nigra pars compacta and a significant decrease in the level of dopamine in the striatum, leading to the disruption of normal functioning of brain basal ganglia and impaired control of motor behavior. In most cases, neuronal death is preceded by the appearance of the so-called Lewy neurites and Lewy bodies – cytoplasmic inclusions consisting mostly of fibrillar alpha-synuclein, ubiquitin, myelin-associated tau protein, and some other proteins. It should be noted that Lewy bodies and Lewy neurites are also found in other brain structures (dorsal motor nucleus, thalamus, amygdala, raphe nucleus, olfactory bulbs, cerebral cortex), and in this regard, PD should not be considered as a disease limited to only one type of neurons and several brain structures [2, 3]. Lewy body-type inclusions have been also found in the peripheral nervous system, e.g., in the neurons of the intestinal submucosa. It is likely that these inclusions can form at the early stage of the pathological process in various parts of the nervous system and gradually spread to other regions of the brain, thus underlying the steady progression of the disease. Notably, at its initial stages, this process does not affect the dopaminergic system of the substantia nigra and striatum or impair motor behavior. The prodromal stage of the disease is associated with the sleep disturbances, olfactory impairments, and dysfunction of the intestine and genitourinary system; however, these changes are not specific and cannot serve as criteria for diagnosing PD. Motor disorders are observed when a significant part (at least 50%) of dopaminergic neurons in the substantia nigra die and dopamine levels in the striatum decrease by 70-80% [4]. The main approaches in the PD therapy, especially those using the dopamine precursor levodopa, are aimed to compensate for the decrease in the dopamine level in the striatum. At the early stages of treatment, dopamine agonists and inhibitors of monoamine oxidase B and catechol-O-methyltransferase can be used. At the later stages, the side effects of levodopa are blocked by certain drugs, such as amantadine (dyskinesia) or apomorphine (development of obsessive-compulsive disorders). In any case, the therapy is symptomatic and does not address the causes of neurodegeneration [5].

MONOGENIC FORMS OF PARKINSON’S DISEASE

The exact genetic nature of the familial form of PD was first demonstrated in 1997, when the p.Ala30Thr missense mutation in the alpha-synuclein gene (SNCA) was detected in a large German family with the autosomal dominant inheritance of PD in four generations, as well as in three unrelated Greek families, in which the disease was observed in two and three generations [6]. Later, several other pathogenetically significant missense mutations in SNCA have been identified (p.Ala30Pro, p.Glu46Lys, p.Gly51Asp, p.Ala53Glu, p.Ala53Thr). It was shown that mutations resulting in the gene dosage change (duplications, triplications, quadruplications), but causing no disturbances in the protein structure can also lead to the PD development [7]. The frequency of mutations in SNCA is low: approximately 0.2% and 1-2% in patients with sporadic and familial forms of the disease, respectively. These mutations serve as a link between the disease forms and indicate a fundamental importance of structural and functional disorders of alpha-synuclein in the pathogenesis of PD [8].

The number of identified genes involved in the familial form of PD has steadily increased, and to date, more than 10 genes have been found, mutations in which unequivocally result in Mendelian PD. Table 1 provides a brief description of genes whose pathogenetic role has been proven and repeatedly confirmed in various studies. However, the discussion about the role of these and other genes in the etiopathogenesis of PD continues.

Table 1 Main genes for the familial forms of PD with proven Mendelian inheritance

One of the most striking examples of such discussion is the reassessment of the role of the ubiquitin carboxy-terminal hydrolase UCHL1 (PARK5) gene (not included in Table 1) in the PD pathogenesis. Mutations in this gene were first identified in 1998, when the cosegregating missense mutation p.Ile193Met was detected in a German family with the late-onset PD [9]. However, since then, no familial PD cases associated with a mutation in the UCHL1 gene have been identified [1011].

At the same time, a number of polymorphic variants of the UCHL1 gene have been identified, one of which (missense mutation p.Ser18Tyr) has been extensively studied in patients with sporadic PD. Although the results of these association studies were contradictory, a meta-analysis of data from more than 6500 patients using the dominant, recessive, and additive models revealed no association of this polymorphism with the PD development [12]. It should be emphasized that this meta-analysis was first carried out in Caucasian patients and then in a Japanese population [13]. Meta-analysis performed in [14] also excluded the influence of the p.Ser18Tyr polymorphism on the risk of PD development in Asian subjects [14] both in the entire analyzed sample and in ethnically selected subsets.

Analysis of transgenic mice with mutations in the ubiquitin carboxy-terminal hydrolase protein gene yielded conflicting results. Thus, mice with an intragenic deletion in the UCHL1 gene homologue demonstrated a decrease in the level of monoubiquitin and formation of protein inclusions in vivo, but no signs of neurodegeneration in the substantia nigra [15]. At the same time, transgenic mice with the mutant UCHL1 p.193Met gene displayed neurodegeneration in the substantia nigra and impaired spontaneous motor activity at the age of 20 months [16], which was accompanied by the formation of protein inclusions and impaired alpha-synuclein metabolism [17]. In the case of p.Ser18Tyr polymorphism, the 18Tyr variant (unlike the wild-type protein) displayed the antioxidant activity both in vitro in neuronal cell cultures and in vivo when the transgene was introduced into the substantia nigra [1819], which reduced the risk of neurodegeneration in carriers of the 18Tyr variant.

Summarizing the above, genetic and molecular biology data fail to provide a clear answer to the question about the role of the UCHL1 gene in the pathogenesis of PD, but so far, its involvement in the disease development seems unlikely.

The second gene not included in Table 1 is the GBA gene coding for glucocerebrosidase (lysosomal hydrolase). Mutations in this gene were first described in patients with Gaucher disease, a systemic disease with impaired hematopoiesis, increased risk of fractures, and pronounced neurological disturbances. Currently, more than 300 pathogenetically significant mutations in GBA have been described. There is no doubt that heterozygous, homozygous, or compound heterozygous mutations in this gene play an important role in the PD development [20-23].

However, even in the patients homozygous for mutations in GBA, the symptoms of parkinsonism develop only in some of mutation carriers regardless of the presence of the clinical phenotype of Gaucher disease. Thus, only about 9% individuals homozygotes for the frequent and relatively mild Asn370Ser mutation (leading to type 1 Gaucher disease without pronounced neurological disturbances) developed parkinsonism [20]. About 40% of Gaucher disease patients displaying parkinsonism had this mutation in the compound heterozygous state; therefore, it was detected in 49% of patients with this phenotype.

Heterozygous carriers of mutations in the GBA gene also demonstrated an increased risk of developing PD – approximately 10% of them developed the disease, and the penetrance of these mutations was estimated as 30% at the age of 80 years [2324]. It is most likely that all Gaucher disease-causing mutations increase the risk of developing PD, and the likelihood of developing parkinsonism correlates with the severity of the Gaucher disease clinical picture observed in the carriers of these mutations. To date, more than 130 pathogenetically significant mutations in the GBA gene have been identified in patients with PD, with different mutations dominating in different ethnic groups. Thus, in Ashkenazi Jews, PD is mainly associated with the p.Arg496His, p.Asn370Ser, and 84insGC mutations, while in Caucasians it is associated with the p.Asn370Ser, p.Leu444Pro, p.Arg120Trp, IVS2 + 1G>A, p.His255Gln, p.Asp409His, p.Glu326Lys, and p.Thr369Met mutations. The frequency of GBA mutant variants also differs in different populations, which may affect the assessment of the cumulative risk of developing GBA-associated PD in different ethnic groups. In general, mutations in the GBA gene increase the risk of developing PD, although their penetrance is significantly below 100%.

Mutations in the gene for leucine-rich repeat kinase 2 (LRRK2, or dardarin) also have a reduced penetrance. The LRRK2 gene has been historically regarded as a gene for the autosomal dominant PD form with a penetrance close to 100%. However, out of more than 100 mutations described in this gene, only six (p.Gly2019Ser, p.Arg1441Cys/Gly/His, Tyr1699Cys, Ile2020Thr) have a clear familial cosegregation with the disease. However, even in this case, the mutation penetrance does not reach 100% [25]. Thus, the penetrance of the p.Gly2019Ser mutation is 85% at the age of 80 years [26], and this value differs in different ethnic groups. The penetrance for mutations in codon 1441 is even lower [27] and differs for different missense variants of this mutation. Although it was impossible to accurately estimate the penetrance of the remaining four mutations due to their low frequency, but it is obviously incomplete.

Therefore, the presence of mutations with pathogenetically proven significance in the LRRK2 gene indicates an increased risk of developing PD; however, an accurate assessment of this risk based only on the fact of mutation presence is impossible. This makes it difficult to assess the contribution of mutations in LRRK2 to the risk of developing familial forms of PD, and the estimates available in the literature (according to which mutations in the LRRK2 gene are found in 5% of patients with familial forms of PD and in 1% of patients with idiopathic PD) should be considered as approximate [28].

The situation with the parkin gene (PRKN) is extremely interesting. Homozygous or compound heterozygous diallelic loss-of-function mutations in this gene lead to the development of the autosomal recessive juvenile form of PD, which is characterized by a median age of clinical onset of 31 years. In 16% of patients, the first signs of the disease are observed before the age of 20 years [29]. It is believed that from 10 to 20% of patients with an onset age below 40 years carry the diallelic mutations in the PRKN gene, although the data for various populations are quite different [30-32]. The vast majority of these mutations are large deletions/duplications resulting from unequal homologous crossing over. They include several exons of the PRKN gene and lead to a reading frame shift with the formation of a non-functional variant of the parkin protein. However, such mutations in the heterozygous state were found in patients with sporadic PD [33-35]. Deletions/duplications the PRKN gene were also detected in about 5% of healthy individuals examined in the population studies [36], i.e., their frequency exceeds the population frequency of mutations in the dardarin gene. In this regard, they can be considered as autosomal dominant, leading to the form of PD with the classic onset age of over 55-60 years, or as risk factors for the disease development (by analogy with mutations in the GBA gene). A potential pathogenetic role of these mutations in the heterozygous state was confirmed by the functional activity analysis of the brain nigrostriatal system using fluorodopa [37]. Heterozygous mutation carriers demonstrated a decrease in the dopamine level in the caudate nucleus and putamen, which, however, was less pronounced than in patients with idiopathic PD. This decrease in the dopamine level caused a compensatory activation of the right dorsal premotor cortex area and rostral supplementary motor area [3738], which slowed down the development of motor disturbances in heterozygous carriers of parkin gene mutations on the background of impaired dopamine metabolism. A similar picture was observed in heterozygous carriers of mutations in the PINK1 gene, but in both cases, no prospective studies to assess the development of PD in the carriers of heterozygous mutations have been conducted yet [39]. On the other hand, no statistically significant differences were found in the frequency of heterozygous deletions of the parkin gene in the CNV (DNA copy number variation) analysis of large (more than 2000 people) cohorts of PD patients and healthy individuals [40].

In general, it is believed that monogenic forms of PD are observed in 5-10% of patients [41]. The main contribution to the development of autosomal dominant and autosomal recessive monogenic PD forms is made by the LRRK2 and PRKN genes, respectively. It should be emphasized that very few studies have been published on a large-scale analysis of multiple genes. Thus, a meta-analysis covering all major PD genes in Brazilian patients was conducted in [42] revealing that the most frequent point mutations were in the LRRK2 gene (2.5% of patients, with the p.Gly2019Ser mutation being found in 2.2% of cases). Mutations in the parkin gene were found in 8.3% of patients [42]. In Ireland, the analysis was limited to mutations in the PRKN, DJ1, and PINK1 genes and patients with the early onset PD, and only mutations in the PRKN gene were detected in 6.9% of the examined patients [31]. The same three genes were analyzed in patients with the early onset PD from the Central Europe (mainly, Poland); mutations in the PRKN gene were found in 3.1% of the patients [30]. Often, mutational screening is limited only to the p.Gly2019Ser mutation, which has been studied in a large number of world populations. In most of them, its frequency in both familial and sporadic PD did not exceed 3-5% [43].

These data indicate a low frequency of pathogenetically significant mutations in major genes for the familial PD forms, pronounced interpopulation heterogeneity, and the need to continue the analysis of mutation spectrum in PD with inclusion of a wider panel of genes, as well to search for new genes for the monogenic PD forms [4445]. Some authors offered their own variants of a gene panel for familial PD, but these panels are quite different (figure). As a result, currently available information is insufficient for the implementation of algorithms for genetic testing for the risk of developing PD. A number of commercial panels have been developed, which differ in the number and repertoire of genes; however, according to experts. none of them is sufficiently informative without including additional data describing the family history and phenotypic characteristics of the tested individual. Meanwhile, the minimum panel of five genes common to all panels (PRKN, LRRK2, SNCA, PINK1, PARK7) does not differ in its efficiency from the maximum panel of 43 genes [46].

figure 1

Lack of full consensus on the genes included in the panels for testing for the monogenic forms of PD. Blue boxes, genes for the monogenic PD forms proposed by different authors [7, 41, 47, 48]; yellow box, genes included in each panel are shown in red; genes included in at least two panels are shown in green

GENOME-WIDE ASSOCIATION STUDIES IN IDIOPATHIC PARKINSON’S DISEASE

An obvious importance of genetic factors in the PD pathogenesis, as well as existence of familial and idiopathic forms of PD with a clear predominance of the latter, have become the basis for genome-wide association studies (GWASs) of the idiopathic forms of the disease. The first studies that were carried out in 2005-2006 on relatively small cohorts of patients using a limited set of DNA markers, have failed to produce reliable results [4950]. However, they became the first steps toward developing the GWAS technologies for studying neurodegenerative diseases. The first meta-analysis of the GWAS data became a trial run that used the results of two previously mentioned works [51]. It revealed three single-nucleotide DNA markers associated with the PD development, regardless of the meta-analysis strategy used, but none of these markers reached the level of genome-wide significance (p < 5×10–8). At the same time, it became obvious that the informative value of the analysis would increase with a significant increase in the size of the analyzed populations (from hundreds subjects in the GWASs in 2005-2006 to tens of thousands in later studies) and increase in the density of analyzed DNA markers.

The overall technological development of the GWAS methodology (increase in the information content of microarrays, improvement of statistical analysis methods) has made possible the genotyping of tens of thousands of PD patients and allowed to conduct the associative analysis for both the disease itself and its endophenotypes (age of clinical onset, nature of motor disturbances, presence of concomitant psychoemotional disorders, rate of disease progression). According to the GWAS Catalog (https://www.ebi.ac.uk/), more than 70 association studies have been carried out to date, that reveled the association of PD with more than 500 single-nucleotide DNA markers with the significance level exceeding 10–6 (https://www.ebi.ac.uk/gwas/efotraits/MONDO_0005180). However, only a few genes had several DNA markers each. Taking into account the number of markers associated with the disease and the significance of this association, there are about six protein-coding genes that determine the multifactorial risk of developing PD and/or affect the clinical course of the disease (Table 2).

Table 2 Protein-coding genes and DNA markers most reliably associated with the risk of developing idiopathic PD according to GWASs

Two of these genes, SNCA and LRRK2, determine the development of monogenic forms of PD; they also were found to be most strongly associated with the idiopathic form of the disease in terms of the number of associated DNA markers and the relative risk they pose. Mutations in the gene for the microtubule-associated tau protein (MAPT) are also linked to the development of monogenic neurodegenerative diseases, but the clinical manifestations of these mutations are associated primarily with the frontotemporal dementia and essential tremor. Muscle disorders characteristic of PD are secondary to the symptoms of dementia/tremor, and the associations revealed by GWASs reflect the genetic proximity of a wide range of neurodegenerative diseases [52-54].

The role of the other three genes associated with the PD development is studied insufficiently, but the data on their functional activity suggest their association them with PD etiopathogenesis.

The TMEM175 gene encodes a transmembrane lysosomal protein, an ion channel for potassium ions and protons [55]. Modeling of this protein deficiency in a neuronal cell culture in vitro showed that a decrease in the activity of TMEM175 led to the impaired control of lysosomal pH, which, in turn, reduced the activity of lysosomal enzymes (including GBA). When monomeric alpha-synuclein is present, TMEM175 deficiency promoted α-synuclein phosphorylation and accompanied by formation of insoluble inclusions containing hyperphosphorylated alpha-synuclein in the presence of exogenously provided preformed α-synuclein fibrils [56]. Two coding variants, TMEM175 p.M393T and p.Q65P, were associated with PD [57]. Deficiency of TMEM175 in mice in TMEM175 caused the loss of dopaminergic neurons and impairment of motor function [58].The LAMP3 gene associated with the development of PD, encodes a lysosomal membrane-associated protein and affects the processes of lysosomal autophagy and apoptosis. It may also be connected to the lysosomal dysfunction, but its role in the functioning of neuronal cells has been studied very poorly [59-61]. The role in the PD pathogenesis of the BST1/CD157 protein, a glycoprotein from the ADP-ribosyl cyclase superfamily associated primarily with autoimmune, hematological diseases, and tumors, has been poorly studied as well. However, recent data indicate an important role of this protein in the regulation of oxytocin metabolism and development of behavioral disorders [62-64].

Most of DNA markers associated with the PD development have little effect on the risk of developing the disease – for the majority of markers, the odds ratio (OR) is increased or decreased by no more than 1.5 times. The meta-analysis conducted in 2019 [65] of all GWAS data published to date [7.8 million polymorphic DNA markers (including imputing), 1.4 million control DNA samples, 37,700 patients with PD, 18,600 first-degree relatives of patients with PD] identified a total of 90 independent DNA markers belonging to 78 genomic regions associated with the disease development. Thirty-eight of these markers have not been previously described, and were identified due to large sample size, which made it possible for weakly associated markers to achieve the level of genome-wide significance (with a reliability of about 10–8). For these new DNA markers, the polygenic risk score (PRS) for developing PD associated with genome polymorphism was found to be less than 36% (if the estimated incidence in a population of people of 80 years and older is 2%). At the same time, it must be remembered that at present, there is no generally accepted methodology for assessing multilocus risk scores; perhaps, this assessment will change significantly when standards for calculating PRS are introduced into practice [66].

Hence, at present, the PRS for developing PD is low for both monogenic forms and multifactorial variants of the disease. One of the possible reasons is underestimation of the role of rare mutations or polymorphisms in the disease development. Further continuation of research in the field of large-scale sequencing of genomes and exomes of PD patients will allow identification of new loci and genes associated with the PD development. At the same time, the key role in these studies will belong to the integration of the obtained primary data into large international databases, combined with the functional annotation programs for such rare variants. The first example of such database is Gene4PD [47], which integrates the results of all published studies on the pathogenetically significant genetic variants of PD and allows to annotate the identified variants on the same platform using the data on the nature and incidence of genetic variant, functional annotation of variant-associated proteins, and results of transcriptome and methylome studies. Eventually, based on the analysis of variants associated with the PD development, all genes containing these variants will be classified as candidate genes of a high, medium, or low certainty.

As mentioned above, in a number of cases (mutations in the parkin, PINK1, and alpha-synuclein genes), the cause of the disease can be mutations or polymorphisms such as CNV, associated with deletion or duplication of genome fragments larger than 50 bp. Mutations of this type can make an important contribution to the overall genome variability [67], and including them in the analysis of genetic risks of multifactorial diseases can significantly change our estimates of PRS. One of the CNV marker affecting the risk of developing PD is associated with the X-linked dystonia/parkinsonism syndrome. This disease is caused by the insertion of a SINE-VNTR-Alu (SVA)-type complex retrotransposon into intron 32 of the TAF1 gene, which encodes the TATA box-binding protein-associated factor 1. This insertion leads to the disruption of the TAF1 gene expression, which is mostly pronounced for the neuronal mRNA variant. The disease phenotype and the mRNA level are influenced by the number of monomers (CCCCST)n in the VNTR repeat. This number varies from 34 to 52 and affects both the comparative severity of dystonia/parkinsonism and the age of clinical onset of the disease [68-70].

It must be emphasized that in this case, the mutation does not involve the protein-coding gene region, and its identification and confirmation of its pathogenetic role required a complex of genome and transcriptome studies using genome-editing techniques. On one hand, this highlights the difficulty of searching for pathogenetically significant mutations of this type and at least partly explains a relatively low genetic risk estimate obtained based on the analysis of single-nucleotide DNA markers only. On the other hand, this indicates an importance of searching for epigenetic factors associated with the disease development – identification of certain changes in the epigenome will allow a better understanding of genetic factors that cause them.

EPIGENETIC FACTORS IN THE DEVELOPMENT OF PARKINSON’S DISEASE

In recent years, an interest in the role of epigenetic mechanisms and factors in the pathogenesis of neurodegenerative diseases has significantly increased [7172]. DNA methylation, histone modification, and expression of various noncoding RNAs (microRNAs, small interfering RNAs, long non-coding RNAs) have been actively analyzed. The epigenome is considered not only as the cause of the disease, but also as a possible therapeutic target.

However, when considering epigenetic mechanisms of PD, it is necessary to take into account the influence of factors limiting the studies.

The first factor is that the vast majority of epigenetic studies are not focused (for obvious reasons) on the substantia nigra and striatum epigenome and are aimed at the analysis of peripheral tissues, primarily blood. Although some studies were conducted using autopsy material, the studied cohorts of patients, who were at the late PD stages (stages four and five according to the Hoehn and Yahr scale), were small and the patients presented various concomitant diseases, which were, in fact, the cause of death [7374]. The epigenome is actively studied in model objects (cell lines, including induced pluripotent cells and their derivatives, genetic and drug-induced animal models of PD), but the transfer of these data to humans requires caution and accuracy.

The second factor is a long asymptomatic course of the disease, while all epigenetic studies are carried out in patients with already severe motor impairments. At best, the studied cohorts are formed of patients at the first or early second stage (according to Hoehn and Yahr scale) and include only the patients that have been diagnosed with PD before the start of treatment. This may be important, since epigenetic markers (for example, miRNA expression levels) can respond to therapeutic interventions [75]. If epigenetic changes are detected, the question arises about their causal relationship with the disease, as changes in the epigenome can also be caused by a damage to dopaminergic neurons. But such changes are extremely important, since they can be considered as markers of the course of the pathological process. Identification of causal relationships in this case will also require analysis of various disease models, although it should be mentioned again that the transfer of data from a model to humans requires caution.

The first epigenetic factor studied in PD is DNA methylation. It has been shown that PD is associated with a global decrease in the level of DNA methylation both in the brain (including striatum and substantia nigra) and peripheral tissues [76]. This decrease is due to the interaction between DNA methyltransferase 1 (DNMT1) and alpha-synuclein in the nervous tissue, resulting in DNMT1 accumulation in the cytoplasm [77]. It was also shown that DNMT1 binds to the alpha-synuclein gene. SNCA intron 1 contains binding sites that determine differential methylation of the this gene. At the same time, no association was found between polymorphic variants of the DNMT1 gene and the risk of developing PD [78]. A number of genes have been identified whose methylation levels change in PD. For example, methylation of genes for the dopamine transporter (DAT), catechol-O-methyltransferase (COMT), prion protein (PRNP), and mitochondrial proteins (LARS2, MIR1977, and DDAH2) is reduced [79]. However, results obtained in the analysis of methylation are poorly reproducible. GWASs of global methylation in the brain tissues of PD patients and healthy individuals conducted over the past 10 years revealed only one gene, whose methylation level changed – the cytochrome P450 gene (CYP2E1) [80]. Low reproducibility of the results may be due to influence of environmental factors that might affect both the risk of developing PD and differential methylation, which should be taken into account when conducting methylation analysis. These factors include smoking, coffee consumption, and exposure to pesticides and heavy metals [80]. It is also important to conduct comprehensive studies that include not only analysis of methylation profiles, but also identification of differentially expressed genes and their annotation. For example, Henderson et al. [81] identified a number of genome regions differentially methylated in peripheral blood lymphocytes in PD patients and found differentially expressed genes associated with these genome regions, although in most cases, differential methylation and differential expression correlated poorly with each other.

Therefore, it is currently difficult to draw a definitive conclusion about the contribution of DNA methylation to the risk of developing PD, which requires additional large-scale, long-term studies in large cohorts of patients that should be characterized in detail both clinically and in terms of lifestyle, with a focus on patients at the early stages of PD. The same is true for another actively analyzed epigenetic marker – the level of microRNA (miRNA) expression.

The first attempts at analysis of changes in miRNA expression in the nervous tissue in PD were carried out in 2015-2016 and yielded very conflicting results. The sets of identified differentially expressed miRNAs did not overlap, which may be due to the use of samples from different brain regions (substance nigra and prefrontal cortex), different miRNA analysis methods (sequencing, real-time PCR, NanoString technology), or different clinical and morphological characteristics of patients. At the same time, it should be noted that the reported characteristics of individuals in these studies were extremely insufficient – for example, none of the articles provided information on the stage and clinical form of PD [73, 82, 83]. A total of 99 miRNAs differentially expressed in the substantia nigra in PD have been identified to date, with 60 miRNAs being upregulated and 39 miRNAs – downregulated [84]. These studies have shown that the miRNA sets expressed in the brain in PD change significantly, and such changes can cause certain abnormalities in the expression of genes involved in the metabolic pathways associated with the PD pathogenesis.

Of particular interest are miRNAs whose targets are genes for the monogenic forms of PD; such miRNAs can be considered as a link between the idiopathic and familial forms of the disease. For example, miR-7 was found to have a binding site in the 3′-untranslated region of the alpha-synuclein gene; it blocked translation of alpha-synuclein mRNA and reduced the level of this protein both in vitro and in vivo. Blocking miR-7 activity in mice resulted in the death of dopaminergic neurons in the substantia nigra, decrease in the dopamine levels, and accumulation of alpha-synuclein in the nervous tissue [74]. However, the general picture is more complicated, as expression of alpha-synuclein is controlled by several miRNAs, such as miR-153, miR-203a-3p, miR-203a-3p, miR-30b, miR-34b/c, miR-214, and miR-433. A similar situation is observed for other genes for the familial forms of PD. Thus, expression of parkin gene is regulated by at least four miRNAs (miR-103a-3p, miR-146a, miR-181a, miR-218), dardarin gene – two miRNAs (miR-205, miR-599), and DJ-1 gene – two miRNAs (miR-494, miR-4639). As a result, miRNAs are involved in the regulation (both positive and negative) of a large number of metabolic processes associated with the PD development, such as mitochondrial dysfunction and oxidative stress, autophagy, apoptosis, inflammation, and neurotrophin expression [8485].

miRNAs may also be a link between the monogenic forms of PD, idiopathic cases, and exposure to environmental factors, e.g., such known risk factors as pesticides [86]. It has been shown that pesticides cause changes in the miRNA profiles in various body tissues, and these changes largely overlap for different pesticides. For example, rotenone, paraquat, organophosphates, and atrazine alter expression of miR-34 [87] involved in the regulation of alpha-synuclein expression. Another relationship between genetic factors and miRNAs may be based on the regulation of miRNA gene expression at the level of transcription and transcript stability. Very little is known about this process, but it has been shown that individual polymorphic sites in the human genome can affect the binding of miRNAs to them and thereby modulate expression of the target genes. Thus, the rs10024743 single-nucleotide polymorphism (minor allele) in the miR-34-binding site located in the 3′-untranslated region of the alpha-synuclein gene reduced the expression of alpha-synuclein more than 2 times [88]. The rs66737902 polymorphism identified in the miR-138-2-3p-binding site in the 3′-untranslated region of the dardarin gene (LRRK2) was also found to be associated with the risk of developing PD. The authors believe that disruption of the miR-138-2-3p binding to the dardarin mRNA leads to changes in its level and development of the pathological process [89].

An important role in the formation of regulatory miRNA networks is likely played by long noncoding RNAs (lncRNAs) – RNAs over 200 nucleotides in length that do not contain open reading frames and are not translated into proteins. These RNAs can be synthesized on the sense and antisense DNA strands and can be encoded in gene introns and intergenic regions. The exact number of such RNA transcripts and genes encoding them is unknown; however, it is believed to be comparable to, and possibly, to exceed the number of protein-coding genes [90].

Several lncRNAs associated with biological processes leading to PD, have been described. Thus, in mice with the drug (6-hydroxydopamine)-induced damage to the substantia nigra, lncRNA H19 activates expression of the tyrosine hydroxylase gene and Wnt/beta-catenin signaling pathway through the binding of miR-301b-3p, and thereby improves survival of dopaminergic neurons in the substantia nigra. On the contrary, a decrease in the H19 RNA level promotes neurodegeneration [91]. A number of lncRNAs (MALAT1, UCA1, SNGH14) regulate alpha-synuclein metabolism and formation of its aggregates [9293]. Many lncRNAs influence various neurodegeneration-associated processes. Thus, MALAT1, in addition to regulating alpha-synuclein metabolism, is involved in the modulation of neuroinflammation and activates expression of dardarin (LRKK2), thereby promoting apoptosis and autophagy in dopaminergic neurons. Blocking expression of MALAT1 improves survival of dopaminergic neurons [92]. The activity of dardarin can be controlled by another lncRNA, NEAT1. It forms intranuclear inclusions (paraspeckles), that contain a number of cellular proteins, including dardarin. This reduces the level of active dardarin in the cell and thereby increases cell resistance to oxidative stress [94].

Epigenetic regulatory interactions are not limited to the three mechanisms described above, but other options are far less studied. Still, a possible role in the pathogenesis of PD has been shown for histone modifications [95], circular RNAs (circRNAs), and competing endogenous RNAs (ceRNA) [96]. Analysis of the epigenetic profile in PD should be expanded to include new models of the disease, both cellular and organismal.

CONCLUSION

Over the past 25 years, identification of genetic and epigenetic factors associated with PD pathogenesis has created the basis to start the construction of the “Cologne Cathedral” or “Sagrada Familia” of PD concept. More likely, the second – after all, both in the case of Sagrada Familia and PD, the building has not yet been fully constructed, but the main contours are already visible.

The foundation for the PD “building” is formed, first of all, by the genes for the main monogenic PD forms. SNCA, PRKN, and LRRK2 are the cornerstones, on which the first mechanisms of disease etiopathogenesis were proposed based on the formation of alpha-synuclein aggregates, mitochondrial dysfunction, and disruption of proteasomal protein degradation processes. Further construction involved identification of new genes for the monogenic PD forms and added lysosomal dysfunction and disruption of vesicular transport processes to the main pathological mechanisms of PD. The concept of a single continuum of interacting cellular processes leading to the selective death of dopaminergic neurons if any of these functional elements is disrupted, has been formulated. It remains relevant for the idiopathic form of the disease as well: GWASs revealed the SNCA and LRKK2 genes as the main PD-associated loci. At the same time, it is obvious that the “walls of the cathedral” have not been completed, as not all genes associated with the familial and sporadic forms of the disease have been identified. However, the major construction has been finished, and identification of new gene(s) will not crumble this already erected building.

However, for any building, connection between its parts is very important, which in the case of an organism is achieved through epigenetic interactions ensuring the functioning of an entire ensemble of genes performing a given biological function. The research community has taken the first and successful steps in this direction, but there is still a lot to be done in the epigenetic “finishing” of the building. This will require the development of fundamentally new approaches both in studying molecular events associated with the epigenome and data analysis using machine learning and artificial intelligence methods [9798].