Background

Parkinson disease (PD) is the fastest growing neurological disorder and affects approximately 1% of the world’s population over the age of 60 [1]. The neuropathological hallmark of PD is death of dopaminergic neurons in the substantia nigra which leads to the diagnostic motor signs of the disease (eg., bradykinesia, rigidity, and resting tremor). In most cases, however, a protracted period of significant neuron death, beginning in the prodromal stage, precedes clinical presentation of motor signs [2,3,4]. Prodromal and early-stage PD is associated with a multitude of heterogeneous non-motor signs and symptoms such as sleep and vision disturbances, olfactory and gastrointestinal dysfunction, anxiety, and early-onset cranial sensorimotor impairments that likely have a variety of systemic pathologies [5]. Moreover, there is no specific biomarker test to diagnose PD in the early-stage, before the cardinal motor features of the disease appear, and consequently diagnosis and treatment are delayed.

Early-stage PD is difficult to investigate in humans due to inconsistencies in symptom manifestation, age of onset, and environmental factors. Biomarker identification research has generally focused on cerebrospinal fluid (CSF), yet CSF collection via lumbar puncture is invasive, expensive, painful, and requires skilled healthcare providers to collect specimens. Alternatively, blood collection via phlebotomy is a ubiquitous, relatively non-invasive source for potential biomarkers that is inexpensive, routine, and still has a high clinical application. Several recent studies have evaluated whole-blood transcriptome data that have demonstrated consistent PD-specific changes in neutrophil gene expression and lymphocyte cell counts linked to the motor progression of PD [6, 7]. Data also suggest that PD gene expression signals such as phosphorylated α-synuclein, DJ-1, and oxidative stress markers are detectable in blood and plasma of clinically-ill patients [8,9,10]. These studies validate use of peripheral whole blood gene expression in biomarker discovery and potential development of diagnostics and prospective therapeutics. Despite these promising findings, identification and understanding of early-stage PD genetic biomarkers within whole blood samples remains limited.

Most cases of PD are idiopathic; however, 5–10% of cases are monogenic familial forms such as those due to mutations in LRRK2, PRKN, DJ1, SNCA, and PINK1 [11,12,13]. Large genome-wide association studies have shown these specific genes are concomitantly implicated in idiopathic PD pathophysiology [14], and are involved in a set of molecular pathways that trigger an early-onset pathology sequence that is indistinguishable from sporadic forms [12]. Mitochondria within the CNS are subject to the immense metabolic demands of neuronal activity. Mitochondrial stress, abnormal mitophagy, and lysosomal dysfunction leads to the release of damage-associated molecules that can activate an innate immune response, as seen in genetic murine models of PD. For example, PINK1 (PARK6, phosphatase and tensin homolog (PTEN)-induced putative kinase 1) is involved in mitochondrial quality control and protects cells from stress-induced mitochondrial dysfunction. Altered mitophagy due to Pink1 deficiency is likely involved in multiple CNS disorders, including PD, Alzheimer’s disease, and glaucoma [15]. In general, loss of Pink1 is also involved in increased generation of pro-inflammatory cytokines and chemokines within plasma, sera, CSF, and blood linked to neuron death [16] as well as increased cytosolic mitochondrial mtDNA and induction of type-I interferon responses and apoptosis [17].

Genetic rodent models of early-stage PD provide insight into the underlying genetics of idiopathic disease, provide experimental control to link genes to behavioral dysfunction associated with disease progression, and identify targets for the development of treatments [18]. The Pink1−/− rat parallels human idiopathic PD progression including early-stage behavioral changes due to sensorimotor and cranial motor dysfunction (eg, vocal communication) [19, 20]. Work over the past decade has demonstrated that Pink1−/− rats develop early motor and non-motor deficits as soon as two months of age [21, 22]. This includes alterations in early affected peripheral head and neck muscles and nerves involved in communication, as well as decreased norepinephrine, abnormal α-synuclein aggregation, and increased inflammation in multiple brainstem regions including those associated with sensorimotor vocal function (eg, periaqueductal gray; vocal modulator) [22,23,24,25,26,27,28,29,30]. Rats communicate during social interactions by producing ultrasonic vocalizations (USVs) in part by contraction and adduction of the thyroarytenoid (TA) muscle [31, 32]. At 2–3 months of age, in the prodromal period, Pink1−/− rats show differences in acoustical parameters compared to wild-type rats [22, 29]. Recent research has shown that loss of functional Pink1 in the TA muscle leads to increased inflammatory and cell death pathways including the TNF-α/NF-kB signaling pathway [28]. The identification of key disease-related genes and biological pathways are important to develop to identify treatment targets for early signs including communication dysfunction. Yet, there is a need to establish translatability between biomarker identification within tissue types and an easily accessible, comprehensive sample type.

While previous cranial motor behavior and tissue-specific genetic studies have been done in the Pink1−/− rat model, whole blood gene expression has not been evaluated [25, 28, 29, 33]. By identifying whole blood gene expression profiles in early-stage PD, we can develop a transcriptomic signature capable of detecting PD in prodromal stages. The purpose of the current study was to identify dysregulated gene pathways within the blood of young Pink1−/− rats (3 months of age), develop genetic biomarkers or signatures that appear during the early-stage of disease, prior to the onset of hallmark limb motor signs, and evaluate whether they can predict early-stage, cranial motor-based vocalization outcomes. Here, we tested the specific hypothesis that loss of Pink1 alters inflammation gene expression in whole blood, resulting in the upregulation of genetic pathways that begin in early-stage disease and are bioinformatically correlated to vocal communication acoustic parameters.

Results

Overall DEG analysis

There were 101 differentially expressed genes (DEG) identified in this study including 16 downregulated and 85 upregulated in Pink1−/− compared to WT male rats. Pink1 was confirmed to be significantly downregulated in the dataset (log(Fold Change) = − 7.49 × 1010). Within the entire DEG dataset, the top gene biological processes from the KEGG 2021 Human included mitophagy, ubiquitin mediated proteolysis, apoptosis, PPAR signaling, and others including Parkinson disease (Table 1).

Table 1 Gene enrichment KEGG analysis of the DEG dataset

Downregulated gene pathways

Downregulated pathways identified using WikiPathways (Table 2) included those related to mitochondrial long-chain fatty acid beta-oxidation (Acadl), type II interferon signaling (Nos2), and PD (Pink1). Within the significantly downregulated DEG dataset, gene enrichment analysis for the GO biological process (Table 3) included nucleotide biosynthesis (Pink1, Nos2), apoptotic processes (Pink1, Lig4), and immune responses (Nos2, Lcn2, Mid2) were identified. Drug compounds identified to reverse downregulated gene transcription (Table 4) included multiple interferon-like compounds (IFNA-MCF7, BT20, SKBR3, MDAB231, HS578T).

Table 2 Downregulated gene pathways
Table 3 Gene enrichment GO Biological Process analysis of the significantly downregulated genes
Table 4 Drug repurposing using LINCS L1000 and downregulated gene targets

Upregulated gene pathways

There were notably more upregulated DEG compared to downregulated; this is consistent with DEG expressed in the brain and thyroarytenoid muscle of the Pink1−/− rat [25, 28]. The gene enrichment for the gene ontology (GO) biological processes (Table 5) included a long list of cellular activities (cell cycle, ubiquitin activity, protein modifiers) as well as multiple blood-specific processes. These findings were mirrored in the WikiPathways analysis in which fatty acid biosynthesis (Acsl1, Ech1), cell cycle control (Cdkn2c, Gadd45A), p54 signaling (Pidd1, Gadd45A), and PPAR signaling (Acsl1, Ubc) were significantly upregulated (Table 6). Finally, drugs that were matched to reverse upregulation included EPG-MCF7, MCF-MCF7 (apoptosis resistance), HBEGF-SKRB3 (Anti-EGFR monoclonal antibodies), and IFNG-MCF7 (interferon) (Table 7).

Table 5 Gene enrichment GO biological process analysis of the significantly upregulated genes
Table 6 Gene enrichment WikiPathways analysis of significantly upregulated genes
Table 7 Drug repurposing using LINCS L1000 and upregulated gene targets

Top 1000 genes

To expand the number of pathways and capture significant biological changes in both directions, the top 1000 up- and downregulated genes were used in a separate gene enrichment analysis. Additional file 1: File S1 lists the significant WikiPathways and GO Biological Process for both up- and downregulated gene lists, respectively. Upregulated pathways included heme biosynthesis (Urod, Uors, Hmbs, Cpox, Ppox) and ferroptosis (Prnp; Steap3; Map1lc3a; Tfrc; Acsl1; Lpcat3; Alox15; Slc11a2; Slc3a2). Downregulated pathways included mitochondrial fatty acid synthesis (Mecr; Oxsm; Hsd17b12) and mitochondrial long-chain fatty acid beta-oxidation (Pecr; Acadl; Cpt2; Eci1).

When the top 1000 up-and downregulated genes were put in STRING, 96 interacting genes were identified (Additional file 2: File S2). These are replotted and shown in Fig. 1 and demonstrate enrichment for PD and prion disease.

Fig. 1
figure 1

STRING protein–protein interaction map with 97 interconnected genes. Nodes indicate protein and edges indicate protein interactions; line width reflects strength of evidence. Significant enrichment for Parkinson disease (purple) and prion disease (pink) (and overlapping) were identified

WGCNA

Additional file 3: File S3 includes the sortable output files, P-values, correlations, and list of genes in the top modules. There were 4 significant modules (ME): Red, Yellow, Midnightblue, and Purple. The top module was Red and included Pink1. To determine the genes and their functions that interact with Pink1 within this whole blood RNA dataset, the 248 genes that were in Red were put into the gene enrichment analysis tool to evaluate this specific gene list against preexisting data sets. Several areas of enrichment were identified including protein catabolic process, ion homeostasis, and protein destabilization. The Yellow module significantly correlated to frequency modulated USV duration (length) and bandwidth (frequency range) and Yellow (335 genes) was enriched for mitochondrial gene expression. Both the Red and Yellow modules correlated to open field number of entries (movement into the open field indicating less anxiety; increased exploration). Other significant modules, Midnightblue (89 genes) demonstrated enrichment in iron ion homeostasis, as well as macrophage activation, and immune processes. Purple (152 genes) showed enrichment in multiple cellular processes.

Discussion

The general understanding of inherited, early-onset monogenic forms of PD is limited, yet necessary to provide insight into the polygenic nature of idiopathic PD as well as the development of candidate biomarkers which may be useful in early-stage diagnosis. Whole blood collection is a relatively non-invasive source of potential biomarkers that is inexpensive, easy to obtain, and has translatable clinical relevance. We hypothesized that loss of Pink1 alters inflammation gene expression in whole blood, resulting in the upregulation of genetic pathways that begin in early-stage disease. Further, we hypothesized that these dysregulated genetic pathways are bioinformatically correlated to behavioral outcomes including motor, anxiety, and vocal communication (cranial motor) acoustic parameters. The Pink1−/− rat was selected because it exhibits parallels to human prodromal PD features, including early and progressive changes to vocalization and anxiety with a gradual onset of limb motor dysfunction in adults [29]. The present study identified several dysregulated genes and biological pathways within the blood of young Pink1−/− rats. These data suggest that the earliest PD signs, independent of nigrostriatal dopamine loss, are bioinformatically correlated to blood pathway data.

Loss of Pink1 results in dysregulation of ribosomal protein and RNA processing gene expression

Consistent with previous sequencing studies on brain and TA muscle, there were notably more upregulated than downregulated DEG in Pink1−/− rats. The most significantly downregulated genes were Rpl12 (ribosomal protein L12) and Lyzl1 (Lysozyme like protein 1). Lzyl1 is a protein coding gene that has been recently identified in a microarray study as a new locus associated with dementia in PD [34]. Some ribosomal proteins, such as ribosomal protein s15, have been linked to neurodegeneration in LRRK2 overexpression human dopamine neuron models [35]. The most significantly upregulated gene in whole blood was Celf3 (CUGBP Elav-Like Family Member 3). Celf3 codes for an RNA-binding protein (RBP) involved in various aspects of RNA processing including nucleic acid binding and pre-mRNA alternative splicing. Dysregulation of RBPs has been implicated in neurodegenerative diseases including Alzheimer’s disease (AD), amyotrophic lateral sclerosis (ALS), and PD. RBPs have been found in inclusion bodies of some of these diseases, providing insight into the misfolding of proteins and subsequent protein aggregation. [36,37,38]

Interferon signaling is altered in whole blood of Pink1 −/− rats

Loss of Pink1 results in increased production of proinflammatory cytokines and chemokines including tumor necrosis factor-α (TNF-α), interleukin-6 (IL-6) and interleukin1-β (IL1-β), as well as interferons (IFNs) IFN-β1 and IFN-γ, within the blood and brain resulting in inflammation and loss of dopaminergic neurons in both a Pink1−/− mouse model and PINK1-associated PD patients [16]. IFNs exhibit antiproliferative, proapoptotic, antiangiogenic and immunomodulatory functions. In cellular models of PD, loss of Pink1 also increased cytosolic dsDNA derived from mitochondria which resulted in elevated type-I IFN responses and correlated with apoptotic markers and cell death [17].

Type-I IFNs (IFN-Is; IFN-α and IFN-β) play a critical role in the innate immune responses by activating classical proinflammatory signaling pathways that result in the production of major inflammatory cytokines: TNF-α, IL-6, and IL1-β. IFN-Is have been shown to regulate neuroinflammation in the central nervous system (CNS) and contribute to degeneration and disease progression in several in vivo and in vitro models of PD [39, 40]. Mouse models of PD and postmortem studies of PD brains have confirmed that mRNA expression of IFN-Is is upregulated in PD. Whereas IFN-β deficiency causes mitochondrial dysfunction in primary cortical neuron cultures and causes defects in the nigrostriatal dopaminergic pathway as well as widespread α-synuclein accumulation in Ifnb−/− mice [41]. In addition to Type-I IFNs, Type-II IFNs have also been implicated in PD pathophysiology. For instance, IFN-γ is elevated in the serum and brain of patients with PD and correlates with disease severity [42, 43].

Several genes identified in this dataset included those related to type I interferon signaling (Ifit1bl; interferon-induced protein with tetratricopeptide repeats 1B-like), type II interferon signaling (Nos2; nitric oxide synthase 2), and apoptosis (Dedd2; death effector domain containing 2). Interferon stimulated genes involved in chromatin remodeling (Gadd45a; Supt4h1; Esco2; Pelp1; Bap1; Tada2a) and ATP-binding proteins (Lig4; Slc22a5; Entpd4; Prnp; Vps25; Kifc1; Bub1; Uhrf1; Pidd1; Nek2l1; Abca4; Kif18b; Ckb) were upregulated in our dataset. Further, unpublished data from our lab show that at 12 months of age, Pink1−/− rats have significantly more up- and downregulated genes compared to age-matched WT rats (upregulated, n = 553; downregulated, n = 1561). In this unpublished data set, numerous interferon stimulated genes (ISGs) were upregulated including Rnasel, Fas, Casp4, Irf1, and Ifitm1. The second most significantly upregulated gene is Ifit1 (interferon-induced protein with tetratricopeptide repeats 1; P = 5.53 × 10–27). In addition, receptors for IFN-α (Ifnar2) and IFN-γ (Ifngr1) were also upregulated in the dataset. Therefore, these data suggest that inflammation and interferon signaling begins early in the whole blood (3 months of age) and progresses as Pink1−/− rats age. Further work will use bioinformatics to correlate these pathways to behavioral data at 12 months of age.

Drug compounds identified to reverse up- and downregulated gene transcription (Table 4) included multiple interferon-like compounds (IFNA-MCF7, BT20, SKBR3, MDAB231, HS578T and IFNG-MCF7).

From this dataset, we hypothesize that the lack of Pink1 may cause an early disruption in interferon signaling that may result in the downstream overproduction of proinflammatory cytokines (TNF-α, IL-1β, IL-6) that will worsen over time. Targeting interferon signaling with drug compounds may be a potential therapeutic intervention to halt or prevent the further production of harmful proinflammatory cytokines that contribute to neuroinflammation and the death of neurons and should be studied in future work.

Major prion protein gene expression is upregulated

STRING analysis showed enrichment for prion disease and PD (as in Fig. 1). One of the most interesting genes identified in this dataset was Prnp, which was significantly upregulated in Pink1−/− whole blood as early as 3 months of age. Prnp encodes major prion protein (PrP) that is primarily active in the brain and associated with several prion and prion-like diseases. This data further supports the prion hypothesis for PD that has been proposed due to the prion-like misfolding and aggregation of α-synuclein [44,45,46,47,48,49]. In a prion disease cell model, PINK1/Parkin signaling, specifically PINK1, was required for mitophagy of damaged mitochondria and activation attenuates prion-induced neuronal apoptosis [49]. To our knowledge, this is the first monogenic PD animal model to report significant genetic changes in the Prnp gene. In this dataset, Prnp was a significant gene in numerous GO Biological Processes identified through gene enrichment of the significantly upregulated genes including dendritic spine maintenance, apoptotic processes, negative regulation of interleukin-17 production, T-cell receptor signaling, and calcium-mediated signaling. There were only four drug compounds identified to reverse upregulated gene transcription and two of them, MCSF-MCF7 and IFNG-MCF7 included Prnp as a significant gene.

Tuba1c, a previously identified significant gene in Pink1 −/− rats, is upregulated in whole blood

Tuba1c upregulation has been recently identified in several of our RNA-sequencing datasets including the thyroarytenoid (TA) vocal fold muscle and brainstem [29]. Previously, Tuba1c was identified as key gene correlated to vocalization acoustic parameter at 2 months of age. In this study, Tuba1c was once again significantly upregulated in whole blood of Pink1−/− rats; it is also an interconnected gene in the STRING analysis. A recent proteomics study, differential expression of Tuba1c protein was identified in the plasma of rotenone-exposed rats [50]. Here, Tuba1c was also identified as a significant gene in the gene enrichment KEGG pathway analysis including, apoptosis, pathways of neurodegeneration, ALS, and Parkinson disease (Table 1).

Bioinformatics analysis highlighted gene pathways that significantly correlate to behavioral outcomes in Pink1 −/− rats

Another goal of this study was to use bioinformatics to highlight biological gene pathways within the whole blood and determine whether they are significantly correlated to anxiety, motor, or ultrasonic vocalization behavioral outcomes in Pink1−/− rats. WGCNA enrichment analysis resulted in four significant modules, in which two of the four modules (Red and Yellow) were significantly correlated to behavioral outcomes. The Red module, which contained the Pink1 gene, correlated to open field number of entries (movement into the open field indicating less anxiety; increased exploration) (Table 8). The Red module demonstrated enrichment in the most biological processes including cell division, chromatin organization, regulation of autophagy, cellular response to ATP and reactive oxygen species, and regulation of the cell cycle. The Yellow module significantly correlated to frequency modulated USV duration (length) and bandwidth (frequency range) as well as open field number of entries. Enrichment of the Yellow module included mitochondrial gene expression, lipid transport across blood–brain barrier, tRNA processing, and IL-7 signaling.

Table 8 WGCNA modules and correlated behavioral variables

Limitations

The Pink1−/− rat is a useful model to study aspects of early-stage PD, including cranial sensorimotor dysfunction in the absence of nigrostriatal dopamine loss [20]. However, it should be noted that there have been inconsistencies between research groups reporting number of nigral neurons as well as striatal dopamine concentrations in older Pink1−/− male rats [51]. Genetic drift, variation among cohorts of animals, differences in experimental and housing paradigms, and other laboratory variables are all common factors in producing inconsistent results and should be considered when making conclusions. The Pink1−/− rat is a progressive model, and much of the neurological quantification has been done in late adulthood (8–12 months); when emphasizing prodromal behavior and neurochemical future studies should target younger prodromal ages (i.e. 2–3 months). Likewise, it would be interesting to compare Pink1−/− transcription with a focus on mitochondrial dysfunction to other PD genetic models, overexpression (synuclein) or neurotoxin lesion. In this study, due to smaller sample sizes used in RNA-seq it is not possible to correlate finding to behavioral outputs. The utility of these findings is the focus on annotated protein coding mRNA and biological pathways for future study. Future work could involve the analysis of non-coding RNA as well as comparison to existing human datasets [52]. While this study focuses on mRNA that encodes protein, non-coding RNA can regulate cellular functions and signal transduction [53]. This study provides the targets for additional studies that focus on particular biological gene networks and manipulation. An additional limitation of this work is that females were not included, and future work should evaluate sex as a biological variable.

Conclusions

Neuroimaging and CSF biomarkers may be useful in research settings, but due to the ease, availability, and low cost of phlebotomy, whole blood biomarkers are among the most promising and practical methods to screen large populations for an occult, yet common and devastating disease with accelerating incidence. Whole blood genetic biomarkers of PD hold promise to screen large populations for PD risk factors. They may also inform prognosis as well as monitor response to future disease-modifying treatments of PD applied in the early-stage of disease, prior to manifestation of hallmark motor signs that currently form the basis of diagnosis. Using validated, monogenic rat models we can study the influence of Parkinsonian genes and their networks and provide data that is translatable to humans. PD has many different identified genes and pathways—mitochondrial dysfunction, deranged immune responses, oxidative stress, and prion protein. This study demonstrates that we can identify PD signature prior to development of a clinical motor phenotype and predict progression of ultrasonic vocalization parameters. Thus, using bioinformatics and whole blood sampling, it may be possible to identify genetic signatures in humans that correlate to vocalization dysfunction and target these gene signatures therapeutically for the treatment of vocal deficits in PD.

Methods

Animals and experimental design

A total of 4 male Long-Evans rats with homozygous Pink1 knockout and 4 male wild-type (WT) Long-Evans control rats (Inotiv, Chicago, IL, USA), aged 3 months, were used in this study. All rats arrived at 4 weeks old and were pair-housed (same-genotype) in standard polycarbonate cages (17 cm × 28 cm × 12 cm) with corncob bedding. Food and water were provided ad libitum. Following arrival, all rats were immediately placed on a 12:12-h reverse light cycle as rats are nocturnal. All behavioral testing occurred under partial red-light illumination. Rats were acclimated to study procedures and experimenter handling prior to all behavioral testing. All rats were weighed upon arrival and weekly using a digital scale to monitor overall health. All procedures and protocols (M006329) were approved by the University of Wisconsin-Madison School of Medicine and Public Health Animal Care and Use Committee and were conducted in accordance with the NIH Guide for the Care and Use of Laboratory animals (National Institutes of Health, Bethesda, MA, USA).

Behavior

Corresponding rat behavioral data used in this study included open field (time in center ([sec]), number of entries, total movement [(cm])), cylinder limb motor (number of rears and lands, hindlimb and forelimb movements), and ultrasonic vocalizations (total number of calls, duration ([msec]), bandwidth ([kHz]), intensity [(dB]), and peak frequency ([kHz])). These measures were used in the gene statistical correlation analysis, discussed below, and were previously published by Lechner et al. [29].

Whole blood collection and RNA processing

Whole blood samples were collected from the body trunk during euthanasia by rapid decapitation via guillotine under heavy isoflurane anesthesia. Approximately 400 µl of trunk blood was immediately transferred to a sterile 2 mL microcentrifuge tube that contained 1.3 mL of RNAlater™ RNA stabilization Solution (Invitrogen, Carlsbad, CA, USA) and inverted several times. RNA extraction was then performed using the Ribopure™ kit, per manufacturer’s instructions (Invitrogen). Briefly, the supernatant was removed, and blood cells were lysed with lysis and sodium acetate solution. An acid-phenol chloroform extraction was performed, and the RNA was purified through a filter cartridge and eluted. Total RNA was measured using a Nanodrop system (Thermo Scientific, Wilmington, DE, USA) as well as with an Agilent RNA 6000 Pico kit (Eukaryote Total RNA Pico, Agilent Technologies, Santa Clara, CA) and the Agilent 2100 bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).

RNA sequencing

All RNA sequencing procedures followed ENCODE and were performed at the University of Wisconsin Biotechnology Center’s Next Generation Sequencing Facility [25, 28] using the Illumina® HiSeq 2000 high-throughput sequencing system (Illumina Inc., San Diego, CA, USA). The Illumina RiboZero Plus Kit (Cat. 20,040,526) with rRNA and globin reduction was used to remove cytoplasmic and mitochondrial rRNA and a sequencing library was generated. Libraries were quantified using Qubit DNA HS kit, diluted 1:100, and assayed on an Agilent DNA1000 chip. Adaptor sequences, contamination and low-quality reads were removed. Reads were mapped to the annotated Rattus norvegicus genome in Ensembl. As reviewed in Kelm-Nelson and Gammie, 2020, the technical quality was determined using multiple parameters [25].

Gene expression analyses

Gene analysis was performed with the glm using the EdgeR Bioconductor Package, v. 3.9. The P-value cutoff was set to 0.05 for significance. All raw data were uploaded to the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE213543; GSE213543). The RSEM approach for normalizing RNA seq data was used. EdgeRglm data are provided in Additional file 4: Table S1. Statistically significant differentially expressed genes (DEG) were ranked according to P-value, FDR, and sorted by up- or downregulation, GO function, biological process, and component (Additional file 5: Table S2).

EnrichR pathway gene enrichment analysis (KEGG analysis) was used to determine gene enrichment on the DEG list and WikiPathway 2021 and GO Biological process was used on the top 1000 up- and 1000 downregulated genes, respectively (Additional file 6: Table S3). Additionally, the top overall 1000 genes were entered into STRINGdp (v 2.0; Search Tool for the Retrieval of Interacting Genes/Proteins, http://string.embl.de/) to identify protein–protein interactions [54, 55]. The top connected genes were replotted with enrichment.

Weighted gene co-expression network analysis (WGCNA) and behavior

A WGCNA was used to construct co-expression networks and modules from the whole blood gene expression dataset as well as the open field, cylinder, and USV acoustical dataset, previously published in 2021 [29]. Data were log2(x + 1) transformed, low expression genes were removed and WGCNA was run (13,360 number of genes) using R Statistical Software [56]. Correlations were raised to a soft thresholding power β of 12, unsupervised hierarchical clustering for WGCNA used the default setting as well as the following: the minimum module size of 30 genes, the signed TOMType, the deepSplit parameter set to 2, and the mergeCutHeight parameter set to 0.25. Searchable networks were created.