Introduction

Cardiovascular disease and osteoporosis each contribute to a significant disease burden worldwide. It is estimated that roughly 17.9 million people died from cardiovascular disease in 2016, which accounts for 31% of all deaths globally1. The estimated number of fragility fractures in the EU in 2010 was 3.5 million, and it is predicted to rise by 28%, reaching 4.5 million, by 20252. Osteoporosis and atherosclerosis share a similar pathophysiological mechanism that involves inflammatory cytokines and oxidized lipids3. The two diseases also share several risk factors, such as oestrogen deficiency, vitamin D abnormalities, dyslipidaemia, dietary calcium, dietary saturated fat and oxidative stress, as well as genetic biomarkers, such as osteoprotegerin, apolipoprotein E and the matrix gla protein3,4. Plasma lipids such as cholesterol, which has been shown to be associated with atherosclerosis, have also been shown to be associated with bone health5,6. In one of our previous studies, we identified three lipidome-based molecular triglycerides jointly associated with early traits of both osteoporosis and atherosclerosis7. Statin, a cholesterol-lowering drug for individuals who are at a high risk of cardiovascular disease, has also been shown to improve bone density8. Despite the strong evidence supporting the comorbidity hypothesis, studies investigating the underlying molecular mechanisms of the diseases using omics data, such as transcriptomics, are scarce.

Several transcriptomic studies have been previously carried out on osteoporosis9,10 and atherosclerosis11,12 independently. However, multivariate transcriptomics studies investigating the joint association of genes with both osteoporosis and atherosclerosis are lacking. While such multivariate transcriptomics studies are important, the analysis of a large number of genes in such high-dimensional data requires a multitude of tests and, consequently, multiple-testing correction, leading to the exclusion of genes with suggestive significance. However, a system-level bioinformatics approach largely reduces the multiple testing and increases the statistical power. We have recently used such an approach to identify lipidome-based molecular lipids jointly associated with early traits of osteoporosis and atherosclerosis7.

The objective of the present study was to perform a system-level analysis of genome-wide expression data in whole blood, used as a proxy for difficult-to-acquire samples such as bone or artery wall tissue, to identify gene co-expression modules, enriched pathways in the identified modules and the most important member genes jointly and significantly associated with the early traits of both osteoporosis and atherosclerosis. To achieve the stated goals, we used weighted gene co-expression network analysis (WGCNA)13, multivariate analysis of variance (MANOVA) and pathway analysis to identify gene modules and enriched pathways jointly associated with early traits of both osteoporosis and atherosclerosis.

Material and methods

Study participants

This study was based on the Young Finns Study (YFS), a prospective multi-centre follow-up study assessing cardiovascular risk factors from childhood to adulthood14. The study was initiated in 1980 with 3,596 children and adolescents aged 3–18 years. The participants were randomly selected from the areas of five university hospitals in Finland (Turku, Tampere, Helsinki, Kuopio, and Oulu) and have been followed up for over 40 years. The study was approved by the ethical committee of the Hospital District of Southwest Finland on 20 June 2017 (ETMK:68/1801/2017). All participants gave their written informed consent, and the studies were conducted in accordance with the Declaration of Helsinki. Data protection will be handled according to current regulations. The present study is based on 1,032 participants, aged 30–45 years, from the 2007 follow-up (Table 1), with four atherosclerotic and six osteoporotic traits as summarized in Table 2. The gene expression levels of the study participants were profiled based on the 2011 follow-up.

Table 1 Population characteristics of the Young Finns Study cohort. Data are expressed as means (± SD) or proportions (%).
Table 2 Early traits of osteoporosis and atherosclerosis with their descriptive statistics among the study participants, expressed as mean ± SD.

Assessment of early traits of atherosclerosis

Carotid and bulbus intima-media thickness (IMT) measurements were used as the early traits of atherosclerosis7. An ultrasound imaging device with a high-resolution system (Sequoia 512, Acuson) and 13.0 MHz linear array transducers was used for IMT measurements by trained sonographers following a standardized protocol. The image was focused on the posterior (far) wall, and images were recorded from the angle showing the greatest distance between the lumen–intima interface and the media–adventitia interface. A scan including the beginning of the carotid bifurcation and the common carotid artery was recorded and stored in digital format on optical discs for subsequent off-line analysis. All scans were analysed by one reader blinded to the participants’ details. The best-quality end-diastolic frame was selected. Several measurements of the common carotid far wall were taken approximately 10 mm proximally to derive the maximal carotid IMT. To assess the reproducibility of IMT measurements, we re-examined 60 participants 3 months after the initial visit (2.5% random sample). The between-visit coefficient of variation of IMT measurements was 6.4%. To assess the reproducibility of the IMT image analysis, 113 scans were re-analysed by a second observer, and the coefficient of variation was 5.2%. The mean and maximum carotid and bulbus IMTs were used in this study.

Assessment of early traits of osteoporosis

The assessment of early traits of osteoporosis was based on peripheral quantitative computed tomography (pQCT) bone measurements from both the distal and diaphysis sites of the radius and tibia, as described elsewhere7. The tomographic slices were taken from the shaft (a cortical-rich bone site) and distal part (a trabecular-rich bone site) of the weight-bearing tibia (30% and 5% from the distal endplate of the tibia, respectively), and of the non-weight-bearing radius (30% and 4% from the distal endplate of the radius, respectively) according to our standard procedures15. For the shaft regions, the analysed bone traits were total area (ToA, mm2), cortical area (CoA, mm2) and cortical density (CoD, mg/cm3). For the distal parts of the radius and tibia, the measured bone traits were ToA (mm2), CoA (mm2) and trabecular density (TrD, mg/cm3). The in vivo precision of the used pQCT-measured traits ranged from 0.5% (CoD of the radial shaft) to 4.4% (CoA of the distal radius). Mineral content was calculated as 0.2 x (area/100) x density. The measured bone traits are shown in Table 2.

Health and lifestyle data

The physical activity index, calculated as metabolic equivalents (METs) by combining information on the frequency, intensity and duration of physical activity, including leisure-time physical activity and commuting to the workplace (MET h/wk), was used to represent the physical activity of the participants. One MET corresponds to the energy consumption of one kilocalorie per kilogram of weight per one hour at rest16. Alcohol consumption was assessed from the participants’ reports on their alcohol consumption expressed in units (i.e., 14 g of alcohol) during the previous week17.

Blood transcriptomic analysis

RNA isolation was performed from whole-blood samples collected from study participants during the 2011 follow-up. Expression levels were analysed with Illumina HumanHT-12 version 4 Expression BeadChip (Illumina Inc.), containing 47,231 expression and 770 control probes. Samples with fewer than 6,000 significantly detected expression probes (detection p-value < 0.01) were discarded. Raw Illumina summary probe-level data was exported from Beadstudio and processed in R (http://www.r-project.org/) using a nonparametric background correction, followed by quantile normalization with control and expression probes, with the neqc function in the limma package18 and a log2 transformation. Nine samples were excluded due to sex mismatch between the recorded sex and predicted sex based on RPS4Y1-2 and XIST mRNA levels on the Y and X chromosomes, respectively. After quality control, expression data was available for 1,654 samples, including 4 technical replicates, which were used to examine batch effects and subsequently excluded before further analysis.

Biostatistical analysis

Signed weighted gene co-expression network analysis (WGCNA) implemented with R statistical software13 was used to identify groups of densely interconnected genes, hereafter referred to as gene modules. The analysis pipeline is illustrated in Fig. 1. The module generation method involved the calculation of Pearson’s correlation (r) for all pairwise comparisons of genes across all participants, resulting in a correlation matrix. The correlation matrix was raised to the power of 10 to generate an adjacency matrix in order to minimize noise and emphasize stronger correlations. The power was chosen using the power function implemented in the WGCNA package, to the effect that it transforms the correlation matrix to an approximately scale-free topology based on the assumption that most of the real-world biological networks are scale-free (Figure S1). The resulting adjacency matrix was used to generate a Topological Overlap Matrix (TOM) in order to incorporate network topology information in the definition of the co-expression of genes. The TOM is a similarity matrix of genes. This was transformed into a dissimilarity matrix, 1-TOM. Average linkage hierarchical clustering of the dissimilarity matrix was performed to generate a hierarchical clustering tree of genes. Next, gene modules were identified with a dynamic tree-cutting algorithm. The quality of the identified gene modules was further assessed by analysing the correlation between gene significance (GS) and module membership (MM). GS is defined as the correlation between the module's member genes and the study traits. MM is defined as the correlation between the summary expression profile of a module and its member genes. An ideal module is one where GS and MM are highly correlated, which suggests that the genes that are highly correlated with the biological marker of interest are also the important member of the analysed module. The first principal component of the expression profiles of the member genes in a module was used as a summary expression profile of the module. Pearson’s correlation coefficients (r) were calculated between the summary expression profiles of the identified gene modules and the early traits of osteoporosis and atherosclerosis. Gene modules that were significantly correlated (Bonferroni-adjusted p-value (p.adj) < 0.05) with early traits of both osteoporosis and atherosclerosis were considered as candidate modules for testing joint statistical association with the studied traits of both diseases in multivariate statistical analysis.

Figure 1
figure 1

Weighted gene co-expression network analysis pipeline (Figure adapted from Fig. 1 in7). 1) Gene expression data analysed using Illumina HumanHT-12 version 4 Expression BeadChip. 2) Correlation matrix based on the pairwise correlations (Pearson) of the gene expression data. 3) Hierarchical clustering of the dissimilarity matrix generated from the correlation matrix. 4) Identification of gene modules based on clustering. 5) Summarization of the gene expression profile in the modules by calculating their first principal component. 6) Correlation between the modules’ summary expression profiles (representative of modules) and a set of early traits of osteoporosis and atherosclerosis. 7) Examination of the correlation between module membership and the gene significance of the gene module in the selected modules as a quality check for the modules. 8) Multivariate analysis of variance (MANOVA) of the gene modules and their member genes. 9) Pathway analysis of the member genes of the significant gene modules.

A multivariate analysis of variance (MANOVA) test implemented in the car R package was performed to test for a joint statistical association between the candidate gene modules and the studied traits of both osteoporosis and atherosclerosis. The MANOVA test was performed with three different models: model 1 for investigating a joint association between gene modules and early traits of both osteoporosis and atherosclerosis with no covariates; model 2, adjusted with age, sex and body mass index (BMI); and model 3, which was the same as model 2 but additionally adjusted for smoking, alcohol consumption and variables related to physical activity. The early traits, one for osteoporosis and another for atherosclerosis, that had the most significant correlation with the module’s summary expression profile (Fig. 2) were chosen for the test. Biological pathway (groups of biologically related genes) analyses based on the Gene Ontology (GO)19, the Kyoto Encyclopedia of Genes and Genomes (KEGG)20 and the Disease Ontology (DO)21 were performed for the genes involved in the significant gene modules using clusterProfiler and DOSE R/Bioconductor packages22,23. In order to simplify the interpretation, significantly enriched pathways across all of the significant gene modules were clustered using the “dendextend” R package based on the similarity of member genes24.

Figure 2
figure 2

Relationships between gene co-expression modules (x-axis) and early traits (y-axis). The rows correspond to the different gene modules (named by colour) and their summary expression profile; for example, brown represents the summary expression profile for the module brown. The columns correspond to the measured early traits of osteoporosis (pQCT traits) and atherosclerosis (ultrasound traits) used in the biostatistical analyses. The values in the cells represent Pearson's correlation coefficients (r), with the associated Bonferroni-adjusted p-values in parentheses below the coefficient. The modules are named by colour, and the correlation coefficients have a colour-coding shown in the colour scale (between − 1 and + 1) on the right side of the figure. Acronyms: DRToBMC, distal radius total bone mineral content; DRTrD, distal radius trabecular bone mineral density; RSCoBMC, radial shaft cortical bone mineral content; DTToBMC, distal tibia total bone mineral content; DTTrD, distal tibia trabecular bone mineral density; TSCoBMC, tibia shaft cortical bone mineral content; CIMTavg, average carotid intima-media thickness; CIMTmax, maximum carotid intima-media thickness; BIMTavg, average bulbus intima-media thickness; BIMTmax, maximum bulbus intima-media thickness.

All statistical analyses and data processing were performed using the statistical package R version 3.4.325. Findings with p.adj < 0.05 were reported as significant. Considering the exploratory nature of this study, p.adj < 0.25 were reported as suggestively significant. Similar reporting approach has been used by others26.

Results

Study population characteristics

The present study is based on 1,032 participants, aged 30–45 years (56% women), from the Young Finns study 2007 follow-up. The clinical and other detailed characteristics of the study population are shown in Table 1. Number of diseases are based on self-reports15. The measured early traits of osteoporosis and atherosclerosis are shown in Table 2. The statistical analyses were based on four atherosclerotic and six osteoporotic traits. The gene expression levels of the study participants were profiled in 2011 follow-up.

Identification of gene modules

The hierarchical clustering of the dissimilarity matrix defined 38 modules, containing 12–5,697 highly correlated genes. Following the standard practice in WGCNA analysis, the gene modules were named according to colour for downstream analysis, as shown in Fig. 2.

Gene co-expression module trait relationships and identification of the most significant gene modules

Pearson’s correlation coefficients (r) between the modules’ summary expression profiles and the studied ten early traits were analysed. Six modules (brown, honeydew, darkseagreen, lightcoral, green and tan) were found to be significantly correlated with several of the early traits of both osteoporosis and atherosclerosis (Fig. 2). Constituent genes of the six modules are listed in Tables S1S6. Two modules, darkseagreen and lightcoral, were significantly correlated with all of the studied early traits of both osteoporosis and atherosclerosis. Within the lightcoral module, the correlation values (r) between the module’s summary expression profile and different study traits varied between 0.17 and 0.81, and the Bonferroni adjusted p-values (p.adj) for these correlations were between 2 × 10–6 and 2 × 10–240. Within the darkseagreen module, the corresponding r was 0.14–0.23 and p.adj ≤ 2 × 10–4 for all traits. Similarly, the module honeydew was significantly but inversely correlated with all of the studied traits of atherosclerosis and with all but one (i.e. DTTrD) early trait of osteoporosis (r varied between -0.12 and -0.19, p.adj ≤ 0.005 for all traits). The brown module was significantly associated with the carotid-IMT-related variables CIMTavg (r = -0.11, p.adj = 0.01) and CIMTmax (r = -0.11, p.adj = 0.01). The same module was also significantly correlated with three of the pQCT bone measurements, the highest correlation being with TSCoBMC (r = -0.12, p.adj = 0.005). Similarly, the green and tan modules were significantly associated with several early traits of both osteoporosis (r ≥ 0.11, p.adj < 0.05) and atherosclerosis (r ≥ 0.11, p.adj < 0.05). All six significant gene modules (Fig. 2) had a significant correlation between GS and MM with regard to the early traits of both atherosclerosis and osteoporosis, with r ≥ 0.34 and a p-value of \(\le 0.05\) (Figure S2S7).

Multivariate analysis of the six significant gene co-expression modules with early traits of osteoporosis and atherosclerosis

The results from the MANOVA test with the three different models (1–3) described in Sect. 2.6 are presented in Table 3. The Pillai’s Trace statistic represents the magnitude of the effect of the module’s summary expression profile on the early traits of osteoporosis and atherosclerosis. The value of the statistic ranges from 0 to 1, with higher values meaning higher effect. The F-value represents the predictive ability of a model: a higher F-value suggests higher significance of a model. All six gene modules were jointly and significantly associated with early traits of both osteoporosis and atherosclerosis in model 1 (p.adj \(\le\) 3.7 × 10−5). Two of the six significant modules, honeydew and green, were jointly and significantly associated with early traits of both osteoporosis and atherosclerosis in models 2 and 3 (p.adj \(\le\) 0.03).

Table 3 Multivariate analysis of variance (MANOVA) test results from association analyses between the summary expression profiles of the six significant gene modules and early traits of both osteoporosis and atherosclerosis, using three differently adjusted models (1–3), as described in the table footnote*. The most significant gene modules across all the three tested models are in bold.

Multivariate analysis of the member genes of the six significant gene modules with early traits of osteoporosis and atherosclerosis

According to model 1, there were 261 genes significantly and jointly associated with early traits of both osteoporosis (TSCoBMC) and atherosclerosis (CIMTavg) in the brown module (Table S7). The most significant gene (p.adj = 1.2 × 10−10) in the brown module with model 1 is solute carrier family 16 member 10 (SLC16A10) (Table 4). Models 2 and 3 identified Zinc Finger Protein 594 (ZNF594) as the most significant gene in the brown model, with a suggestive significance level (p.adj = 0.08).

Table 4 Multivariate analysis of variance (MANOVA) test results from association analyses between member genes of the six significant gene modules and early traits of both osteoporosis and atherosclerosis using three differently adjusted models (1–3), as described in the table footnote*. The topmost ranking genes based on the Bonferroni-adjusted p-value for each of the three models are presented.

According to model 1, the honeydew module had 31 genes significantly and jointly associated with early traits of both osteoporosis (DRToBMC) and atherosclerosis (BIMTmax) (Table S8), the most significant (p.adj = 7.4 × 10−16) being Nitric Oxide Synthase Interacting Protein (NOSIP) (Table 4). NOSIP was also the most significant gene of the honeydew module with models 2 (p.adj = 0.03) and 3 (p.adj = 0.09).

The darkseagreen module had 25 genes significantly and jointly associated with early traits of both osteoporosis (RSCoBMC) and atherosclerosis (CIMTavg) in model 1 (Table S9), the most significant (p.adj = 6.8 × 10−24) being Myeloperoxidase (MPO) (Table 4). Models 2 and 3 identified Protein Tyrosine Phosphatase Non-Receptor Type 20 (PTPN20) as the most significant gene in the darkseagreen module, with a p.adj of 0.04 in model 2 and p.adj of 0.13 in model 3.

Only 13 genes in the lightcoral module were significantly and jointly associated with early traits of both osteoporosis (RSCoBMC) and atherosclerosis (BIMTmax) in model 1 (Table S10) and none in models 2 and 3 (Table 4). The most significant gene (p.adj = 1.8 × 10−253) in model 1, LOC100133662, is uncharacterized.

The green module had 18 genes significantly and jointly associated with early traits of both osteoporosis (DRToBMC) and atherosclerosis (BIMTavg) in model 1 (Table S11), the most significant (p.adj = 3.3 × 10−14) being MAP7 Domain Containing 2 (MAP7D2) (Table 4). Model 2 identified Glucoside Xylosyltransferase 2 (GXYLT2) as the most significant gene, with a p.adj of 0.18, and model 3 identified Tripartite Motif Containing 63 (TRIM63) as the most significant gene, with a p.adj of 0.07.

The tan module had 132 genes significantly and jointly associated with early traits of both osteoporosis (DRToBMC) and atherosclerosis (CIMTavg) in model 1 (Table S12), the most significant (p.adj = 2.1 × 10−61) being an uncharacterized gene (HS.412918) (Table 4). Model 2 identified chromosome 17 open reading frame 28 (C17ORF28) as the most significant gene in the tan module, with a p.adj of 0.23.

Pathway analysis of gene modules shared by early traits of osteoporosis and atherosclerosis

Biological pathways significantly enriched (p.adj < 0.05) across all six gene modules were clustered into four groups based on the similarity of member genes (Fig. 3). The largest cluster, coded with green in Fig. 3, mostly contained pathways related to diseases (including atherosclerosis and mouth disease) and the immune response. The second largest cluster, coded with blue in Fig. 3, contained pathways related to the immune response. The third and fourth clusters (coded with orange and grey, respectively, in Fig. 3) contained pathways respectively related to RNA metabolism and olfactory receptors.

Figure 3
figure 3

Biological pathways shared by early traits of osteoporosis and atherosclerosis. Dendrogram plot representing the hierarchical clustering of biological pathways significantly enriched (Bonferroni-adjusted p-value < 0.05) in member genes of the six significant joint gene modules (brown4, honeydew1, darkseagreen4, lightcoral, green and tan). The four clusters, from the largest to smallest in size, are represented by green, blue, orange and grey colours. Biological pathways were defined as sets of genes derived from three different knowledge bases: the Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Disease Ontology (DO).

Member genes in the brown module were significantly (p.adj < 0.05) enriched with nine GO terms (four biological processes, one molecular function and four cellular components) and two KEGG pathways (Table S13). Eight of the nine pathways belonged to the orange cluster, representing RNA-metabolism-related pathways (Fig. 3). Genes in the honeydew module were significantly enriched with 4 GO pathways (2 biological processes and 2 cellular components) (Table S14), all of which were related to the immune response represented by the green cluster (Fig. 3). Genes from the darkseagreen module were significantly enriched with 27 GO pathways (17 biological processes, 4 molecular functions and 6 cellular components), 2 KEGG pathways and 15 DO pathways (Table S15). Most of these pathways were related to the immune response and diseases (represented by the green cluster in Fig. 3). The lightcoral module contains genes that were significantly enriched with 8 GO pathways (1 biological processes, 3 molecular functions and 4 cellular components), 1 KEGG pathway and 1 DO pathway (Table S16). All the pathways in the lightcoral module belonged to the green cluster (Fig. 3). The green module was enriched with 6 GO pathways (4 biological processes and 2 molecular functions) and 1 KEGG pathway (Table S17). The pathways are related to olfactory receptors and were represented in the grey cluster (Fig. 3).

Discussion

We performed a system-level analysis of genome-wide gene expression data to identify whole blood transcriptomic modules and the enriched biological pathways shared by early traits of osteoporosis (pQCT bone measurements) and atherosclerosis (ultrasound carotid IMT). Genome-wide expression levels from whole blood represent the average gene expression levels across all cells in whole blood and, therefore, facilitate the examination of the general expression pattern associated with a phenotype being studied. The whole blood gene expression profile recapitulates the biological processes in bone marrow, as immune cells within blood migrate back and forth between blood and bone marrow27 and are known to influence bone homeostasis28. Furthermore, a whole-blood-based approach can provide biomarkers that are easily assessable and non-invasive as opposed to bone and vascular tissue.

We identified six gene modules jointly and significantly associated with the studied traits of both diseases. Two of the six gene modules, green and honeydew, were jointly and significantly associated with the studied traits of both diseases after adjustment with age, sex, body mass index, smoking habit, alcohol consumption and physical exercise covariates. Another two modules, darkseagreen and tan, were jointly associated with the studied traits of both diseases after otherwise similar statistical adjustments, but with suggestive significance. A biological pathway analysis of the six gene modules identified several statistically significant biological pathways jointly associated with the studied traits of both diseases. Detailed gene-level analysis of the modules identified the three most significant genes (NOSIP from honeydew1 module, GXYLT2 and TRIM63 from green module) jointly associated with the early traits of the diseases. The three genes have been identified by previous studies as being independently associated with osteoporosis29,30,31 and/or atherosclerosis32,33, thus validating our findings. However, to the best of our knowledge, this is the first study to report their joint association with early traits of both the diseases.

The biological processes (detection of chemical stimulus, sensory perception of smell and detection of chemical stimulus involved in the sensory perception of smell), molecular function (olfactory receptor activity), and the KEGG (olfactory transduction) pathway enriched in the green module mostly contained olfactory receptor genes that belong to the G-protein-coupled receptor gene family. Even though olfactory receptor genes are traditionally known to be responsible for detecting odours and initiating the signalling cascade, various studies have shown their diverse physiological functions in other tissues, such as the testes34, lungs35, the brain36 and the heart37. Studies have indicated that olfactory receptors play a critical role in lipid metabolism38 and regulate heart function39. Bone morphogenetic proteins act as a regulator during bone formation and repair40. Additionally, bone morphogenetic proteins are suggested to promote the survival of olfactory receptors’ neurons41. Therefore, we speculate that olfactory receptors have an important role in bone remodelling. The green module was also enriched with a receptor ligand activity (molecular function) pathway that contains genes with anti-inflammatory and anti-atherogenic properties, such as Adiponectin (ADIPOQ)42. Adiponectin prevents atherosclerosis by decreasing oxidative stress, total cholesterol, triglycerides and low-density lipoprotein-cholesterol43. Adiponectin also increases bone mass through the activation of osteoblasts and suppression of osteoclasts44. Homophilic cell adhesion via plasma membrane adhesion molecules, a biological process enriched in the green module, contained genes such as the polio virus receptor (PVR). PVR plays an important role in inflammatory process during atherosclerosis via leukocytes movement across the endothelium45. Furthermore, studies indicate that PVR-mediated signalling inhibits osteoclast formation46.

The biological process, T-cell activation, enriched in the honeydew module plays an important role in the development of both osteoporosis47 and atherosclerosis48. Similarly, the other enriched biological process, positive regulation of lymphocyte activation, is in line with existing literature, as B-lymphocytes are known to regulate the RANK-RANKL-OPG pathway that plays a role in basal bone homeostasis, osteoclast formation and the regulation of bone resorption49. Both biological processes are also important for atherosclerosis because the disease involves immune cells such as macrophages and T-lymphocytes50.

The module darkseagreen was enriched with 15 biological pathways based on the Disease Ontology (DO), a knowledge base of human diseases. Three of the DO pathways (coronary artery disease, acute myocardial infarction and atherosclerosis) are related to atherosclerosis, and one of the DO pathways, mouth disease, is related to bone disease51. There were 17 biological pathways related to biological processes, most of which were related to the immune system, which is central to both bone and vascular health.

The most significant genes jointly associated with early traits of both osteoporosis and atherosclerosis were NOSIP (module: honeydew1), GXYLT2 (module: green) and TRIM63 (module: green). NOSIP, expressed mostly in musculoskeletal muscle and also in bone-derived and whole blood cells, is related to the metabolism of nitric oxide, which plays a crucial role in the pathogenesis of both osteoporosis29 and atherosclerosis32. A moderate induction of nitric oxide promotes bone resorption, while constitutive production of nitric oxide induces the proliferation of osteoblast-like cells. GXYLT2 is expressed in whole blood (immune cells) and several other tissues as well. The gene encodes for xylosyltransferase, which plays a role in the biosynthesis of glycosaminoglycan chains, an important constituent of proteoglycans. Proteoglycans are among several extracellular matrix molecules that congregate in atherosclerosis lesions33 and regulate the osteolytic process30. The third significant gene, TRIM63 (Tripartite Motif Containing 63), also known as MURF1 (muscle-specific ring finger protein 1), is mostly expressed in skeletal muscle but moderately also in other tissues, such as whole blood (immune cells) and osteoblastic cells, when induced by glucocorticoids. The expression of TRIM63 has been shown to promote osteoblastic cell differentiation and suppress the proliferation of osteoblastic cells31. While TRIM63 is known to play a role in the regulation of cardiac hypertrophy52, its role in atherosclerosis is unclear. All three genes are related to the immune system, which plays an important role in the development of both osteoporosis and atherosclerosis. TRIM63 is related to the innate immune system. NOSIP is involved in the metabolism of nitric oxide, which is a key player in the immune system. GXYLT2 plays a role in transferase activity, which also affects immune function.

There were certain limitations to the study. The study was based on a relatively young population cohort with an early phase of cardiovascular disease and osteoporosis and very few clinically diagnosed cases. Therefore, the early traits of the diseases used in the study were positively correlated, as shown in our previous study7. A positive correlation between the early traits of osteoporosis and atherosclerosis, such as increasing bone density with an increase in CIMT, is counterintuitive for the comorbidity hypothesis. However, we and others53 suggest that the positive correlation might be reflecting the shared biological mechanisms between bone and vascular tissue during normal growth and development. The study was thus focused on identifying transcriptomic biomarkers from whole blood that are associated with early traits of bone and vascular health. We speculate that the direction of the association between these two sets of traits representing osteoporotic and atherosclerotic comorbidity and, consequently, also the association with the identified transcriptomic biomarkers, may change systematically due to unhealthy lifestyle choices, similarly to the phenomenon known as “decoherence”54. The joint association between the significant modules (green and honeydew) and the early traits of the diseases is weak, albeit significant, perhaps due to the relatively young and healthy cohort, as described above. We, however, believe that the results show a suggestive joint association between the modules and traits of the diseases that warrants further research in a case–control setting that includes participants with clinically diagnosed osteoporosis and atherosclerosis. Another limitation of the study was the time difference (four years) between the measurement of the early traits of the diseases and the transcriptomic profile of the study participants. The study was based on the assumption that there is no substantial change in bone and carotid artery measurements over a four-year period among a healthy young population. Also, all study participants are of European origin. Further research in a case–control setting with populations of different ethnicities are needed. This study was based on microarray technology as RNA-Seq technology was still too expensive at the time of follow-up (year 2011) for a large epidemiological study such as the one in this study.

Summary and conclusion

There is a lack of omics-based studies investigating osteoporosis and atherosclerosis comorbidity in the literature despite strong and clear indications from several studies that the diseases are comorbid. We performed system-level analysis of joint associations between early traits of both diseases and transcriptomics modules. The study identified six genome-wide gene co-expression modules, and several enriched biological pathways within the modules, that are significantly and jointly associated with early traits of both the diseases, supporting our comorbidity hypothesis. Detailed analysis of the gene co-expression modules identified three genes (NOSIP, GXYLT2 and TRIM63) that might play an important role in developing dual-purpose prevention methods.