Background

Asthma is a chronic inflammatory airway disease influenced by genetic and environmental factors, characterized by variable and recurring symptoms (wheezing, coughing, chest tightness, shortness of breath), airflow obstruction and bronchospasm [1]. Asthma affects ~300 million people worldwide, leading to ~250,000 deaths annually [2]. The majority of the asthmatic population is characterized by the development of an inflammatory response following allergen exposure (allergic asthma). The airway epithelium plays a critical role in allergen sensitization in asthma. Allergens and allergen products activate specific receptors in the surface of the epithelial cells, including Toll-like receptors (TLRs). The activated epithelium releases inflammatory cytokines that enhance dendritic cell migration to lymph nodes, recruitment of innate immune and Th2 cells to the airways, and activation of Th2 cells [3, 4].

While no cure exists to date, glucocorticoids (GCs) are commonly prescribed to control asthma symptoms [5, 6]. Glucocorticoids bind to the glucocorticoid receptor (GCR), which interacts and represses (trans-repression) the activity of transcription factors that regulate the expression of inflammatory genes (e.g., transcription factors AP1 and NFKB1). The activated GCR also binds to glucocorticoid response elements (GRE) in the DNA and promotes the expression of genes with anti-inflammatory actions (e.g., GILZ MKP-1) [7].

Although the effects of GCs on the airway epithelium are well known, the mechanisms mediating GCR actions are not as well understood. A significant percentage of the asthmatic population requires the maximum doses of inhaled GCs and suffers the associated side effects and a small fraction of asthmatics are resistant to GCs [8]. Accordingly, increased insight into the molecular mechanisms mediating GC anti-inflammatory actions will assist in the development of therapies that minimize resistance and side effects as well as increase our understanding of the disease. In addition, an understanding of individual responses to GC treatment will assist in establishing treatment regimens tailored for personal susceptibility to GCs.

We previously investigated the effects of a synthetic GC, fluticasone propionate (Flovent), on the transcriptional profile of the airway epithelium of asthmatics [9]. In that study the aim was to identify genes regulated by GCs in the airways, and therefore the focus was on the most significantly altered genes. We demonstrated the GC-dependent regulation of a number of genes, including CLCA1, POSTN, SERPINB2 and FKBP5[9]. To gain further insight into the processes governing the actions of GCs in the airways, we performed an unbiased analysis to interrogate the microarray dataset from a systems-based perspective. Our approach was primarily that of a hypothesis generating experiment, where two complementary network approaches were applied to mine different aspects of cellular networks. The first method used co-transcriptional information to characterize the functional pathways activated in the airways. The second method combined protein-protein interaction (PPI) networks with gene expression changes to identify pathways altered following Flovent treatment. Subsequently, combining results from both networks provided a weight of evidence approach to elucidate mechanistic processes in the action of Flovent in asthmatics. These results were further supported by promoter enrichment analysis to highlight a putative role of PPAR in suppressing the TLR-mediated interferon pathway following GC treatment. The combination of these different approaches demonstrated how multiple analysis methods can be combined in an in silico experiment to generate novel hypotheses. Taken together, these efforts provided a concerted investigation into the action of GCs in the airway epithelium of asthmatics.

Results

Statistical analysis

Figure 1 shows an overview of the analysis workflow followed in this study. Linear model analysis of differences due to Flovent treatment revealed 1,492 genes that were differentially expressed in Flovent versus placebo treated subjects (P < 0.05, Additional file 1: Table S1). The P-values were not corrected for multiple testing and therefore the list of genes is referred to as ‘RAW’. This list included many genes with a purported role in regulation of the immune response in asthma: interleukins (IL2 IL27 and IL33), chemokines (CXCL1 CXCL17 CCL5 and CCL27), mucins (MUC2 MUC13 and MUCL1), TLR7 and NOS2, among others. Functional enrichment analysis resulted in 13 significantly enriched KEGG pathways and 132 GO biological processes (P < 0.05) (Additional file 2: Table S2). Additional file 3: Figure S1 shows the 10 KEGG pathways that resulted after filtering (see Methods), including the top ones ‘Allograft rejection’, ‘Antigen processing and presentation’ and ‘Glutathione metabolism’. Results for GO enrichment included: ‘antigen processing and presentation of peptide antigen via MHC class I’, ‘antigen processing and presentation’ and ‘positive regulation of B cell proliferation’ as the top ranking biological processes. Although the list of uncorrected P-values (RAW) contained many genes and pathways related to asthma, the results possessed a potential high false positive rate (16,015 x 0.05 = 800 theoretical false positives). After correction for multiple testing by the methods of Benjamini and Hochberg [10], 115 genes remained significantly altered with q < 0.05 (Additional file 1: Table S1). This list of genes is referred to as ‘FDR’, and subsequent KEGG pathway enrichment resulted in five pathways overrepresented; ‘Steroid hormone biosynthesis’, ‘Sulfur metabolism’, ‘Primary bile acid biosynthesis’, ‘Endocytosis’ and ‘Retinol metabolism’ (Additional file 2: Table S2 and Additional file 3: Figure S1). GO enrichment analysis returned 123 significant terms including ‘negative regulation of B cell activation’ and ‘B cell mediated immunity’ and other biological processes related to asthma. However, only a few genes were associated with these selected terms (1–2 genes with P ~ 1E-03).

Figure 1
figure 1

Overview of the analysis workflow. This figure shows an overview of the analysis workflow followed in this study as described in the text. From the initial microarray data (orange), three different approaches were pursued (light green). (I) Statistical analysis (II) Co-transcriptional network analysis and (III) Interactome network analysis. Each of these paths ends with a functional analysis (blue), which was combined into a common pathways analysis at the end of the pipeline (purple). Promoter analysis (lavender) was performed afterwards as a validation step for the conclusions from the common pathways analysis. LM = linear models. RAW = unadjusted P-values (not corrected for multiple hypothesis testing).

Analysis of the co-expression network

A co-transcriptional network was constructed from the microarray data using the ARACNE method [11]. The network was built based upon within subject differences in transcript levels between the first sample (pre-treatment) and the second sample (after Flovent treatment or placebo). The inferred network contained 7,676 nodes and 49,499 edges (Additional file 3: Figure S2), and the degree distribution followed a power-law with γ = 1.69 (Additional file 3: Figure S3). The MCODE method was applied to identify clusters of genes with similar expression profiles in the inferred network [12]. This method was originally designed to detect complexes in PPI networks, but can also be used to detect clusters in co-transcriptional networks. MCODE retrieved 109 modules, with module 1 (M1) being the highest ranked (i.e., greatest connectivity in the network). Module M1 contained 150 highly correlated genes (average Spearman ρ = 0.8074; Figure 2). Pathway analysis resulted in 19 significantly enriched KEGG pathways (P < 0.05, Additional file 2: Table S2), which were reduced to 11 after pathway filtering (Additional file 3: Figure S1). Most of the pathways enriched in M1 are directly related to biological processes in asthma, including ‘Asthma’, ‘Cell adhesion molecules (CAMs)’, ‘Antigen processing and presentation’, ‘Cytokine-cytokine receptor interaction’, ‘Toll-like receptor signaling pathway’, ‘PPAR signaling pathway’, ‘Leukocyte transendothelial migration’ and ‘Chemokine signaling pathway’. GO enrichment analysis revealed 313 significantly enriched biological processes (P < 0.05), with many categories related to asthma pathogenesis located top in the ranking (i.e., had lowest P-values). In particular, the most significant term was ‘inflammatory response’ with 28 associated genes (P = 1.12x10-19), including chemokine ligands (CXCL3 CCL4 CCL18), chemokine receptor (CCR1), complement components (C1QA C1QB C1QC C2), integrins (ITGB2), interleukins (IL1B) and lymphocyte antigens (LY86). Other top-ranked terms included ‘leukocyte migration’, ‘regulation of T cell activation’, ‘neutrophil chemotaxis’ and ‘regulation of interleukin 2 production’ (Additional file 2: Table S2).

Figure 2
figure 2

MCODE module M1. Selected KEGG and GO enriched terms in the MCODE module M1. This module contains the greatest number of genes of all modules identified via MCODE from the co-transcriptional network and has the highest MCODE score (i.e., connectivity density). A red label indicates a gene annotated in GO (biological process) with transcription factor activity. This module contains no differentially expressed genes identified at P < 0.05.

The remaining modules generally contained less information, with the exception of modules M5, M6 and M9. Module M5 contained the greatest number of differentially expressed genes (Additional file 3: Figure S4, 26 differentially expressed genes at P < 0.05, and 22 genes with q < 0.05), which were both negatively and positively correlated (absolute mean Spearman ρ = 0.665; Additional file 3: Figure S5). Only three KEGG pathways were overrepresented, including ‘Steroid hormone biosynthesis’ (Additional file 2: Table S2). GO functional analysis revealed 40 biological processes; however, 35 of them consisted of only a single gene association. Overrepresented terms included ‘steroid metabolic process’, ‘thiamin transport’, ‘positive regulation of epithelial cell proliferation’ and ‘progesterone receptor signaling pathway’ (Additional file 2: Table S2). Module M6 contained 39 correlated genes (average Spearman ρ = 0.6730; Additional file 3: Figures S5 and S6) and appeared to be functionally related to module M1. Some of the genes in module M6 participate in pathways enriched in module M1, including TLR4 ('Toll-like receptor signaling pathway'), PPARG and NFAM1 ('PPAR signaling pathway'). Module M9 had 7 correlated genes (average Spearman ρ = 0.8055) genes, including JUN, JUNB and FOS, which are part of the AP1 transcription factor that regulates the expression of inflammatory genes, and is a target of the activated GCR. The enriched KEGG pathways that passed the filtering were ‘B-cell receptor signaling pathway’, ‘MAPK signaling pathway’, ‘T cell receptor signaling pathway’ and ‘Toll-like receptor signaling pathway’ (Additional file 3: Figure S7; Additional file 2: Table S2). The top-ten ranked GO terms included ‘response to cytokine stimulus’ and ‘SMAD protein signal transduction’ (Additional file 2: Table S2).

Analysis of the human interactome

Protein-protein interaction (PPI) networks represent the skeleton on which signaling pathways are constructed and can be used to identify modules of differentially expressed genes related to the same or overlapping signaling pathways [13]. The human interactome was obtained from the BioGRID database, and the BioNet software was used to identify modules significantly enriched in differentially expressed genes [14, 15]. BioNet detected a subnetwork containing 32 genes, including 19 upregulated and 4 downregulated (P < 0.05, 19 with q < 0.05) by Flovent treatment (Figure 3). KEGG pathway enrichment analysis of this module highlighted 9 significant pathways including ‘Toll-like receptor signaling pathway’, ‘Steroid hormone biosynthesis’, ‘PPAR signaling pathway’, ‘ErbB signaling pathway’, and ‘Adipocytokine signaling pathway’ (Additional file 3: Figure S1; Additional file 2: Table S2). GO analysis identified 207 processes, with ‘alpha-beta T cell activation during immune response’ as the top-third term, although it only had a single associated gene. Some other significant biological processes not in the top rank have a direct relation to asthma pathogenesis, including ‘negative regulation of T-helper 2 type immune response’, ‘response to cytokine stimulus’, ‘regulation of inflammatory response’, ‘SMAD protein signaling transduction’ and ‘cellular response to hormone stimulus’ (Additional file 2: Table S2). However, these terms also show a low number (1–2) of associated genes.

Figure 3
figure 3

BioNet module. BioNet module annotated with selected KEGG pathways and GO terms obtained from functional enrichment analysis. Differentially expressed genes (P < 0.05) are indicated with a red circle (up-regulated in response to Flovent treatment) or a blue circle (down-regulated in response to Flovent treatment). A red label indicates a gene annotated in GO (biological process) with transcription factor activity. Grey circles indicate genes not expressed (based on XIST filtering - see Methods), which are connected to other genes through gray dashed lines.

Analysis of common pathways

Networks were constructed using the two different approaches of co-transcriptional and PPI-based analyses. To identify cellular processes active in the airways of asthmatics that are targeted by GC treatment, we examined common pathways found in the different modules. We focused on common pathways between the highest ranked MCODE module (M1) and the PPI module (BIONET). The pathways, along with the number of genes in each pathway, are shown in the heatmap in Figure 4. The only common pathways were ‘Toll-like receptor signaling pathway’ and ‘PPAR signaling pathway’. To determine whether these two pathways appeared in common by chance, we performed a simulation with random lists of genes of the same size as the BIONET and M1 modules. For each random list, we performed enrichment analysis and determined the common pathways. By repeating this process 1,000 times and computing the number of common pathways similar to those detected in our analysis of the Flovent dataset, we estimated the probability of observing common pathways by chance. The simulation resulted in a probability of random appearance as common pathways of P = 0.002 for ‘Toll-like receptor signaling pathway’. ‘PPAR signaling pathway’ did not appear as a common pathway in any of the simulations (P = 0). These results demonstrate the robustness of the common pathways approach.

Figure 4
figure 4

Common pathways analysis. Analysis of common pathways highlighted pathways present in the BioNet and selected MCODE modules. This approach resulted in Toll-like receptor signaling and PPAR signaling pathways as the pathways in common with other modules. See Additional file 3: Figure S1 for a full summary of the functional enrichment analysis.

Promoter enrichment analysis

The analysis of common pathways highlighted the TLR and PPAR pathways as targets of GC actions. Consequently, some of the genes regulated by these pathways should be differentially expressed following Flovent treatment. To explore this possibility we analyzed the promoter region of differentially expressed genes (q < 0.05) to look for transcription factor binding sites (TFBS) associated with TLR and PPAR pathways. The method Pscan was used to find enriched TFBS from the JASPAR and TRANSFAC databases [16]. The top enriched motifs in the JASPAR database were IRF2 and IRF1 (P = 4.1E-4 and P = 4.1E-3), whereas the top motif in the TRANSFAC database was V$IRSE_01 (P = 4.1E-3) (Additional file 4: Table S4). Motifs V$IRF1 (P = 0.027), V$IRF2 (P = 0.041) and V$IRF7 (P = 0.07) were also significantly enriched in the TRANSFAC database. ISRE (Interferon Stimulated Response Element) motifs are involved in the regulation of gene expression in response to interferon type 1 (IFNA-alpha/beta) activation [17]. IRF motifs are the target of Interferon Regulatory Factors (IRFs), which function as transcriptional co-activators of interferon-regulated genes [18, 19]. Interestingly, type 1 interferons (alpha and beta) are produced after activation of TLRs (especially TLR4 and TLR7). The TFBS results also showed enrichment of PPARG/RXRA binding motifs from the JASPAR database (PPARG:::RXRA, P = 0.0182) and the TRANSFAC database (V$PPARG_01; P = 0.084).

The enrichment motifs determined by Pscan provided the possibility that a transcription factor could bind to the promoter of genes. Cscan (http://159.149.109.9/cscan/) was used in order to obtain some experimental evidence for binding of transcription factors to the promoter region of the list of differentially expressed genes [20]. Cscan compiles ChIP-seq data from published experiments assessing the binding of several transcription factors and epigenetic markers in different cell lines for human and mice. Results from Cscan identified GCR as the most significantly enriched transcription factor in six different cell lines subjected to variable doses of a glucocorticoid receptor agonist (Additional file 4: Table S4). STAT1 and RXRA binding were detected at the respective ranks of 10 and 12 (Bonferroni corrected P-values: q = 0.0009 and q = 0.0018, respectively). Binding for PPARs or IFR9 could not be assessed due to a lack of available experimental data.

Interferon regulatory network

Promoter analysis suggested the type 1 interferon pathway as the most important target of GC actions. To further validate this finding we queried the database Interferome (http://www.interferome.org), which maintains a curated list of known interferon-regulated genes extracted from studies on which control and interferon-treated samples were analyzed. We found that 20 out of 114 genes in the FDR list (17%) have been previously identified as being regulated by interferons (Additional file 5: Table S5). This results in an associated probability of rejecting the null (no enrichment) hypothesis based on the hypergeometric distribution of P = 0.0505, indicating that the FDR list is enriched in interferon-regulated genes. Additional file 5: Table S5 contains the number of datasets in which regulation by a certain type of interferon was found. Type 1 interferons regulated 17 out of 20 genes in a total of 72 datasets, whereas type 2 interferons were the only source of regulation for three genes in a total of 3 datasets. Taken together, these results suggest that GCs target the type 1 interferon pathway in the epithelium of asthmatics.

Discussion

Classical statistical and functional enrichment analysis of the Flovent/placebo dataset provided little insight into the biological processes associated with the treated asthmatics. Results from the uncorrected tests contained many genes known to be associated with asthma, and some enriched pathways and GO terms linked these genes to inflammatory responses and metabolism of GCs. However, the high number of potential false positives hampered the results and correction for multiple testing eliminated most of these genes from the final list. Genes selected after multiple testing correction (FDR) identified the SERPINB2 and FKBP5 genes, which were previously validated [9]. However, the FDR approach failed to identify CLCA1, another gene experimentally confirmed [9], although it was significantly altered based on uncorrected P-values. This effect could partially be attributed to the reduced statistical power associated with the use of multiple testing correction methods compared to the uncorrected analysis [21]. It has been well demonstrated that the excessive penalty imposed on P-values following FDR adjustment increases the number of false negatives [22]. On the other hand, POSTN, which was also experimentally validated, had a P > 0.05 and therefore was not differentially expressed in our analysis. This comparison highlights some of the limitations associated with statistical analyses that rely on a single source of evidence to derive conclusions on biological function, and argues for the need for integrative approaches to data analysis.

Network analysis offers some advantages over classical analysis by being able to incorporate additional information from multiple sources [21, 23, 24]. We applied a weight-of-evidence approach by using multiple methods that mine different aspects of the cellular processes, which converged to ‘Toll-like receptor signaling pathway’ and ‘PPAR signaling pathway’ as the most relevant pathways related to GC treatment (Figure 4). Toll-like receptors (TLRs) regulate inflammatory responses by inducing the expression of inflammatory cytokines upon binding of viral or bacterial proteins and mediate the signaling pathways that regulate innate and Th2 responses in the epithelium [3, 25, 26]. Toll-like receptors, including TLR4 (found in module M6) and TLR7 (found in module M1) mediate the production of interferons alpha and beta (type 1 interferons) [27]. This important link between TLR and the interferon 1 pathway, not evident from the pathway enrichment analysis, was revealed by the promoter enrichment analysis.

Type 1 interferons act in an autocrine or paracrine manner and bind to interferon-alpha receptors in the membrane. This activates the Janus kinase and signal transducers and activators of transcription (JAK/STAT) pathway, which regulates the expression of inflammatory genes [17]. Phosphorylation of STAT1 by JAK1 activates STAT1, which forms a heterodimer with STAT2 and recruits IRF9 to form the IFN-stimulated gene factor 3 (ISGF3) transcriptional complex [18, 19]. The analysis of the human interactome identified STAT1 and IRF9, which are downregulated (P < 0.05) by Flovent (Figure 3). In addition, promoter analysis identified ISRE as the top enriched transcription factor motif in the list of differentially expressed genes (q < 0.05). This suggests a mechanism in which GCs regulate the type 1 interferon pathway by downregulating the proteins involved in interferon signal transduction. However, although both STAT1 and IRF9 are suggested as differentially expressed (P < 0.05) and exhibited correlated expression profiles (r = 0.771, P = 3.7E-07), their correlation with treatment is less clear (Figure 5). Interestingly, GCs are known to regulate the type 1 interferon pathway by suppressing the phosphorylation of STAT1[28], providing an alternative mechanism of regulation.

Figure 5
figure 5

STAT1 and IRF9 expression profiles. STAT1 and IRF9 show correlated expression profiles and are downregulated following Flovent treatment.

STAT1 phosphorylation also mediates the transduction of type 2 interferon (IFN-gamma) signaling by binding to gamma-activated sequence (GAS) elements as an homodimer. Therefore, it is possible that downregulation of STAT1 and inhibition of phosphorylation of the protein represses type 2 interferon signaling. Indeed, GCs are known to inhibit IFN-gamma signaling by downregulating STAT1 mRNA and protein expression in PBMCs [29]. Gene expression for type 2 interferon and its receptor (IFN-gamma receptor, IFNGR) were found in the microarray dataset after filtering non-expressed genes (not shown). However, promoter analysis only identified enrichment of ISRE elements, the binding motif for the ISGF3 transcriptional complex, GAS elements were not detected. In addition, interrogation of the interferome database resulted in a list of interferon-regulated genes that were, in the majority of datasets assayed, regulated by type 1 interferons. Consequently, although a role of type 2 interferons cannot be excluded, our findings suggest that type 1 interferons are the target of GCs in the epithelium of asthmatics.

Peroxisome proliferator-activated receptors (PPAR) are lipid-activated transcription factors that regulate the expression of target genes [30]. PPAR-alpha and -gamma modulate allergic inflammation, and agonists are able to reduce levels of inflammatory cytokines [31, 32]. PPARs exert their activity by forming a heterodimer with the retinoid receptor RXRA, which then binds to co-activator proteins, including PPARGC1A (PGC-1), to regulate gene expression. Transcriptional co-activators amplify the transcription of nuclear receptor regulated target genes [33]. In particular, PPARGC1A can recruit other co-activator proteins with histone acetyltransferase (HAT) activity that open up the chromatin and enhance the expression of target genes [34]. Consequently, an increase in the expression of RXRA and PPARGC1A co-activators can intensify the activity of the PPAR pathway, resulting in an increased repression of STAT1 phosphorylation. Both RXRA and PPARGC1A are present in the BioNet module and are upregulated by Flovent (Figure 3). Promoter analysis also resulted in an enrichment of genes with PPARG:RXRA motifs (P ~ 0.01). This suggests that GCs modulates the activity of the PPAR pathway by upregulating co-transcriptional activators. Supporting this evidence, the expression profiles for RXRA and PPARGC1A are correlated (r = 0.4078, P = 0.022) and show a trend of higher levels in Flovent-treated patients (Additional file 3: Figure S8).

Our results link GCs to type 1 interferon and PPAR pathways. There is previous evidence that connects PPARs to interferons, providing an interesting mechanism integrating GC actions. Activation of the PPAR-alpha (PPARA) pathway was found to suppress STAT1 phosphorylation in rat glia [35]. However, in our study promoter analysis identified PPARG, but not PPARA (P ~ 0.2), motifs enriched in the list of differentially expressed genes. This lack of significance could be explained by the fact that PPARA motifs are missing in the JASPAR database, which retrieved higher significant results for the complex PPARG:RXRA transcription factor motif. In addition, our search strategy looked for motifs in the 1 kb region upstream of the transcription start site, whereas some transcription factors can bind to more distal locations. Alternatively, as our data suggest, GC action on type 1 interferons may be mediated via a PPAR-gamma-dependent process. For example, it has been shown that PPAR-gamma can repress the type 1 interferon pathway by downregulating the production of INF-beta upon TLR4 activation [36]. Treatment with the PPAR-gamma agonist troglitazone and challenge with LPS and poly(I:C) impaired IRF3 binding to the IFN-beta-promoter. Downregulation of IFN-beta prevented activation of the IFN-beta receptor and subsequent STAT1 phosphorylation and ISRE activation [36]. However, in our dataset type 1 interferons were present, but not differentially expressed (average P ~ 0.58 for type 1 interferon genes in the microarray). Interestingly, activation of the PPAR-gamma pathway was previously found to downregulate the expression of IFN-gamma activated genes [37]. These findings highlight the complex nature of PPAR-mediated interferon regulation, which can affect different pathways (type 1 vs. type 2) at multiple regulatory points.

Our findings provide a link between interferon, PPARs and GCs that suggests a model similar to that presented in Figure 6. In this model, allergens activate the TLR signaling, which in turn activates the production of type 1 interferons. The alpha/beta interferons then bind to interferon receptors (IFNAR1/2) that stimulate the phosphorylation of STAT1 and promote the expression of genes with a number of inflammatory effects. Treatment with GCs upregulates PPARGC1A and RXRA coactivator molecules, which consequently enhance the PPAR pathway. Activation of the PPAR pathway inhibits phosphorylation of STAT1 and therefore inhibits the interferon pathway. In addition, GCs may repress the interferon pathway by downregulation of STAT1 and IRF9 transcription factors. In this model, PPARs could be potential mediators of the anti-inflammatory actions of GCs. Both PPAR-alpha and -gamma inhibit airway inflammation in a murine module of asthma [32]. The use of PPAR-gamma agonists has been shown to evidence improvements in lung function of smokers with asthma (improved FEV1; forced expiratory volume in 1 sec), who had previously demonstrated a reduced response to GC treatment [38]. While the mechanism for the observed improvements in lung function was unclear, it was postulated that PPAR-gamma could independently modulate a set of inflammatory genes relative to GCs [39].

Figure 6
figure 6

Model of glucocorticoid (GC) actions in the airways epithelium. Simplified (A) and detailed (B) model of PPAR-mediated anti-inflammatory actions of GC in lung epithelial cells. Following activation by allergens, TLRs activate the type 1 interferon pathway. Interferons induce the expression of inflammatory genes mediated by phosphorylated STAT1 and IRF9 transcription factors. GC treatment represses the interferon pathway via two potential mechanisms. First, upregulation (orange boxes) of RXRA and PPARGC1A enhances the activity of the PPAR pathway, which inhibits STAT1 phosphorylation, repressing interferon signaling. Secondly, down-regulation (blue boxes) of STAT1 and IRF9, reduces the amount of the transcriptional complex responsible for interferon-regulated gene expression. Clear arrowheads indicate phosphorylation. Solid arrowheads indicate activation. A dashed line represents transcriptional regulation (arrow: activation; T: repression).

A link between GCs and the interferon pathways has been previously reported [5, 4042], and interferons are known to affect symptoms in asthmatics [4346]. Interferon-alpha has been associated with severe exacerbation of asthma symptoms [43], which provides a simple mechanistic interpretation of the beneficial effects of GC-mediated repression of the interferon pathway. On the other hand, low doses of interferon-alpha have been associated with therapeutic effects in GC-resistant patients [4446]. It is unclear if these disparate responses are due to differences in interferon dose or different patient phenotype. This clearly highlights the inherent complexity of the underlying regulatory networks and the need of further studies investigating the mechanisms of GCs and their relation with the interferon pathway.

Conclusions

Multifactorial diseases such as asthma challenge our ability to identify distinct mechanisms involved in disease etiology and pathology. The advent of high-throughput approaches has increased the amount of information that can be extracted from a sample set, but linking together disparate datasets remains an elusive goal. Technologies that summarize the principal disease processes affected (e.g., functional enrichment analysis) assist in characterizing the primary functional features, but often fail to suggest detailed information of disease and treatment action. Accordingly, approaches that incorporate information regarding multiple biological aspects from disparate sources can stimulate the generation of hypotheses that will uncover novel mechanisms. Given that asthma is a heterogeneous inflammatory condition, it is of interest to examine for distinct molecular phenotypes. For example, it has been shown that responsiveness to inhaled GCs correlates with the degree of Th2 inflammation. It is possible that distinct subgroups of asthmatics exist with specific shifts in PPAR/TLR pathways that correlate with GC responsiveness or other molecular phenotypes, such as the IL-13 related Th2 sub-phenotypes previously described for the current data set [47, 48]. Further investigations to investigate sub-phenotypical responses in the PPAR/TLR pathways are thus warranted.

With the workflow presented herein, we were able to identify specific pathways as being targets of GCs in the epithelium of asthmatics. Results highlighted the prominent role of interferons in mediating inflammatory processes in asthmatics, and further supported the finding that GCs mediate their activities by repressing the interferon pathway. We found that this repression may be mediated by GC-dependent upregulation of the PPAR pathway. A potential mechanism of direct repression of the interferon pathway by downregulation of key transcription factors is also suggested. If confirmed, these findings will be potentially valuable in the design of new therapies tailored for the GC-resistant asthmatic sub-population.

Materials and methods

Microarray data acquisition and preprocessing

The microarray data set was published previously, and is available in the GEO database (accession number GSE4302) [9]. The dataset consisted of 64 Affymetrix HGU133plus2 arrays measuring expression profiles from airway epithelial brush biopsies obtained from 32 asthmatics in a randomized, blinded parallel group study design with sampling at baseline levels and after 1 week treatment with either placebo or fluticasone propionate (Flovent). Observation 55 was identified as an outlier due to poor technical quality of the microarray data, and was excluded from all analyses. Custom CDFs (Chip Description File) from the BrainArray Project were used to map the array probes to 17,788 probe sets corresponding to individual Entrez Gene entries [49]. Arrays were preprocessed with the RMA method from the Bioconductor package affy[50, 51]. Non-expressed genes were eliminated on the basis of XIST expression, which is not expressed in males [52]. The distribution of intensities for XIST in male and females was computed (Additional file 3: Figure S9), and the background threshold estimated as the median value of the distribution in males (log2 median intensity = 5.32). Genes were eliminated when the intensity in all arrays was below that of the determined threshold value. The resulting microarray data contained 16,015 remaining genes. Baseline correction was performed by computing the log difference of the intensity values from baseline and post-treated arrays prior to statistical analysis.

Statistical analysis

Statistical analysis was performed with the Bioconductor package limma[53]. A linear model was constructed including drug (i.e., Flovent or placebo), sex and age in the design matrix. P values associated with the comparison Flovent vs. placebo were computed. Multiple testing correction was applied using the method of Benjamini and Hochberg [10].

Co-transcriptional network

A co-transcriptional network was constructed from the microarray data using the ARACNE method, which uses mutual information (MI) to test statistical dependency between expression profiles [11]. Due to concerns about the effects of low sample numbers in combination with the inherent biological variability upon the statistical dependency, all individuals in the dataset (Flovent-treated and placebo-treated) were included in the co-transcriptional network inference. To reduce the inter-individual variance from the gene expression profiles, expression values were normalized to the baseline values for each individual (i.e., samples collected prior to initialization of treatment regimen). Mutual information was computed using a Gaussian Kernel estimate. Two critical parameters for the ARACNE method are the kernel width and the MI threshold, which are normally estimated using precomputed calibration curves. However, for this project, custom calibration curves were constructed using the MATLAB scripts provided in the original publication [11]. A calibration curve for kernel width was generated (Additional file 3: Figure S10A) and the optimal kernel width was determined to be 0.2179931, which is 5.3% lower than that estimated by ARACNE (0.230167). The calibration curve for the MI threshold was computed (Additional file 3: Figure S10B) and used during network reconstruction. Next, the optimal P-value cutoff for network reconstruction was determined by computing networks at different P-values over a sufficiently wide range. For each network, the number of nodes, edges and degree distribution was computed (Additional file 3: Figure S11A), and a P-value of 1.0E-8 was chosen, which resulted in a power law degree distribution for the network (Additional file 3: Figures S3 and S11B). Finally, in order to assess the effect of sample ordering upon correlation estimations (MI), a bootstrapped network was computed by setting 100 cycles and the consensus network was calculated with a Bonferroni corrected q < 0.001[54].

Protein-protein interaction network

Information about protein-protein interactions (PPI) was obtained from the BioGRID database [15], which contains experimentally validated data from a range of high-throughput and low-throughput techniques. The database version 3.1.70 contained information on 9,935 genes, which formed a network with 50,531 edges. This network contained multiple edges joining the same pair of genes (i.e., the interaction has several sources of evidence), and therefore for module detection the network was simplified by using the simplify function in the R package igraph (http://igraph.sourceforge.net), to reduce the number of edges per pair of genes to one. The resulting simplified network contained 33,282 edges.

Module detection

For the co-transcriptional network, detection of clusters of densely connected nodes was performed using the MCODE plugin for the Cytoscape software with default parameters [55]. For the BioGRID PPI network, module detection was performed using the Bioconductor package BioNet [14], which enables the determination of an exact solution to the problem of finding connected sub-graphs with low P-values. A Binomial Uniform Mixture (BUM) model was fitted to the distribution of P-values, and scores were derived for each network node at a given FDR (q = 0.05). Finally, these scores were used to detect modules using the exact Heinz method [56].

Functional enrichment analysis

Functional enrichment analysis for KEGG pathways and Gene Ontology (GO) Biological Process (BP) terms (downloaded on 2010/09/07 and 2010/09/04 respectively) was performed using a hypergeometric test as implemented in the Bioconductor package GOstats[57]. For GO, a conditional test that considers the dependence structure of the GO terms was performed [58]. The lists of probe sets from the statistical analysis and modules from the co-transcriptional network were converted into lists of Entrez Gene identifiers prior to enrichment analysis. Enrichment results were considered significant at P < 0.05.

KEGG pathways were restricted to those involved in biological processes. Consequently, disease pathways were discarded (except the KEGG pathway ‘Asthma’). In addition, pathways that were supported with less than two genes in any module or list of differentially expressed genes were discarded. The final list of KEGG pathways was used to filter modules in the co-transcriptional network. Modules that did not contain any genes annotated in the KEGG pathways that passed the filtering were discarded. Pathways were grouped using a hierarchical clustering with Pearson correlation as a distance measure, and the gene counts displayed as a heatmap.

Common pathways analysis

To link the results from the co-transcriptional and PPI network analyses, common pathways were selected when present simultaneously in the BIONET module and the MCODE module M1 from the co-transcriptional network. To assess the possibility that the selected common pathways appeared by chance in our dataset, we performed simulations with random lists of genes. First, background lists of genes were generated by randomly selecting genes from the total number of genes (universe) used for enrichment analysis. The number of genes in each simulated list was the same as the target list (i.e., the same number of genes as found in the BIONET and MCODE module M1). A KEGG enrichment analysis was then performed for each list as described above, and the common pathways were computed. Finally, the common pathways in the simulation were compared to those obtained with the real dataset and the number of positive hits n recorded. This process was repeated N = 1000 times and the probability of a pathway appearing by chance was computed as the number of times it appeared as common in the simulated datasets divided by the number of simulations (P = n / N).

Promoter enrichment analysis

Promoter enrichment analysis was performed to identify transcription factor binding motifs enriched in the promoter regions of sets of genes by using the software Pscan[16]. Entrez Gene identifiers were converted into REFSEQ ids and these were loaded into the Pscan web server. JASPAR and TRANSFAC databases of transcription factor motifs were selected to analyze the region from −950 to +50 bases relative to the transcription start site (TSS). Transcription factor binding assessment was performed to identify experimental evidence (based on published ChIP-seq data) of binding of transcription factors to the promoter region of the differentially expressed genes. Entrez Gene identifiers were converted into REFSEQ ids, which were loaded into the Cscan web server [20]. A region between −1,000 upstream of the TSS and the transcribed region of the gene was selected to identify transcription factor binding events.

Interferon regulated genes

The database Interferome (http://www.interferome.org) was used to identify interferon target genes [59]. This database catalogs genes regulated in microarray and other experiments where comparisons between control and interferon treated samples were performed. Entrez Gene identifiers were converted into EnsEMBL gene identifiers and loaded into the Interferome web server. Enrichment of interferon-regulated genes was computed using the hypergeometric test implemented in the R software. The parameters were set with 20,900 as total number of genes in the human genome (based on data from the EnsEMBL database release 64) and 2,000 as the number of interferon-regulated genes (as stated in the Interferome web site). Since the number of interferon-regulated genes is an estimate, the enrichment calculation is only approximate.