Background

MicroRNAs (miRNAs) are short single stranded, non-coding RNAs that regulate protein-coding mRNAs [14]. Mature miRNAs cause either target mRNA degradation or translational repression [4] by inducing cleavage or inhibiting translation in the 3'-untranslated regions (UTRs) of the target mRNA [2, 3]. In spite of the continuous attempts to identify miRNAs and to elucidate their basic mechanisms of action, little is understood about their biological functions.

Because of the regulatory role of miRNAs [5] and lack of direct functional annotation to miRNAs, current functional enrichment methods for miRNAs rely instead on their target genes' functional annotations [68]. If the target genes of a specific miRNA are significantly enriched with a set of Gene Ontology (GO) terms, it is reasonable to infer that the miRNA is also involved in the same GO annotations. As only few experimentally validated targets are available, current methods of target gene's annotation-based inference of miRNA function rely on target prediction algorithms such as TargetScan [9, 10] and Pictar [11].

Many studies on miRNAs have used this "predicted target-genes’ functional annotation-based" miRNA function prediction strategy. Gaidatzis et al. [12] applied a log-likelihood test for functional enrichment analysis for KEGG pathways. Gusev [13] used hypergeometric distributions for GO and pathway-based enrichment analysis. Xu and Wong [14] applied hypergeometric distribution test to detect significant over-representation of miRNA cluster targets in BioCarta pathways. Similar methods using GO, KEGG and BioCarta pathways were implemented in miRGator [15] and SigTerms [16], applying hypergeometric distributions to evaluate functional enrichment.

The target links from miRNAs to genes, however, show very uneven distributions. So do the links from genes to GO terms. One miRNA may regulate more than several hundreds of targets and one gene may be controlled by many miRNAs [17]. In contrast, the current methods that rely only on the predicted target genes' functional annotations are not powerful enough to capture such variability. For instance, if a certain miRNA targeting hundreds of genes is shared by different miRNA clusters, the clusters' functional annotations may become very similar even though they consist of very different miRNA members, just because they share the 'very bush' one. Another limitation of the current methods is that they treat all target genes equally. One should differently weight genes that are targeted by only one member from those that are targeted by all members of a miRNA cluster. In summary, the current functional enrichment methods for miRNA cluster have limitations of not considering the tri-partite network topologies from miRNAs to genes to functional annotations regarding multiplicity and cooperativity, containing more information than simple target gene counts.

For the purpose of illustration, Figure 1(A) and 1(B) exhibit example cases where the same numbers of miRNAs (k = 5) from equal-sized clusters (k = 6) are targeting the same numbers of target genes (k = 6) from equal number of genes (k = 11) that are annotated to a specific GO term, GO:0030282 and GO:0051482, respectively. The numbers of target links between Figure 1(A) and 1(B), however, are differently 8 and 22, respectively. Figure 1(C) and 1(D) exhibit cases where the numbers of miRNAs connected to a specific GO term, GO:0015917 and GO:0030851, are differently 6 and 3, respectively, while the numbers of links (k = 6) are the same. It is clearly demonstrated that the current approach only based on target gene counts is unable to discern the difference in these targeting relations.

Figure 1
figure 1

Indiscernibility example. Calculating target gene-centric (ρ) hypergeometric distribution cannot discern the completely different targeting topologies between (A) and (B) and between (C) and (D), resulting the same p-values (p = 0.30325 and 0.31120), respectively. The target link-centric (τ p-values can discriminate (A) and (B) (i.e., p = 0.62358 and 0.00956, respectively) and the miRNA-centric (μ p-values can discriminate (C) and (D) (i.e., p = 0.00695 and 0.65253, respectively). *p < 0.05, hypergeometric test.

The present study proposes a more generalized conceptual framework to develop and analyze new functional enrichment measures. According to the framework, the traditional "predicted target-genes' functional annotation-based" miRNA function prediction method is regarded as 'target gene-centric' denoted by ρ because it eventually considers only the fraction of the target genes among those that are annotated to a specific GO. Under the proposed framework, we derive 'target link-centric' (τ) and 'miRNA-centric' (μ) measures, considering the numbers of links and miRNAs linked to a specific GO term.

Figure 1 illustrates that while the traditional target gene-centric ρ measure cannot discern (A) and (B) (p = 0.30325) nor (C) and (D) (p = 0.31120), the newly proposed τ and μ measures successfully discern (A) and (B) (i.e., p = 0.62358 and p = 0.00956, respectively) and (C) and (D) (i.e., p = 0.00695 and p = 0.65253, respectively). It is clearly demonstrated that different measures calculated from different viewpoints significantly impact the result of functional enrichment analysis of miRNA clusters. We also propose a rank statistic for the purpose of systematic comparison in terms of the average specificity and functional homogeneity of the significantly enriched term for each GO category, Biological Process (BP), Molecular Functions (MF), and Cellular Components (CC). We show that the proposed miRNA-centric measures identify more specific and functionally homogenous sets of GO annotations for miRNA clusters.

Methods

Dataset: miRNA clusters

We used publicly available co-expressed and differentially expressed miRNA clusters for comparative evaluation of the proposed methods. For co-expressed miRNA clusters, we obtained the data created by Ruepp et al. [18] that show correlated expression patterns across several human diseases. The data can be downloaded from Ruepp et al. [18] (http://genomebiology.com/content/supplementary/gb-2010-11-1-r6-s2.xls). Forty three among the 47 clusters having at least one target gene were used in this study. Differentially expressed miRNA sets consisting of up- or down-regulated genes in six solid tumors were also downloaded [19]. MiRNAs down-regulated in colon cancer had no target gene and hence were excluded in the present study. Supplement Tables S1 and S2 in 'Additional file 1' list the 54 (= 43 + (2 × 6) - 1) miRNA clusters from the two studies with the associated information.

Creating variations of miRNA-mRNA target pairs for comprehensive evaluation

Another input of our analysis is the target gene list of each miRNA that will guide the functional enrichment test based on the gene annotations. Considering that only a few experimentally validated miRNA targets are available, we use miRNA-mRNA target pairs obtained from computational target prediction methods. Prediction algorithms generate a relatively high level of false positives [20] and the degree of overlap between predicted targets from different methods is often poor or null [21]. Given the lack of 'gold standard' for miRNA and target gene pairs, we consider a wide range of variations in miRNA-gene pair relations for comprehensive evaluation. We used miRecords [22] and miRGen [23], which are integrated resources of miRNA-target interactions from 11 established target prediction algorithms and from four most widely used target prediction programs, respectively. We created 21 variations for predicted target pairs by considering the number of positive voters from the included algorithms by miRecords (Table 1, upper panel) and six variations by applying the four programs of miRGen (Table 1, lower panel). Because most of the evaluation results from these variations were largely comparable, the most representative variation #6 in Table 1 was used to describe the overall study results in the following sections. Variation #6 was created by applying the 11 algorithms provided by miRecords, wining more than three positive voters and resulting in 1,569,741 target links from 553 miRNAs to 17,636 genes. As the number of required positive voters is increasing, the numbers of miRNAs, links and genes are decreasing as can be seen in Table 1.

Table 1 Variation for predicted miRNA-gene target pairs

Target gene-, target relation-, and miRNA-centric calculations of hypergeometric distributions

Now we describe the details of the proposed measures in a proposed conceptual framework. Suppose we want to test the functional enrichment of a miRNA cluster with respect to a specific GO term (or annotation). In most previous approaches, one first constructs a corresponding target gene cluster consisting of all the genes targeted by at least one member in the miRNA cluster. Then the numbers of target genes annotated (ρ i ) and not annotated (ρ j ) by the GO term are used in the two by two contingency table along with the numbers of genes not in the target cluster and are either annotated (ρ k ) or not annotated (ρ l ) with the term, as shown in Figure 2(B). Functional enrichment is tested from this contingency table using a hypergeometric distribution. These traditional target gene-centric (ρ) methods are limited in that they consider only the fraction of target genes connected to a specific annotation for each annotation [1214], as already illustrated in Figure 1. To this rather confusing problem, the diagram and contingency tables in Figure 2 provide a conceptual framework to understand and correctly design new functional enrichment measures. The diagram of miRNA, gene and annotation worlds in Figure 2(A) depicts the tri-partite network topology between the three worlds such that one can drive the quartet numbers to create contingency tables for miRNA-centric (τ) and target link-centric (μ) as well as for the target gene-centric (ρ) measures (Figure 2(B)~(D)).

Figure 2
figure 2

Framework for developing three types of miRNA functional enrichment measures. A conceptual framework is constructed to consider the tri-partite network topology. (A) A miRNA cluster under investigation contains the members, μ i and μ j , targeting genes that are associated (ρ i ) and not associated (ρ j ) with a specific GO term of interest through τ i and τ j , respectively. Non-member miRNAs may be associated (μ k ρ k ) or not (μ l ρ l ) with the GO term through τ k and τ l . Counts for (D) miRNA-centric (μ) and (C) target link-centric (τ) as well as (B) target gene-centric (ρ) are listed by two-by-two contingency tables. The closed and broken circles in the miRNA world depict the miRNA cluster under investigation and the subset miRNAs targeting the genes that are associated with a specific GO term of interest.

Under this conceptual framework in Figure 2, subscripts i and k represent positive and subscripts j and l negative connections to the GO term. Subscripts i and j represents connections from inside of and k and l from outside of the targeting miRNA or target gene clusters. The traditional ρ i and ρ j , for example, correspond to the sets of target genes that are annotated (ρ i ) and not annotated (ρ j ) to a specific GO term. ρ k and ρ l denote non-targeted genes that are annotated (ρ k ) and not annotated (ρ l ) to the GO term. We can develop a miRNA-centric measure in the conceptualized three framework in a consistent way. We define μ i and μ j as the miRNAs in the cluster whose target genes are annotated (μ i ) and not annotated (μ j ) to the GO term. As in the case of a gene-centric measure, μ k and μ l correspond to miRNAs outside of the cluster whose target genes are annotated (μ k ) and not annotated (μ k ) to the GO term. Similarly, for a target link-centric measure, we define τ i and τ j as the target links connecting members of the miRNA cluster in μ i and in μ j , respectively, to genes that are connected (ρ i ) and not connected (ρ j ) to a specific GO term. Remaining miRNAs outside the cluster, μ k and μ l , target genes through τ k and τ l that are headed to genes that are connected (ρ k ) and not connected (ρ l ) to the GO term.

To formally define the three measures, let ρ, τ, and μ be the random variables that represent the number of target genes, target links, miRNAs, respectively, which are linked to a specific GO term as explained above. The following three equations, (1), (2), and (3), describe the hypergeometric distributions of ρ, τ, and μ, respectively.

p r o b a b i l i t y ( ρ = ρ i ) = ρ i + ρ k ρ i ρ j + ρ l ρ j ρ i + ρ j + ρ k + ρ l ρ i + ρ j
(1)
p r o b a b i l i t y ( τ = τ i ) = τ i + τ k τ i τ j + τ l τ j τ i + τ j + τ k + τ l τ i + τ j
(2)
p r o b a b i l i t y ( μ = μ i ) = μ i + μ k μ i μ j + μ l μ j μ i + μ j + μ k + μ l μ i + μ j
(3)

Note that for notational convenience, we now used ρ a , τ a , μ a for a ∈ {i, j, k, l}, instead of |ρ a |, etc., to represent the number of members in the corresponding set by abuse of notation. The p-value for the enrichment test from hypergeometric distribution of the random variable ρ is calculated from the cumulative probability of observing at least ρ i out of ρ i + ρ j times. Accordingly, the p-value from each of the three measures can be defined as follows;

p - v a l u e ρ = p r o b a b i l i t y ( ρ ρ i )
p - v a l u e τ = p r o b a b i l i t y ( τ τ i )
p - v a l u e μ = p r o b a b i l i t y ( μ μ i )

These probabilities are computed using the phyper and dhyper functions in R 'stats' package.

Combining P-values

For the purpose of comprehensive evaluation, we create all possible combinations of the three measures and tested each of those at all GO categories and using different miRNA-target gene pair sets. Figure 3 illustrates steps of combining the three types of hypergeometric distributions for ρ, τ and μ. For each of the 54 miRNA clusters, of the 27 variations for miRNA-target gene pairs, of the three GO categories, and of annotations (or GO terms), three p-values, p ρ , p τ and p μ , are first computed. Then, we generate 4 combined p-values by using Fisher's combined p-value method [24].

Figure 3
figure 3

Steps for combining three types of p -values. For a selected GO category and a miRNA-gene target-pair variation, for each GO term, three p-values are computed for ρ, τ, and μ, and then rank normalized. S ρ (n) denotes the set of GO terms whose p-values' ranks in the ρ hypergeometric distribution are less than or equal to n. By applying set operations, four combinations of S ρ (n), S τ (n), S μ (n) are created for further evaluation.

p ρ , τ :  combined p-value of  p ρ  and  p τ p ρ , μ :  combined p-value of  p ρ  and  p μ p τ , μ :  combined p-value of  p τ  and  p μ p ρ , τ , μ :  combined p-value of  p ρ , p τ  and  p μ

We briefly describe how Fisher's combined p-value method can be applied to our proposed measures. Under the null hypothesis of no significant enrichment, the individual p-value for the random variable ρ, τ, or μ follows the uniform distribution on 0[1]. Then the distribution of

Y=ln ( p -value )

is chi-squared with one degree of freedom. We have three p-values from ρ, τ, and μ hypergeometric distributions,

p ρ , p τ  and  p μ ,

and thus we define

Y ρ =ln ( p ρ ) , Y τ =ln ( p τ ) , and  Y μ =ln ( p μ )

Each of the random variables Y ρ , Y τ , and Y μ is under the chi-squared distribution with one degree of freedom. The final four sums of W are then defined as follows:

W 1 = Y ρ + Y τ W 2 = Y ρ + Y μ W 3 = Y τ + Y μ W 4 = Y ρ + Y τ + Y μ

The random variables W1, ..., W4 follow chi-squared distribution with degrees of freedom 2, 2, 2, and 3, respectively. These random variables are used to produce the combined 'overall' p-values. To calculate these p-values, we applied fisherSum function in R 'MADAM' package [25].

The underlying distribution of p-values from each method can be different due to the different characteristics of the measure. To take into account this heterogeneity in the distribution of p-values, we rank-normalized p-values for each GO category as shown in the last step of Figure 3. Specifically, we construct the set S θ (n) of top n significant GO terms having the smallest p-values for each measure θ ∈ {ρ, τ, μ}. Four additional sets of S ρ,τ (n), S ρ,μ (n), S τ,μ (n), and S ρ,τ,μ (n) for the combined measures are also created and used for further evaluation.

Evaluation measures

Average specificities and functional homogeneity index (or semantic similarity density) of the rank normalized term sets S θ (n) for each measure θ ∈{ρ, τ, μ,(ρ, τ), (ρ, μ), (τ, μ), (ρ, τ, μ)} are computed for performance comparison. This is based on the general assumption that for a specific set of GO terms identified by each measure, the more functionally homogenous the set is, the more reliable the measure is. In addition, higher specificities are more desirable because it is more informative to have more specific terms than more general terms in the functional analysis of clusters.

Many studies have shown that Information Content (IC) can quantify the specificity of a cluster [26, 27]. IC measure is based on the fact that less frequently used terms are more specific. The IC of a GO term t is defined as follows:

I C t = - log f r e q t f r e q r o o t
(4)

where root represents the root term for each GO category. freq(t) is defined as follows;

f r e q t = n a n n o t a t e t + c children t n a n n o t a t e c
(5)

where children(t) returns the list of child terms of term t. Thus t becomes a parent term of all members of children t), either directly or indirectly. The functions annotate(t) and n(G) return the list of genes that are annotated to GO term t and the number of the genes in the gene list G, respectively. We use the average IC value of the given term set as a performance measure to compare the specificity.

For functional homogeneity index (or semantic similarity density), we choose a widely used Resnik's measure of semantic similarity [28]. The semantic similarity between two terms is defined as the IC of the lowest common ancestor (LCA) of the two terms and hence is obtained by:

S R e s n i k t A , t B = I C L C A t A , t B
(6)

As an evaluation measure, the average of all pairwise term-to-term Resnik's similarities was applied for S θ (n) for each measure θ ∈ {ρ, τ, μ, (ρ, τ), (ρ, μ), (τ, μ), (ρ, τ, μ)} and defined as semantic similarity density of the set.

GO terms and associated gene sets were downloaded from http://www.geneontology.org/gene-associations/gene_association.goa_human.gz. We excluded GO associations having ND (No biological data) or NR (Not Recorded) evidence codes.

Results

Average specificity and functional homogeneity index distributions

Figure 4 shows the distributions of average IC values and functional homogeneity index for GO BP terms with p-values in top n = 100 ranks in the 'breast/up-regulated miRNA cluster' from Volinia et al. [19] (Supplementary Table S2 in 'Additional file 1'). Most of the highest average IC and functional homogeneity values were obtained by miRNA-centric μ measures throughout the evaluations (see Supplement Fig. S1 series in 'Additional file 1') including the specific example shown in Figure 4. Because of the small numbers of miRNA members and target genes, target variations #5, #10, #11, #15, and #16 in Table 1 had no significant GO terms. Evaluation showed that miRNA-centric μ measure exhibited the best specificity and homogeneity except only for the target variations #12, #19 and #22. The very small numbers of miRNAs (i.e., m = 56, 54, 175, respectively) and target genes (i.e., m = 160, 197, 1206, respectively) from the very strict thresholds may explain the results. These findings are also consistent throughout the evaluation study regardless of different GO categories.

Figure 4
figure 4

Evaluation of functional enrichment measures and their combinations. Distributions of (A) functional homogeneity index (or average IC value) and (B) semantic similarity (or average all pair-wise Resnik's similarity) are exhibited for significantly enriched GO BP terms in the 'breast/up-regulated miRNA cluster' from Volinia et al. [19](see index 1 in Supplement Table S2) by applying target variation #6 in Table 1. MicroRNA-centric measure (μ) outperforms the traditional target gene-centric measure (ρ) and others.

Performance comparison with a varying parameter setting

Figure 5(A) and 5(B) shows the distributions of the average IC values and functional homogeneity values with increasing numbers of rank normalized GO terms n (see Figure 3), as an example for "breast/up-regulated miRNA cluster" from Volinia et al. [19] (index 1 in Supplementary Table S1 in 'Additional file 1') by applying target variation #6 in Table 1, GO BP category. Measures containing miRNA-centric μ (in blue cross) like (ρ, μ) and (τ, μ) consistently outperformed traditional gene-centric ρ (in red circle) at all threshold levels of n. Figure 6 demonstrates the distribution of p-values for all GO BP terms annotated to the miRNA clusters from the dataset of Volinia et al. [19]. Although the interpretation about the p-value distribution is generally tricky and needs to be done carefully, it seems that the p-value distribution for miRNA-centric μ (in green) shows overall better discriminant power than target link-centric τ (in blue) and traditional gene-centric ρ (in red) methods.

Figure 5
figure 5

Evaluation of functional homogeneity and semantic similarity densities across different thresholds. Average (A) information content and (B) all pair-wise semantic similarity values are plotted with increasing numbers of rank normalized GO terms n (see Fig. 3) for "breast/up-regulated miRNA cluster" from Volinia et al. [19] (index 1 in Supplementary Table S1 in 'Additional file 1') by applying target variation #6 in Table 1, GO BP category. Measures containing miRNA-centric μ (in blue) like (ρ, μ) (in pink) and (τ, μ) (in sky blue) consistently outperform traditional gene-centric ρ (in red) measures at all levels.

Figure 6
figure 6

Distribution of p -values for all GO BP terms. Distribution of p-values for all GO BP terms demonstrates that miRNA-centric μ (in green) shows overall better discriminant power than target link-centric τ (in blue) and traditional gene-centric ρ (in red) methods for datasets from Volinia et al. [19].

Examples showing complementary properties

Examples of GO terms determined to be statistically significant by miRNA-centric μ but not by traditional gene-centric ρ method are listed in the upper part of Table 2. Gusev [13] correctly pointed out that it was common for top ranked GO terms to be targeted by every member of the corresponding miRNA cluster. Those that are targeted by all six miRNA members (i.e., μ i = 6) shown in the upper part of Table 2, however, are not statistically significant (p > 0.05) and show poor ranks (>290) by ρ method. But μ method shows statistical significances (p < 0.05) with high ranks (<35) (Table 2). In contrast, those that are targeted by all six miRNA members shown in the middle part of Table 2 show very strong statistical significance (p < 0.001) by ρ method. The very low μ k to μ l ratios (i.e., about 50:1) in the middle part compared to those in the upper part (i.e., about 1:1) of Table 2 clearly explain the poor p-values and ranks (>2500) by μ method. Therefore, Gusev's correct intuition can further be formally analyzed by introducing miRNA-centric μ method. It is demonstrated that our new measure considering μ complements some drawbacks of the traditional gene-centric ρ measure.

Table 2 Comparison of miRNA-centric μ and gene-centric ρ measuresa

The GO terms in the lower part of Table 2 are annotated only to two to five among six mRNA members such that they are far from statistical significance by ρ calculations. The p-values by μ method, however, are even more statistically significant. Complement activation (GO:0006956) in GO BP category was rejected by the traditional ρ method (p = 0.42) but accepted by miRNA-centric μ method (p > 0.001) with ranks of 1251 and 1, respectively. Complement activation indeed has long been well recognized in breast cancer [29, 30]. At least four well-known breast cancer genes including SMAD2, SMAD4, TGFB3 and TGFBR3 are involved in palate development. There are many studies reporting that regulation of growth hormone secretion (GO:0060123) is indeed associated with breast cancer [3133]. For the GO term, negative regulation of activin receptor signaling pathway (GO:0032926), many studies reported that facilitating activin signaling either by Cripto silencing or FLRG silencing inhibits human breast cancer cell growth [34, 35]. Numerous studies have reported that acetyl-CoA carboxylase (ACCα) and fatty acid synthase (FAS), key limiting fatty acid synthesis enzymes involved in coenzyme A metabolic process (GO:0015936), are highly expressed in human breast cancer cell lines and breast carcinomas [3640]. Moreover, pantothenate kinase 3 (PANK3) and Coenzyme A synthase (COASY) are known breast cancer genes.

Discussion

We proposed miRNA-centric μ and target link-centric τ measures that improve functional enrichment analysis of differentially expressed or co-expressed miRNA clusters. We performed comprehensive evaluations of different methods on various settings. It is demonstrated that these new measures complement the conventional target gene-centric ρ measure and miRNA-centric μ method was among the most powerful and reliable.

MicroRNA's intrinsic properties of multiplicity and cooperativity [17] may be correctly modeled by combined hypergeometric distributions. Average IC value for the μ category was consistently the highest among different conditions and measures. It is suggested that the number of miRNAs and their relations associated with a specific GO term of interest is as much important as the number of target mRNAs associated with the GO term. Therefore, applying ρ, τ, and μ hypergeometric distributions for functional annotation of miRNAs are mutually complementary.

The proposed method is based on computationally predicted rather than experimentally validated target relations. Computational prediction has limitations given high level of false positives and negatives. Especially, it is difficult to obtain predicted targets for minor forms of miRNA such as star, -3p, -5p or other recently identified forms of miRNAs. All current computational enrichment analysis methods that use predicted target relations suffer from the same drawback. Combining the proposed three methods may complement with each other in finding and evaluating the correct miRNA-mRNA target relations, and improving functional annotations and enrichment analysis.