Long noncoding RNAs as novel predictors of survival in human cancer: a systematic review and meta-analysis

Serghiou, Stylianos; Kyriakopoulou, Aikaterini; Ioannidis, John P. A.

doi:10.1186/s12943-016-0535-1

Long noncoding RNAs as novel predictors of survival in human cancer: a systematic review and meta-analysis

Review
Open access
Published: 28 June 2016

Volume 15, article number 50, (2016)
Cite this article

Download PDF

You have full access to this open access article

Molecular Cancer Aims and scope Submit manuscript

Long noncoding RNAs as novel predictors of survival in human cancer: a systematic review and meta-analysis

Download PDF

Stylianos Serghiou^1,2,
Aikaterini Kyriakopoulou³ &
John P. A. Ioannidis^4,5,6,7

4327 Accesses
93 Citations
3 Altmetric
Explore all metrics

Abstract

Background

Expression of various long noncoding RNAs (lncRNAs) may affect cancer prognosis. Here, we aim to gather and examine all evidence on the potential role of lncRNAs as novel predictors of survival in human cancer.

Methods

We systematically searched through PubMed, to identify all published studies reporting on the association between any individual lncRNA or group of lncRNAs with prognosis in human cancer (death or other clinical outcomes). Where appropriate, we then performed quantitative synthesis of those results using meta-analytic methods to identify the true effect size of lncRNAs on cancer prognosis. The reliability of those results was then examined using measures of heterogeneity and testing for selective reporting biases.

Results

Three hundred ninety-two studies were screened to eventually identify 111 eligible studies on 127 datasets. In total, these represented 16,754 independent participants pertaining to 53 individual and 6 grouped lncRNAs within a total of 19 cancer sites. Overall, 83 % of the studies we identified addressed overall survival and 32 % of the studies addressed recurrence-free survival. For overall survival, 96 % (88/92) of studies identified a statistically significant association of lncRNA expression to prognosis. Meta-analysis of 6 out of 7 lncRNAs for which three or more studies were available, identified statistically significant associations with overall survival. The lncRNA HOTAIR was by far the most broadly studied lncRNA (n = 29; of 111 studies) and featured a summary hazard ratio (HR) of 2.22 (95 % confidence interval (CI), 1.86–2.65) with modest heterogeneity (I² = 49 %; 95 % CI, 14–79 %). Prominent excess significance was demonstrated across all meta-analyses (p-value = 0.0003), raising the possibility of substantial selective reporting biases.

Conclusions

Multiple lncRNAs have been shown to be strongly associated with prognosis in diverse cancers, but substantial bias cannot be excluded in this field and larger studies are needed to understand whether these prognostic information may eventually be useful.

Computational Analysis of lncRNA Function in Cancer

Long non-coding RNAs: potential new biomarkers for predicting tumor invasion and metastasis

Article Open access 29 September 2016

Chunyang Jiang, Xin Li, … Huibin Liu

LncRNA Linc00173 may be a potential prognostic biomarker in human solid tumors: a meta—analysis and bioinformatics analysis

Article 09 March 2023

Cong Zhou, Yalong Huang, … Xiaojun Yang

Background

Non-coding RNAs (ncRNAs) have been proposed in the last decade as regulators of cancer pathways and biomarkers of cancer outcomes [1–4]. Potentially informative biomarkers based on ncRNAs include microRNAs (miRs) [5] and the larger long non-coding RNAs (lncRNAs). NcRNAs were up to recently disregarded as ‘junk’ and despite constituting the large majority of RNAs being transcribed, their role in normal development and cellular physiology in health and disease is only recently becoming apparent [2, 6, 7].

LncRNAs refer to any ncRNA consisting of more than 200 nucleotides. They are functionally heterogeneous molecules [6, 8], themselves sub-classified into large intergenic non-coding RNAs (lincRNA), transcribed ultraconserved regions (T-UCRs) and many others [2]. Of an estimated putative 140,000 different ncRNAs in total [9], lncRNAs are estimated to constitute proportionally the largest class, with the most comprehensive approach to date confirming 58,648 expressed lncRNAs [10]. Even though the function of lncRNAs is still being debated [11], certain lncRNAs have been implicated in functions related to regulation of gene expression in health and disease [2, 6–8, 12–15]. Well-studied examples include the lncRNA Xist, which initiates X-chromosome inactivation in female cells by recruiting repressive complexes to the X-chromosome under inactivation [16–18] and H19, which has been shown to play a significant role in genomic imprinting [19, 20].

Of particular interest however is, that it is now clear that lncRNAs are major players in tumorigenesis [7–9, 21–23]. In this context, the most well studied lncRNA is HOTAIR (HOmeobox (HOX) Transcript AntIsense RNA), which has been shown to recruit the PRC2 (Polycomb Repressive Complex 2) complex and eventually lead to epigenetic silencing of metastasis suppressor genes [2, 24].

More than 20 meta-analyses studying the role of lncRNAs in cancer prognosis have been published so far, all within the past 2 years. All of these studied a single lncRNA, either in relation to a specific cancer or to any cancer. The two most studied lncRNAs are MALAT1 and HOTAIR, which have been the subject of 10 and 7 meta-analyses respectively. The latest meta-analysis on MALAT1 for all cancer types showed that its upregulation is statistically significantly associated with poor overall survival (pooled hazard ratio [HR], 2.14; 95 % CI, 1.74–2.64) with low between-study heterogeneity (I², 4.3 %; p-value = 0.399), on the basis of 9 studies [25]. The results were similar to the latest meta-analysis of HOTAIR (HR, 2.33; 95 % CI, 1.77–3.09), but with significant between-study heterogeneity (Cochran’s Q-test p-value = 0.016), on the basis of 16 studies [26]. Interestingly, all meta-analyses published so far have been produced by Chinese groups and all identified a statistically significant association of all lncRNAs studied to prognosis in cancer. However, no systematic review and meta-analysis to-date has identified all lncRNAs studied in the context of cancer and to what extent these might be of prognostic significance.

In this paper, we aimed to examine the potential role of all lncRNAs ever investigated in the context of cancer survival prediction, as novel predictors of survival in human cancer. We utilized a field-wide meta-analysis approach [27] to systematically identify and examine all published papers trying to associate lncRNAs to prognosis in human cancer, and to quantitatively synthesize data directly related to prognosis wherever three or more studies on an lncRNA had been done.

Methods

Systematic review

This report has been structured on the basis of PRISMA [28].

Eligibility criteria

We considered published reports of a prospective or retrospective study design that had explored the association of any single or combination of stated lncRNAs to any of the following types of survival analysis: disease-specific survival (DSS, duration of time from the day of diagnosis to the day of death due to cancer); metastasis-free survival (MFS, duration of time from day of diagnosis to the day of diagnosing a metastatic event); overall/cumulative survival (OS, duration of time from day of diagnosis to the day of death due to any cause); progression/event/disease-free survival (PFS, duration of time from day of first treatment to the day evidence of cancer progression are identified or the patient dies of any cause); and recurrence-free survival (RFS, duration of time from day of cure from cancer to the day evidence of cancer progression/recurrence is identified). Survival analyses measuring different types of survival were treated separately at all times. Studies describing the association of individual or groups of lncRNAs with clinicopathologic variables (e.g. Stage, Grade, Distant metastasis, etc.), without specifically examining associations to any of the aforementioned survival analyses, were excluded. We likewise excluded cross-sectional studies and studies concerning genetic alterations of an lncRNA (e.g. polymorphisms or methylation patterns). Any kind of quantitative lncRNA analysis (quantitative real time–PCR, in situ hybridization) was eligible.

For meta-analysis eligibility, a study had to also provide the effect size and confidence interval for the association of an individual or group of lncRNAs with any of the above survival outcomes, or report information through which this effect size and confidence interval could be calculated [29, 30]. Wherever the same cohort had published more than one overlapping analysis, we only used the most encompassing data (for example, the classification of glioma would be preferred over glioblastoma multiforme). Two reviewers (S. Serghiou and A. Kyriakopoulou) identified eligible studies, and any contested articles were adjudicated by a third reviewer (J. P. A. Ioannidis).

Information sources

We systematically searched PubMed (1950 to September, 2015) for studies of any language that analyzed associations between lncRNAs and prognosis in human cancer. Our search strategy was developed in consideration of previous recommendations [30] and used the clinical queries prognosis filter, which has been reported to have an average estimated sensitivity of 92 % for detecting articles related to prognosis [5, 31]. Our search term was: (Prognosis/Broad [filter]) AND ((lncRNA OR “lnc RNA” OR “long noncoding ribonucleic acid” OR “long noncoding RNA” OR “long non-coding ribonucleic acid” OR “long intergenic noncoding RNA” OR “long intergenic non-coding RNA” OR “long non-coding RNA” OR “long ncRNA” OR “lincRNA” OR “linc RNA”) AND (cancer OR carcinoma OR tumor OR neoplas* OR tumour OR malignan* OR metastat* OR metastas* OR leukemia OR leukaemia OR lymphoma OR recurren* OR “lymph node” OR response) AND (Humans[Mesh] AND English[lang])). The search was last updated to include articles published through September 26, 2015.

Study selection

We used the programming language R [32] to remove duplicate records. Title and abstract were screened to identify relevant articles. The full manuscript of the relevant articles was screened against our eligibility criteria. Any uncertainties were resolved by consensus with JPA. Data were collected by two reviewers (SS, AK) and saved in a pre-designed extraction form on Google Sheets. Where information was ambiguous (such as, for example, mentioning multiple types of lncRNA quantification methods but not clarifying which one of those was used to provide the quantities utilized in the survival analysis), this was labelled as ‘unclear’. An attempt was made to contact the authors when information was clearly logically inconsistent, as in for example quoting a hazard ratio (HR) outside the confidence interval (CI), but none replied. In one paper, the lncRNA expression level [33] was subdivided into low versus medium versus high; for this paper we only extracted the comparison between low versus high expression levels. The following data were extracted for all articles following the CHARMS checklist [34]: title; authors; year of publication; journal of publication; groupings (i.e. whether lncRNAs were studied one by one or in groups); what lncRNAs were studied; whether an agnostic approach to identifying the studied lncRNAs was used (where an agnostic approach would be one assuming no prior knowledge regarding the choice of lncRNA to be studied); cancer site (e.g. brain) and cancer subtype (e.g. glioblastoma multiforme); whether a paper reported clinicopathologic data of its sample and which ones; whether an attempt of associating those clinicopathologic data to lncRNAs was made and for which ones; whether an attempt of associating clinicopathologic data to prognosis was made and for which ones; whether an attempt was made to explain the clinical outcomes using non-clinical studies (in vivo, in vitro); the types of survival analyses used (as above); type of study design (prospective cohort, retrospective cohort, unreported); means of lncRNA quantitative analysis (qRT–PCR, qPCR, in situ hybridization (ISH), other); and whether the paper tried to make any non-clinical associations of the identified lncRNAs to cancer in vitro. For eligible articles we further extracted: country and city of origin of the study cohort, period of sample recruitment, range of sample ages, mean/median age with confidence interval, the population type (general population, non-general population (e.g. veterans), unreported), stage of cancer upon initial patient presentation, sample size, means of tissue preservation (frozen, paraffin-embedded, both, other), any and what preoperative treatment was given, the total number of lncRNAs studied, the type of metric the paper used to characterize their results (hazard ratio, relative risk, odds ratio, p-value), type of analysis (i.e. univariable or multivariable), lncRNA quantity cut-off and its unit (i.e. the threshold based on which lncRNA expression was deemed upregulated or downregulated by the study), the sample size of each comparison group, the minimum and maximum participant follow-up time, the number of censored participants throughout follow-up and whether this was explicitly stated or read off the Kaplan-Meier curves, the HR and its CI (provided or inferred, e.g. from p-values and HR point estimates), the p-value and whether this was statistically significant at p < 0.05 and whether an attempt to validate the reported results was made, and if so, what type of validation method was used (internal or external). For eligibility for meta-analysis, enough information to extract or calculate the natural logarithm of the hazard ratio and its variance must have been provided.

Whenever multiple datasets were combined into a single dataset to study a specific lncRNA, we only extracted the summary HR, rather than extracting the HR respective to each constitutive dataset. If multiple datasets were assessed within the same study without being combined into a single dataset, we extracted the HR respective to each dataset, as they represent separate estimates. Where both the log-rank and Breslow tests were reported, only the log-rank was extracted. No cohort was used more than once and effect sizes describing a broader class of cancer (e.g. glioma) were preferred over subclassifications of that (e.g. glioblastoma multiforme). Three studies reported effect sizes that were excluded from further consideration because the quoted HRs contradicted the text [35] or they were either outside the CI or could not have possibly led to the quoted CI [36, 37]; this led to complete exclusion of two out of these three studies [35, 37]. Our database can be freely accessed here: https://goo.gl/EjCDAp.

Risk of bias in individual studies

Risk of bias in individual studies was assessed on the basis of the framework of assessing internal validity of articles dealing with prognosis [30, 38] and recommendations regarding reporting of biomarker studies [39, 40].

Meta-analysis

Summary measures and synthesis of results

We meta-analyzed data on lncRNAs for which three or more estimates of their effect on a specific survival outcome were available. Therefore, meta-analyses were only done for OS and RFS. Effect sizes for OS and RFS were meta-analyzed separately. Our principal summary measure was the summary HR. Standard errors were calculated using: ln (upper limit of CI/lower limit of CI)/(2 × 1.96). Estimates were synthesized using a random-effects model and estimated using the restricted maximum-likelihood ratio method. As previously described [27], four meta-analyses were done for each of: (1) multivariable data, (2) univariable data, (3) multivariable data combined with univariable data whenever multivariable data were unavailable (preferentially multivariable) and (4) univariable data combined with multivariable data whenever univariable data were unavailable (preferentially univariable). Given the similarity between the estimates of all four types of meta-analysis and the importance of multivariable modelling in prognostic studies, this report only quotes the estimates of the ‘preferentially multivariable’ meta-analysis; the rest can be found in Additional file 1: Table S2. For each estimate we provide the effect size and 95 % CI. Heterogeneity was analyzed using the Q and I² statistics and the 95 % CI of I² was also calculated [41, 42]. These analyses were done using R and the package metafor 1.9-8 [43]. Data were combined for each type of lncRNA regardless of cancer type. Wherever an lncRNA had been analyzed three or more times for one or more specific cancer type, a post hoc subgroup analysis per cancer type was done for that lncRNA.

Risk of bias across studies

Risk of publication bias is a significant concern in prognostic studies [30]. We explored excess significance for factors reported by at least 3 studies [44]. Briefly, for every meta-analyzed risk factor we compare the number of observed significant results (O) at α = 0.05, to the number of expected significant results (E), where E = sum of power of each study within a specific meta-analysis. Power was calculated taking as plausible effect for the risk factor the effect seen in the most precise study (lowest standard error). The difference between O and E was assessed using a two-tailed binomial test, with α = 0.1, as previously suggested [45]. O and E were also summed and compared across all meta-analyses.

Results

Literature search and description of studies

We initially identified 397 records, from which 286 were excluded (Fig. 1), leaving us with 111 studies eligible for systematic review (Additional file 2), of which 85 were also eligible for meta-analysis. The 111 studies utilized 127 datasets to produce their analyses (four studies utilized two datasets, three studies utilized three datasets and two studies utilized four datasets). No new studies were imported through reference checking.

Of 127 identified datasets, only 2 were reported to represent a prospective cohort; of the rest, 19 were reported to represent a retrospective cohort and there were no relevant information for the remaining 106 datasets. No report specified what type of population their samples came from and for 113/127 datasets we have no information as to what sampling method was used to obtain the sample. For the remaining datasets, consecutive sampling was stated to have been used in 5 and random sampling in 4 datasets; 5 datasets were based on all patients ever seen by the clinic. Sampling method was disproportionately frequently reported for studies coming from the USA (4/9). A total of 94/127 datasets came from Asia (78 from China), followed by Europe (15/127) and America (13/127); there was no reported country of origin for 2 datasets and 3 datasets contained patients from multiple continents; the latter were multi-center cohorts. A total of 16,754 different patients were enrolled within these studies (avoiding double-counting samples that had been used for two or more analyses). Median sample size was 90 (IQR, 82; range, 30–997) and 69/127 datasets contained less than 100 participants (50 of which datasets came from China).

Mapping of lncRNA prognostic data

The eligible reports studied 18 types of cancer, top three most studied of which were gastric cancer (n = 16 datasets), lung cancer (n = 15) and colorectal cancer (n = 15) (Table 1). Almost half of the reports studied cancer related to the gastrointestinal tract (57/127 datasets). OS was assessed in 92/111 studies (83 %), RFS in 36 (32 %), DSS in 10 (9 %), MFS in 9 (8 %) and PFS in 6 (5 %). The majority of studies did not appear to choose what lncRNAs to study on the basis of agnostic reports (77 %, 85/111). For 98/127 datasets (77 %), there was no information regarding adjuvant treatment; for the 29 studies providing information regarding treatment, only 4 datasets indicated that their patients were treated homogeneously. In addition to survival analysis, 68 % (76/111) of the identified studies attempted to further study their chosen lncRNAs in vitro, to corroborate the results of their survival analyses with mechanistic insights into the function of their chosen lncRNAs. Across 66 studies reporting multivariable analyses, 42 adjusted for stage of cancer (or all three components of the TNM staging) and 27 for grade of cancer; only 19/66 studies adjusted for both. Figure 2 displays a microarray of the covariates that have been studied more than three times within multivariable analyses (Additional file 3: Figure S1 displays the complete data microarray). Out of all 66 studies, 20 (30 %) studies adjusted for the same factors as at least one other paper and the most commonly encountered combination of factors adjusted for was Stage and Lymph Node Metastasis, which was seen in 6/66 studies. The median number of adjustment combinations matching between at least two papers was 1 (IQR, 0).

Table 1 Descriptive statistics of eligible studies

Full size table

Overall survival

Out of 92 studies reporting on OS, 87 studies (representing 111/127 analyses, as explained in Additional file 4: Table S1) provided effect estimates, out of which two were completely excluded due to reporting inconsistent effect sizes, as indicated in the Methods [35, 37]. The 85 remaining studies provided effect estimates on 53 lncRNAs and 6 multi-lncRNA risk score scales. The three most frequently studied lncRNAs within OS analyses were HOTAIR (n = 29 effect estimates), MALAT1 (n = 8) and GAS 5, H19 and PVT1 (n = 4 for each). Most individual lncRNAs (42/53) were only studied once (Table 2). Only 7 lncRNAs were studied at least three times in association to OS and for 6 of them more than half of the studies showed statistically significant p-values. These lncRNAs were studied in the context of a median of 4 different types of cancer (IQR, 3). Out of the 52 individual or groups of lncRNAs studied less than three times, 44 were always reported significantly associated to OS. Overall, of the 92 studies reporting on OS (but not necessarily quoting an effect estimate), 88 (96 %) reported at least one statistically significant result for association with prognosis.

Table 2 Details of the lncRNAs studied

Full size table

Meta-analysis for overall survival

A meta-analysis of OS was done for all 7 individual or groups of lncRNAs having been studied three or more times (Fig. 3; Table 3; Additional file 1: Table S2). For p-value < 0.0005, 5 lncRNAs were statistically significantly associated to OS in all of our meta-analyses (HOTAIR, MALAT1, 6 lncRNA risk score, PVT1, SChLAP1) and 6/7 were statistically significant in all of our meta-analyses at p-value < 0.05 (H19; Additional file 1: Table S2). An increase in cellular expression of these lncRNAs was statistically significantly associated to a decrease in overall survival; GAS5 was not statistically significantly associated to OS in our meta-analyses. The funnel plot for HOTAIR (Fig. 4), which is the only lncRNA studied 10 or more times, indicates significant small-study effects (p-value = 0.0006), and this may be suggestive of publication bias. The summary effect size for HOTAIR also displays a moderate amount of between-study heterogeneity (I², 48 %; 95 % CI, 14–78 %). The summary effects for the effect of HOTAIR on OS in cancers for which it was studied 3 or more times were: colorectal cancer (HR, 4.76; 95 % CI, 2.46–9.21), esophageal cancer (HR, 2.29; 95 % CI, 1.68–3.12) and glioma (HR, 1.71; 95 % CI, 1.25–2.34).

Table 3 The results of our meta-analysis for each lncRNA using ‘primarily multivariable’ data

Full size table

Other meta-analyses

The only type of survival analysis other than OS studied 3 or more times in relation to a specific lncRNA was MFS for HOTAIR. This was investigated within 4 different studies in relation to 4 different cancers (breast, colorectal, esophageal, head and neck). Meta-analysis of these studies identified a summary HR of 2.54 (95 % CI, 1.62–3.98) with no statistically significant heterogeneity (Q-statistic, 5.16; p-value = 0.16).

Heterogeneity metrics and excess significance

Statistically significant heterogeneity was only observed in HOTAIR analyses, but substantial estimates of I² were common. For HOTAIR and OS, a sensitivity analysis excluding the only study reporting an inverse correlation of HOTAIR to cancer survival [33] generated a HR of 2.30 (95 % CI, 1.97-2.70) with I² = 0 % (95 % CI, 0–59 %); for all other meta-analyses, no single study produced a major change in the I².

There was excess significance across the whole field for overall survival and the binomial distribution revealed a two-tailed p-value of 0.0003, with O = 42 statistically significant results and E = 30 expected statistically significant results across all meta-analyses with 3 or more studies each on OS. As far as excess significance within lncRNAs studied 5 or more times is concerned, there was significant excess significance documented for HOTAIR (p-value = 0.002), but not MALAT1 (p-value = 0.46).

Discussion

In this systematic review and meta-analysis we have tried to gather all published papers evaluating the prognostic ability of lncRNAs in cancer. We have identified that a large number of lncRNAs have been evaluated within the context of cancer prognosis. Most of them have been evaluated only once in a published paper. Almost all of the published papers report that lncRNAs are statistically significant predictors of survival. There was often substantial heterogeneity between studies in the strength of the predictive effect. There was also strong evidence for small-study effects and for excess significance. This picture may be due to genuine differences across studies, such as different cancers and populations under study, and different adjustments made in multivariable models. However, it is also highly compatible with the presence of substantial publication bias and other selective reporting bias in this field resulting in exaggerated effects in mostly small studies (most of which coming from China) and in an implausibly high prevalence of nominally significant results.

It is well recognized that published literature on prognosis and the identification of prognostic markers is characterized by poor methodological quality, significant publication bias and wide heterogeneity in aspects of sample selection, such as pre/post-biopsy treatment or tissue preservation methods, and analysis, such as multivariable modelling and determination of cutoff values [30, 46]. As such, meta-analyses of prognostic studies may elicit summary effect sizes that are unrealistic [47]. An evaluation of studies investigating the association of TP53 to risk of death by head and neck squamous cell carcinoma, identified that even though readily available effect sizes would confirm that TP53 is a strongly significant prognostic factor, after standardizing definitions of TP53 status and outcomes across papers and retrieving non-readily available information, this association was completely abrogated [48]. These issues may also apply to the lncRNA literature. No two studies of our dataset were identical in all of lncRNA, cancer site, cut-off value and multivariable modelling, suggesting substantial room for selective reporting of analyses that could be done with very different models and definitions. Moreover, we suspect that publication bias may also be operating in the field.

Of particular interest is the excess significance we identified across the field (p-value = 0.0003). Despite the poor translation of cancer biomarkers into clinical practice [39, 49–51], out of 1575 studies on cancer biomarkers published in 2005, 95.8 % reported statistically significant results and only 1.3 % did not report any kind of statistically significant results [52]. Indeed, as we have shown, this pattern is also prominent in the lncRNA cancer prognosis literature.

One way of reducing the selective reporting biases that have led to the above status quo and thus reducing lack of translatability, is transparency. The need to improve transparency has been mentioned repeatedly [39, 53]. Guidelines have been proposed to improve the reporting of prognostic markers (REMARK) [39, 51], multivariable prediction models (TRIPOD) [54] and genetic risk prediction studies [55]. Wider adoption of these guidelines may increase transparency, but it is unknown whether it will suffice to markedly reduce selective reporting.

In our cohort of studies, the extent of unreported items in Table 1, did not inspire confidence in transparency and completeness of reporting practices. We also documented minimal use of validation (12/111 studies, 11 %), despite reports stressing the necessity and importance of validation in identifying true effect size for prognostic tools [56, 57]. Furthermore, more than half of the identified studies had a sample size of less than 100. Small studies are known, both theoretically and empirically, to be associated with inflated estimates of effect size [58], not as much due to their limited sample size, as for lower quality standards, publication bias and selective reporting [59], which is why they lead to so-called ‘small-study effects’. Even though these have mostly been studied within the context of randomized-controlled trials, where they have been associated with a larger average effect size and at least double the between-study heterogeneity found in larger studies [60], similar problems may occur also in prognostic study research [43]. The meta-analysis for HOTAIR, which is the most widely studied lncRNA in the context of cancer prognosis, clearly indicates that smaller studies tend to be less precise and report a higher effect size than larger studies. Inflated effects are common in biomarker studies [61], and this may apply also for the results of lncRNAs.

Another interesting point of note is the Chinese provenance of most papers in our collection of eligible studies (78/111, 70 %). In a previous analysis of genetic studies, it was shown that there is a vast Chinese literature, and that papers from China tend to utilize smaller sample sizes yet reach statistical significance far more commonly than other papers [62]. This was attributed to more prominent publication bias against null results or other kinds of selection bias in pursuit of statistically significant results. Discrepancies between the Chinese literature and the rest of the world were also found in published meta-analyses of genomic data [63]. Chinese meta-analyses (1) focused on the results of studies investigating individual candidate genes rather than the results of genome-wide association studies and (2) used nominal significance (i.e. p-value < 0.05) rather than genome-wide p-value thresholds to identify statistically significant results.

Although there has been an explosion in the amount of identified potential biomarkers due to high throughput methods, unlike traditional methods of identifying molecules directly relevant to a known cellular event [49], very few have made their way to clinical practice, due to lack of appropriate evidence [50, 64, 65]. An important aspect in ascribing usefulness to a novel biomarker is their ability to add further predictive value, over and above the one already possible using known prognostic factors. Unfortunately, in our sample, despite most multivariable analyses identifying lncRNAs as a statistically significant predictor, only about 30 % of the reported prognostic effects were adjusted for the two classically most relevant predictors of cancer prognosis (i.e. Stage and Grade).

Limitations

Our analysis has several limitations. First, given that this report is only based on the results of a single database (PubMed), it is possible that relevant papers may have been missed. Second, our analysis utilized the Medical Subject Heading (MeSH) ‘Humans’ to limit our search results to those studies conducted in humans. Even though this is accepted practice and has been used previously in similar studies [5], that label is added to papers at the point of indexing, and thus some papers that were published close to our search date (September 26, 2015) and had not been MeSH-labeled yet, would have been missed. We performed an updated search (June 5, 2016) for papers that did not have a Human [MeSH] and had been published before 2015 and found only two small studies [66, 67] that could potentially qualify for inclusion for the outcome of survival. This is a field with prolific literature and a substantial number of papers have continued to appear after our September 2015 search and will probably continue to appear in the near future. Third, our meta-analysis has attempted to combine multiple studies that are known to be heterogeneous in terms of cancer site and provenance of patient populations. Our estimates of heterogeneity metrics have wide 95 % confidence intervals [42]. Fourth, on 51 occasions we had to calculate HRs ourselves based on data provided within the papers, which may not have provided the most accurate estimate of the HR possible, as most of the time these data were extracted from Kaplan-Meier curves. However, this practice has not been shown to yield results significantly different from direct methods of HR estimation [29]. Fifth, even though every effort was made to exclude analyses of the same lncRNA using the same dataset of patients, it is possible that some overlapping data have been included, if their authors have made no hint as to the presence of overlap.

Conclusions

In conclusion, we have gathered a substantial amount of prognostic data regarding the association of various lncRNAs and survival. Our analysis identified a significant number of studies, most of which have been published within the last 2 years and most of which are of small sample size. Even though our systematic review and meta-analyses identified that almost all lncRNAs identified are statistically significant predictors of OS, it is very difficult to know the importance of these associations, given the detection of excess significance, small-study effects and the known difficulties with analyzing prognostic studies. Larger studies, ideally with collaborative teams using standardized approaches to measurement, adjustment, analysis, and reporting, will offer better insights into the prognostic value of lncRNAs.

Abbreviations

RNA, Ribonucleic acid; ncRNAs, Noncoding RNAs; LncRNAs, Long noncoding RNAs; LincRNA, large intergenic non-coding RNAs; T-UCRs, transcribed ultraconserved regions; miR, microRNA; HOTAIR (HOmeobox (HOX) Transcript AntIsense RNA); PRC2, Polycomb Repressive Complex 2; HR, Hazard Ratio; CI, Confidence Interval; IQR, Interquartile range; DSS, Disease-specific survival; MFS, Metastasis-free survival, OS, Overall/cumulative survival; PFS, Progression/event/disease-free survival; RFS, Recurrence-free survival; O, Number of observed events; E, Number of expected events; PCR, Polymerase chain reaction; qPCR, Quantitative PCR; qRT-PCR, Quantitative real-time PCR; RT-qPCR, real-time quantitative PCR; ISH, in situ hybridization; LNM, Lymph node metastasis; LVM, Lymphovascular metastasis

References

Alexander RP, Fang G, Rozowsky J, Snyder M, Gerstein MB. Annotating non-coding regions of the genome. Nat Rev Genet. 2010;11:559–71.
Article CAS PubMed Google Scholar
Esteller M. Non-coding RNAs in human disease. Nat Rev Genet Nat Publish Group. 2011;12:861–74.
Article CAS Google Scholar
Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489:101–8.
Article CAS PubMed PubMed Central Google Scholar
Kornienko AE, Guenzl PM, Barlow DP, Pauler FM. Gene regulation by the act of long non-coding RNA transcription. BMC Biol BioMed Central. 2013;11:1.
Google Scholar
Nair VS, Maeda LS, Ioannidis JPA. Clinical outcome prediction by microRNAs in human cancer: a systematic review. J Natl Cancer Inst. 2012;104:528–40.
Article CAS PubMed PubMed Central Google Scholar
Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet Nat Publish Group. 2009;10:155–9.
Article CAS Google Scholar
Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, et al. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011;477:295–300.
Article CAS PubMed PubMed Central Google Scholar
Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs. Cell. 2009;136:629–41.
Article CAS PubMed Google Scholar
Malek E, Jagannathan S, Driscoll JJ. Correlation of long non-coding RNA expression with metastasis, drug resistance and clinical outcome in cancer. Oncotarget. 2014;5:8027–38.
Article PubMed PubMed Central Google Scholar
Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet Nat Publish Group. 2015;47:199–208.
Article CAS Google Scholar
Lee JT. Epigenetic regulation by long noncoding RNAs. Science. 2012;338:1435–9.
Article CAS PubMed Google Scholar
Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev Biochem. 2012;81:145–66.
Article CAS PubMed Google Scholar
Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, et al. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A. 2009;106:11667–72.
Article CAS PubMed PubMed Central Google Scholar
Koziol MJ, Rinn JL. RNA traffic control of chromatin complexes. Curr Opin Genet Dev. 2010;20:142–8.
Article CAS PubMed PubMed Central Google Scholar
Vance KW, Ponting CP. Transcriptional regulatory functions of nuclear long noncoding RNAs. Trends Genet. 2014;30(8):348–355.
Penny GD, Kay GF, Sheardown SA, Rastan S, Brockdorff N. Requirement for Xist in X chromosome inactivation. Nature. 1996;379:131–7.
Article CAS PubMed Google Scholar
Wutz A, Rasmussen TP, Jaenisch R. Chromosomal silencing and localization are mediated by different domains of Xist RNA. Nat Genet Nat Publish Group. 2002;30:167–74.
Article CAS Google Scholar
Wutz A, Gribnau J. X inactivation Xplained. Curr Opin Genet Dev. 2007;17:387–93.
Article CAS PubMed Google Scholar
Forne T, Oswald J, Dean W, Saam JR, Bailleul B, Dandolo L, et al. Loss of the maternal H19 gene induces changes in Igf2 methylation in both cis and trans. PNAS Nation Acad Sci. 1997;94:10243–8.
Article CAS Google Scholar
Gabory A, Ripoche M-A, Le Digarcher A, Watrin F, Ziyyat A, Forné T, et al. H19 acts as a trans regulator of the imprinted gene network controlling growth in mice. Dev Company Biol Ltd. 2009;136:3413–21.
CAS Google Scholar
Calin GA, Liu C-G, Ferracin M, Hyslop T, Spizzo R, Sevignani C, et al. Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. Cancer Cell. 2007;12:215–29.
Article CAS PubMed Google Scholar
Spizzo R, Almeida MI, Colombatti A, Calin GA. Long non-coding RNAs and cancer: a new frontier of translational research? Oncogene. 2012;31:4577–87.
Article CAS PubMed PubMed Central Google Scholar
Li X, Wu Z, Fu X, Han W. Long Noncoding RNAs: Insights from Biological Features and Functions to Diseases. Med Res Rev. 2013;33:517–53.
Article PubMed Google Scholar
Gupta RA, Wang KC, Hung T, West RB, Sukumar S, Chang HY. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464:1071–6.
Article CAS PubMed PubMed Central Google Scholar
Wang J, Xu AM, Zhang JY, He XM, Pan YS, Cheng G, et al. Prognostic significance of long non-coding RNA MALAT-1 in various human carcinomas: a meta-analysis. Genet. Mol. Res. 2016;15(1).
Deng Q, Sun H, He B, Pan Y, Gao T, Chen J, et al. Prognostic Value of Long Non-Coding RNA HOTAIR in Various Cancers. PLoS ONE Public Library Sci. 2014;9:e110059.
Article Google Scholar
Serghiou S, Patel CJ, Tan YY, Koay P, Ioannidis JPA. Field-wide meta-analyses of observational associations can map selective availability of risk factors and the impact of model specifications. J Clin Epidemiol. 2016;71:58–67.
Article PubMed Google Scholar
Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. Public Library of Science; 2009;6(7):e1000100.
Parmar MK, Torri V, Stewart L. Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Stat Med. 1998;17:2815–34.
Article CAS PubMed Google Scholar
Altman DG. Systematic reviews of evaluations of prognostic variables. BMJ British Med J Publish Group. 2001;323:224–8.
Article CAS Google Scholar
Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Team H. Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey. BMJ. 2005;330:1179.
Article PubMed PubMed Central Google Scholar
R Development Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria. Available from: http://www.R-project.org.
Lu L, Zhu G, Zhang C, Deng Q, Katsaros D, Mayne ST, et al. Association of large noncoding RNA HOTAIR expression and its downstream intergenic CpG island methylation with survival in breast cancer. Breast Cancer Res Treat Springer US. 2012;136:875–83.
Article CAS Google Scholar
Moons KGM, De Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. Plos Med Public Library Sci. 2014;11:e1001744.
Google Scholar
Xu Z-Y, Yu Q-M, Du Y-A, Yang L-T, Dong R-Z, Huang L, et al. Knockdown of long non-coding RNA HOTAIR suppresses tumor invasion and reverses epithelial-mesenchymal transition in gastric cancer. Int J Biol Sci. 2013;9:587–97.
Article PubMed PubMed Central Google Scholar
Wu Z-H, Wang X-L, Tang H-M, Jiang T, Chen J, Lu S, et al. Long non-coding RNA HOTAIR is a powerful predictor of metastasis and poor prognosis and is associated with epithelial-mesenchymal transition in colon cancer. Oncol Rep Spandidos Publ. 2014;32:395–402.
CAS Google Scholar
Takahashi Y, Sawada G, Kurashige J, Uchi R, Matsumura T, Ueo H, et al. Amplification of PVT-1 is involved in poor prognosis via apoptosis inhibition in colorectal cancers. Br J Cancer. 2014;110:164–71.
Article CAS PubMed Google Scholar
Bouwmeester W, Zuithoff NPA, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, et al. Reporting and methods in clinical prediction research: a systematic review. Macleod MR, editor. Plos Med Public Library of Sci. 2012;9:1–12.
Google Scholar
McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM. REporting recommendations for tumour MARKer prognostic studies (REMARK). Br J Cancer Nat Publish Group. 2005;93:387–91.
CAS Google Scholar
Henry NL, Hayes DF. Cancer biomarkers. Mol Oncol. 2012;6:140–6.
Article CAS PubMed Google Scholar
Higgins J, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539-58.
Ioannidis JPA, Patsopoulos NA, Evangelou E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ British Med J Publish Group. 2007;335:914–6.
Article Google Scholar
Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Statistical Software [Internet]. 2010;36:1–48. Available from: http://www.jstatsoft.org/v36/i03/.
Google Scholar
Ioannidis JPA, Trikalinos TA. An exploratory test for an excess of significant findings. Clin Trials. 2007;4:245–53.
Article PubMed Google Scholar
Ioannidis JPA. Clarifications on the application and interpretation of the test for excess significance and its extensions. J Math Psychol. 2013;57:184–7.
Article Google Scholar
Simon R, Altman DG. Statistical aspects of prognostic factor studies in oncology. Br J Cancer Nat Publish Group. 1994;69:979–85.
CAS Google Scholar
Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, meta-analyses and pooled analyses in epidemiology. Int J Epidemiol. 1999;28:1–9.
Article CAS PubMed Google Scholar
Kyzas PA, Loizou KT, Ioannidis JPA. Selective reporting biases in cancer prognostic factor studies. J Natl Cancer Inst Oxford Univ Press. 2005;97:1043–55.
Article Google Scholar
Sideris M, Papagrigoriadis S. Molecular biomarkers and classification models in the evaluation of the prognosis of colorectal cancer. Anticancer Res. 2014;34:2061–8.
CAS PubMed Google Scholar
Weigel MT, Dowsett M. Current and emerging biomarkers in breast cancer: prognosis and prediction. Endocr Relat Cancer BioScientifica. 2010;17:R245–62.
Article CAS Google Scholar
Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): explanation and elaboration. Plos Med Public Library Sci. 2012;9:e1001216.
Google Scholar
Kyzas PA, Denaxa-Kyza D, Ioannidis JPA. Almost all articles on cancer prognostic markers report statistically significant results. Eur J Cancer. 2007;43:2559–79.
Article PubMed Google Scholar
Peat G, Riley RD, Croft P, Morley KI, Kyzas PA, Moons KGM, et al. Improving the transparency of prognosis research: the role of reporting, data sharing, registration, and protocols. Plos Med Public Library of Sci. 2014;11:e1001671.
Google Scholar
Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med Am College of Physicians. 2015;162:W1–W73.
Google Scholar
Janssens ACJW, Ioannidis JPA, van Duijn CM, Little J, Khoury MJ, GRIPS Group. Strengthening the reporting of Genetic RIsk Prediction Studies: the GRIPS Statement. Plos Med. 2011;8(3):e1000420.
Altman DG, Vergouwe Y, Royston P, Moons KGM. Prognosis and prognostic research: validating a prognostic model. BMJ. 2009;338:b605.
Article PubMed Google Scholar
Siontis GCM, Tzoulaki I, Castaldi PJ, Ioannidis JPA. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol. 2015;68:25–34.
Article PubMed Google Scholar
Ioannidis JPA. Why most discovered true associations are inflated. Epidemiology. 2008;19:640–8.
Article PubMed Google Scholar
Sterne JA, Gavaghan D, Egger M. Publication and related bias in meta-analysis: power of statistical tests and prevalence in the literature. J Clin Epidemiol. 2000;53:1119–29.
Article CAS PubMed Google Scholar
IntHout J, Ioannidis JPA, Borm GF, Goeman JJ. Small studies are more heterogeneous than large ones: a meta-meta-analysis. J Clin Epidemiol. 2015;68:860–9.
Article PubMed Google Scholar
Ioannidis JPA, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA Am Med Assoc. 2011;305:2200–10.
Article CAS Google Scholar
Pan Z, Trikalinos TA, Kavvoura FK, Lau J, Ioannidis JPA. Local literature bias in genetic epidemiology: an empirical evaluation of the Chinese literature. Plos Med Public Library Sci. 2005;2:e334.
Google Scholar
Ioannidis JPA, Chang CQ, Lam TK, Schully SD, Khoury MJ. The geometric increase in meta-analyses from China in the genomic era. PLoS ONE Public Library Sci. 2013;8:e65602.
Article CAS Google Scholar
Hayes DF, Bast RC, Desch CE, Herbert Fritsche J, Kemeny NE, Jessup JM, et al. Tumor Marker Utility Grading System: a Framework to Evaluate Clinical Utility of Tumor Markers. J Natl Cancer Inst Oxford Univ Press. 1996;88:1456–66.
Article CAS Google Scholar
Harris L, Fritsche H, Mennel R, Norton L, Ravdin P, Taube S, et al. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J Clin Oncol Am Soc Clin Oncol. 2007;25:5287–312.
Article CAS Google Scholar
Chi Y, Huang S, Yuan L, Liu M. Role of BC040587 as a predictor of poor outcome in breast cancer. Cancer Cell Int. 2014;14(1):123.
Liu PY, Erriquez D, Marshall GM, Tee AE, Polly P, Wong M, et al. Effects of a novel long noncoding RNA, lncUSMycN, on N-Myc expression and neuroblastoma progression. J Natl Cancer Inst. 2014;106:113–3. dju.
Article Google Scholar

Download references

Acknowledgements

None.

Funding

No sources of funding to declare.

Availability of data and materials

The complete database upon which this review article has been constructed can be freely accessed here: https://goo.gl/EjCDAp. The size of this database does not permit its publication as an additional supporting file.

Authors’ contributions

SS: study design, acquisition, analysis and interpretation of data, manuscript drafting; AK: study design, acquisition of data; final approval of the manuscript; JPA: study conception and design, data interpretation, drafting and critical appraisal of manuscript. All authors have given final approval to this version of the manuscript to be published.

Authors’ information

As previously declared.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations

St. John’s Hospital, Livingston, EH54 6PP, UK
Stylianos Serghiou
College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK
Stylianos Serghiou
University Hospital of North Durham, North Rd, Durham, DH1 5TW, UK
Aikaterini Kyriakopoulou
Stanford Prevention Research Center, Department of Medicine, Stanford University School of Medicine Stanford, Stanford, CA, 94305, USA
John P. A. Ioannidis
Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, 94305, USA
John P. A. Ioannidis
Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, 94305, USA
John P. A. Ioannidis
Meta-Research Innovation Center at Stanford (METRICS), Stanford University, 1265 Welch Rd, MSOB X306, Stanford, CA, 94305, USA
John P. A. Ioannidis

Authors

Stylianos Serghiou
View author publications
You can also search for this author in PubMed Google Scholar
Aikaterini Kyriakopoulou
View author publications
You can also search for this author in PubMed Google Scholar
John P. A. Ioannidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John P. A. Ioannidis.

Additional files

Additional file 1: Table S2.

A table presenting all meta-analyses done. 'Analysis' refers to the the type of data used in each meta-analysis, as this was explained in Methods. 'Studies' refers to the number of studies included within each meta-analysis. Columns 'HR' and '95 % CI' refer to the summary Hazard Ratio (HR) of each meta-analysis with its 95 % Confidence Interval (CI) (lower and upper limit). 'Tau' refers to the squared root of the estimate of between-study variance in each of our random-effects meta-analyses. Columns 'I and '95 % CI' refer to a measure of between-study heterogeneity and its corresponding 95 % CI. 'Q-statistic' and its 'P-value' refer to Cochran's Q measure of heterogeneity with its p-value. 'Observed', 'Expected' and 'P-value (binomial)' refer to the observed and expected amount of statistically significant results and the comparison between the two, as this was described in Methods. (XLSX 50.7 kb)

Additional file 2:

The studies eligible for systematic review. (DOC 444 kb)

Additional file 3: Figure S1.

The covariates included within the multivariable models fitted by each paper. This is a data microarray in which the studies run along the Y-axis and the covariates run along the X-axis. Rows and columns are ordered in descending order, based on the total number each covariate was included in the multivariable models fitted by each study. Where patterns were similar between studies or covariates, those studies or covariates were placed next to each other. (PDF 82 kb)

Additional file 4: Table S1.

Explanation of how 92 studies provided data regarding 127 analyses. (DOC 36 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Serghiou, S., Kyriakopoulou, A. & Ioannidis, J.P.A. Long noncoding RNAs as novel predictors of survival in human cancer: a systematic review and meta-analysis. Mol Cancer 15, 50 (2016). https://doi.org/10.1186/s12943-016-0535-1

Download citation

Received: 18 February 2016
Accepted: 14 June 2016
Published: 28 June 2016
DOI: https://doi.org/10.1186/s12943-016-0535-1

Abstract

Background

Methods

Results

Conclusions

Similar content being viewed by others

Background

Methods

Systematic review

Eligibility criteria

Information sources

Study selection

Risk of bias in individual studies

Meta-analysis

Summary measures and synthesis of results

Risk of bias across studies

Results

Literature search and description of studies

Mapping of lncRNA prognostic data

Overall survival

Meta-analysis for overall survival

Other meta-analyses

Heterogeneity metrics and excess significance

Discussion

Limitations

Conclusions

Abbreviations

References

Acknowledgements

Funding

Availability of data and materials

Authors’ contributions

Authors’ information

Competing interests

Consent for publication

Ethics approval and consent to participate

Author information

Authors and Affiliations

Corresponding author

Additional files

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation