Random gene sets in predicting survival of patients with hepatocellular carcinoma
Despite multiple publications, molecular signatures predicting the course of hepatocellular carcinoma (HCC) have not yet been integrated into clinical routine decision-making. Given the diversity of published signatures, optimal number, best combinations, and benefit of functional associations of genes in prognostic signatures remain to be defined. We investigated a vast number of randomly chosen gene sets (varying between 1 and 10,000 genes) to encompass the full range of prognostic gene sets on 242 transcriptomic profiles of patients with HCC. Depending on the selected size, 4.7 to 23.5% of all random gene sets exhibit prognostic potential by separating patient subgroups with significantly diverse survival. This was further substantiated by investigating gene sets and signaling pathways also resulting in a comparable high number of significantly prognostic gene sets. However, combining multiple random gene sets using “swarm intelligence” resulted in a significantly improved predictability for approximately 63% of all patients. In these patients, approx. 70% of all random 50-gene containing gene sets resulted in equal and stable prediction of survival. For all other patients, a reliable prediction seems highly unlikely for any selected gene set. Using a machine learning and independent validation approach, we demonstrated a high reliability of random gene sets and swarm intelligence in HCC prognosis. Ultimately, these findings were validated in two independent patient cohorts and independent technical platforms (microarray, RNASeq). In conclusion, we demonstrate that using “swarm intelligence” of multiple gene sets for prognosis prediction may not only be superior but also more robust for predictive purposes.
Molecular signatures predicting HCC have not yet been integrated into clinical routine
Depending on the selected size, 4.7 to 23.5% of all random gene sets exhibit prognostic potential; independent of the technical platform (microarray, RNASeq)
Using “swarm intelligence” resulted in a significantly improved predictability
In these patients, approx. 70% of all random 50-gene containing gene sets resulted in equal and stable prediction of survival
Overall, “swarm intelligence” is superior and more robust for predictive purposes in HCC
KeywordsHCC Liver cancer Prognostic Signature Gene set Bioinformatics Transcriptome Profiling Random Swarm intelligence Microarray RNA Seq
The authors thank Dr. Snorri Thorgeirsson, NIH/NCI, Bethesda, MD for his generous support and providing clinical parameters to the GSE4024 and GSE1898 data sets. S.R. was supported by the German Research Foundation (DFG) CRC SFB/TR 209 Liver Cancer project B01.
Study concept and design: TI, RS, TM, SST, and AT; acquisition of data—public expression data, analysis, and interpretation of data: TI, RS, TM, SM, SR, MPE, ME, and AT; drafting of the manuscript: TI, RS, TM, SM, HJS, WH, ME, and AT; critical revision of the manuscript for important intellectual content: TI, RS, TM, SM, SR, SST, MPE, HJS, WH, ME, and AT; statistical analysis: RS; obtained funding: WH, ME, HJS, and AT.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- 1.Cao H, Phan H, Yang LX (2012) Improved chemotherapy for hepatocellular carcinoma. Anticancer Res 32:1379–1386Google Scholar
- 2.Llovet JM, Montal R, Sia D, Finn RS. Molecular therapies and precision medicine for hepatocellular carcinoma. Nat Rev Clin Oncol 2018 Google Scholar
- 11.Ayers M, Symmans WF, Stec J, Damokosh AI, Clark E, Hess K, Lecocke M, Metivier J, Booser D, Ibrahim N, Valero V, Royce M, Arun B, Whitman G, Ross J, Sneige N, Hortobagyi GN, Pusztai L (2004) Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer. J Clin Oncol 22:2284–2293CrossRefGoogle Scholar
- 12.Gerlinger M, Rowan AJ, Horswell S, Math M, Larkin J, Endesfelder D, Gronroos E, Martinez P, Matthews N, Stewart A, Tarpey P, Varela I, Phillimore B, Begum S, McDonald N, Butler A, Jones D, Raine K, Latimer C, Santos CR, Nohadani M, Eklund AC, Spencer-Dene B, Clark G, Pickering L, Stamp G, Gore M, Szallasi Z, Downward J, Futreal PA, Swanton C (2012) Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366:883–892CrossRefGoogle Scholar
- 15.Roessler S, Budhu A, Wang XW (2014) Deciphering cancer heterogeneity: the biological space. Front Cell Dev Biol 3:2–12Google Scholar
- 16.Itzel T, Scholz P, Maass T, Krupp M, Marquardt JU, Strand S, Becker D, Staib F, Binder H, Roessler S, Wang XW, Thorgeirsson S, Müller M, Galle PR, Teufel A (2015) Translating bioinformatics in oncology: guilt-by-profiling analysis and identification of KIF18B and CDCA3 as novel driver genes in carcinogenesis. Bioinformatics 31:216–224CrossRefGoogle Scholar