iBATCGH: Integrative Bayesian Analysis of Transcriptomic and CGH Data

  • Alberto Cassese
  • Michele Guindani
  • Marina VannucciEmail author
Conference paper
Part of the Abel Symposia book series (ABEL, volume 11)


We describe a method for the integration of high-throughput data from different sources. More specifically, iBATCGH is a package for the integrative analysis of transcriptomic and genomic data, based on a hierarchical Bayesian model. Through the specification of a measurement error model we relate the gene expression levels to latent copy number states which, in turn, are related to the observed surrogate CGH measurement via a hidden Markov model. Selection of relevant associations is performed employing variable selection priors that explicitly incorporate dependence information across adjacent copy number states. Posterior inference is carried out through Markov chain Monte Carlo techniques that efficiently explores the space of all possible associations. In this chapter we review the model and present the functions provided in iBATCGH, an R package based on a C implementation of the inferential algorithm. Lastly, we illustrate the method via a case study on ovarian cancer.


Hide Markov Model Comparative Genomic Hybridization Betulinic Acid Lung Squamous Cell Carcinoma Measurement Error Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Airoldi, E.M., Costa, T., Bassetti, F., Leisen, F., Guindani, M.: Generalized species sampling priors with latent Beta reinforcements. J. Am. Stat. Assoc. 109(508), 1466–1480 (2014)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32(3), 870–897 (2004)CrossRefMathSciNetzbMATHGoogle Scholar
  3. 3.
    Barnes, C., Plagnol, V., Fitzgerald, T., et al.: A robust statistical method for case-control association testing with copy number variation. Nat. Genet. 40, 1245–1252 (2008)CrossRefGoogle Scholar
  4. 4.
    Brasseur, F., Rimoldi, D., Liénard, D., et al.: Expression of MAGE genes in primary and metastatic cutaneous melanoma. Int. J. Cancer 63(3), 375–380 (1995)CrossRefGoogle Scholar
  5. 5.
    Brown, P., Vannucci, M., Fearn, T.: Multivariate Bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B 60, 627–641 (1998)CrossRefMathSciNetzbMATHGoogle Scholar
  6. 6.
    Cardin, N., Holmes, C., Donnelly, P., Marchini, J.: Bayesian hierarchical mixture modeling to assign copy number from a targeted CNV array. Genet. Epidemiol. 35, 536–548 (2011)Google Scholar
  7. 7.
    Cassese, A., Guindani, M., Tadesse, M., Falciani, F., Vannucci, M.: A hierarchical Bayesian model for inference of copy number variants and their association to gene expression. Ann. Appl. Stat. 8(1), 148–175 (2014)CrossRefMathSciNetzbMATHGoogle Scholar
  8. 8.
    Cassese, A., Guindani, M., Vannucci, M.: A bayesian integrative model for genetical genomics with spatially informed variable selection. Cancer Informat. 13(S2), 29–37 (2014)Google Scholar
  9. 9.
    Colella, S., Yau, C., Taylor, J., et al.: QuantiSNP: an objective Bayes hidden-Markov model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35(6), 2013–2025 (2007)CrossRefGoogle Scholar
  10. 10.
    Davis, S.J., Choong, D.Y., Ramakrishna, M., Ryland, G.L., Campbell, I.G., Gorringe, K.L.: Analysis of the mitogen-activated protein kinase kinase 4 (MAP2K4) tumor suppressor gene in ovarian cancer. BMC Cancer 1(11), 173 (2011)CrossRefGoogle Scholar
  11. 11.
    Dennis, G., Sherman, B.T., Hosack, D.A., Yang, J., Gao, W., Lane, H.C., Lempicki, R.A.: DAVID: database for annotation, visualization, and integrated discovery. Genome Biol. 4(5), P3 (2003)CrossRefGoogle Scholar
  12. 12.
    Du, L., Chen, M., Lucas, J., Carlin, L.: Sticky hidden Markov modeling of comparative genomic hybridization. IEEE Trans. Signal Process 58(10), 5353–5368 (2010)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Eddelbuettel, D., Francois, R.: Rcpp: seamless R and C++ integration. J. Stat. Softw. 40(8), 1–18 (2011)CrossRefGoogle Scholar
  14. 14.
    George, E., McCulloch, R.: Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373 (1997)zbMATHGoogle Scholar
  15. 15.
    Guha, S., Li, Y., Neuberg, D.: Bayesian hidden Markov modelling of array cgh data. J. Am. Stat. Assoc. 103(482), 485–497 (2008)CrossRefMathSciNetzbMATHGoogle Scholar
  16. 16.
    Imaia, Y., Shichijo, S., Yamada, A., Katayama, T., Yano, H., Itoh, K.: Sequence analysis of the MAGE gene family encoding human tumor-rejection antigens. Gene 160(2), 287–290 (1995)CrossRefGoogle Scholar
  17. 17.
    Jinawath, N., Vasoontara, C., Jinawath, A., et al.: Oncoproteomic analysis reveals co-upregulation of RELA and STAT5 in carboplatin resistant ovarian carcinoma. PLoS One 5(6), e11198 (2010)CrossRefGoogle Scholar
  18. 18.
    Karni, R., de Stanchina, E., Lowe, S.W., Sinha, R., Mu, D., Krainer, A.R.: The gene encoding the splicing factor SF2/ASF is a proto-oncogene. Nat. Struct. Mol. Biol. 14(3), 185–193 (2007)CrossRefGoogle Scholar
  19. 19.
    King, M.C., Marks, J.H., Mandell., J.B.: Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2. Science 302(5645), 643–646 (2003)Google Scholar
  20. 20.
    Malek, A., Bakhidze, E., Noske, A., et al.: HMGA2 gene is a promising target for ovarian cancer silencing therapy. Int. J. Cancer 132(2), 348–356 (2008)CrossRefGoogle Scholar
  21. 21.
    Monni, S., Tadesse, M.: A stochastic partitioning method to associate high-dimensional responses and covariates. Bayesian Anal. 4(3), 413–436 (2009)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Morris, J.R., Boutell, C., Keppler, M., et al.: The SUMO modification pathway is involved in the BRCA1 response to genotoxic stress. Nature 462(7275), 886–890 (2009)CrossRefGoogle Scholar
  23. 23.
    Newton, M.A., Noueiry, A., Sarkar, D., Ahlquist, P.: Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5(2), 155–176 (2004)CrossRefzbMATHGoogle Scholar
  24. 24.
    Pharoah, P.D., Tsai, Y.Y., Ramus, S.J., et al.: GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer. Nat. Genet. 45(4), 362–370 (2013)CrossRefGoogle Scholar
  25. 25.
    Picard, F., Robin, S., Lebarbier, E., Daudin, J.: A segmentation-clustering model for the analysis of array CGH data. Biometrics 63(3), 758–766 (2007)CrossRefMathSciNetzbMATHGoogle Scholar
  26. 26.
    Resnick, M.B., Sabo, E., Kondratev, S., Kerner, H., Spagnoli, G.C., Yakirevich, E.: Cancer-testis antigen expression in uterine malignancies with an emphasis on carcinosarcomas and papillary serous carcinomas. Int. J. Cancer 101(2), 190–195 (2002)CrossRefGoogle Scholar
  27. 27.
    Rihardson, S., Gilks, W.R.: Conditional independence models for epidemiological studies with covariate measurement error model. Stat. Med. 12, 1703–1722 (1993)CrossRefGoogle Scholar
  28. 28.
    Richardson, S., Bottolo, L., Rosenthal, J.: Bayesian models for sparse regression analysis of high dimensional data. Bayesian Stat. 9, 539–569 (2010)Google Scholar
  29. 29.
    Sha, N., Vannucci, M., Tadesse, M., et al.: Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage. Biometrics 60(3), 812–819 (2004)CrossRefMathSciNetzbMATHGoogle Scholar
  30. 30.
    Shappell, S.B., Gupta, R.A., Manning, S., et al.: 15S-Hydroxyeicosatetraenoic acid activates peroxisome proliferator-activated receptor gamma and inhibits proliferation in PC3 prostate carcinoma cells. Cancer Res. 61(2), 497–503 (2001)Google Scholar
  31. 31.
    Stingo, F., Chen, Y., Vannucci, M., Barrier, M., Mirkes, P.A.: Bayesian graphical modelling approach to microRNA regulatory network inference. Ann. Appl. Stat. 4(4), 2024–2048 (2010)CrossRefMathSciNetzbMATHGoogle Scholar
  32. 32.
    Stingo, F., Chen, Y., Tadesse, M., Vannucci, M.: Incorporating biological information into linear models: a Bayesian approach to the selection of pathways and genes. Ann. Appl. Stat. 5(3), 1978–2002 (2011)CrossRefMathSciNetzbMATHGoogle Scholar
  33. 33.
    Subirana, I., Diaz-Uriarte, R., Lucas, G., Gonzalez, J.: CNVassoc: association analysis of CNV data using R. BMC Med. Genomics 4, 47 (2011)CrossRefGoogle Scholar
  34. 34.
    Tavassoli, M., Ruhrberg, C., Beaumont, V., Reynolds, K., Kirkham, N., Collins, W.P., Farzaneh F.: Whole chromosome 17 loss in ovarian cancer. Genes Chromosom. Cancer 8(3), 195–198 (1993)CrossRefGoogle Scholar
  35. 35.
    Venkatraman, E., Olshen, A.: A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23(6), 657–663 (2007)CrossRefGoogle Scholar
  36. 36.
    Wang, K., Li, M., Hadley, D., et al.: PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17(11), 1665–1674 (2007)CrossRefGoogle Scholar
  37. 37.
    Wang, Y.J., Liu, J.B., Dou, Y.C.: Sequential treatment with betulinic acid followed by 5-fluorouracil shows synergistic cytotoxic activity in ovarian cancer cells. Int. J. Clin. Exp. Pathol. 8(1), 252–259 (2015)Google Scholar
  38. 38.
    Yau, C., Papaspiliopoulos, O., Roberts, G.O., Holmes, C.: Bayesian nonparametric hidden Markov models with application to the analysis of copy-number-variation in mammalian genomes. J. R. Stat. Soc. Ser. B 73(1), 37–57 (2011)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Alberto Cassese
    • 1
  • Michele Guindani
    • 2
  • Marina Vannucci
    • 3
    Email author
  1. 1.Maastricht UniversityMaastrichtThe Netherlands
  2. 2.UT MD Anderson Cancer CenterHoustonUSA
  3. 3.Department of StatisticsRice UniversityHoustonUSA

Personalised recommendations