Clustering and Network Analysis of Reverse Phase Protein Array Data

  • Adam ByronEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1606)


Molecular profiling of proteins and phosphoproteins using a reverse phase protein array (RPPA) platform, with a panel of target-specific antibodies, enables the parallel, quantitative proteomic analysis of many biological samples in a microarray format. Hence, RPPA analysis can generate a high volume of multidimensional data that must be effectively interrogated and interpreted. A range of computational techniques for data mining can be applied to detect and explore data structure and to form functional predictions from large datasets. Here, two approaches for the computational analysis of RPPA data are detailed: the identification of similar patterns of protein expression by hierarchical cluster analysis and the modeling of protein interactions and signaling relationships by network analysis. The protocols use freely available, cross-platform software, are easy to implement, and do not require any programming expertise. Serving as data-driven starting points for further in-depth analysis, validation, and biological experimentation, these and related bioinformatic approaches can accelerate the functional interpretation of RPPA data.

Key words

Bioinformatics Cell signaling Data analysis Hierarchical clustering Interaction networks Microarray analysis Pathway analysis Proteomics Reverse phase protein array Visualization 



A.B. is funded by Cancer Research UK (grant C157/A15703 to M. C. Frame).


  1. 1.
    Paweletz CP, Charboneau L, Bichsel VE, Simone NL, Chen T, Gillespie JW, Emmert-Buck MR, Roth MJ, Petricoin IE, Liotta LA (2001) Reverse phase protein microarrays which capture disease progression show activation of pro-survival pathways at the cancer invasion front. Oncogene 20(16):1981–1989. doi: 10.1038/sj.onc.1204265 CrossRefPubMedGoogle Scholar
  2. 2.
    Csete ME, Doyle JC (2002) Reverse engineering of biological complexity. Science 295(5560):1664–1669. doi: 10.1126/science.1069981 CrossRefPubMedGoogle Scholar
  3. 3.
    Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1(1):24–45. doi: 10.1109/TCBB.2004.2 CrossRefPubMedGoogle Scholar
  4. 4.
    Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J, Nicholas S, Teboulle M (eds) Grouping multidimensional data: recent advances in clustering. Springer, Berlin, Germany, pp 27–71Google Scholar
  5. 5.
    Yona G, Dirks W, Rahman S (2009) Comparing algorithms for clustering of expression data: how to assess gene clusters. Methods Mol Biol 541:479–509. doi: 10.1007/978-1-59745-243-4_21 CrossRefPubMedGoogle Scholar
  6. 6.
    Carugo O (2010) Clustering criteria and algorithms. Methods Mol Biol 609:175–196. doi: 10.1007/978-1-60327-241-4_11 CrossRefPubMedGoogle Scholar
  7. 7.
    Nugent R, Meila M (2010) An overview of clustering applied to molecular biology. Methods Mol Biol 620:369–404. doi: 10.1007/978-1-60761-580-4_12 CrossRefPubMedGoogle Scholar
  8. 8.
    Spirin V, Mirny LA (2003) Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A 100(21):12123–12128. doi: 10.1073/pnas.2032324100 CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Orchard S (2012) Molecular interaction databases. Proteomics 12(10):1656–1662. doi: 10.1002/pmic.201100484 CrossRefPubMedGoogle Scholar
  10. 10.
    Chen B, Fan W, Liu J, Wu FX (2014) Identifying protein complexes and functional modules--from static PPI networks to dynamic PPI networks. Brief Bioinform 15(2):177–194. doi: 10.1093/bib/bbt039 CrossRefPubMedGoogle Scholar
  11. 11.
    Srihari S, Yong CH, Patil A, Wong L (2015) Methods for protein complex prediction and their contributions towards understanding the organisation, function and dynamics of complexes. FEBS Lett 589(19 Pt A):2590–2602. doi: 10.1016/j.febslet.2015.04.026 CrossRefPubMedGoogle Scholar
  12. 12.
    Lei C, Ruan J (2013) A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity. Bioinformatics 29(3):355–364. doi: 10.1093/bioinformatics/bts688 CrossRefPubMedGoogle Scholar
  13. 13.
    Masseroli M, Canakoglu A, Quigliatti M (2015) Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations. BMC Genomics 16:S5. doi: 10.1186/1471-2164-16-S6-S5 CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Domijan M, Rand DA (2015) Using constraints and their value for optimization of large ODE systems. J R Soc Interface 12(104):20141303. doi: 10.1098/rsif.2014.1303 CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    D’Alessandro LA, Samaga R, Maiwald T, Rho SH, Bonefas S, Raue A, Iwamoto N, Kienast A, Waldow K, Meyer R, Schilling M, Timmer J, Klamt S, Klingmuller U (2015) Disentangling the complexity of HGF signaling by combining qualitative and quantitative modeling. PLoS Comput Biol 11(4):e1004192. doi: 10.1371/journal.pcbi.1004192 CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Korkola JE, Collisson EA, Heiser L, Oates C, Bayani N, Itani S, Esch A, Thompson W, Griffith OL, Wang NJ, Kuo WL, Cooper B, Billig J, Ziyad S, Hung JL, Jakkula L, Feiler H, Lu Y, Mills GB, Spellman PT, Tomlin C, Mukherjee S, Gray JW (2015) Decoupling of the PI3K pathway via mutation necessitates combinatorial treatment in HER2+ breast cancer. PLoS One 10(7):e0133219. doi: 10.1371/journal.pone.0133219 CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Dunster JL, Mazet F, Fry MJ, Gibbins JM, Tindall MJ (2015) Regulation of early steps of GPVI signal transduction by phosphatases: a systems biology approach. PLoS Comput Biol 11(11):e1004589. doi: 10.1371/journal.pcbi.1004589 CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Kim DC, Yang CR, Wang X, Zhang B, Wu X, Gao J (2011) Discovery of lung cancer pathways using reverse phase protein microarray and prior-knowledge based Bayesian networks. Conf Proc IEEE Eng Med Biol Soc 2011:5543–5546. doi: 10.1109/IEMBS.2011.6091414 PubMedGoogle Scholar
  19. 19.
    Oates CJ, Dondelinger F, Bayani N, Korkola J, Gray JW, Mukherjee S (2014) Causal network inference using biochemical kinetics. Bioinformatics 30(17):i468–i474. doi: 10.1093/bioinformatics/btu452 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Frohlich H, Bahamondez G, Gotschel F, Korf U (2015) Dynamic Bayesian network modeling of the interplay between EGFR and Hedgehog signaling. PLoS One 10(11):e0142646. doi: 10.1371/journal.pone.0142646 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    von der Heyde S, Bender C, Henjes F, Sonntag J, Korf U, Beissbarth T (2014) Boolean ErbB network reconstructions and perturbation simulations reveal individual drug response in different breast cancer cell lines. BMC Syst Biol 8:75. doi: 10.1186/1752-0509-8-75 CrossRefPubMedGoogle Scholar
  22. 22.
    Martinez-Sanchez ME, Mendoza L, Villarreal C, Alvarez-Buylla ER (2015) A minimal regulatory network of extrinsic and intrinsic factors recovers observed patterns of CD4+ T cell differentiation and plasticity. PLoS Comput Biol 11(6):e1004324. doi: 10.1371/journal.pcbi.1004324 CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Mori T, Flottmann M, Krantz M, Akutsu T, Klipp E (2015) Stochastic simulation of Boolean rxncon models: towards quantitative analysis of large signaling networks. BMC Syst Biol 9:45. doi: 10.1186/s12918-015-0193-8 CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Melas IN, Mitsos A, Messinis DE, Weiss TS, Rodriguez JS, Alexopoulos LG (2012) Construction of large signaling pathways using an adaptive perturbation approach with phosphoproteomic data. Mol Biosyst 8(5):1571–1584. doi: 10.1039/c2mb05482e CrossRefPubMedGoogle Scholar
  25. 25.
    Buescher JM, Driggers EM (2016) Integration of omics: more than the sum of its parts. Cancer Metab 4:4. doi: 10.1186/s40170-016-0143-y CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Asthana S, King OD, Gibbons FD, Roth FP (2004) Predicting protein complex membership using probabilistic network reliability. Genome Res 14(6):1170–1175. doi: 10.1101/gr.2203804 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    King AD, Przulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020. doi: 10.1093/bioinformatics/bth351 CrossRefPubMedGoogle Scholar
  28. 28.
    Wang C, Ding C, Yang Q, Holbrook SR (2007) Consistent dissection of the protein interaction network by combining global and local metrics. Genome Biol 8(12):R271. doi: 10.1186/gb-2007-8-12-r271 CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Jeong H, Mason SP, Barabasi AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411(6833):41–42. doi: 10.1038/35075138 CrossRefPubMedGoogle Scholar
  30. 30.
    Chua HN, Sung WK, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22(13):1623–1630. doi: 10.1093/bioinformatics/btl145 CrossRefPubMedGoogle Scholar
  31. 31.
    Ulitsky I, Shamir R (2009) Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics 25(9):1158–1164. doi: 10.1093/bioinformatics/btp118 CrossRefPubMedGoogle Scholar
  32. 32.
    Hidalgo CA, Blumm N, Barabasi AL, Christakis NA (2009) A dynamic network approach for the study of human phenotypes. PLoS Comput Biol 5(4):e1000353. doi: 10.1371/journal.pcbi.1000353 CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Kim YA, Wuchty S, Przytycka TM (2011) Identifying causal genes and dysregulated pathways in complex diseases. PLoS Comput Biol 7(3):e1001095. doi: 10.1371/journal.pcbi.1001095 CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    de Hoon MJ, Imoto S, Nolan J, Miyano S (2004) Open source clustering software. Bioinformatics 20(9):1453–1454. doi: 10.1093/bioinformatics/bth078 CrossRefPubMedGoogle Scholar
  35. 35.
    Saldanha AJ (2004) Java Treeview--extensible visualization of microarray data. Bioinformatics 20(17):3246–3248. doi: 10.1093/bioinformatics/bth349 CrossRefPubMedGoogle Scholar
  36. 36.
    Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504. doi: 10.1101/gr.1239303 CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Espina V, Mueller C, Liotta LA (2011) Phosphoprotein stability in clinical tissue and its relevance for reverse phase protein microarray technology. Methods Mol Biol 785:23–43. doi: 10.1007/978-1-61779-286-1_3 CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Wolff C, Schott C, Malinowsky K, Berg D, Becker KF (2011) Producing reverse phase protein microarrays from formalin-fixed tissues. Methods Mol Biol 785:123–140. doi: 10.1007/978-1-61779-286-1_9 CrossRefPubMedGoogle Scholar
  39. 39.
    Kornblau SM, Coombes KR (2011) Use of reverse phase protein microarrays to study protein expression in leukemia: technical and methodological lessons learned. Methods Mol Biol 785:141–155. doi: 10.1007/978-1-61779-286-1_10 CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Chung JY, Hewitt SM (2015) A well-based reverse-phase protein array of formalin-fixed paraffin-embedded tissue. Methods Mol Biol 1312:129–139. doi: 10.1007/978-1-4939-2694-7_17 CrossRefPubMedGoogle Scholar
  41. 41.
    Gallagher RI, Silvestri A, Petricoin EF 3rd, Liotta LA, Espina V (2011) Reverse phase protein microarrays: fluorometric and colorimetric detection. Methods Mol Biol 723:275–301. doi: 10.1007/978-1-61779-043-0_18 CrossRefPubMedGoogle Scholar
  42. 42.
    Ambroz K (2011) Impact of blocking and detection chemistries on antibody performance for reverse phase protein arrays. Methods Mol Biol 785:13–21. doi: 10.1007/978-1-61779-286-1_2 CrossRefPubMedGoogle Scholar
  43. 43.
    Brase JC, Mannsperger H, Sultmann H, Korf U (2011) Antibody-mediated signal amplification for reverse phase protein array-based protein quantification. Methods Mol Biol 785:55–64. doi: 10.1007/978-1-61779-286-1_5 CrossRefPubMedGoogle Scholar
  44. 44.
    Chiechi A (2016) Normalization of reverse phase protein microarray data: choosing the best normalization analyte. Methods Mol Biol 1362:77–89. doi: 10.1007/978-1-4939-3106-4_4 CrossRefPubMedGoogle Scholar
  45. 45.
    Neeley ES, Kornblau SM, Coombes KR, Baggerly KA (2009) Variable slope normalization of reverse phase protein arrays. Bioinformatics 25(11):1384–1389. doi: 10.1093/bioinformatics/btp174 CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Anderson T, Wulfkuhle J, Liotta L, Winslow RL, Petricoin E 3rd (2009) Improved reproducibility of reverse-phase protein microarrays using array microenvironment normalization. Proteomics 9(24):5562–5566. doi: 10.1002/pmic.200900505 CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Mannsperger HA, Gade S, Henjes F, Beissbarth T, Korf U (2010) RPPanalyzer: analysis of reverse-phase protein array data. Bioinformatics 26(17):2202–2203. doi: 10.1093/bioinformatics/btq347 CrossRefPubMedGoogle Scholar
  48. 48.
    Troncale S, Barbet A, Coulibaly L, Henry E, He B, Barillot E, Dubois T, Hupe P, de Koning L (2012) NormaCurve: a SuperCurve-based method that simultaneously quantifies and normalizes reverse phase protein array data. PLoS One 7(6):e38686. doi: 10.1371/journal.pone.0038686 CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    von der Heyde S, Sonntag J, Kaschek D, Bender C, Bues J, Wachter A, Timmer J, Korf U, Beissbarth T (2014) RPPanalyzer toolbox: an improved R package for analysis of reverse phase protein array data. Biotechniques 57(3):125–135. doi: 10.2144/000114205 PubMedGoogle Scholar
  50. 50.
    Kaushik P, Molinelli EJ, Miller ML, Wang W, Korkut A, Liu W, Ju Z, Lu Y, Mills G, Sander C (2014) Spatial normalization of reverse phase protein array data. PLoS One 9(12):e97213. doi: 10.1371/journal.pone.0097213 CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Chiechi A, Mueller C, Boehm KM, Romano A, Benassi MS, Picci P, Liotta LA, Espina V (2012) Improved data normalization methods for reverse phase protein microarray analysis of complex biological samples. Biotechniques 0(0):1–7. doi: 10.2144/000113926 PubMedPubMedCentralGoogle Scholar
  52. 52.
    Eichner J, Heubach Y, Ruff M, Kohlhof H, Strobl S, Mayer B, Pawlak M, Templin MF, Zell A (2014) RPPApipe: a pipeline for the analysis of reverse-phase protein array data. Biosystems 122:19–24. doi: 10.1016/j.biosystems.2014.06.009 CrossRefPubMedGoogle Scholar
  53. 53.
    R Development Core Team (2013) R: A Language and Environment for Statistical Computing. The R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
  54. 54.
    Achtert E, Kriegel H-P, Zimek A (2008) ELKI: a software system for evaluation of subspace clustering algorithms. Lect Notes Comput Sci 5069:580–585CrossRefGoogle Scholar
  55. 55.
    Sharan R, Maron-Katz A, Shamir R (2003) CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19(14):1787–1799CrossRefPubMedGoogle Scholar
  56. 56.
    Sturn A, Quackenbush J, Trajanoski Z (2002) Genesis: cluster analysis of microarray data. Bioinformatics 18(1):207–208CrossRefPubMedGoogle Scholar
  57. 57.
    Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. Int AAAI Conf Web Soc Media.
  58. 58.
    Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal, Complex Systems:1695.
  59. 59.
    Smedley D, Haider S, Durinck S, Pandini L, Provero P, Allen J, Arnaiz O, Awedh MH, Baldock R, Barbiera G, Bardou P, Beck T, Blake A, Bonierbale M, Brookes AJ, Bucci G, Buetti I, Burge S, Cabau C, Carlson JW, Chelala C, Chrysostomou C, Cittaro D, Collin O, Cordova R, Cutts RJ, Dassi E, Di Genova A, Djari A, Esposito A, Estrella H, Eyras E, Fernandez-Banet J, Forbes S, Free RC, Fujisawa T, Gadaleta E, Garcia-Manteiga JM, Goodstein D, Gray K, Guerra-Assuncao JA, Haggarty B, Han DJ, Han BW, Harris T, Harshbarger J, Hastings RK, Hayes RD, Hoede C, Hu S, Hu ZL, Hutchins L, Kan Z, Kawaji H, Keliet A, Kerhornou A, Kim S, Kinsella R, Klopp C, Kong L, Lawson D, Lazarevic D, Lee JH, Letellier T, Li CY, Lio P, Liu CJ, Luo J, Maass A, Mariette J, Maurel T, Merella S, Mohamed AM, Moreews F, Nabihoudine I, Ndegwa N, Noirot C, Perez-Llamas C, Primig M, Quattrone A, Quesneville H, Rambaldi D, Reecy J, Riba M, Rosanoff S, Saddiq AA, Salas E, Sallou O, Shepherd R, Simon R, Sperling L, Spooner W, Staines DM, Steinbach D, Stone K, Stupka E, Teague JW, Dayem Ullah AZ, Wang J, Ware D, Wong-Erasmus M, Youens-Clark K, Zadissa A, Zhang SJ, Kasprzyk A (2015) The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res 43(W1):W589–W598. doi: 10.1093/nar/gkv350 CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Assenov Y, Ramirez F, Schelhorn SE, Lengauer T, Albrecht M (2008) Computing topological parameters of biological networks. Bioinformatics 24(2):282–284. doi: 10.1093/bioinformatics/btm554 CrossRefPubMedGoogle Scholar
  61. 61.
    Ideker T, Ozier O, Schwikowski B, Siegel AF (2002) Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(Suppl 1):S233–S240CrossRefPubMedGoogle Scholar
  62. 62.
    Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, Fridman WH, Pages F, Trajanoski Z, Galon J (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25(8):1091–1093. doi: 10.1093/bioinformatics/btp101 CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Wu G, Feng X, Stein L (2010) A human functional protein interaction network and its application to cancer data analysis. Genome Biol 11(5):R53. doi: 10.1186/gb-2010-11-5-r53 CrossRefPubMedPubMedCentralGoogle Scholar
  64. 64.
    Montojo J, Zuberi K, Rodriguez H, Bader GD, Morris Q (2014) GeneMANIA: fast gene network construction and function prediction for Cytoscape. F1000Res 3:153. doi: 10.12688/f1000research.4572.1 PubMedPubMedCentralGoogle Scholar
  65. 65.
    Gene Ontology Consortium (2015) Gene ontology consortium: going forward. Nucleic Acids Res 43(Database issue):D1049–D1056. doi: 10.1093/nar/gku1179 CrossRefGoogle Scholar
  66. 66.
    Mostafavi S, Morris Q (2010) Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14):1759–1765. doi: 10.1093/bioinformatics/btq262 CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Aranda B, Blankenburg H, Kerrien S, Brinkman FS, Ceol A, Chautard E, Dana JM, De Las RJ, Dumousseau M, Galeota E, Gaulton A, Goll J, Hancock RE, Isserlin R, Jimenez RC, Kerssemakers J, Khadake J, Lynn DJ, Michaut M, O’Kelly G, Ono K, Orchard S, Prieto C, Razick S, Rigina O, Salwinski L, Simonovic M, Velankar S, Winter A, Wu G, Bader GD, Cesareni G, Donaldson IM, Eisenberg D, Kleywegt GJ, Overington J, Ricard-Blum S, Tyers M, Albrecht M, Hermjakob H (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods 8(7):528–529. doi: 10.1038/nmeth.1637 CrossRefPubMedPubMedCentralGoogle Scholar
  68. 68.
    van Iersel MP, Pico AR, Kelder T, Gao J, Ho I, Hanspers K, Conklin BR, Evelo CT (2010) The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services. BMC Bioinformatics 11:5. doi: 10.1186/1471-2105-11-5 CrossRefPubMedPubMedCentralGoogle Scholar
  69. 69.
    Heer J, Card SK, Landay J (2005) Prefuse: a toolkit for interactive information visualization. Proc SIGCHI Conf Hum Factor Comput Syst 2015:421–430Google Scholar
  70. 70.
    Barnes J, Hut P (1986) A hierarchical O(N log N) force-calculation algorithm. Nature 324(6096):446–449. doi: 10.1038/324446a0 CrossRefGoogle Scholar
  71. 71.
    Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512CrossRefPubMedGoogle Scholar
  72. 72.
    Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555. doi: 10.1126/science.1073374 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Cancer Research UK Edinburgh Centre, Institute of Genetics and Molecular MedicineUniversity of EdinburghEdinburghUK

Personalised recommendations