Skip to main content
Log in

Bayesian Analysis of iTRAQ Data with Nonrandom Missingness: Identification of Differentially Expressed Proteins

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

iTRAQ (isobaric Tags for Relative and Absolute Quantitation) is a technique that allows simultaneous quantitation of proteins in multiple samples. In this paper, we describe a Bayesian hierarchical model-based method to infer the relative protein expression levels and hence to identify differentially expressed proteins from iTRAQ data. Our model assumes that the measured peptide intensities are affected by both protein expression levels and peptide specific effects. The values of these two effects across experiments are modeled as random effects. The nonrandom missingness of peptide data is modeled with a logistic regression which relates the missingness probability for a peptide with the expression level of the protein that produces this peptide. We propose a Markov chain Monte Carlo method for the inference of model parameters, including the relative expression levels across samples. Our simulation results suggest that the estimates of relative protein expression levels based on the MCMC samples have smaller bias than those estimated from ANOVA models or fold changes. We apply our method to an iTRAQ dataset studying the roles of Caveolae for postnatal cardiovascular function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193

    Article  Google Scholar 

  2. Choe L, D’Ascenzo M, Relkin NR, Pappin D, Ross P, Williamson B, Guertin S, Pribil P, Lee KH (2007) 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer’s disease. Proteomics 7:3651–3660

    Article  Google Scholar 

  3. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17:994–999

    Article  Google Scholar 

  4. Hamdan M, Righetti PG (2002) Modern strategies for protein quantification in proteome analysis: advantages and limitations. Mass Spectrom Rev 21:287–302

    Article  Google Scholar 

  5. Hill EG, Schwacke JH, Comte-Walters S, Slate EH, Oberg AL, Eckel-Passow JE, Therneau TM, Schey KL (2008) A statistical model for iTRAQ data analysis. J Proteome Res 7:3091–3101

    Article  Google Scholar 

  6. Liu H, Sadygov RG, Yates JR (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem 76:4193–4201

    Article  Google Scholar 

  7. Marx J (2001) Caveolae: a once-elusive structure gets some respect. Science 294:1862–1865

    Google Scholar 

  8. Oberg A, Mahoney D, Eckel-Passow J, Malone C, Wolfinger R, Hill E, Cooper L, Onuma O, Spiro C, Therneau T, Bergen H (2008) Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA. J Proteome Res 7:225–233

    Article  Google Scholar 

  9. O’Farrell PH (1975) High resolution two-dimensional electrophoresis of proteins. J Biol Chem 250:4007–4012

    Google Scholar 

  10. Patton WF (2002) Detection technologies in proteome analysis. J Chromatogr B, Anal Technol Biomed Life Sci 771:3–31

    Article  Google Scholar 

  11. Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567

    Article  Google Scholar 

  12. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ (2004) Multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics 3:1154–1169

    Article  Google Scholar 

  13. Salim K, Kehoe L, Minkoff MS, Bilsland JG, Munoz-Sanjuan I, Guest PC (2006) Identification of differentiating neural progenitor cell markers using shotgun isobaric tagging mass spectrometry. Stem Cells Dev 15:461–470

    Article  Google Scholar 

  14. Seshi B (2006) An integrated approach to mapping the proteome of the human bone marrow stromal cell. Proteomics 6:5169–5182

    Article  Google Scholar 

  15. Wang P, Tang H, Zhang H, Whiteaker J, Paulovich AG, Mcintosh M (2006) Normalization regarding non-random missing values in high-throughput mass spectrometry data. Pac Symp Biocomput 11:315–326

    Article  Google Scholar 

  16. Wu WW, Wang G, Baek SJ, Shen R-F (2006) Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D Gel- or LC-MALDI TOF/TOF. J Proteome Res 5:651–658

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruiyan Luo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, R., Colangelo, C.M., Sessa, W.C. et al. Bayesian Analysis of iTRAQ Data with Nonrandom Missingness: Identification of Differentially Expressed Proteins. Stat Biosci 1, 228–245 (2009). https://doi.org/10.1007/s12561-009-9013-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-009-9013-2

Keywords

Navigation