Abstract
iTRAQ (isobaric Tags for Relative and Absolute Quantitation) is a technique that allows simultaneous quantitation of proteins in multiple samples. In this paper, we describe a Bayesian hierarchical model-based method to infer the relative protein expression levels and hence to identify differentially expressed proteins from iTRAQ data. Our model assumes that the measured peptide intensities are affected by both protein expression levels and peptide specific effects. The values of these two effects across experiments are modeled as random effects. The nonrandom missingness of peptide data is modeled with a logistic regression which relates the missingness probability for a peptide with the expression level of the protein that produces this peptide. We propose a Markov chain Monte Carlo method for the inference of model parameters, including the relative expression levels across samples. Our simulation results suggest that the estimates of relative protein expression levels based on the MCMC samples have smaller bias than those estimated from ANOVA models or fold changes. We apply our method to an iTRAQ dataset studying the roles of Caveolae for postnatal cardiovascular function.
Similar content being viewed by others
References
Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193
Choe L, D’Ascenzo M, Relkin NR, Pappin D, Ross P, Williamson B, Guertin S, Pribil P, Lee KH (2007) 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer’s disease. Proteomics 7:3651–3660
Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17:994–999
Hamdan M, Righetti PG (2002) Modern strategies for protein quantification in proteome analysis: advantages and limitations. Mass Spectrom Rev 21:287–302
Hill EG, Schwacke JH, Comte-Walters S, Slate EH, Oberg AL, Eckel-Passow JE, Therneau TM, Schey KL (2008) A statistical model for iTRAQ data analysis. J Proteome Res 7:3091–3101
Liu H, Sadygov RG, Yates JR (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem 76:4193–4201
Marx J (2001) Caveolae: a once-elusive structure gets some respect. Science 294:1862–1865
Oberg A, Mahoney D, Eckel-Passow J, Malone C, Wolfinger R, Hill E, Cooper L, Onuma O, Spiro C, Therneau T, Bergen H (2008) Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA. J Proteome Res 7:225–233
O’Farrell PH (1975) High resolution two-dimensional electrophoresis of proteins. J Biol Chem 250:4007–4012
Patton WF (2002) Detection technologies in proteome analysis. J Chromatogr B, Anal Technol Biomed Life Sci 771:3–31
Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567
Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ (2004) Multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics 3:1154–1169
Salim K, Kehoe L, Minkoff MS, Bilsland JG, Munoz-Sanjuan I, Guest PC (2006) Identification of differentiating neural progenitor cell markers using shotgun isobaric tagging mass spectrometry. Stem Cells Dev 15:461–470
Seshi B (2006) An integrated approach to mapping the proteome of the human bone marrow stromal cell. Proteomics 6:5169–5182
Wang P, Tang H, Zhang H, Whiteaker J, Paulovich AG, Mcintosh M (2006) Normalization regarding non-random missing values in high-throughput mass spectrometry data. Pac Symp Biocomput 11:315–326
Wu WW, Wang G, Baek SJ, Shen R-F (2006) Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D Gel- or LC-MALDI TOF/TOF. J Proteome Res 5:651–658
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luo, R., Colangelo, C.M., Sessa, W.C. et al. Bayesian Analysis of iTRAQ Data with Nonrandom Missingness: Identification of Differentially Expressed Proteins. Stat Biosci 1, 228–245 (2009). https://doi.org/10.1007/s12561-009-9013-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-009-9013-2