Abstract
Close to three percent of the world’s population suffer from diabetes. Despite the range of treatment options available for diabetes patients, not all patients benefit from them. Investigating how different pathways correlate with phenotype of interest may help unravel novel drug targets and discover a possible cure. Many pathway-based methods have been developed to incorporate biological knowledge into the study of microarray data. Most of these methods can only analyze individual pathways but cannot deal with two or more pathways in a model based framework. This represents a serious limitation because, like genes, individual pathways do not work in isolation, and joint modeling may enable researchers to uncover patterns not seen in individual pathway-based analysis. In this paper, we propose a random effects model to analyze two or more pathways. We also derive score test statistics for significance of pathway effects. We apply our method to a microarray study of Type II diabetes. Our method may eludicate how pathways crosstalk with each other and facilitate the investigation of pathway crosstalks. Further hypothesis on the biological mechanisms underlying the disease and traits of interest may be generated and tested based on this method.
Similar content being viewed by others
References
American Diabetes Association (2013) Economic costs of diabetes in the U.S. in 2012. Diabetes Care 36: 1033–1046
Algul H, Tando Y, Beil M, Weber C, Von Weyhern C, Schneider G, Adler G, Schmid R (2002) Different modes of NF-kappaB/Rel activation in pancreatic lobules. J Physiol Gastrointest Liver Physiol 283:G270–281
Baldi C, Cho S, Ellis R (2009) Mutations in two independent pathways are sufficient to create hermaphroditic nematodes. Science 326:1002–1005
Beinborn M, Worrall C, McBride E, Kopin A (2005) A human glucagon-like peptide-1 receptor polymorphism results in reduced agonist responsiveness. Regul Pept 130:1–6
Buse J, Hirst K (2003) The HEALTHY study: introduction. Int J Obes 33(Suppl 4):S1–2
Canty T, Boyle E Jr, Farr A, Morgan E, Verrier E, Pohlman T (1999) Oxidative stress induces NF-kappaB nuclear translocation without degradation of IkappaBalpha. Circulation 100: II361–364
Centers for Disease Control and Prevention (2011) National diabetes fact sheet: general information and national estimates on diabetes in the United States, 2011. U.S. Department of Health and Human Services 2011, Atlanta
Chakrabarti S, Varghese S, Vitseva O, Tanriverdi K, Freedman J (2005) D40 ligand influences platelet release of reactive oxygen intermediates. Arterioscler Thromb Vasc Biol 25:2428–2434
Chen C, Chai H, Wang X, Jiang J, Jamaluddin M, Liao D, Zhang Y, Wang H, Bharadwaj U, Zhang S, Li M, Lin P, Yao Q (2008) Soluble CD40 ligand induces endothelial dysfunction in human and porcine coronary artery endothelial cells. Blood 112:3205–3216
Chung K (1974) A course in probability theory, 2nd edn. Academic Press, New York
Croom K, McCormack P (2009) Liraglutide: a review of its use in type 2 diabetes mellitus. Drugs 69:1985–2004
Dettling M (2004) BagBoosting for tumor classification with gene expression data. Bioinformatics 20:3583–3593
Duckworth W, Abraira C, Moritz T, Reda D, Emanuele N, Reaven P, Zieve F, Marks J, Davis S, Hayward R, Warren S, Goldman S, McCarren M, Vitek M, Henderson W, Huang G (2009) VADT Investigators. Glucose control and vascular complications in veterans with type 2 diabetes. N Engl J Med 360:129–139
Gerstein H, Miller M, Byington R, Goff D Jr, Bigger J, Buse J, Cushman W, Genuth S, Ismail-Beigi F, Grimm R Jr, Probstfield J, Simons-Morton D, Friedewald W (2008) Effects of intensive glucose lowering in type 2 diabetes. Action to Control Cardiovascular Risk in Diabetes Study Group. N Engl J Med 358:2545–2559
Goeman J, van de Geer S, de Kort F, van Houwelingen H (2004) A global test for groups of genes: testing association with a clinical outcome. Bioinformatics 20:93–99
Goeman J, Oosting J, Cleton-Jansen A, Anninga J, van Houwelingen H (2005) Testing association of a pathway with survival using gene expression data. Bioinformatics 21:1950–1957
Henderson C, Kempthorne O, Searle S, von Krosigk C (1959) The estimation of environmental and genetic trends from records subject to culling. Biometrics 15:192–218
Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32:D277–280
Ke Z, Calingasan N, DeGiorgio L, Volpe B, Gibson G (2005) CD40–CD40L interactions promote neuronal death in a model of neurodegeneration due to mild impairment of oxidative metabolism. Neurochem Int 47:204–215
Kim I, Pang H, Zhao H (2012) Semiparametric methods for evaluating pathway effects on clinical outcomes using gene expression data. Stat Med 10:1633–1651
Kingwell B, Formosa M, Muhlmann M, Bradley S, McConell G (2002) Nitric oxide synthase inhibition reduces glucose uptake during exercise in individuals with Type 2 diabetes more than in control subjects. Diabetes 51:2572–2580
Lin X (1997) Variance component testing in generalised linear models with random effects. Biometrika 84:309–326
Lin J, Wu H, Tarr P, Zhang C, Wu Z, Boss O, Michael L, Puigserver P, Isotani E, Olson E, Lowell B, Bassel-Duby R, Spiegelman B (2002) Transcriptional co-activator PGC-1 alpha drives the formation of slow-twitch muscle fibres. Nature 418:797–801
Liu D, Lin X, Ghosh D (2007) Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63:1079–1088
Malhotra R, Liu Z, Vincenz C, Brosius F 3rd (2001) Hypoxia induces apoptosis via two independent pathways in Jurkat cells: differential regulation by glucose. Am J Physiol Cell Physiol 281:C1596–1603
Mandrup-Poulsen T (2003) Apoptotic signal transduction pathways in diabetes. Biochem Pharmacol 66:1433–1440
Mansmann U, Meister R (2003) Testing differential gene expression in functional groups. Goeman’s global test versus an ANCOVA approach. Methods Inf Med 44:449–453
Mootha V, Lindgren C, Eriksson K, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrle M, Laurila E, Houstis N, Daly M, Patterson N, Mesirov J, Golub T, Tamayo P, Spiegelman B, Lander E, Hirschhorn J, Altshuler D, Groop L (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genetics 34:267–273
Pande V, Sharma R, Inoue J, Otsuka M, Ramos M (2003) A molecular modeling study of inhibitors of nuclear factor kappa-B (p50)-DNA binding. J Comput Aided Mol Des 17:825–836
Pang H, Lin A, Holford M, Enerson BE, Lu B, Lawton MP, Floyd E, Zhao H (2006) Pathway analysis using random forests classification and regression. Bioinformatics 22:2028–2036
Pang H, Zhao H (2008) Building pathway clusters from random forests classification using class votes. BMC Bioinform 9:87
Pang H, Datta D, Zhao H (2010) Pathway analysis using random forests with bivariate node-split for survival outcomes. Bioinformatics 26:250–258
Pang H, Hauser M, Minvielle S (2011) Pathway-based identification of SNPs predictive of survival. Eur J Hum Genet 19:704–709
Pang H, George SL, Hui K, Tong T (2012) Gene selection using iterative feature elimination random forests for survival outcomes. IEEE/ACM Trans Comput Biol Bioinform 9:1422–1431
Raz I, Hanefeld M, Xu L, Caria C, Williams-Herman D, Khatami H, Sitagliptin Study 023 Group (2006) Efficacy and safety of the dipeptidyl peptidase-4 inhibitor sitagliptin as monotherapy in patients with type 2 diabetes mellitus. Diabetologia 49:2564–2571
Robinson G (1991) That BLUP is a good thing: the estimation of random effects. Stat Sci 6:15–32
Ryan G, Jobe L, Martin R (2010) Pramlintide in the treatment of type 1 and type 2 diabetes mellitus. Clin Ther 27:1500–1512
Shackelford D, Shaw R (2009) The LKB1-AMPK pathway: metabolism and growth control in tumor suppression. Nat Rev Cancer 9:563–575
Shaik Z, Fifer E, Nowak G (2010) Akt activation improves oxidative phosphorylation in renal proximal tubular cells following nephrotoxicant injury. Am J Physiol Renal Physiol 294:F423–432
Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck F, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker E (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell 122:957–968
Wang X, Shaw S, Amiri F, Eaton D, Marrero M (2002) Inhibition of the JAK/STAT signaling pathway prevents the high glucose-induced increase in TGF-b and fibronectin synthesis in mesangial cells. Diabetes 51:3505–3509
Wei Z, Li H (2007) Nonparametric pathway-based regression models for analysis of genomic data. Biostatistics 8:265–284
Wild S, Roglic G, Green A, Sicree R, King H (2004) Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care 27:1047–1053
Yang J, Yeh H, Lin K, Wang P (2009) Insulin stimulates Akt translocation to mitochondria: implications on dysregulation of mitochondrial oxidative phosphorylation in diabetic myocardium. J Mol Cell Cardiol 46:919–926
Zeitler P, Epstein L, Grey M, Hirst K, Kaufman F, Tamborlane W, Wilfley D (2007) Treatment options for type 2 diabetes in adolescents and youth: a study of the comparative efficacy of metformin alone or in combination with rosiglitazone or lifestyle intervention in adolescents with type 2 diabetes. Pediatr Diabetes 8:74–87
Zhang D, Lin X (2003) Hypothesis testing in semiparametric additive mixed models. Biostatistics 4:57–74
Zhang L, Lon S, Subramani S (2006) Two independent pathways traffic the intraperoxisomal peroxin PpPex8p into peroxisomes. Mol Biol Cell 17:690–699
Acknowledgments
This research was partially supported by National Institutes of Health (NIH) grant GM59507, CA142538, CA154295, a pilot grant from the Yale Pepper Center, the National Science Foundation (NSF) grant DMS 1106738, and start-up funds from Duke University School of Medicine. We would also like to thank ‘Yale University Biomedical High Performance Computing Center’ NIH grant RR19895, which funded the instrumentation.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
1.1 Asymptotic Distribution of Score Test for Individual Null
Regularity conditions:
-
1.
\(\Sigma _{-j}^{-1} \Sigma _{\tau _j}\) is of full rank.
-
2.
As \(n \rightarrow \infty \), the number of observations at any level of any random effect is bounded by a constant which follows from that fact that each pathway effect is a vector of size \(n\).
-
3.
The sequences \(\{ \xi _i \psi ^{-1}_{i} y_i \}\) and \(\{\xi _i\}\) are uniformly bounded \(\forall i=1,...,n\).
-
4.
There exists a positive definite matrix \(I^{0}\) such that \(\lim _{n\rightarrow \infty } \frac{I}{n} = I^{0}\), which is reasonable given conditions 1 and 2.
-
5.
For any given \(q\) by 1 constant vector \(\eta \), let \(\Phi _{\eta } = \sum _{l=1}^{q} \frac{1}{2} \eta \Sigma _{-j}^{-1} \Sigma _{\tau _j} \Sigma _{-j}^{-1}\). Then \(y^{*}_{\eta } = \Phi _{\eta } y = (y^{*}_{\eta ,1},...,y^{*}_{\eta ,n})^{T}\) forms an \(m\)-dependent sequence for some constant \(m\).
-
6.
The usual asymptotic behavior of the maximum likelihood estimates of parameter vector \(\tau _1, ..., \tau _{j-1}, \tau _{j+1}, ..., \tau _{q}\) holds, including consistency and efficiency.
Proof. Let \(\tau ^{*}_{-j}\) be the true value of \(\tau _{-j}\). First, let’s proof the asymptotic normality of \(n^{\frac{1}{2}} \eta U(\tau ^{*}_{-j})\). Let \(\eta \) be a any constant vector of size \(q\). The score test in Sect. 3.1 under \(H_0\) can be written as:
where \(\Phi = \sum _{l=1}^{m} \frac{1}{2} \eta \Sigma _{-j}^{-1} \Sigma _{\tau _j} \Sigma _{-j}^{-1}\). Under conditions 1 and 2, the above equation can be rewritten as:
where \(y^{*}_{\eta ,i} = \Phi _{\eta } y = (y^{*}_{\eta ,1},...,y^{*}_{\eta ,n})^{T}\) is a weighted sum of the \(Y_i\)s and \(\Gamma _{\eta ,i} = \sum _{l=1}^{m} \frac{1}{2} (\eta _{l}) \xi _{i} \).
Under condition 3, the sequence \(\{\xi _i \psi ^{-1}_{i} \xi _{i'} \psi ^{-1}_{i'} y_i y_{i'}\} \) for (\(i,i' = 1,...,n\)) are uniformly bounded. This implies that \(U_{\eta , i}\)s are also uniformly bounded for any given \(\eta \).
Under conditions 4, 5, and an application of Theorem 7.3.1 of Chung [10], it follows that:
in distribution as \(n\rightarrow \infty \). Using Cramer-Wald theorem, we have \(n^{-\frac{1}{2}} U(\tau ^{*}_{-j}) \rightarrow N(E(U_{\tau _j}), I^{0}(\tau ^{*}_j))\)
Under condition 6, we have that \(n^{-\frac{1}{2}} U(\hat{\tau }_{-j}) \rightarrow N(E(U_{\hat{\tau }_j}), I^{0}(\hat{\tau }_j))\) and it follows from Slutsky’s theorem that:
where \( \tilde{\vartheta }_{jj} = \vartheta _{\tau _j \tau _j} - \vartheta _{\upsilon \tau _j}^{T}\vartheta _{\upsilon \upsilon }^{-1}\vartheta _{\upsilon \tau _j}\).
The above score test follows an asymptotically normal distribution with mean \(E(U_{\tau _j})\) and variance 1. And \(E(U_{\tau _j}) = \sum _{ii} R_{ii}^{*} \mu _{2i}^{T} + 2 \sum _{i \ne j} R_{ij}^{*T} \mu _{1i}^{T} \mu _{1j}^{T}\).
Rights and permissions
About this article
Cite this article
Pang, H., Kim, I. & Zhao, H. Random Effects Model for Multiple Pathway Analysis with Applications to Type II Diabetes Microarray Data. Stat Biosci 7, 167–186 (2015). https://doi.org/10.1007/s12561-014-9109-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-014-9109-1