Information Bottleneck for Pathway-Centric Gene Expression Analysis
While DNA microarrays enable us to conveniently measure expression profiles in the scope of thousands of genes, the subsequent association studies typically suffer from a tremendous imbalance between number of variables (genes) and observations (subjects). Even more so, each gene is heavily perturbed by noise which prevents any meaningful analysis on the single-gene level . Hence, the focus shifted to pathways as groups of functionally related genes , in the hope that aggregation potentiates the underlying signal. Technically, this leads to a problem of feature extraction which was previously tackled by principal component analysis . We reformulate the task using an extension of the Meta-Gaussian Information Bottleneck method as a means to compress a gene set while preserving information about a relevance variable. This opens up new possibilities, enabling us to make use of clinical side information in order to uncover hidden characteristics in the data.
- 8.Globerson, A., Tishby, N.: On the Optimality of the Gaussian Information Bottleneck Curve. The Hebrew University of Jerusalem. Technical report (2004)Google Scholar
- 13.Rey, M., Roth, V.: Meta-gaussian information bottleneck. Adv. Neural Inf. Process. Syst. 25, 1925–1933 (2012)Google Scholar
- 14.Sheffer, M., Bacolod, M.D., Zuk, O., Giardina, S.F., Pincas, H., Barany, F., Paty, P.B., Gerald, W.L., Notterman, D.A., Domany, E.: Association of survival and disease progression with chromosomal instability: a genomic exploration of colorectal cancer. Proc. Natl. Acad. Sci. 106(17), 7131–7136 (2009)CrossRefGoogle Scholar
- 15.Sklar, A.: Fonctions de répartition à n dimensions et leurs marges. Université Paris (1959)Google Scholar
- 17.Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377 (1999)Google Scholar