Biologically-aware Latent Dirichlet Allocation (BaLDA) for the Classification of Expression Microarray
- Cite this paper as:
- Perina A., Lovato P., Murino V., Bicego M. (2010) Biologically-aware Latent Dirichlet Allocation (BaLDA) for the Classification of Expression Microarray. In: Dijkstra T.M.H., Tsivtsivadze E., Marchiori E., Heskes T. (eds) Pattern Recognition in Bioinformatics. PRIB 2010. Lecture Notes in Computer Science, vol 6282. Springer, Berlin, Heidelberg
Topic models have recently shown to be really useful tools for the analysis of microarray experiments. In particular they have been successfully applied to gene clustering and, very recently, also to samples classification. In this latter case, nevertheless, the basic assumption of functional independence between genes is limiting, since many other a priori information about genes’ interactions may be available (co-regulation, spatial proximity or other a priori knowledge). In this paper a novel topic model is proposed, which enriches and extends the Latent Dirichlet Allocation (LDA) model by integrating such dependencies, encoded in a categorization of genes. The proposed topic model is used to derive a highly informative and discriminant representation for microarray experiments. Its usefulness, in comparison with standard topic models, has been demonstrated in two different classification tests.
Unable to display preview. Download preview PDF.