Chapter

Pattern Recognition in Bioinformatics

Volume 6282 of the series Lecture Notes in Computer Science pp 230-241

Biologically-aware Latent Dirichlet Allocation (BaLDA) for the Classification of Expression Microarray

  • Alessandro PerinaAffiliated withCarnegie Mellon UniversityUniversity of Verona
  • , Pietro LovatoAffiliated withCarnegie Mellon UniversityUniversity of Verona
  • , Vittorio MurinoAffiliated withCarnegie Mellon UniversityUniversity of VeronaItalian Institute of Technology (IIT)
  • , Manuele BicegoAffiliated withCarnegie Mellon UniversityUniversity of VeronaItalian Institute of Technology (IIT)

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Topic models have recently shown to be really useful tools for the analysis of microarray experiments. In particular they have been successfully applied to gene clustering and, very recently, also to samples classification. In this latter case, nevertheless, the basic assumption of functional independence between genes is limiting, since many other a priori information about genes’ interactions may be available (co-regulation, spatial proximity or other a priori knowledge). In this paper a novel topic model is proposed, which enriches and extends the Latent Dirichlet Allocation (LDA) model by integrating such dependencies, encoded in a categorization of genes. The proposed topic model is used to derive a highly informative and discriminant representation for microarray experiments. Its usefulness, in comparison with standard topic models, has been demonstrated in two different classification tests.