Discovering Significant Structures in Clustered Bio-molecular Data Through the Bernstein Inequality

  • Alberto Bertoni
  • Giorgio Valentini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4694)

Abstract

Searching for structures in complex bio-molecular data is a central issue in several branches of bioinformatics. In particular, the reliability of clusters discovered by a given clustering algorithm have been recently assessed through methods based on the concept of stability with respect to random perturbations of the data. In this context, a major problem is to assess the confidence of the measures of reliability. We discuss a partially ”distribution independent” method based on the classical Bernstein inequality to assess the statistical significance of the discovered clusterings. Experimental results with gene expression data show the effectiveness of the proposed approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Alberto Bertoni
    • 1
  • Giorgio Valentini
    • 1
  1. 1.DSI, Dipartimento di Scienze dell’ Informazione, Università degli Studi di Milano, Via Comelico 39, 20135 MilanoItalia

Personalised recommendations