Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Pawlikowska, Iwona; Liu, Zhifa; Shi, Lei; Lin, Tong; Gruber, Tanja; Robinson, Giles; Onar-Thomas, Arzu; Pounds, Stan

doi:10.1186/1471-2105-16-S15-P12

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Poster presentation
Open access
Published: 23 October 2015

Volume 16, article number P12, (2015)
Cite this article

Download PDF

You have full access to this open access article

BMC Bioinformatics Aims and scope Submit manuscript

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Download PDF

Iwona Pawlikowska^1,3,
Zhifa Liu¹,
Lei Shi¹,
Tong Lin¹,
Tanja Gruber²,
Giles Robinson²,
Arzu Onar-Thomas¹ &
…
Stan Pounds¹

1392 Accesses
1 Citation
Explore all metrics

Background

Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials. However, for any data set, there are a very large number of candidate cluster analysis methods (CCAMs) due to the many choices for feature selection criteria, number of selected features, number of clusters to define, etc. Frequently, a specific CCAM is chosen without quantifying the validity of its results in terms of reproducibility or distinctiveness of the reported subgroups.

Materials and methods

Here, we propose the Dunn Index Bootstrap (DIBS) procedure to quantify the reproducibility and distinctiveness of subgroups defined by many CCAMs. DIBS applies each CCAM to the observed data and many bootstrap data sets obtained by subject resampling. The bootstrap results are used to compute metrics of subgroup reproducibility and distinctiveness of the subgroups defined by each CCAM.

Results

DIBS was used to characterize the performance of each of 4,032 CCAMs in the analysis of one RNA-seq, two microarray gene expression, and one methylation array data set from three different cancers. In each example, DIBS identified specific CCAMs that defined subgroups of well-established biological and clinical relevance.

Author information

Authors and Affiliations

Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
Iwona Pawlikowska, Zhifa Liu, Lei Shi, Tong Lin, Arzu Onar-Thomas & Stan Pounds
Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
Tanja Gruber & Giles Robinson
Institue of Mathematics, University of Silesia, Katowice, 2469011, Poland
Iwona Pawlikowska

Authors

Iwona Pawlikowska
View author publications
You can also search for this author in PubMed Google Scholar
Zhifa Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Giles Robinson
View author publications
You can also search for this author in PubMed Google Scholar
Arzu Onar-Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Stan Pounds
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stan Pounds.

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Pawlikowska, I., Liu, Z., Shi, L. et al. Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups. BMC Bioinformatics 16 (Suppl 15), P12 (2015). https://doi.org/10.1186/1471-2105-16-S15-P12

Download citation

Published: 23 October 2015
DOI: https://doi.org/10.1186/1471-2105-16-S15-P12

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Background

Materials and methods

Results

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Background

Materials and methods

Results

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation