Clustering DNA Microarray Data

  • Henryk Maciejewski
  • Anna Jasinska
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 30)

Abstract

Proper interpretation of results of clustering of gene expression data from DNA microarray tests is one of major challenges in experiment data analysis. Interpretation problems arise due to the fact that different algorithms tend to produce different results, while some clusters appear to be invariant of an algorithm applied. A procedure described in this work can be a good starting point for a decision making process to evaluate biological relevance of clustering results obtained. In our view, any other similar approach aiming to discover biologically relevant clusters will have to include biological information. It would be probably beneficial if relevant biological knowledge could be incorporated on the input side of clustering algorithm rather than at the results post processing / interpretation stage, as described in this work. Making clustering algorithms make clustering decision biased towards biologically relevant groupings, thus forming ‘supervised clustering’ approach may be a motivation for further research in this area.

Keywords

Cluster Algorithm Average Linkage FMR1 Gene Experiment Data Analysis Cluster Gene Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Art D, Gnanadesikan R, Kettenring R (1982) Data-based Metrics for Cluster Analysis. Utilitas Mathematica 21A:75–99MathSciNetGoogle Scholar
  2. 2.
    Brown M, et al. (2000) Proc. Natl. Acad. Sci. USA 97:262–267CrossRefGoogle Scholar
  3. 3.
    Eisen M, et al. (1998) Proc. Natl. Acad. Sci. USA 95:14863–14868CrossRefGoogle Scholar
  4. 4.
    Everitt B (1980) Cluster Analysis, Second Edition. Heineman Educational Books Ltd., LondonMATHGoogle Scholar
  5. 5.
    Ewens W, Grant G (2001) Statistical Methods in Bioinformatics. Springer, Berlin Heidelberg New YorkMATHGoogle Scholar
  6. 6.
    Faller D, et al. (2003) Journal of Computational Biology 10:751–762CrossRefGoogle Scholar
  7. 7.
    Hastie T, Tibshirani R, Friedman J (2002) The Elements of Statistical Learning. Data Mining, Inference and Prediction. Springer, Berlin Heidelberg New YorkGoogle Scholar
  8. 8.
    Hoffmann R, Seidl T, Dugas M (2002) Profound effect of normalization on detection of differently expressed genes in oligonucleotide microarray data analysis. Genome BiologyGoogle Scholar
  9. 9.
    Quackenbush J (2001) Nature Reviews Genetics 2:418–427CrossRefGoogle Scholar
  10. 10.
    Shannon W, Culverhouse R, Duncann J (2003) Pharmacogenomics 4:41–51CrossRefGoogle Scholar
  11. 11.
    Tamayo P, et al. (1999) Proc. Natl. Acad. Sci. USA 96:2907–2912CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Henryk Maciejewski
    • 1
  • Anna Jasinska
    • 2
  1. 1.Institute of Engineering CyberneticsWroclaw University of TechnologyWroclawPoland
  2. 2.Laboratory of Cancer GeneticsInstitute of Bioorganic ChemistryPoznanPoland

Personalised recommendations