Skip to main content
Log in

Spot Detection and Image Segmentation in DNA Microarray Data

  • Methodology
  • Published:
Applied Bioinformatics

Abstract

Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. A1
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Brown P, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet 1999 Jan; 21 (1 Suppl.): 33–7

    Article  PubMed  CAS  Google Scholar 

  2. Bittner M, Meltzer XP, Chen XY, et al. Gene expression data analysis. FEBS Lett 2000; 480: 17–24

    Article  Google Scholar 

  3. Schena M. Microarray analysis. Hoboken (NJ): Wiley-Liss, 2003

    Google Scholar 

  4. Eisen M. ScanAlyze user manual. Stanford (CA): Stanford University, 1999

    Google Scholar 

  5. Axon Instruments, Inc. GenePix Professional 4200A: microarray scanner user’s guide [online]. Available from URL: http://www.files.axon.com/downloads/manuals/GenePix_4200A_User_Guide_Rev_B.pdf [Accessed 2005 May 24]

  6. Packard BioScience. QuantArray microarray analysis software manual [online]. Available from URL: http://www.las.perkinelmer.com/content/Manuals/quantarraymanual.pdf [Accessed 2005 May 24]

  7. Buckly M. The Spot user’s guide [online]. CSIRO Mathematical and Information Sciences, 2000. Available from URL: http://www.cmis.csiro.au/IAP/Spot/spotmanual.htm [Accessed 2005 May 24]

    Google Scholar 

  8. Callow MJ, Dudoit S, Gong EL, et al. Microarray expression profiling identifies genes with altered expression in HDL deficient mice. Genome Res 2000; 10(12): 2022–9

    Article  PubMed  CAS  Google Scholar 

  9. Katzer M, Kummert F, Sagerer G. Robust automatic microarray image analysis. International Conference on Bioinformatics: North-South Networking; 2002 Feb 6–8; Bangkok

    Google Scholar 

  10. Jain A, Tokuyasu T, Snijders A, et al. Fully automatic quantification of microarray image data. Genome Res 2002; 12(2): 325–32

    Article  PubMed  CAS  Google Scholar 

  11. Steinfath M, Wruck W, Scidel H. Automated image analysis for array hybridization experiments. Bioinformatics 2001; 17(7): 634–41

    Article  PubMed  CAS  Google Scholar 

  12. Soille P. Morphological image analysis: principles and applications. 2nd ed. New York: Springer-Verlag, 2003

    Google Scholar 

  13. Chen Y, Dougherty E, Bittner M. Ratio-based decisions and the quantitative analysis of cDNA microarray images. J Biomed Opt 1997; 2: 364–74

    Article  PubMed  CAS  Google Scholar 

  14. Kooperberg C, Fazzio T, Tsukiyama T. Improved background correction for spotted DNA microarrays. J Comput Biol 2002; 9(1): 55–66

    Article  PubMed  CAS  Google Scholar 

  15. Goryachev A, Macgregor P, Edwards A. Unfolding of microarray data. J Comput Biol 2001; 8(4): 443–61

    Article  PubMed  CAS  Google Scholar 

  16. Yang M, Ruan Q, Yang J, et al. A statistical procedure for flagging weak spots greatly improves normalization and ratio estimates in microarray experiments. Physiol Genomics 2001; 7(1): 45–53

    PubMed  CAS  Google Scholar 

  17. Schuchhardt J, Beule D, Malik A, et al. Normalization strategies for cDNA microarrays. Nucleic Acids Res 2000; 28: 47

    Article  Google Scholar 

  18. Duda R, Hart P, Stork D. Pattern classification. 2nd ed. Canada: Wiley-Interscience, 2000

    Google Scholar 

  19. Jaakkola T, Diekhans M, Haussler D. Using the Fisher kernel method to detect remote protein homologies. Proc Int Conf Intell Syst Mol Biol 1999, 149–58

    Google Scholar 

  20. Mukherjee S, Tamayo P, Slonim D, et al. Support vector machine classification of microarray data. Artificial Intelligence (AI) Memo 1677. Cambridge (MA): Massachusetts Institute of Technology, 1999

    Google Scholar 

  21. Spellman P, Sherlock G, Zhang M, et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998; 9: 3273–97

    PubMed  CAS  Google Scholar 

  22. Zien A, Ratsch G, Mika S, et al. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 2000; 16(9): 799–807

    Article  PubMed  CAS  Google Scholar 

  23. Cai Y, Liu X, Xu X, et al. Support vector machines for predicting protein structural class. BMC Bioinformatics 2001; 2(1): 1–5

    Article  Google Scholar 

  24. Schölkopf B, Guyon IM, Weston J. Statistical learning and kernel methods in bioinformatics. In: Frasconi P, Shamir R, editors. Artificial intelligence and heuristic methods in bioinformatics. Amsterdam: IOS Press, 2003: 1–21

    Google Scholar 

  25. Ding C, Dubchak I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 2001; 17: 349–58

    Article  PubMed  CAS  Google Scholar 

  26. Campanini R, Dongiovanni D, Lanconelli N, et al. A support vector machines classifier based on recursive feature elimination for microarray data in breast cancer characterization. First National Workshop on Bioinformatics, VIII National Congress of the Italian Association for Artificial Intelligence; 2002 Sep 10; Siena, Italy

    Google Scholar 

  27. Guyon I, Weston J, Barnhill S, et al. Gene selection for cancer classification using support vector machines. Mach Learn 2002; 46(1/3): 389–422

    Article  Google Scholar 

  28. Rueda L, Oommen BJ. On optimal pairwise linear classifiers for normal distributions: the two-dimensional case. IEEE Trans Pattern Anal Mach Intell 2002; 24(2): 274–80

    Article  Google Scholar 

  29. Rueda L. An efficient approach to compute the threshold in multi-dimensional linear classifiers. Pattern Recognit 2004; 37(4): 811–26

    Article  Google Scholar 

  30. Wen X, Fuhrman S, Michaels G, et al. Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci U S A 1998; 95: 334–9

    Article  PubMed  CAS  Google Scholar 

  31. Alon U, Barkai N, Notterman D, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A 1999 Jun 8; 96(12): 6745–50

    Article  PubMed  CAS  Google Scholar 

  32. Tibshirani R, Hastie T, Eisen M, et al. Clustering methods for the analysis of DNA microarray data [technical report]. Stanford (CA): Department of Statistics, Stanford University, 1999

  33. Perou C, Jeffrey S, van de Rijn M, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A 1999; 96: 9212–7

    Article  PubMed  CAS  Google Scholar 

  34. Furey T, Cristianini N, Duffy N, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000; 16(10): 906–14

    Article  PubMed  CAS  Google Scholar 

  35. Bicciato S, Pandin M, Didone G, et al. Analysis of an associative memory neural network for pattern identification in gene expression data. Workshop on Data Mining and Bioinformatics (BIOKDD’01); 2001 Aug 26; San Francisco

    Google Scholar 

  36. Tamayo P, Slonim D, Mesirov J, et al. Interpreting patterns of gene expression with selforganizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A 1999; 96(6): 2907–12

    Article  PubMed  CAS  Google Scholar 

  37. Asano T, Chen D, Katoh N, et al. Polynomial-time solutions to image segmentation. Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. Philadelphia: Society of Applied and Industrial Mathematics, 1996

    Google Scholar 

  38. Puzicha J, Buhmann J, Hofmann T. Histogram clustering for unsupervised image segmentation. Comput Vis Pattern Recognit 1999; 2: 2602–8

    Google Scholar 

  39. Draghici S. Data analysis for DNA microarrays. Boca Raton (FL): CRC Press, 2003

    Book  Google Scholar 

  40. Buhler J, Ideker T, Haynor D. Dapple: improved techniques for finding spots on DNA microarrays [technical report UWTR 2000-08-05.]. Seattle: University of Washington, 2000

    Google Scholar 

  41. Heyer L, Moskowitz D, Abele J, et al. MAGIC tool: integrated microarray data analysis. Bioinformatics 2005; 21(9): 2114–5

    Article  PubMed  CAS  Google Scholar 

  42. Adams R, Bishop L. Seeded region growing. IEEE Trans Pattern Anal Mach Intell 1994; 16(6): 641–7

    Article  Google Scholar 

  43. Yang Y, Buckley M, Dudoit S, et al. Comparison of methods for image analysis on cDNA microarray data. J Comput Graph Stat 2002; 11: 108–36

    Article  Google Scholar 

  44. Wu H, Yan H. Microarray image processing based on clustering and morphological analysis. Proceedings of the First Asia-Pacific Conference on Bioinformatics. Darlinghurst, Australia: Australian Computer Society, Inc., 2003: 111–8

    Google Scholar 

  45. Rueda L, Qin L. An unsupervised learning scheme for DNA microarray image spot detection. First International Conference on Complex Medical Engineering; 2005 May 15–18; Takamatsu, Japan

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the referees who devoted their efforts to substantially improving the quality of the paper. This research work has been partially supported by NSERC (Natural Sciences and Engineering Council of Canada), CFI (Canadian Foundation for Innovation) and OIT (Ontario Innovation Trust).

The authors have provided no information on conflicts of interest directly relevant to the content of this article..

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Rueda.

Appendix 1

Appendix 1

Refer to figure A1.

Theorem 1:Let D = {x 1,…,x n}, which has to be clustered into two classes. If n→∞, KSCMIS produces the same results as the k-means algorithm, where the Euclidean distance is used.

Proof: Let n = N 1 + N 2. Since empty clusters are not allowed, then it is true that as n→∞, it implies that N 1, N 2 →∞. We can then write the asymptotic behaviour of equation 4 as follows (equation 6):

Additionally, it is straightforward that, in the 1-D Euclidean space, equation 6 is equivalent to (x i —μ(1)2 > (x i — μ1)2. Clearly, equation 6 is the criterion used by the standard k-means algorithm, and thus the result follows.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, L., Rueda, L., Ali, A. et al. Spot Detection and Image Segmentation in DNA Microarray Data. Appl-Bioinformatics 4, 1–11 (2005). https://doi.org/10.2165/00822942-200504010-00001

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00822942-200504010-00001

Keywords

Navigation