Clustering Analysis for Bacillus Genus Using Fourier Transform and Self-Organizing Map

  • Cheng-Chang Jeng
  • I-Ching Yang
  • Kun-Lin Hsieh
  • Chun-Nan Lin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4234)


Because the lengths of nucleotide sequences for microorganisms are various, it is difficult to directly compare the complete nucleotide sequences among microorganisms. In this study, we adopted a method that can convert DNA sequences of microorganisms into numerical form then applied Fourier transform to the numerical DNA sequences in order to investigate the distributions of nucleotides. Also, a visualization scheme for transformed DNA sequences was proposed to help visually categorize microorganisms. Furthermore, the well-known neural network technique Self-Organizing Map (SOM) was applied to the transformed DNA sequences to draw conclusions of taxonomic relationships among the bacteria of Bacillus genus. The results show that the relationships among the bacteria are corresponding to recent biological findings.


DNA sequence Bacillus Fourier transform Self-organizing map 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Delong, E.F., Pace, N.R.: Environmental Diversity of Bacteria and Archaea. Systematic Biology 50, 470–478 (2001)CrossRefGoogle Scholar
  2. 2.
    Zuckerandl, E., Pauling, L.: Molecules as Documents of Evolutionary History. Journal of Theoretical Biology 8, 357–366 (1965)CrossRefGoogle Scholar
  3. 3.
    Woese, C.R., Fox, G.E.: Phylogenetic Structure of the Prokaryotic Domain: The Primary Kingdoms. Proceedings of the National Academy of Sciences of the United States of America 74, 5088–5090 (1977)CrossRefGoogle Scholar
  4. 4.
    Woese, C.R., Kandler, O., Wheelis, M.L.: Towards a Natural System of Organisms: Proposal for the Domains Archaea, Bacteria, and Eucarya. Proceedings of the National Academy of Sciences of the United States of America 87, 4576–4579 (1990)CrossRefGoogle Scholar
  5. 5.
    Mayr, E.: Two Empires or Three. Proceedings of the National Academy of Sciences of the United States of America 95, 9720–9723 (1998)CrossRefGoogle Scholar
  6. 6.
    Doolittle, W.F.: Phylogenetic Classification and the Universal Tree. Science 284, 2124–2128 (1999)CrossRefGoogle Scholar
  7. 7.
    Chen, Y.H., Nyeo, S.L., Yu, J.P.: Power-Laws in the Complete Sequences of Human Genome. Journal of Biological Systems 13, 105–115 (2005)CrossRefzbMATHGoogle Scholar
  8. 8.
    De Sousa Vieira, M.: Statistics of DNA Sequences: A Low-Frequency Analysis. Physical Review E 60, 5932–5937 (1999)CrossRefGoogle Scholar
  9. 9.
    Isohata, Y., Hayashi, M.: Analyses of DNA Base Sequences for Eukaryotes in Terms of Power Spectrum Method. Japanese Journal of Applied Physics 44, 1143–1146 (2005)CrossRefGoogle Scholar
  10. 10.
    Fukushima, A., Ikemura, T., Kinouchi, M., Oshima, T., Kudo, Y., Mori, H., Kanaya, S.: Periodicity in Prokaryotic and Eukaryotic Genomes Identified by Power Spectrum Analysis. Gene 300, 203–211 (2002)CrossRefGoogle Scholar
  11. 11.
    Nyeo, S.L., Yang, I.C., Wu, C.H.: Spectral Classification of Archaeal and Bacterial Genomes. Journal of Biological Systems 10, 233–241 (2002)CrossRefzbMATHGoogle Scholar
  12. 12.
    Voss, R.F.: Evolution of Long-Range Fractal Correlations And 1/f Noise in DNA Base Sequences. Physical Review Letter 68, 3805–3808 (1992)CrossRefGoogle Scholar
  13. 13.
    Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)zbMATHGoogle Scholar
  14. 14.
    Vesanto, J.: SOM-Based Visualization Methods. Intelligent Data Analysis 3, 111–126 (1999)CrossRefzbMATHGoogle Scholar
  15. 15.
    Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: Self-Organizing Map in Matlab: The SOM Toolbox. In: Proceedings of the Matlab DSP Conference, Espoo, Finland, pp. 35–40 (1999)Google Scholar
  16. 16.
    Helgason, E., Økstad, O.A., Caugant, D.A., Johansen, H.A., Fouet, A., Mock, M., Hegna, I., Kolstø, A.-B.: Bacillus Anthracis, Bacillus Cereus, and Bacillus Thuringiensis–One Species on the Basis of Genetic Evidence. Applied and Environmental Microbiology 66, 2627–2630 (2000)CrossRefGoogle Scholar
  17. 17.
    Rey, M.W., et al.: Complete Genome Sequence of the Industrial Bacterium Bacillus Licheniformis and Comparisons with Closely Related Bacillus Species. Genome Biology 5 Article R77 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cheng-Chang Jeng
    • 1
  • I-Ching Yang
    • 1
  • Kun-Lin Hsieh
    • 1
  • Chun-Nan Lin
    • 2
  1. 1.Systematic and Theoretical Science Research GroupNational Taitung UniversityTaitungTaiwan
  2. 2.Department of Management Information SystemNational Chung Cheng UniversityMin-Hsiung Chia-YiTaiwan

Personalised recommendations