Clustering Algorithms for ITS Sequence Data with Alignment Metrics

  • Andrei Kelarev
  • Byeong Kang
  • Dorothy Steane
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)

Abstract

The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the results of experimental analysis of performance of these algorithms for an ITS-sequence data set, and compares the results with known biologically significant clusters of this data set. It is shown that both algorithms are efficient and can be used in practice.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baldi, P., Brunak, S.: Bioinformatics: The Machine Learning Approach. MIT Press, Cambridge (2001)MATHGoogle Scholar
  2. 2.
    Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis. Cambridge University Press, Cambridge (1999)Google Scholar
  3. 3.
    Gedeon, T.D., Fung, L.C.C. (eds.): AI 2003. LNCS (LNAI), vol. 2903. Springer, Heidelberg (2003)MATHGoogle Scholar
  4. 4.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)MATHCrossRefGoogle Scholar
  5. 5.
    Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms. MIT Press, Cambridge (2004), http://www.bioalgorithms.info/ Google Scholar
  6. 6.
    Kang, B.H.: Pacific Knowledge Acquisition Workshop. In: Part of the 8th Pacific Rim Internat. Conf. on Artificial Intelligence, Auckland, New Zealand (2004)Google Scholar
  7. 7.
    Kang, B.H., Kelarev, A.V., Sale, A.H.J., Williams, R.N.: A model for classifying DNA code embracing neural networks and FSA. In: Pacific Knowledge Acquisition Workshop, PKAW 2006, Guilin, China, August 7-8, pp. 201–212 (2006)Google Scholar
  8. 8.
    Kelarev, A.V.: Ring Constructions and Applications. World Scientific, River Edge (2002)MATHGoogle Scholar
  9. 9.
    Kang, B.H., Kelarev, A.V., Sale, A.H.J., Williams, R.N.: Labeled directed graphs and FSA as classifiers of strings. In: 17th Australasian Workshop on Combinatorial Algorithms, AWOCA 2006, pp. 93–109 (2006)Google Scholar
  10. 10.
    Li, J., Yang, Q., Tan, A.-H. (eds.): BioDM 2006. LNCS (LNBI), vol. 3916. Springer, Heidelberg (2006)Google Scholar
  11. 11.
    Mount, D.: Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory (2001), http://www.bioinformaticsonline.org/
  12. 12.
    Steane, D.A., Nicolle, D., Mckinnon, G.E., Vaillancourt, R.E., Potts, B.M.: High-level relationships among the eucalypts are resolved by ITS-sequence data. Australian Systematic Botany 15, 49–62 (2002)CrossRefGoogle Scholar
  13. 13.
    Webb, G.I., Yu, X. (eds.): AI 2004. LNCS (LNAI), vol. 3339. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    WEKA, Waikato Environment for Knowledge Analysis (viewed 20.06.2006), http://www.cs.waikato.ac.nz/ml/weka
  15. 15.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier/Morgan Kaufman, Amsterdam (2005)MATHGoogle Scholar
  16. 16.
    Yang, J.Y., Ersoy, O.K.: Combined supervised and unsupervised learning in genomic data mining, School of Electrical and Computer Engineering, Purdue University (2003)Google Scholar
  17. 17.
    Zhang, C., Guesgen, H.W., Yeap, W.-K. (eds.): PRICAI 2004. LNCS (LNAI), vol. 3157. Springer, Heidelberg (2004)MATHGoogle Scholar
  18. 18.
    Zhang, S., Jarvis, R.A. (eds.): AI 2005. LNCS (LNAI), vol. 3809. Springer, Heidelberg (2005)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Andrei Kelarev
    • 1
  • Byeong Kang
    • 1
  • Dorothy Steane
    • 2
  1. 1.School of ComputingUniversity of TasmaniaTasmaniaAustralia
  2. 2.CRC Forestry and School of Plant ScienceUniversity of TasmaniaTasmaniaAustralia

Personalised recommendations