Abstract
The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the results of experimental analysis of performance of these algorithms for an ITS-sequence data set, and compares the results with known biologically significant clusters of this data set. It is shown that both algorithms are efficient and can be used in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baldi, P., Brunak, S.: Bioinformatics: The Machine Learning Approach. MIT Press, Cambridge (2001)
Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis. Cambridge University Press, Cambridge (1999)
Gedeon, T.D., Fung, L.C.C. (eds.): AI 2003. LNCS (LNAI), vol. 2903. Springer, Heidelberg (2003)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms. MIT Press, Cambridge (2004), http://www.bioalgorithms.info/
Kang, B.H.: Pacific Knowledge Acquisition Workshop. In: Part of the 8th Pacific Rim Internat. Conf. on Artificial Intelligence, Auckland, New Zealand (2004)
Kang, B.H., Kelarev, A.V., Sale, A.H.J., Williams, R.N.: A model for classifying DNA code embracing neural networks and FSA. In: Pacific Knowledge Acquisition Workshop, PKAW 2006, Guilin, China, August 7-8, pp. 201–212 (2006)
Kelarev, A.V.: Ring Constructions and Applications. World Scientific, River Edge (2002)
Kang, B.H., Kelarev, A.V., Sale, A.H.J., Williams, R.N.: Labeled directed graphs and FSA as classifiers of strings. In: 17th Australasian Workshop on Combinatorial Algorithms, AWOCA 2006, pp. 93–109 (2006)
Li, J., Yang, Q., Tan, A.-H. (eds.): BioDM 2006. LNCS (LNBI), vol. 3916. Springer, Heidelberg (2006)
Mount, D.: Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory (2001), http://www.bioinformaticsonline.org/
Steane, D.A., Nicolle, D., Mckinnon, G.E., Vaillancourt, R.E., Potts, B.M.: High-level relationships among the eucalypts are resolved by ITS-sequence data. Australian Systematic Botany 15, 49–62 (2002)
Webb, G.I., Yu, X. (eds.): AI 2004. LNCS (LNAI), vol. 3339. Springer, Heidelberg (2004)
WEKA, Waikato Environment for Knowledge Analysis (viewed 20.06.2006), http://www.cs.waikato.ac.nz/ml/weka
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier/Morgan Kaufman, Amsterdam (2005)
Yang, J.Y., Ersoy, O.K.: Combined supervised and unsupervised learning in genomic data mining, School of Electrical and Computer Engineering, Purdue University (2003)
Zhang, C., Guesgen, H.W., Yeap, W.-K. (eds.): PRICAI 2004. LNCS (LNAI), vol. 3157. Springer, Heidelberg (2004)
Zhang, S., Jarvis, R.A. (eds.): AI 2005. LNCS (LNAI), vol. 3809. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kelarev, A., Kang, B., Steane, D. (2006). Clustering Algorithms for ITS Sequence Data with Alignment Metrics. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_116
Download citation
DOI: https://doi.org/10.1007/11941439_116
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)