Abstract
The “cythosome c oxidase subunits 1” (COI) gene is used for identification of species, and it is one of the so-called DNA barcode genes. Identification of species, even using DNA barcoding can be difficult if the biological examples are degraded. Spectral representation of sequences and the General Regression Neural Network (GRNN) can give some interesting results in these difficult cases. The GRNN is based on the distance between the memorized examples of sequence and the input unknown sequence, both represented using a vector space spectral representation. In this paper we will analyse the effectiveness of different distance models in the GRNN implementation and will compare the obtained results in the classification of full length sequences and degraded samples.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aggarwal, C., Hinnenburg, A., Keim, D.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) Database Theory ICDT 2001. Lecture Notes in Computer Science, vol. 1973, pp. 420–434. Springer, Berlin/Heidelberg (2001)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Fiannaca, A., La Rosa, M., Rizzo, R., Urso, A.: Analysis of DNA barcode sequences using neural gas and spectral representation. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) Engineering Applications of Neural Networks. Communications in Computer and Information Science, vol. 384, pp. 212–221. Springer, Berlin/Heidelberg (2013)
Francois, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19(7), 873–886 (2007)
Hajibabaei, M., Singer, G.A.C., Hebert, P.D.N., Hickey, D.A.: DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. Trends Genet. 23(4), 167–172 (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Upper Saddle River (1998)
Hebert, P.D.N., Ratnasingham, S., DeWaard, J.R.: Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. Ser. B, Biol. Sci. 270 Suppl, S96–S99 (2003)
Hinnenburg, A., Aggarwal, C., Keim, D.: What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th International Conference on Very Large Data Bases, VLDB ’00, pp. 506–515. Morgan Kaufmann, San Francisco (2000)
Kuksa, P., Pavlovic, V.: Efficient alignment-free DNA barcode analytics. BMC Bioinf. 10(Suppl.14), S9 (2009)
La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: A study of compression-based methods for the analysis of barcode sequences. In: Peterson, L.E., Masulli, F., Russo, G. (eds.) Computational Intelligence Methods for Bioinformatics and Biostatistics. Lecture Notes in Computer Science, vol. 7845, pp. 105–116. Springer, Berlin/Heidelberg (2013)
La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: Alignment-free analysis of barcode sequences by means of compression-based methods. BMC Bioinf. 14, S4 (2013)
Marshall, E.: Taxonomy. Will DNA bar codes breathe life into classification? Science (New York, N.Y.) 307(5712), 1037 (2005)
Meusnier, I., Singer, G.A.C., Landry, J.F., Hickey, D.A., Hebert, P.D.N., Hajibabaei, M.: A universal DNA mini-barcode for biodiversity analysis. BMC Genomics 9, 214 (2008)
Rach, J., Desalle, R., Sarkar, I.N., Schierwater, B., Hadrys, H.: Character-based DNA barcoding allows discrimination of genera, species and populations in Odonata. Proc. Biol. Sci. R. Soc. 275(1632), 237–247 (2008)
Ratnasingham, S., Hebert, P.D.N.: Bold: the barcode of life data system (http://www.barcodinglife.org). Mol. Ecol. Notes 7(3), 355–364 (2007)
Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: The general regression neural network to classify barcode and mini-barcode DNA. In: Computational Intelligence Methods for Bioinformatics and Biostatistics. Lecture Notes in Computer Science. Springer, Berlin/Heidelberg (2015)
Scholkopf, B., Smola, A.: Learning with Kernels. MIT, Cambridge (2002)
Specht, D.F.: A general regression neural network. IEEE Trans. Neural Netw. 2(6), 568–576 (1991)
Weitschek, E., Van Velzen, R., Felici, G., Bertolazzi, P.: BLOG 2.0: a software system for character-based species classification with DNA barcode sequences. What it does, how to use it. Mol. Ecol. Resour. 13(6), 1043–1046 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A. (2015). DNA Barcode Classification Using General Regression Neural Network with Different Distance Models. In: Zazzu, V., Ferraro, M., Guarracino, M. (eds) Mathematical Models in Biology. Springer, Cham. https://doi.org/10.1007/978-3-319-23497-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23497-7_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23496-0
Online ISBN: 978-3-319-23497-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)