Skip to main content

DNA Barcode Classification Using General Regression Neural Network with Different Distance Models

  • Chapter

Abstract

The “cythosome c oxidase subunits 1” (COI) gene is used for identification of species, and it is one of the so-called DNA barcode genes. Identification of species, even using DNA barcoding can be difficult if the biological examples are degraded. Spectral representation of sequences and the General Regression Neural Network (GRNN) can give some interesting results in these difficult cases. The GRNN is based on the distance between the memorized examples of sequence and the input unknown sequence, both represented using a vector space spectral representation. In this paper we will analyse the effectiveness of different distance models in the GRNN implementation and will compare the obtained results in the classification of full length sequences and degraded samples.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aggarwal, C., Hinnenburg, A., Keim, D.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) Database Theory ICDT 2001. Lecture Notes in Computer Science, vol. 1973, pp. 420–434. Springer, Berlin/Heidelberg (2001)

    Chapter  Google Scholar 

  2. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  3. Fiannaca, A., La Rosa, M., Rizzo, R., Urso, A.: Analysis of DNA barcode sequences using neural gas and spectral representation. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) Engineering Applications of Neural Networks. Communications in Computer and Information Science, vol. 384, pp. 212–221. Springer, Berlin/Heidelberg (2013)

    Chapter  Google Scholar 

  4. Francois, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19(7), 873–886 (2007)

    Article  Google Scholar 

  5. Hajibabaei, M., Singer, G.A.C., Hebert, P.D.N., Hickey, D.A.: DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. Trends Genet. 23(4), 167–172 (2007)

    Article  Google Scholar 

  6. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)

    Article  Google Scholar 

  7. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Upper Saddle River (1998)

    Google Scholar 

  8. Hebert, P.D.N., Ratnasingham, S., DeWaard, J.R.: Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. Ser. B, Biol. Sci. 270 Suppl, S96–S99 (2003)

    Google Scholar 

  9. Hinnenburg, A., Aggarwal, C., Keim, D.: What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th International Conference on Very Large Data Bases, VLDB ’00, pp. 506–515. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  10. Kuksa, P., Pavlovic, V.: Efficient alignment-free DNA barcode analytics. BMC Bioinf. 10(Suppl.14), S9 (2009)

    Article  Google Scholar 

  11. La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: A study of compression-based methods for the analysis of barcode sequences. In: Peterson, L.E., Masulli, F., Russo, G. (eds.) Computational Intelligence Methods for Bioinformatics and Biostatistics. Lecture Notes in Computer Science, vol. 7845, pp. 105–116. Springer, Berlin/Heidelberg (2013)

    Chapter  Google Scholar 

  12. La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: Alignment-free analysis of barcode sequences by means of compression-based methods. BMC Bioinf. 14, S4 (2013)

    Article  Google Scholar 

  13. Marshall, E.: Taxonomy. Will DNA bar codes breathe life into classification? Science (New York, N.Y.) 307(5712), 1037 (2005)

    Google Scholar 

  14. Meusnier, I., Singer, G.A.C., Landry, J.F., Hickey, D.A., Hebert, P.D.N., Hajibabaei, M.: A universal DNA mini-barcode for biodiversity analysis. BMC Genomics 9, 214 (2008)

    Article  Google Scholar 

  15. Rach, J., Desalle, R., Sarkar, I.N., Schierwater, B., Hadrys, H.: Character-based DNA barcoding allows discrimination of genera, species and populations in Odonata. Proc. Biol. Sci. R. Soc. 275(1632), 237–247 (2008)

    Article  Google Scholar 

  16. Ratnasingham, S., Hebert, P.D.N.: Bold: the barcode of life data system (http://www.barcodinglife.org). Mol. Ecol. Notes 7(3), 355–364 (2007)

  17. Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: The general regression neural network to classify barcode and mini-barcode DNA. In: Computational Intelligence Methods for Bioinformatics and Biostatistics. Lecture Notes in Computer Science. Springer, Berlin/Heidelberg (2015)

    Book  Google Scholar 

  18. Scholkopf, B., Smola, A.: Learning with Kernels. MIT, Cambridge (2002)

    Google Scholar 

  19. Specht, D.F.: A general regression neural network. IEEE Trans. Neural Netw. 2(6), 568–576 (1991)

    Article  Google Scholar 

  20. Weitschek, E., Van Velzen, R., Felici, G., Bertolazzi, P.: BLOG 2.0: a software system for character-based species classification with DNA barcode sequences. What it does, how to use it. Mol. Ecol. Resour. 13(6), 1043–1046 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo La Rosa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A. (2015). DNA Barcode Classification Using General Regression Neural Network with Different Distance Models. In: Zazzu, V., Ferraro, M., Guarracino, M. (eds) Mathematical Models in Biology. Springer, Cham. https://doi.org/10.1007/978-3-319-23497-7_9

Download citation

Publish with us

Policies and ethics