Skip to main content

The General Regression Neural Network to Classify Barcode and mini-barcode DNA

  • Conference paper
  • First Online:
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8623))

Abstract

In the identification of living species through the analysis of their DNA sequences, the mitochondrial “cytochrome c oxidase subunit 1” (COI) gene has proved to be a good DNA barcode. Nevertheless, the quality of the full length barcode sequences often can not be guaranteed because of the DNA degradation in biological samples, so that only short sequences (mini-barcode) are available. In this paper, a prototype-based classification approach for the analysis of DNA barcode, exploiting a spectral representation of DNA sequences and a memory-based neural network, is proposed. The neural network is a modified version of General Regression Neural Network (GRNN) used as a classification tool. Furthermore, the relationship between the characteristics of different species and their spectral distribution is investigated. Namely, a subset of the whole spectrum of a DNA sequence, composed by very high frequency DNA k-mers, is considered providing a robust system for the classification of barcode sequences. The proposed approach is compared with standard classification algorithms, like Support Vector Machine (SVM), obtaining better results specially when applied to mini-barcode sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2000), doi:10.1007/3-540-44503-X

    Chapter  Google Scholar 

  2. Chang, C.-C., Lin, C.-J.: Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)

    Article  Google Scholar 

  3. Fiannaca, A., La Rosa, M., Rizzo, R., Urso, A.: Analysis of DNA barcode sequences using neural gas and spectral representation. In: van Zee, G.A., van de Vorst, J.G.G. (eds.) EANN 2013. LNCS, vol. 384, pp. 215–224. Springer, Heidelberg (1989)

    Google Scholar 

  4. Francois, D., Wertz, V., Verleysen, M.: The Concentration of Fractional Distances. IEEE Transactions on Knowledge and Data Engineering 19(7), 873–886 (2007)

    Article  Google Scholar 

  5. Hajibabaei, M., Smith, M.A., Janzen, D.H., Rodriguez, J.J., Whitfield, J.B., Hebert, P.D.N.: A minimalist barcode can identify a specimen whose DNA is degraded. Molecular Ecology Notes 6(4), 959–964 (2006)

    Article  Google Scholar 

  6. Hajibabaei, M., Singer, G.A.C., Hebert, P.D.N., Hickey, D.A.: DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics.. Trends in Genetics 23(4), 167–172 (2007)

    Article  Google Scholar 

  7. Haykin, S.: Neural networks: a comprehensive foundation, 2nd edn. Prentice-Hall (1998)

    Google Scholar 

  8. Hebert, P.D.N., Ratnasingham, S., DeWaard, J.R.: Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings of the Royal Society. Series B, Biological Sciences 270(suppl.), S96–S99 (2003)

    Google Scholar 

  9. Hinnenburg, A., Aggarwal, C., Keim, D.: What is the nearest neighbor in high dimensional spaces?. In: Proceedings of the 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 506–515. Morgan Kaufmann Publishers Inc. (2000)

    Google Scholar 

  10. Kuksa, P., Pavlovic, V.: Efficient alignment-free DNA barcode analytics. BMC Bioinformatics 10(suppl. 14), 9 (2009)

    Article  Google Scholar 

  11. La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: Alignment-free Analysis of Barcode Sequences by means of Compression-Based Methods. BMC Bioinformatics 14, S4 (2013)

    Google Scholar 

  12. La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: A study of compression–based methods for the analysis of barcode sequences. In: Peterson, L.E., Masulli, F., Russo, G. (eds.) CIBB 2012. LNCS, vol. 7845, pp. 105–116. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Marshall, E.: Taxonomy. Will DNA bar codes breathe life into classification? Science 307(5712), 1037 (2005)

    Article  Google Scholar 

  14. Meusnier, I., Singer, G.A.C., Landry, J.-F., Hickey, D.A., Hebert, P.D.N., Hajibabaei, M.: A universal DNA mini-barcode for biodiversity analysis.. BMC Genomics 9, 214 (2008)

    Article  Google Scholar 

  15. Ratnasingham, S., Hebert, P.D.N.: bold: The Barcode of Life Data System (http://www.barcodinglife.org).. Molecular Ecology Notes 7(3), 355–364 (2007)

    Article  Google Scholar 

  16. Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: The General Regression Neural Network to Classify Barcode and mini-barcode DNA. In: Proceedings of CIBB (2014)

    Google Scholar 

  17. Scholkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002)

    MATH  Google Scholar 

  18. Seo, T.K.: Classification of nucleotide sequences using support vector machines. Journal of Molecular Evolution 71(4), 250–267 (2010)

    Article  Google Scholar 

  19. Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Rizzo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A. (2015). The General Regression Neural Network to Classify Barcode and mini-barcode DNA. In: DI Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2014. Lecture Notes in Computer Science(), vol 8623. Springer, Cham. https://doi.org/10.1007/978-3-319-24462-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24462-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24461-7

  • Online ISBN: 978-3-319-24462-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics