Skip to main content

From Linnaean System to Machine Learning Based-SNP Barcoding: A Changing Epitome of Mosquito Species Identification

  • Chapter
  • First Online:
Molecular Identification of Mosquito Vectors and Their Management

Abstract

Mosquitoes are one of the best known group of Diptera with 3556 valid species in 112 genera. Out of currently listed 112 genera, only three genera, viz. Anopheles, Culex, and Aedes mainly account for vectorial transmission. Accurate identification of mosquito vector species is paramount in several facets including the progress in mosquito vector control strategies. Conventional identification methods through morpho-taxonomy using differential morphology as Linnaeus have limitations to distinguish siblings and closely related indistinguishable mosquito species. Regardless of the general approach that traditional morpho-taxonomy is important, the downturn of the taxonomic approach and skill basis for identification for most of the mosquito vector is a prominent reality. Perhaps, this may be a reason for expanding the interest for shifting of identification method from conventional morpho-taxonomy to molecular taxonomy. Recently, molecular taxonomy based identification in the form of labeled DNA barcodes is considered as a landmark in the discrimination of mosquito vectors. Almost all the DNA barcodes contain detailed information of the barcoding gene in conjunction with some uninformative sequences of a particular species. Henceforth, a technique is highly essential to wipe out or to reduce the number of uninformative sequences and ought to create more accurate species-specific barcodes for discrimination. To overcome this issue, minor genetic sequence variants in the form of single nucleotide polymorphism (SNP) were analyzed and regarded as a novel approach for developing a fast, reliable, and high-throughput technique for the discrimination between known species. Conversely, machine learning approaches especially help in extracting information from a huge amount of continuously growing data and are particularly useful for applications where the data is complicated to analyze critically. To this end, the chapter explains the present status of knowledge regarding the implementation of a machine learning approach for the generation of SNP barcodes from the DNA barcoding gene sequences of various evolutionarily related mosquitoes species.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878

    Article  PubMed  PubMed Central  Google Scholar 

  • Ashfaq M, Hebert PD, Mirza JH, Khan AM, Zafar Y, Mirza MS (2014) Analyzing mosquito (Diptera: culicidae) diversity in Pakistan by DNA barcoding. PLoS One 9(5):e97268

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Barik TK (2015) Antimalarial drug: from its development to deface. Curr Drug Discov Technol 12:225–228

    Article  CAS  PubMed  Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning, vol 4. springer, New York

    Google Scholar 

  • Black WC (1993) PCR with arbitrary primers: approach with care. Insect Mol Biol 2:1–6

    Article  CAS  PubMed  Google Scholar 

  • Boitard S, Schlotterer C, Futschik A (2009) Detecting selective sweeps: a new approach based on hidden Markov models. Genetics 181:1567–1578

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Campos J, Andrade S, Recco-Pimentel S (2003) A technique for preparing polytene chromosomes from Aedes aegypti (Diptera, Culicinae). Mem Inst Oswaldo Cruz 98:387–390

    Article  PubMed  Google Scholar 

  • Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972–1973

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Chan-Chable RJ, Martínez-Arce A, Mis-Avila PC, Ortega-Morales AI (2019) DNA barcodes and evidence of cryptic diversity of anthropophagous mosquitoes in Quintana Roo, Mexico. Ecol Evol 9:4692–4705

    Article  PubMed  PubMed Central  Google Scholar 

  • Che D, Zhao J, Cai L, Xu Y (2007) Operon prediction in microbial genomes using decision tree approach. In: CIBCB’07: IEEE symposium on computational intelligence and bioinformatics and computational biology, pp 135–142

    Google Scholar 

  • Clements AN, Harbach RE (2018) Controversies over the scientific name of the principal mosquito vector of yellow fever virus – expediency versus validity. J Vector Ecol 43:1–14

    Article  PubMed  Google Scholar 

  • Coll F, McNerney R, Guerra-Assuncao JA, Glynn JR, Perdigão J, Viveiros M, Portugal I, Pain A, Martin N, Clark TG (2014) A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nat Commun 5:4812

    Article  CAS  PubMed  Google Scholar 

  • Dobigny G, Ducroz JF, Robinson TJ, Volobouev V (2004) Cytogenetics and cladistics. Syst Biol 53:470–484

    Article  PubMed  Google Scholar 

  • Foley DH, Meek SR, Bryan JH (1994) The Anopheles punctulatus complex of mosquitoes in the Solomon Islands and Vanuatu surveyed by allozyme electrophoresis. Med Vet Entomol 8:340–350

    Article  CAS  PubMed  Google Scholar 

  • Fontenille D, Simard F (2004) Unravelling complexities in human malaria transmission dynamics in Africa through a comprehensive knowledge of vector populations. Comp Immunol Microbiol Infect Dis 27:357–375

    Article  PubMed  Google Scholar 

  • Githeko AK, Ayisi JM, Odada PK et al (2006) Topography and malaria transmission heterogeneity in western Kenya highlands: prospects for focal vector control. Malar J 5:107

    Article  PubMed  PubMed Central  Google Scholar 

  • Gokhman VE, Kuznetsova VG (2006) Comparative insect karyology: current state and applications. Entomol Rev 86:352–368

    Article  Google Scholar 

  • Greenwood B (2009) Can malaria be eliminated? Trans R Soc Trop Med Hyg 103(1):S2–S5

    Article  PubMed  Google Scholar 

  • Gubler DJ (1998) Resurgent vector-borne diseases as a global health problem. Emerg Infect Dis 4(3):442–450

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Harbach RE, Howard TM (2007) Correction in the status and rank of names used to denote varietal forms of mosquitoes (Diptera: Culicidae). Zootaxa 1542:35–48

    Article  Google Scholar 

  • Hebert PD, Cywinska A, Ball SL, de Waard JR (2003) Biological identifications through DNA barcodes. Proc Biol Sci 270(1512):313–321

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Heckel DG (2003) Genomics in pure and applied entomology. Annu Rev Entomol 48:235–260

    Article  CAS  PubMed  Google Scholar 

  • ICZN (1999) International code of zoological nomenclature, 4th edn. International Trust for Zoological Nomenclature, London. 306pp.

    Google Scholar 

  • Kilpatrick AM, Randolph SE (2012) Drivers, dynamics, and control of emerging vector-borne zoonotic diseases. Lancet (London, England) 380(9857):1946–1955

    Article  Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems 25. Neural Information Processing Systems Foundation, pp 1097–1105

    Google Scholar 

  • Kulkarni MA, Desrochers RE, Kerr JT (2010) High resolution riche models of malaria vectors in northern Tanzania: a new capacity to predict malaria risk? PLoS One 5(2):e9396

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16:321–332

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lin K, Li H, Schlötterer C, Futschik A (2011) Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics. Genetics 187(1):229–244

    Article  PubMed  PubMed Central  Google Scholar 

  • Lönnig W-E, Saedler H (2002) Chromosome rearrangements and transposable elements. Annu Rev Genet 36:389–410

    Article  PubMed  CAS  Google Scholar 

  • Loxdale HD, Lushai G (1998) Molecular markers in entomology. Bull Entomol Res 88:577–600

    Article  CAS  Google Scholar 

  • Maa WCJ, Terriere LC (1983) Age-dependent variation in enzymatic and electrophoretic properties of house fly (M. domestica) carboxylesterases. Comp Biochem Physiol 74C:461–467

    CAS  Google Scholar 

  • Mailund T, Dutheil JY, Hobolth A, Lunter G, Schierup MH (2011) Estimating divergence time and ancestral effective population size of Bornean and Sumatran orangutan subspecies using a coalescent hidden Markov model. PLoS Genet 7(3):e1001319

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ndo C, Antonio-Nkondjio C, Cohuet A, Ayala D, Kengne P, Morlais I et al (2010) Population genetic structure of the malaria vector Anopheles nili in sub-Saharan Africa. Malar J 9:161

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Paaijmans KP, Thomas MB (2011) The influence of mosquito resting behaviour and associated microclimate for malaria risk. Malar J 10:183

    Article  PubMed  PubMed Central  Google Scholar 

  • Patnaik S, Verma R, Prasad R, Das CC (1989) Banding pattern morphology in the polytene chromosome from salivary gland and Malpighian tubule nuclei of Culex quinquefasciatus (Culicidae). Perspec Cytol Genet 6:363–369

    Google Scholar 

  • Pavlidis P, Jensen JD, Stephan W (2010) Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations. Genetics 185(3):907–922

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Prosper O, Ruktanonchai N, Martcheva M (2012) Assessing the role of spatial heterogeneity and human movement in malaria dynamics and control. J Theor Biol 303:1–14

    Article  PubMed  Google Scholar 

  • Ranson H, N’Guessan R, Lines J et al (2011) Pyrethroid resistance in African anopheline mosquitoes: what are the implications for malaria control? Trends Parasitol 27:91–98

    Article  CAS  PubMed  Google Scholar 

  • Raskina O, Barber JC, Nevo E, Belyayev A (2008) Repetitive DNA and chromosomal rearrangements: speciation-related events in plant genomes. Cytogenet Genome Res 120:351–357

    Article  CAS  PubMed  Google Scholar 

  • Reidenbach KR, Cook S, Bertone MA, Harbach RE, Wiegmann BM, Besansky NJ (2009) Phylogenetic analysis and temporal diversification of mosquitoes (Diptera: Culicidae) based on nuclear genes and morphology. BMC Evol Biol 9:298

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Reiter P, Sprenger D (1987) The used tire trade: a mechanism for the worldwide dispersal of container breeding mosquitoes. J Am Mosq Control Assoc 3:494–501

    CAS  PubMed  Google Scholar 

  • Salzberg S (1995) Locating protein coding regions in human DNA using a decision tree algorithm. J Comput Biol 2(3):473–485

    Article  CAS  PubMed  Google Scholar 

  • Schaffner F, Bellini R, Petrić D, Scholte E-J, Zeller H, Marrama RL (2013) Development of guidelines for the surveillance of invasive mosquitoes in Europe. Parasit Vectors 6:209

    Article  PubMed  PubMed Central  Google Scholar 

  • Schrider DR, Kern AD (2018) Supervised machine learning for population genetics: a new paradigm. Trends Genet 34(4):301–312

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Severson DW, Brown SE, Knudson DL (2001) Genetic and physical mapping in mosquitoes: molecular approaches. Annu Rev Entomol 46:183–219

    Article  CAS  PubMed  Google Scholar 

  • Swain SN, Makunin A, Dora AS, Barik TK (2019) SNP barcoding based on decision tree algorithm: a new tool for identification of mosquito species with special reference to Anopheles. Acta Tropica 199:105152

    Google Scholar 

  • Takken W, Knols BGJ (2009) Malaria vector control: current and future strategies. Trends Parasitol 25(3):101–104

    Article  PubMed  Google Scholar 

  • Tatem AJ, Hay SI, Rogers DJ (2006) Global traffic and disease vector dispersal. Proc Natl Acad Sci 103:6242–6247

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang G, Li C, Guo X et al (2012) Identifying the main mosquito species in China based on DNA barcoding. PLoS One 7(10):e47051

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang X, Liu Y, Wang L, Han J, Chen S (2016) A nucleotide signature for the identification of Angelicae sinensis radix (Danggui) and its products. Sci Rep 6:34940

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wilding N, Mardell SK, Brookes CP, Loxdale HD (1993) The use of polyacrylamide gel electrophoresis of enzymes to identify entomophthoralean fungi in aphid hosts. J Invertebr Pathol 62:268–272

    Article  CAS  Google Scholar 

  • Wilson AJ, Mellor PS (2009) Bluetongue in Europe: past, present and future. Philos Trans R Soc Lond Ser B Biol Sci 364:2669–2681

    Article  Google Scholar 

  • World Health Organization (2004) The world health report 2004 – changing history. Geneva.

    Google Scholar 

  • Yang CH, Wu KC, Dahms HU, Chuang LY, Chang HW (2017) Single nucleotide polymorphism barcoding of cytochrome c oxidase I sequences for discriminating 17 species of Columbidae by decision tree algorithm. Ecol Evol 7(13):4717–4725

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Swain, S.N., Barik, T.K. (2020). From Linnaean System to Machine Learning Based-SNP Barcoding: A Changing Epitome of Mosquito Species Identification. In: Barik, T.K. (eds) Molecular Identification of Mosquito Vectors and Their Management. Springer, Singapore. https://doi.org/10.1007/978-981-15-9456-4_2

Download citation

Publish with us

Policies and ethics