Skip to main content

Advertisement

Log in

A novel locally guided genome reassembling technique using an artificial ant system

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

DNA reassembling is an NP-hard problem (Brun, Theor Comput Sci 395:31–46, 2008; Medvedev et al 2007; Ma and Lombardi 2008). The present article presents a locally guided global learning system to solve the problem of genome reassembling. We have used a reference DNA sequence which is 99 % similar to an unknown DNA sequence. Two different sequences from the same organism generally have around 99 % similarity (Wei et al 2007). We have considered different DNA sequences from NCBI website (http://www.ncbi.nlm.nih.gov). Then we have simulated the tasks of cloning the sequence, followed by shearing the clones to a number of short reads. In our algorithm, we have introduced a new concept in the task of DNA reassembling using Ant Colony Optimization, where pheromone concentration is proportional to the score of assembled DNA fragments with some known reference sequences within the same organism. Unlike local overlapping, we have used here local alignment score of short reads with some known local reference region as the heuristic information. The result shows that our algorithm is capable of reassembling at par with the state-of-the-art. DNA reassembling techniques may need a massive parallel computation and huge memory space (Kurniawan et al 2008) because of size ~109bp of DNA sequences of mammals (Miller et al, Genomics 95:315–327, 2010; Blazewicz et al, Comput Biol Chem 33:224–230, 2009; Butler et al, Genome Res 18:810–820, 2008; Joshi et al 2011; Stupar et al, Arch Oncol 19:3–4, 2011; Quail et al, BMC Genomics 13:1471–2164, 2012), and ACO is inherently concurrent in nature (Dorigo and Stutzle 2004). Due to lack of appropriate computational resources, we had to confine ourselves to deal with the sequences of length up to ∼105 b p. We have considered 22 sequences of different organism, including Homo sapiens BRCA1 (127429bp) gene. For large sequences, we have applied hierarchical BAC-by-BAC sequencing (Fig. 2) (Myers, Comput Sci Eng 1:33–43, 1999), to stitch the individual segments to retrieve the original DNA sequence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://www.cbcb.umd.edu/research/assembly_primer

  2. https://www.cbcb.umd.edu/research/assembly_primer

  3. https://www.cbcb.umd.edu/research/assembly_primer

  4. http://www.ncbi.nlm.nih.gov

  5. http://www.socscistatistics.com/tests/signedranks/Default2.aspx

References

  1. Garca S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametric tests for analyzing the evolutionary algorithms behaviour: a case study on the CEC2005 Special Session on Real Parameter Optimization. Journal of Heuristcs 15:617– 644

    Article  Google Scholar 

  2. Indumathy R, Uma Maheswari S (2014) Solving DNA Sequence Assembly Using Particle Swarm Optimization With Inertia Weight and Constriction Factor. International Journal of Soft Computing and Artificial Intelligence 2(1):90–94

    Google Scholar 

  3. Verma RS, Singh V, Kumar S (2011) DNA Sequence Assembly using Particle Swarm Optimization. Int J Comput Appl 28(10):34–38

    Google Scholar 

  4. Fang S-C, Wang Y, Zhong J (2005) A Genetic Algorithm Approach to Solving DNA Fragment Assembly Problem. J Comput Theor Nanosci 2:1–7

    Article  Google Scholar 

  5. Parsons RJ, Forrest S, Burks C (1995) Genetic algorithms, operators, and DNA fragment assembly. Mach Learn 21(1-2):11–33

    Article  Google Scholar 

  6. Nebro AJ, Luque G, Luna F, Alba E (2008) DNA fragment assembly using a grid-based genetic algorithm. Comput Oper Res 35(9):2776–2790

    Article  MATH  Google Scholar 

  7. Luque G, Alba E, Khuri S (2005) Parallel Computing for Bioinformatics and Computational Biology, WILEY, Chapter-12: Assembling DNA Fragments with a Distributed Genetic Algorithm

  8. Karaboga D, Akay B (2009) A comparative study of Artificial Bee Colony algorithm. Appl Math Comput 25:108–132

    Article  MathSciNet  Google Scholar 

  9. Karaboga D, Ozturk C, Karaboga N, Gorkemli B (2012) Artificial bee colony programming for symbolic regression. Inf Sci 209:01–15

    Article  Google Scholar 

  10. Firoz JS, Sohel Rahman M, Saha TK (2012) Bee Algorithms for Solving DNA Fragment Assembly Problem with Noisy and Noiseless data. GECCO ’12 Proceedings 14th Annual Conference on Genetic and Evolutionary Computation. ACM, NY, pp 201–208

    Google Scholar 

  11. Ansorge WJ (2009) Next generation DNA sequencing techniques. New Biotechnol 25(4):167–260

    Article  Google Scholar 

  12. Blazewicz J, Bryjaa M, Figlerowicz M, Gawrona P, Kasprzak M, Kirton E, Platt D, Przybytek J, Swiercz A, Szajkowski L (2009) Whole genome assembly from 454 sequencing output via modified DNA graph concept. Comput Biol Chem 33:224–230

    Article  Google Scholar 

  13. Blum C, Valles MY, Blesa MJ (2008) An ant colony optimization algorithm for DNA sequencing by hybridization. Comput Oper Res 35:362–3635

    Article  Google Scholar 

  14. Brun Y (2008) Solving NP-complete problems in the tile assembly model. Theor Comput Sci 395:31–46

    Article  MathSciNet  MATH  Google Scholar 

  15. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB (2008) Allpaths: De novo assembly of whole-genome shotgun micro reads. Genome Res 18:810–820

    Article  Google Scholar 

  16. Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Systems Man Cybern Part B 26:29–41

  17. Dorigo M, Stutzle T (2004) Ant Colony Optimization. MIT Press, London

    Book  MATH  Google Scholar 

  18. Isakov O, Shomron N Deep sequencing data analysis: Challenges and solutions. Bioinformatics Trends and Methodologies, Intech, November 2011, ch-29:Deep Sequencing Data Analysis

  19. Joshi N, Srivastava S, Kumar M, Kavalan J, Karandikar SK, Saraph A (2011) Parallelization of velvet, a de-novo genome sequence assembler. IEEE International Conference on High Performance Computing

  20. Kurniawan TB, Ibrahim Z, Saaid MFM, Yahya A (2008) Implementation of ant system for DNA sequence optimization. NANO-SciTech, Shah Alam

    Google Scholar 

  21. Ma X, Lombardi F (2008) Combinatorial optimization problem in designing DNA self-assembly tile sets. 2008 IEEE International Workshop on Design and Test of Nano Devices, Circuits and Systems, pp 73–76

  22. Medvedev P, Georgiou K, Myers G, Brudno M (2007) Computability models of sequence assembly. Workshop on Algorithms in Bioinformatics, Philadelphia, 289–301

  23. Meksangsouy P, Chaiyaratana N (2003) DNA fragment assembly using an ant colony system algorithm. Proceedings Evolutionary Computation. CEC ’03 3:1756–1763

    Google Scholar 

  24. Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next generation sequencing data. Genomics 95:315–327

    Article  Google Scholar 

  25. Myers G (1999) Whole-genome dna sequencing. Comput Sci Eng 1:33–43

    Article  Google Scholar 

  26. Myllykangas S, Buenrostro J, Ji HP (2012) Overview of sequencing technology platforms. Bioinformatics for High Throughput Sequencing, 11–25

  27. Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Yong G (2012) A tale of three next generation sequencing platforms: comparison of ion torrent, pacific biosciences and illumina Miseq sequencers. BMC Genom 13:1471–2164

    Article  Google Scholar 

  28. Scheibye-Alsing K, Hoffmann S, Frankel AM, Jensen P, Stadler PF (2009) Sequence assembly. Comput Biol Che:33

  29. Stupar M, Vidovi V, Luka D (2011) Functions of human non-coding DNA sequences. Arch Oncol 19:3–4

    Article  Google Scholar 

  30. Treangen TJ, Salzberg SL (2011) Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet 113(1):36–46

    Google Scholar 

  31. Wei L-T, Yang C-B, Ann H-Y, Peng Y-H (2007) Ant colony optimization algorithms for sequence assembly with haplotyping. 6th Conference on Information Technology and Applications in Outlying Islands, Yunlin, Taiwan, 260–268

  32. Zerbino DR, Velvet EB (2008) Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 1:821–829

    Article  Google Scholar 

  33. Fullwood MJ, Wei C-L, Liu ET (2009) Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res 19:521–532

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajat Kumar De.

Additional information

http://www.ncbi.nlm.nih.gov

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baidya, S., De, R.K. A novel locally guided genome reassembling technique using an artificial ant system. Appl Intell 43, 397–411 (2015). https://doi.org/10.1007/s10489-015-0650-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0650-5

Keywords

Navigation