A Novel Hypergraph-Based Genetic Algorithm (HGGA) Built on Unimodular and Anti-homomorphism Properties for DNA Sequencing by Hybridization

Original Research Article
  • 83 Downloads

Abstract

The sequencing by hybridization (SBH) of determining the order in which nucleotides should occur on a DNA string is still under discussion for enhancements on computational intelligence although the next generation of DNA sequencing has come into existence. In the last decade, many works related to graph theory-based DNA sequencing have been carried out in the literature. This paper proposes a method for SBH by integrating hypergraph with genetic algorithm (HGGA) for designing a novel analytic technique to obtain DNA sequence from its spectrum. The paper represents elements of the spectrum and its relation as hypergraph and applies the unimodular property to ensure the compatibility of relations between l-mers. The hypergraph representation and unimodular property are bound with the genetic algorithm that has been customized with a novel selection and crossover operator reducing the computational complexity with accelerated convergence. Subsequently, upon determining the primary strand, an anti-homomorphism is invoked to find the reverse complement of the sequence. The proposed algorithm is implemented in the GenBank BioServer datasets, and the results are found to prove the efficiency of the algorithm. The HGGA is a non-classical algorithm with significant advantages and computationally attractive complexity reductions ranging to \(O(n^{2} )\) with improved accuracy that makes it prominent for applications other than DNA sequencing like image processing, task scheduling and big data processing.

Keywords

Hypergraph Genetic algorithm Unimodular property Anti-homomorphism L-mers Computational complexity 

Notes

Acknowledgements

The authors thank the Department of Science and Technology—Fund for Improvement of S&T Infrastructure in Universities and Higher Educational Institutions Government of India (SR/FST/MSI-107/2015) for their financial support. Authors express their gratefulness to SASTRA University, Thanjavur, for providing the infrastructural facilities and academic support to carry out this research work. We would like to express our gratitude toward the unknown potential reviewers who have agreed to review this paper and provided valuable suggestions to improve the quality of the paper.

References

  1. 1.
    Bains W, Smith GC (1988) A novel method for nucleic acid sequence determination. J Theor Biol 135(3):303–307CrossRefPubMedGoogle Scholar
  2. 2.
    Drmanac R, Labat I, Brukner I, Crkvenjakov R (1989) Sequencing of megabase plus DNA by hybridization: theory of the method. Genomics 4(2):114–128CrossRefPubMedGoogle Scholar
  3. 3.
    Khrapko KR, Lysov YP, Khorlyn AA, Shick VV, Florentiev VL, Mirzabekov AD (1989) An oligonucleotide hybridization approach to DNA sequencing. FEBS Lett 256(1–2):118–122CrossRefPubMedGoogle Scholar
  4. 4.
    Błażewicz J, Kasprzak M, Kuroczycki W (2002) Hybrid genetic algorithm for DNA sequencing with errors*. J Heuristics 8:495–502CrossRefGoogle Scholar
  5. 5.
    Blazewicz J, Glover F, Kasprzak M (2005) Evolutionary approaches to DNA sequencing with errors. Ann Oper Res 138:67–78CrossRefGoogle Scholar
  6. 6.
    Błażewicz J et al (2000) Tabu search for DNA sequencing with false negatives and false positives. Eur J Oper Res 125(2):257–265CrossRefGoogle Scholar
  7. 7.
    Błażewicz J, Glover F, Kasprzak M (2004) DNA sequencing—Tabu and Scatter search combined. INFORMS J Comput 16(3):232–240CrossRefGoogle Scholar
  8. 8.
    Caserta M, Vo S (2014) A hybrid algorithm for the DNA sequencing problem. Discret Appl Math 163:87–99CrossRefGoogle Scholar
  9. 9.
    Radom M, Formanowicz P (2017) An Algorithm for Sequencing by Hybridization Based on an Alternating DNA Chip. Interdiscip Sci: Comput Life Sci 1–11Google Scholar
  10. 10.
    Blazewicz J, Hertz A, Kobler D, De Werra D (1999) On some properties of DNA graphs. Discret Appl Math 98:1–19CrossRefGoogle Scholar
  11. 11.
    Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74(12):5463–5467CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Sanger F (1981) Determination of nucleotide sequences in DNA. Science 214(4526):1205–1210CrossRefPubMedGoogle Scholar
  13. 13.
    Sutton G, White O, Adams M (1995) TIGR Assembler: A new tool for assembling large shotgun sequencing projects. Genome Sci Technol 1(1):9–19CrossRefGoogle Scholar
  14. 14.
    Ahmadian A, Ehn M, Hober S (2006) Pyrosequencing: history, biochemistry and future. Clin Chim ActaGoogle Scholar
  15. 15.
    Lysov IP, Florent’ev VL, Khorlin AA, Khrapko KR, Shik VV (1988) Determination of the nucleotide sequence of DNA using hybridization with oligonucleotides A new method. Dokl Akad Nauk SSSR 303(6):1508–1511PubMedGoogle Scholar
  16. 16.
    Bains W (1991) Hybridization methods for DNA sequencing. Genomics 11(2):294–301CrossRefPubMedGoogle Scholar
  17. 17.
    Drmanac R, Crkvenjakov R (1993) Method of sequencing of genomes by hybridization of oligonucleotide probes. US Patent 5,202,231Google Scholar
  18. 18.
    Zhang J-H, Wu L-Y, Zhang X-S (2003) Reconstruction of DNA sequencing by hybridization. Bioinformatics 19(1):14–21CrossRefPubMedGoogle Scholar
  19. 19.
    Cattaneo G, Chiaselotti G, Ciucci D, Gentile T (2016) On the connection of hypergraph theory with formal concept analysis and rough set theory. Inf Sci (NY) 330:342–357CrossRefGoogle Scholar
  20. 20.
    Hartemink AJ, Gifford DK, Khodor J (1999) Automated constraint-based nucleotide sequence selection for DNA computation. Biosystems 52(1):227–235CrossRefPubMedGoogle Scholar
  21. 21.
    Penchovsky R, Ackermann J (2003) DNA library design for molecular computation. J Comput Biol 10(2):215–229CrossRefPubMedGoogle Scholar
  22. 22.
    Zhang Z et al (2000) A greedy algorithm for aligning DNA sequences. J Comput biol 7(1–2):203–214CrossRefPubMedGoogle Scholar
  23. 23.
    Blum C, Vallès MY, Blesa MJ (2008) An ant colony optimization algorithm for DNA sequencing by hybridization. Comp Oper Res 35(11):3620–3635CrossRefGoogle Scholar
  24. 24.
    Kurniawan T, Khalid N, Ibrahim Z (2008) Evaluation of ordering methods for DNA sequence design based on ant colony system. In: 2008 Second AsiaGoogle Scholar
  25. 25.
    Mizas C et al (2008) Reconstruction of DNA sequences using genetic algorithms and cellular automata: towards mutation prediction? Biosystems 92(1):61–68CrossRefPubMedGoogle Scholar
  26. 26.
    Chaves-González JM, Vega-Rodríguez MA, Granado-Criado JM (2013) A multiobjective swarm intelligence approach based on artificial bee colony for reliable DNA sequence design. Eng Appl Artif Intell 26(9):2045–2057CrossRefGoogle Scholar
  27. 27.
    Cui G, Li X (2010) The optimization of DNA encodings based on modified PSO/GA algorithm. In: 2010 International Conference on Computer Design and Applications (ICCDA), vol 1. IEEEGoogle Scholar
  28. 28.
    Khalid NK, Ibrahim Z, Kurniawan TB, Khalid M, Engelbrecht AP (2009) Implementation of binary particle swarm optimization for DNA sequence design. In: International Work-Conference on Artificial Neural Networks. Springer, Heidelberg, pp 450–457Google Scholar
  29. 29.
    Xiao J, Cheng Z (2011) DNA sequences optimization based on gravitational search algorithm for reliable DNA computing. In: Bio-Inspired Computing: Theories and Applications (BIC-TA), 2011 6th International Conference on IEEE, pp 103–107. IEEEGoogle Scholar
  30. 30.
    Blazewicz J et al (2013) A hyper-heuristic approach to sequencing by hybridization of DNA sequences. Ann Oper Res 207(1):27–41CrossRefGoogle Scholar
  31. 31.
    Cuticchia AJ, Arnold J, Timberlake WE (1993) ODS: ordering DNA sequences—a physical mapping algorithm based on simulated annealing. Bioinformatics 9(2):215–219CrossRefGoogle Scholar
  32. 32.
    Błażewicz J, Formanowicz P, Kasprzak M, Schuurman P, Woeginger GJ (2002) DNA sequencing, eulerian graphs, and the exact perfect matching problem. Springer, Berlin, pp 13–24Google Scholar
  33. 33.
    Blazewicz J, Kasprzak M (2006) Computational complexity of isothermic DNA sequencing by hybridization. Disc Appl Math 154(5):718–729CrossRefGoogle Scholar
  34. 34.
    Jafarzadeh N, Iranmanesh A (2013) A new graph theoretical approach to DNA sequencing with nanopores. Match-Commun Math Comp Chem 70(1):401–415Google Scholar
  35. 35.
    Xie X, Zaitsev Y, Velasquez-Garcia L, Teller S, Livermore C (2014). Compact, scalable, high-resolution, MEMS-enabled tactile displays. In: Proceedings of solid-state sensors, actuators, and microsystems workshop, pp 127–130Google Scholar
  36. 36.
    Xie X, Livermore C (2017) Passively self-aligned assembly of compact barrel hinges for high-performance, out-of-plane mems actuators. In: Proceedings of IEEE international conference micro electro mechanical system, pp 813–816Google Scholar
  37. 37.
    Xie X, Livermore C (2016) A pivot-hinged, multilayer SU-8 micro motion amplifier assembled by a self-aligned approach. In: Proceedings of IEEE international conference on micro electro mechanical system, vol. 2016, pp 75–78Google Scholar
  38. 38.
    Xie X, Zaitsev Y, Velásquez-García LF, Teller SJ, Livermore C (2014) Scalable, MEMS-enabled, vibrational tactile actuators for high resolution tactile displays. J. Micromech Microeng 24(12):125014CrossRefGoogle Scholar
  39. 39.
    Sheng B et al (2017) AutoPath : harnessing parallel execution paths for efficient resource allocation in multi-stage big data frameworks autopath : harnessing parallel execution paths for efficient resource allocation in multi-stage big data frameworksGoogle Scholar
  40. 40.
    Pevzner PA, Lipshutz RJ (1994) Towards DNA sequencing chips. In: International symposium on mathematical foundations of computer science. Springer, BerlinGoogle Scholar
  41. 41.
    Xiong S, Ji D (2016) Query-focused multi-document summarization using hypergraph-based ranking. Inf Process Manag 52(4):670–681CrossRefGoogle Scholar
  42. 42.
    Kannan K, Kanna BR, Aravindan C (2010) Root mean square filter for noisy images based on hyper graph model. Image Vis Comput 28(9):1329–1338CrossRefGoogle Scholar
  43. 43.
    Xiao G, Wang H, Lai T, Suter D (2016) Hypergraph modelling for geometric model fitting. Pattern Recognit 60:748–760CrossRefGoogle Scholar
  44. 44.
    Huang S, Elgammal A, Yang D (2017) On the effect of hyperedge weights on hypergraph learning. Image Vision Comput 57:89–101CrossRefGoogle Scholar
  45. 45.
    Berge C, Minieka E (1973) Graphs and hypergraphs, 2nd edn. Amsterdam, North-Holland, p 528Google Scholar
  46. 46.
    Pandey HM, Shukla A, Chaudhary A, Mehrotra D (2016) Evaluation of Genetic Algorithm’s Selection Methods. In: Information Systems Design and Intelligent Applications. Springer, New Delhi, pp 731–738CrossRefGoogle Scholar
  47. 47.
    Parsons R, Forrest S, Burks C (1993) Genetic algorithms for DNA sequence assembly. Proceed Int Conf Intell Syst Mol Biol 1:310–318Google Scholar
  48. 48.
    Yang Z, Wang J, Evans D, Mi N (2017) AutoReplica: automatic data replica manager in distributed caching and data processing systems. In: 2016 IEEE 35th international conference on computer communications IPCCC 2016, DecemberGoogle Scholar
  49. 49.
    Yang Z, Tai J, Bhimani J, Wang J, Mi N, Sheng B (2017) GReM: dynamic SSD resource allocation in virtualized storage systems with heterogeneous IO workloads. In: 2016 IEEE 35th international conference on computer communications IPCCC 2016Google Scholar
  50. 50.
    Bhimani J, Mi N, Leeser M (2017) FiM : performance prediction for parallel computation in iterative data processing applications FiM : performance prediction for parallel computation in iterative data processing applicationsGoogle Scholar
  51. 51.
    Yang Z, Awasthi M, Ghosh M, Mi N (2017) A fresh perspective on total cost of ownership models for flash storage in datacenters. In: Proc. Int. Conf. Cloud Comput. Technol. Sci. CloudCom, no. December 2016, pp 245–252Google Scholar
  52. 52.
    Caserta M, Voß S (2014) A hybrid algorithm for the DNA sequencing problem. Discret Appl Math 163(1):87–99CrossRefGoogle Scholar
  53. 53.
    Illumina (2017) An introduction to next-generation sequencing technology. http://www.illumina.com/technology/next-generation-sequencing.html. Accessed 18 Jun 2017

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Discrete Mathematics Research Laboratory, Srinivasa Ramanujan CentreSASTRA UniversityThanjavurIndia
  2. 2.School of ComputingSASTRA UniversityThanjavurIndia
  3. 3.School of Humanities and SciencesSASTRA UniversityThanjavurIndia

Personalised recommendations