Chemical reaction optimization for solving longest common subsequence problem for multiple string

  • Md. Rafiqul Islam
  • C. M. Khaled Saifullah
  • Zarrin Tasnim Asha
  • Rezoana Ahamed
Methodologies and Application

Abstract

Longest common subsequence (LCS) is a well-known NP-hard optimization problem that finds out the longest subsequence of each member of a given set of strings. In computational biology, sequence alignment is a fundamental technique to measure the similarity of biological sequences, such as DNA and genome sequences. A high sequence similarity often applied to molecular structural as well as functional similarities and can be used to determine whether (and how) sequences are related. Finding the longest common subsequence (LCS) is one way to measure the similarity of sequences. It has also applications in data compression, FPGA circuit minimization, and bioinformatics, etc. Exact algorithms are impractical since they fail to solve this problem for multiple instances of long lengths in polynomial time. There are some approximations, heuristic, and metaheuristic methods proposed to solve the problem. Chemical reaction optimization (CRO) is a new metaheuristic method that mimics the nature of chemical reaction into optimization problems. In this paper, we have proposed chemical reaction optimization technique to solve the longest common subsequence problem for multiple instances. Here, we have redesigned four elementary operators of CRO for LCS problem. Operators of CRO algorithm are used to explore the search space both locally and globally. A novel correction method has been designed to correct the solution. Correction method works after each search operator to ensure the validity of the changes made by operators. Both solution quality and execution time are considered while designing the operators and the correction method. Thus proposed system brings robustness, efficiency, and effectiveness while solving MLCS problem. Our approach is compared with hyper-heuristic, ant colony optimization, beam ant colony optimization, and memory-bound anytime algorithms. The experimental results in lengths of the returned common sequences show that our proposed algorithm gives either same or better results than all other algorithms in less execution time.

Keywords

Algorithm Chemical reaction optimization Longest common subsequence NP-hard Optimization 

Notes

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Aho AV, Hopcroft JE, Ullman JD (1983) Data structures and algorithms. Addison Wesley Publishing Company, INc., BostonMATHGoogle Scholar
  2. Aine S, Chakrabarti P, Kumar R (2007) Awa-a window constrained anytime heuristic search algorithm. In: IJCAI, pp 2250-2255Google Scholar
  3. Banerjee A, Ghosh J (2001) Clickstream clustering using weighted longest common subsequences. In: Proceedings of the web mining workshop at the 1st SIAM conference on data mining, vol. 143, p 144Google Scholar
  4. Bepery C, Abdullah-Al-Mamun S, Rahman MS (2015) Computing a longest common subsequence for multiple sequences. In: 2015 2nd international conference on electrical information and communication technology (EICT). IEEE, pp 118-129Google Scholar
  5. Blum C (2010) Beam-ACO for the longest common subsequence problem. In: 2010 IEEE congress on evolutionary computation (CEC). IEEE, pp. 1-8Google Scholar
  6. Blum C, Blesa M (2007) Probabilistic beam search for the longest common subsequence problem. Engineering stochastic local search algorithms. Designing, implementing and analyzing effective heuristics, pp 150–161Google Scholar
  7. Blum C, Blesa MJ (2017) A hybrid evolutionary algorithm based on solution merging for the longest arc-preserving common subsequence problem. arXiv preprint arXiv:1702.00318
  8. Blum C, Blesa MJ (2018) Hybrid techniques based on solving reduced problem instances for a longest common subsequence problem. Appl Soft Comput 62:15–28CrossRefGoogle Scholar
  9. Blum C, Blesa M, Lopez M (2009) Beam search for the longest common subsequence problem. Comput Oper Res 36:3178–3186MathSciNetCrossRefMATHGoogle Scholar
  10. Blum C, Blesa MJ, Calvo B (2013) Beam-ACO for the repetition-free longest common subsequence problem. In: International conference on artificial evolution (Evolution Artificielle). Springer, pp 79–90Google Scholar
  11. Bonizzoni P, Della Vedova G, Mauri G (2001) Experimenting an approximation algorithm for the LCS. Discrete Appl Math 110(1):13–24MathSciNetCrossRefGoogle Scholar
  12. Brisk P, Kaplan A, Sarrafzadeh M (2004) Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Proceedings of the 41st annual design automation conference. ACM, pp 395–400Google Scholar
  13. Chen Y, Wan A, Liu W (2006) A fast parallel algorithm for finding the longest common sequence of multiple biosequences. BMC Bioinform 7(4):S4CrossRefGoogle Scholar
  14. Chin F, Poon CK (1994) Performance analysis of some simple heuristics for computing longest common subsequences. Algorithmica 12(4–5):293–311MathSciNetCrossRefMATHGoogle Scholar
  15. Easton T, Singireddy A (2007) A specialized branching and fathoming technique for the longest common subsequence problem. Int J Oper Res 4(2):98–104MathSciNetMATHGoogle Scholar
  16. Easton T, Singireddy A (2008) A large neighborhood search heuristic for the longest common subsequence problem. J Heuristics 14(3):271–283CrossRefMATHGoogle Scholar
  17. Eppstein D, Galil Z, Giancarlo R, Italiano GF (1992) Sparse dynamic programming ii: convex and concave cost functions. J ACM (JACM) 39(3):546–567MathSciNetCrossRefMATHGoogle Scholar
  18. Guénoche A (2004) Supersequences of masks for oligo-chips. J Bioinform Comput Biol 2(03):459–469CrossRefGoogle Scholar
  19. Guenoche A, Vitte P (1995) Longest common subsequence to multiple strings. Exact and approximate algorithms. TSI-Technique et Science Informatiques-RAIRO 14(7):897–916Google Scholar
  20. Hakata K, Imai H (1992) The longest common subsequence problem for small alphabet size between many strings. Algorithms Comput 650:469–478MathSciNetGoogle Scholar
  21. Hirschberg DS (1975) A linear space algorithm for computing maximal common subsequences. Commun ACM 18(6):341–343MathSciNetCrossRefMATHGoogle Scholar
  22. Ho Wc (2017) A fast algorithm for the constrained longest common subsequence problem with small alphabet. Proceedings of the 34th workshop on combinatorial mathematics and computation theory, Taichung, Taiwan, May 19–20, 2017Google Scholar
  23. Hsu W, Du M (1984) Computing a longest common subsequence for a set of strings. BIT Numer Math 24(1):45–59MathSciNetCrossRefMATHGoogle Scholar
  24. Huang K, Yang CB, Tseng KT, et al (2004) Fast algorithms for finding the common subsequence of multiple sequences. In: Proceedings of the international computer symposium. IEEE Press, pp 1006–1011Google Scholar
  25. Irving RW, Fraser CB (1992) Two algorithms for the longest common subsequence of three (or more) strings. In: Annual symposium on combinatorial pattern matching. Springer, pp 214–229Google Scholar
  26. Islam MR, Asha ZT, Ahmed R (2015) Longest common subsequence using chemical reaction optimization. In: 2015 2nd international conference on electrical information and communication technology (EICT). IEEE, pp 29–33Google Scholar
  27. James J, Lam AY, Li VO (2011) Evolutionary artificial neural network based on chemical reaction optimization. In: 2011 IEEE congress on evolutionary computation (CEC). IEEE, pp 2083–2090Google Scholar
  28. Jansen T, Weyland D (2007) Analysis of evolutionary algorithms for the longest common subsequence problem. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, pp 939–946Google Scholar
  29. Jiang T, Li M (1995) On the approximation of shortest common supersequences and longest common subsequences. SIAM J Comput 24(5):1122–1139MathSciNetCrossRefMATHGoogle Scholar
  30. Johtela T, Smed J, Hakonen H, Raita T (1996) An efficient heuristic for the LCS problem. In: Third South American workshop on string processing, WSP 96:126–140Google Scholar
  31. Korkin D, Wang Q, Shang Y (2008) An efficient parallel algorithm for the multiple longest common subsequence (MLCS) problem. In: 37th international conference on parallel processing, 2008, ICPP’08. IEEE, pp 354–363Google Scholar
  32. Lam AY, Li VO (2012) Chemical reaction optimization: a tutorial. Memet Comput 4(1):3–17CrossRefGoogle Scholar
  33. Likhachev M, Gordon GJ, Thrun S (2004) Ara*: Anytime a* with provable bounds on sub-optimality. In: Advances in neural information processing systems, pp 767–774Google Scholar
  34. Likhachev M, Ferguson D, Gordon G, Stentz A, Thrun S (2008) Anytime search in dynamic graphs. Artif Intell 172(14):1613–1643MathSciNetCrossRefMATHGoogle Scholar
  35. Li Y, Li H, Duan T, Wang S, Wang Z, Cheng Y (2016) A real linear and parallel multiple longest common subsequences (MLCS) algorithm. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1725–1734Google Scholar
  36. López-Ibánez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package: iterated racing for automatic algorithm configuration. IRIDIA, Universite Libre de Bruxelles, Brussels, Belgium, Technical Report TR/IRIDIA/2011-004Google Scholar
  37. Maier D (1978) The complexity of some problems on subsequences and supersequences. J ACM (JACM) 25(2):322–336MathSciNetCrossRefMATHGoogle Scholar
  38. Mousavi SR, Tabataba FS (2012) An improved algorithm for the longest common subsequence problem. Comput Oper Res 39:512–520MathSciNetCrossRefMATHGoogle Scholar
  39. Ning K (2010) Deposition and extension approach to find longest common subsequence for thousands of long sequences. Comput Biol Chem 34(3):149–157MathSciNetCrossRefMATHGoogle Scholar
  40. Peng Z, Wang Y (2017) A novel efficient graph model for the multiple longest common subsequences (MLCS) problem. Front Genet 8:104CrossRefGoogle Scholar
  41. Saifullah CK, Islam MR (2016a) Chemical reaction optimization for solving shortest common supersequence problem. Comput Biol Chem 64:82–93CrossRefGoogle Scholar
  42. Saifullah CK, Islam MR (2016b) Solving shortest common supersequence problem using chemical reaction optimization. In: 2016 5th International conference on informatics, electronics and vision (ICIEV). IEEE, pp 50–55Google Scholar
  43. Saifullah CK, Islam MR, Mahmud MR (2018) Chemical reaction optimization algorithm for word detection using pictorial structure. In: International conference on emerging technology in data mining and information security (IEMIS) (Accepted. To appear)Google Scholar
  44. Sankoff D, Kruskal JB, (1983) Time warps, string edits, and macromolecules: the theory and practice of sequence comparison. In: Sankoff D, Kruskal JB (eds) Reading: Addison-Wesley Publication (1983)Google Scholar
  45. Sellis TK (1988) Multiple-query optimization. ACM Trans Database Syst (TODS) 13(1):23–52CrossRefGoogle Scholar
  46. Shyu SJ, Tsai CY (2009) Finding the longest common subsequence for multiple biological sequences by ant colony optimization. Comput Oper Res 36(1):73–91MathSciNetCrossRefMATHGoogle Scholar
  47. Singireddy A (2003) Solving the longest common subsequence problem in bioinformatics. Master, Kansas State University 1(1):1–10Google Scholar
  48. Sivanandam S, Deepa S (2007) Introduction to genetic algorithms. Springer, BerlinMATHGoogle Scholar
  49. Storer J (1988) Data compression. Elsevier, AmsterdamGoogle Scholar
  50. Tabataba FS, Mousavi SR (2012) A hyper-heuristic for the longest common subsequence problem. Comput Biol Chem 36:42–54MathSciNetCrossRefMATHGoogle Scholar
  51. Truong TK, Li K, Xu Y (2013) Chemical reaction optimization with greedy strategy for the 0–1 knapsack problem. Appl Soft Comput 13(4):1774–1780CrossRefGoogle Scholar
  52. Tsai Y, Hsu J (2002) An approximation algorithm for multiple longest common subsequence problems. In: Proceeding of the 6th world multiconference on systemics, cybernetics and informatics, SCI, pp 456–460Google Scholar
  53. Tseng KT, Chan DS, Yang CB, Lo SF (2018) Efficient merged longest common subsequence algorithms for similar sequences. Theor Comput Sci 708:75–90MathSciNetCrossRefMATHGoogle Scholar
  54. Vadlamudi SG, Aine S, Chakrabarti PP (2011) A memory-bounded anytime heuristic-search algorithm. IEEE Trans Syst Man Cybern Part B (Cybern) 41(3):725–735CrossRefGoogle Scholar
  55. Van Den Berg J, Shah R, Huang A, Goldberg K (2011) Ana: anytime nonparametric a. In: Proceedings of twenty-fifth AAAI conference on artificial intelligence (AAAI-11)Google Scholar
  56. Wang Q, Korkin D, Shang Y (2009) Efficient dominant point algorithms for the multiple longest common subsequence (mlcs) problem. In: IJCAI, pp 1494–1500Google Scholar
  57. Wang Q, Pan M, Shang Y, Korkin D (2010) A fast heuristic search algorithm for finding the longest common subsequence of multiple strings. In: AAAIGoogle Scholar
  58. Wang Q, Korkin D, Shang Y (2011) A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans Knowl Data Eng 23(3):321–334CrossRefGoogle Scholar
  59. Wang X, Wu Y, Zhu D (2016) A polynomial time algorithm for a generalized longest common subsequence problem. In: Green, pervasive, and cloud computing. Springer, pp 18–29Google Scholar
  60. Xu J, Lam AY, Li VO (2010) Parallel chemical reaction optimization for the quadratic assignment problem. In: World congress in computer science, computer engineering, and applied computing, Worldcomp 2010Google Scholar
  61. Xu J, Lam AY, Li VO (2011) Chemical reaction optimization for task scheduling in grid computing. IEEE Trans Parallel Distrib Syst 22(10):1624–1631CrossRefGoogle Scholar
  62. Yang J, Xu Y, Shang Y, Chen G (2014) A space-bounded anytime algorithm for the multiple longest common subsequence problem. IEEE Tans Knowl Data Eng 26(11):2599–2609CrossRefGoogle Scholar
  63. Yao X (1991) Optimization by genetic annealing. In: Proceedings of the second australian conference on neural networks, pp 94–97Google Scholar
  64. Zhu D, Wang X (2016) A fast algorithm for solving a generalized longest common subsequence problem. ICSIC 2016 Committees Executive Committee , p 1Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Md. Rafiqul Islam
    • 1
  • C. M. Khaled Saifullah
    • 1
  • Zarrin Tasnim Asha
    • 1
  • Rezoana Ahamed
    • 1
  1. 1.Khulna UniversityKhulnaBangladesh

Personalised recommendations