Journal of Mathematical Biology

, Volume 70, Issue 6, pp 1327–1358 | Cite as

A combinatorial approach to the design of vaccines

  • Luis Martínez
  • Martin Milanič
  • Leire Legarreta
  • Paul Medvedev
  • Iker Malaina
  • Ildefonso M. de la Fuente


We present two new problems of combinatorial optimization and discuss their applications to the computational design of vaccines. In the shortest \(\lambda \)-superstring problem, given a family \(S_1,\ldots ,S_k\) of strings over a finite alphabet, a set \(\mathcal{T}\) of “target” strings over that alphabet, and an integer \(\lambda \), the task is to find a string of minimum length containing, for each \(i\), at least \(\lambda \) target strings as substrings of \(S_i\). In the shortest \(\lambda \)-cover superstring problem, given a collection \(X_1,\ldots , X_n\) of finite sets of strings over a finite alphabet and an integer \(\lambda \), the task is to find a string of minimum length containing, for each \(i\), at least \(\lambda \) elements of \(X_i\) as substrings. The two problems are polynomially equivalent, and the shortest \(\lambda \)-cover superstring problem is a common generalization of two well known combinatorial optimization problems, the shortest common superstring problem and the set cover problem. We present two approaches to obtain exact or approximate solutions to the shortest \(\lambda \)-superstring and \(\lambda \)-cover superstring problems: one based on integer programming, and a hill-climbing algorithm. An application is given to the computational design of vaccines and the algorithms are applied to experimental data taken from patients infected by H5N1 and HIV-1.


Vaccine design Combinatorial Optimization Integer programming Hill-climbing Shortest common superstring problem Set cover problem 

Mathematics Subject Classification

68Q25 68W32 90C90 90C59 90C90 92C40 92C50 92D20 


  1. Allegrini P, Buiatti M, Grigolini P, West BJ (1998) Fractional brownian motion as a nonstationary process: an alternative paradigm for DNA sequences. Phys Rev E 57(4):4558CrossRefGoogle Scholar
  2. Alon N, Moshkovitz D, Safra S (2006) Algorithmic construction of sets for \(k\)-restrictions. ACM Trans Algorithms (TALG) 2(2):153–177CrossRefMathSciNetGoogle Scholar
  3. Audit B, Vaillant C, Arnéodo A, d’Aubenton-Carafa Y, Thermes C (2004) Wavelet analysis of DNA bending profiles reveals structural constraints on the evolution of genomic sequences. J Biol Phys 30(1):33–81Google Scholar
  4. Ausiello G, Protasi M, Marchetti-Spaccamela A, Gambosi G, Crescenzi P, Kann V (1999). Complexity and approximation: combinatorial optimization problems and their approximability properties, 1st edn. Springer, SecaucusGoogle Scholar
  5. Blum A, Jiang T, Li M, Tromp J, Yannakakis M (1994) Linear approximation of shortest superstrings. J ACM (JACM) 41(4):630–647CrossRefzbMATHMathSciNetGoogle Scholar
  6. De la Fuente IM, Martinez L, Benitez N, Veguillas J, Aguirregabiria J (1998) Persistent behavior in a phase-shift sequence of periodical biochemical oscillations. Bull Math Biol 60(4):689–702Google Scholar
  7. De la Fuente IM, Vadillo F, Pérez-Pinilla M-B, Vera-López A, Veguillas J (2009) The number of catalytic elements is crucial for the emergence of metabolic cores. PLoS ONE 4(10):e7510Google Scholar
  8. Fischer W, Perkins S, Theiler J, Bhattacharya T, Yusim K, Funkhouser R, Kuiken C, Haynes B, Letvin NL, Walker BD et al (2006) Polyvalent vaccines for optimal coverage of potential t-cell epitopes in global HIV-1 variants. Nat Med 13(1):100–106CrossRefGoogle Scholar
  9. Gallant J, Maier D, Storer JA (1980) On finding minimal length superstrings. J Comput Syst Sci 20(1):50–58CrossRefzbMATHMathSciNetGoogle Scholar
  10. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New YorkGoogle Scholar
  11. GenBank (2013). (Online; Accessed 21 Sept 2013)
  12. Giles BM, Ross TM (2011) A computationally optimized broadly reactive antigen (COBRA) based H5N1 VLP vaccine elicits broadly reactive antibodies in mice and ferrets. Vaccine 29(16):3043–3054CrossRefGoogle Scholar
  13. Goldbeter A (1997) Biochemical oscillations and cellular rhythms: the molecular bases of periodic and chaotic behaviour. Cambridge University Press, CambridgeGoogle Scholar
  14. Henry-Labordere A (1969) The record balancing problem: A dynamic programming solution of a generalized traveling salesman problem. Rev Franç Inform Rech Opér 3(B-2):43–49Google Scholar
  15. HIV Molecular Immunology Database (2013). (Online; Accessed 21 Sept 2013)
  16. Holley LH, Goudsmit J, Karplus M (1991) Prediction of optimal peptide mixtures to induce broadly neutralizing antibodies to human immunodeficiency virus type 1. Proc Natl Acad Sci USA 88(15):6800–6804CrossRefGoogle Scholar
  17. Ibm, ILOG CPLEX Optimization Studio (2013). (Online; Accessed 21 Sept 2013)
  18. Java (2013). (Online; Accessed 21 Sept 2013)
  19. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási A-L (2000) The large-scale organization of metabolic networks. Nature 407(6804):651–654CrossRefGoogle Scholar
  20. Jojic N, Jojic V, Frey B, Meek C, Heckerman D (2005) Using “epitomes” to model genetic diversity: rational design of HIV vaccine cocktails. NIPS2005Google Scholar
  21. Jones NC, Pevzner P (2004) An introduction to bioinformatics algorithms. MIT Press, CambridgeGoogle Scholar
  22. Kazachenko V, Astashev M, Grinevich A (2007) Multifractal analysis of K+ channel activity. Biochem (Moscow) Suppl Ser A Membr Cell Biol 1(2):169–175CrossRefGoogle Scholar
  23. Kirovski D, Heckerman D, Jojic N (2007) Combinatorics of the vaccine design problem: definition and an algorithm. Microsoft Research Technical, Report MSR-TR-2007-2148Google Scholar
  24. Kulkarni V, Rosati M, Valentin A, Ganneru B, Singh AK, Yan J, Rolland M, Alicea C, Beach RK, Zhang G-M et al (2013) HIV-1 p24gag derived conserved element DNA vaccine increases the breadth of immune response in mice. PLoS ONE 8(3):e60245CrossRefGoogle Scholar
  25. Medvedev P, Georgiou K, Myers G, Brudno M (2007) Computability of models for sequence assembly. In: Algorithms in bioinformatics. Springer, Berlin, pp 289–301Google Scholar
  26. Miller CE, Tucker AW, Zemlin RA (1960) Integer programming formulation of traveling salesman problems. J ACM (JACM) 7(4):326–329CrossRefzbMATHMathSciNetGoogle Scholar
  27. Nickle DC, Rolland M, Jensen MA, Pond SLK, Deng W, Seligman M, Heckerman D, Mullins JI, Jojic N (2007) Coping with viral diversity in HIV vaccine design. PLoS Comput Biol 3(4):e75CrossRefGoogle Scholar
  28. O’Neill E, Kuo LS, Krisko JF, Tomchick DR, Garcia JV, Foster JL (2006) Dynamic evolution of the human immunodeficiency virus type 1 pathogenic factor, Nef. J Virol 80(3):1311–1320Google Scholar
  29. Pataki G (2003) Teaching integer programming formulations using the traveling salesman problem. SIAM Rev 45(1):116–123CrossRefzbMATHMathSciNetGoogle Scholar
  30. Saksena JP (1970) Mathematical model of scheduling clients through welfare agencies. CORS J 8:185–200MathSciNetGoogle Scholar
  31. Schrijver A (1986) Theory of linear and integer programming., Wiley-Interscience Series in Discrete Mathematics. Wiley, ChichesterGoogle Scholar
  32. Srivastava SS, Kumar S, Garg RC, Sen P (1969) Generalized travelling salesman problem through \(n\) sets of nodes. CORS J 7:97–101MathSciNetGoogle Scholar
  33. Tarhio J, Ukkonen E (1988) A greedy approximation algorithm for constructing shortest common superstrings. Theoret Comput Sci 57(1):131–145CrossRefzbMATHMathSciNetGoogle Scholar
  34. Toussaint NC, Dönnes P, Kohlbacher O (2008) A mathematical framework for the selection of an optimal set of peptides for epitope-based vaccines. PLoS Comput Biol 4(12):e1000246CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Luis Martínez
    • 1
    • 2
  • Martin Milanič
    • 3
    • 4
  • Leire Legarreta
    • 1
    • 2
  • Paul Medvedev
    • 5
    • 6
    • 7
  • Iker Malaina
    • 2
    • 8
  • Ildefonso M. de la Fuente
    • 1
    • 2
    • 9
    • 10
  1. 1.Department of MathematicsUniversity of the Basque Country UPV/EHUBilbaoSpain
  2. 2.Biocruces Health Research Institute I.I.S. BiocrucesBasque CountrySpain
  3. 3.University of Primorska, UP IAMKoperSlovenia
  4. 4.University of Primorska, UP FAMNITKoperSlovenia
  5. 5.Department of Computer Science and EngineeringThe Pennsylvania State UniversityState CollegeUSA
  6. 6.Department of Biochemistry and Molecular BiologyThe Pennsylvania State UniversityState CollegeUSA
  7. 7.Genomic Sciences Institute of the HuckThe Pennsylvania State UniversityState CollegeUSA
  8. 8.Department of PhysiologyUniversity of the Basque Country UPV/EHUBilbaoSpain
  9. 9.Institute of Parasitology and Biomedicine López-NeyraCSICGranadaSpain
  10. 10.Unit of Biophysics (CSIC, UPV/EHU), and Department of Biochemistry and Molecular BiologyUniversity of the Basque CountryBilbaoSpain

Personalised recommendations