Advertisement

A Novel Multi-objectivisation Approach for Optimising the Protein Inverse Folding Problem

  • Sune S. Nielsen
  • Grégoire Danoy
  • Wiktor Jurkowski
  • Juan Luis Jiménez Laredo
  • Reinhard Schneider
  • El-Ghazali Talbi
  • Pascal Bouvry
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9028)

Abstract

In biology, the subject of protein structure prediction is of continued interest, not only to chart the molecular map of the living cell, but also to design proteins of new functions. The Inverse Folding Problem (IFP) is in itself an important research problem, but also at the heart of most rational protein design approaches. In brief, the IFP consists in finding sequences that will fold into a given structure, rather than determining the structure for a given sequence - as in conventional structure prediction. In this work we present a Multi Objective Genetic Algorithm (MOGA) using the diversity-as-objective (DAO) variant of multi-objectivisation, to optimise secondary structure similarity and sequence diversity at the same time, hence pushing the search farther into wide-spread areas of the sequence solution-space. To control the high diversity generated by the DAO approach, we add a novel Quantile Constraint (QC) mechanism to discard an adjustable worst quantile of the population. This DAO-QC approach can efficiently emphasise exploitation rather than exploration to a selectable degree achieving a trade-off producing both better and more diverse sequences than the standard Genetic Algorithm (GA). To validate the final results, a subset of the best sequences was selected for tertiary structure prediction. The super-positioning with the original protein structure demonstrated that meaningful sequences are generated underlining the potential of this work.

Keywords

Inverse Folding Problem Protein design Genetic Algorithm Multi-objectivisation 

Notes

Acknowledgments

Work funded by the National Research Fund of Luxembourg (FNR) as part of the EVOPERF project at the University of Luxembourg with the AFR contract no. 1356145. Experiments were carried out using the HPC facility of the University of Luxembourg [23]

References

  1. 1.
    Alba, E., Dorronsoro, B.: The exploration/exploitation tradeoff in dynamic cellular genetic algorithms. IEEE Trans. Evol. Comput. 9(2), 126–142 (2005)CrossRefGoogle Scholar
  2. 2.
    Bowie, J.U., Lüthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known three-dimensional structure. Science (New York, N.Y.) 253(5016), 164–170 (1991)CrossRefGoogle Scholar
  3. 3.
    De Jong, K.A.: Analysis of the behavior of a class of genetic adaptive systems. Ph.D. thesis, University of Michigan Ann Arbor, MI, USA (1975)Google Scholar
  4. 4.
    Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. Lect. Notes Comput. Sci. 849–858, 2000 (1917)Google Scholar
  5. 5.
    Deb, K., Saha, A.: Finding multiple solutions for multimodal optimization problems using a multi-objective evolutionary approach. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, pp. 447–454. ACM (2010)Google Scholar
  6. 6.
    DeLano, W.L.: The pymol molecular graphics system, delano scientific, San Carlos, CA, USA (2002). There is no corresponding record for this reference (2002)Google Scholar
  7. 7.
    Goldberg, D.E., Richardson, J.: Genetic algorithms with sharing for multimodal function optimization. In: Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms, pp. 41–49 (1987)Google Scholar
  8. 8.
    Gutte, B., Däumigen, M., Wittschieber, E.A.: Design, synthesis and characterisation of a 34-residue polypeptide that interacts with nucleic acids. Nature 281(5733), 650–655 (1979)CrossRefGoogle Scholar
  9. 9.
    Jones, D.T.: De novo protein design using pairwise potentials and a genetic algorithm. Protein Sci. 3, 567–574 (1994)CrossRefGoogle Scholar
  10. 10.
    Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12), 2577–2637 (1983)CrossRefGoogle Scholar
  11. 11.
    Klein, F., Mouquet, H., Dosenovic, P., Scheid, J.F., Scharf, L., Nussenzweig, M.C.: Antibodies in HIV-1 vaccine development and therapy. Science (New York, N.Y.) 341(6151), 1199–1204 (2013)CrossRefGoogle Scholar
  12. 12.
    Klepeis, J.L., Floudas, C.A., Morikis, D., Tsokos, C.G., Lambris, J.D.: Design of peptide analogues with improved activity using a novel de novo protein design approach. Ind. Eng. Chem. Res. 43(14), 3817–3826 (2004)CrossRefGoogle Scholar
  13. 13.
    Jiménez Laredo, J.L., Nielsen, S.S., Danoy, G., Bouvry, P., Fernandes, C.M.: Cooperative selection: improving tournament selection via altruism. In: Blum, C., Ochoa, G. (eds.) EvoCOP 2014. LNCS, vol. 8600, pp. 85–96. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  14. 14.
    Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G.: Clustal w and clustal x version 2.0. Bioinformatics 23(21), 2947–2948 (2007)CrossRefGoogle Scholar
  15. 15.
    Mitra, P., Shultis, D., Brender, J.R., Czajka, J., Marsh, D., Gray, F., Cierpicki, T., Zhang, Y.: An evolution-based approach to de novo protein design and case study on Mycobacterium tuberculosis. PLoS Comput. Biol. 9(10), e1003298 (2013)CrossRefGoogle Scholar
  16. 16.
    Pabo, C.: Molecular technology: designing proteins and peptides. Nature 301(5897), 200 (1983)CrossRefGoogle Scholar
  17. 17.
    Ponder, J.W., Richards, F.M.: Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 193(4), 775–791 (1987)CrossRefGoogle Scholar
  18. 18.
    Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19(1), 55–72 (1994)CrossRefGoogle Scholar
  19. 19.
    Roy, A., Kucukural, A., Zhang, Y.: I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5(4), 725–738 (2010)CrossRefGoogle Scholar
  20. 20.
    Shimodaira, H.: Dcga: a diversity control oriented genetic algorithm. In: ICTAI, pp. 367–374 (1997)Google Scholar
  21. 21.
    Smadbeck, J., Peterson, M.B., Khoury, G.A., Taylor, M.S., Floudas, C.A.: Protein wisdom: a workbench for in silico de novo design of biomolecules. J. Vis. Exp. 77, e50476 (2013)Google Scholar
  22. 22.
    Toffolo, A., Benini, E.: Genetic diversity as an objective in multi-objective evolutionary algorithms. Evol. Comput. 11(2), 151–167 (2003)CrossRefGoogle Scholar
  23. 23.
    Varrette, S., Bouvry, P., Cartiaux, H., Georgatos, F.: Management of an academic HPC cluster: the UL experience. In: Proceedings of the 2014 International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy. IEEE, July 2014Google Scholar
  24. 24.
    Wessing, S., Preuss, M., Rudolph, G.: Niching by multiobjectivization with neighbor information: trade-offs and benefits. In: 2013 IEEE Congress on Evolutionary Computation (CEC), pp. 103–110. IEEE (2013)Google Scholar
  25. 25.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)CrossRefGoogle Scholar
  26. 26.
    Zemla, A.: LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res. 31(13), 3370–3374 (2003)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sune S. Nielsen
    • 1
  • Grégoire Danoy
    • 1
  • Wiktor Jurkowski
    • 3
  • Juan Luis Jiménez Laredo
    • 4
  • Reinhard Schneider
    • 2
  • El-Ghazali Talbi
    • 5
  • Pascal Bouvry
    • 1
  1. 1.FSTCUniversity of LuxembourgWalferdangeLuxembourg
  2. 2.LCSBUniversity of LuxembourgWalferdangeLuxembourg
  3. 3.TGAC, Norwich Research ParkNorwichUK
  4. 4.LITISUniversité du HavreLe HavreFrance
  5. 5.INRIA Lille, Nord Europe Research CentreLilleFrance

Personalised recommendations