On the Use of Principal Component Analysis and Particle Swarm Optimization in Protein Tertiary Structure Prediction

  • Óscar Álvarez
  • Juan Luis Fernández-Martínez
  • Celia Fernández-Brillet
  • Ana Cernea
  • Zulima Fernández-Muñiz
  • Andrzej KloczkowskiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10842)


We discuss applicability of Principal Component Analysis and Particle Swarm Optimization in protein tertiary structure prediction. The proposed algorithm is based on establishing a low-dimensional space where the sampling (and optimization) is carried out via Particle Swarm Optimizer (PSO). The reduced space is found via Principal Component Analysis (PCA) performed for a set of previously found low-energy protein models. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. Our results show that PSO improves the energy of the best decoy used in the PCA considering an adequate number of PCA terms.


Principal component analysis Particle swarm optimization Tertiary protein structure Conformational sampling Protein structure refinement 



A. K. acknowledges financial support from NSF grant DBI 1661391 and from The Research Institute at Nationwide Children’s Hospital.


  1. 1.
    Zhang, Y.: Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol. 18, 342–348 (2008)CrossRefGoogle Scholar
  2. 2.
    Bonneau, R., Strauss, C.E., Rohl, C.A., Chivian, D., Bradley, P., Malmstrom, L., Robertson, T., Baker, D.: De novo prediction of three-dimensional structures for major protein families. J. Mol. Biol. 322, 65–78 (2002)CrossRefGoogle Scholar
  3. 3.
    Bradley, P., Chivian, D., Meiler, J., Misura, K., Rohl, C., Schief, W.W.W., Schueler-Furman, O., Murphy, P., Schonbrun, J., Rosetta predictions in: CASP5: successes, failures, and prospects for complete automation. Proteins 53, 457–468 (2003)CrossRefGoogle Scholar
  4. 4.
    Chivian, D., Kim, D.E., Malmstrom, L., Bradley, P., Robertson, T., Murphy, P., Strauss, C.E., Bonneau, R., Rohl, C.A., Baker, D.: Automated prediction of CASP-5 structures using the Robetta server. Proteins 53, 524–533 (2003)CrossRefGoogle Scholar
  5. 5.
    Sen, T.Z., Feng, Y., Garcia, J.V., Kloczkowski, A., Jernigan, R.L.: The extent of cooperativity of protein motions observed with elastic network models is similar for atomic and coarser-grained models. J. Chem. Theory Comput. 2, 696–704 (2006)CrossRefGoogle Scholar
  6. 6.
    Gniewek, P., Kolinski, A., Jernigan, R.L., Kloczkowski, A.: Elastic network normal modes provide a basis for protein structure refinement. J. Chem. Phys. 136, 195101 (2012)CrossRefGoogle Scholar
  7. 7.
    Fernández-Martínez, J.L.: Model reduction and uncertainty analysis in inverse problems. Lead. Edge 34, 1006–1016 (2015)CrossRefGoogle Scholar
  8. 8.
    Price, S.L.: From crystal structure prediction to polymorph prediction: interpreting the crystal energy landscape. Phys. Chem. Chem. Phys. 10, 1996–2009 (2008)CrossRefGoogle Scholar
  9. 9.
    Fernández-Martínez, J.L., et al.: On the topography of the cost functional in linear and nonlinear inverse problems. Geophysics 77, W1–W15 (2012)CrossRefGoogle Scholar
  10. 10.
    Fernández-Martínez, J.L., García-Gonzale, E.: Stochastic stability analysis of the linear continuous and discrete PSO models. Trans. Evol. Comp. 15, 405–423 (2011)CrossRefGoogle Scholar
  11. 11.
    Fernández-Martínez, J.L., García-Gonzalo, E.: Stochastic stability and numerical analysis of two novel algorithms of the PSO family: PP-PSO and RR-PSO. Int. J. Artif. Intell. Tools 21, 1240011 (2012)CrossRefGoogle Scholar
  12. 12.
    Jolliffe, I.T.: Principal Component Analysis. Springer, Heidelberg (2002). Scholar
  13. 13.
    Kennedy, J., Eberhart, R.: A new optimizers using particle swarm theory. In: Proceedings of Sixth International Symposium Micromachine Human Science, vol. 1, pp. 39–46 (1995)Google Scholar
  14. 14.
    Fernández-Martínez, J.L., García-Gonzalo, E.: The generalized PSO a new door to PSO evolution. J. Artif. Evol. Appl. 2008, 861275 (2008)Google Scholar
  15. 15.
    Fernández-Martínez, J.L., García-Gonzalo, E.: The PSO family: deduction, stochastic analysis and comparison. Swarm Intell 3, 245–273 (2009)CrossRefGoogle Scholar
  16. 16.
    Gront, D., Kolinski, A.: BioShell – A package of tools for structural biology prediction. Bioinformatics 22, 621–622 (2006)CrossRefGoogle Scholar
  17. 17.
    Gront, D., Kolinski, A.: Utility library for structural bioinformatics. Bioinformatics 24, 584–585 (2008)CrossRefGoogle Scholar
  18. 18.
    Gniewek, P., Kolinski, A., Jernigan, R.L., Kloczkowski, A.: BioShell - threading: a versatile monte carlo package for protein threading. BMC Bioinform. 22, Article no. 22 (2014)Google Scholar
  19. 19.
    Aramini, J.M., et al.: Solution NMR structure of a putative Uracil DNA glycosylase from Methanosarcina acetivorans. Northeast Structural Genomics Consortium Target MvR76 (2010)Google Scholar
  20. 20.
    Ramelot, T.A., et al.: Solution NMR structure of the PBS linker Polypeptide domain (fragment 254-400) of Phycobilisome linker protein ApcE from Synechocystis sp. PCC 6803. Northeast Structural Genomics Consortium Target SgR209CGoogle Scholar
  21. 21.
    Eletsky, A., et al.: Solution NMR structure of the N-terminal domain of putative ATP-dependent DNA Helicase RecG-related Protein from Nitrosomonas europaea. Northeast Structural Genomics Consortium Target NeR70A (2010)Google Scholar
  22. 22.
    Heidebrecht, T., et al.: The structural basis for recognition of J-base containing DNA by a Novel DNA-binding domain in JBP1. Northeast Structural Genomics Consortium and others (2010)Google Scholar
  23. 23.
    Cuff, M.E., et al.: The lactose-specific IIB component domain structure of the phosphoenolpyruvate: carbohydrate phosphotransferase system (PTS) from Streptococcus pneumoniae. Midwest Center for Structural Genomics Target TIGR4 (2010)Google Scholar
  24. 24.
    Ramagopal, U.A. et al.: Structure of putative HAD superfamily (subfamily III A) hydrolase from Legionella pneumophila. 3N1U, New York Structural Genomics Research Center Target (2010)Google Scholar
  25. 25.
    Oke, M., et al.: Crystal structure of the hypothetical protein PA0856 from Pseudomonas Aeruginosa. Joint Center for Structural Genomics NP_249547.1 (2010)Google Scholar
  26. 26.
    Zhang, R., et al.: The crystal structure of functionally unknown protein from Neisseria Meningitidis MC58. Midwest Center for Structural Genomics Target 3NYM (2008)Google Scholar
  27. 27.
    Forouhar, F., et al.: Crystal structure of the N-terminal domain of DNA-binding protein SATB1 from Homo Sapiens. Northeast Structural Genomics Consortium Target HR4435B (2010)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Óscar Álvarez
    • 1
  • Juan Luis Fernández-Martínez
    • 1
  • Celia Fernández-Brillet
    • 1
  • Ana Cernea
    • 1
  • Zulima Fernández-Muñiz
    • 1
  • Andrzej Kloczkowski
    • 2
    • 3
    Email author
  1. 1.Department of MathematicsUniversity of OviedoOviedoSpain
  2. 2.Battelle Center for Mathematical MedicineNationwide Children’s HospitalColumbusUSA
  3. 3.Department of PediatricsThe Ohio State UniversityColumbusUSA

Personalised recommendations