Skip to main content
Log in

New Error Measures and Methods for Realizing Protein Graphs from Distance Data

  • Published:
Discrete & Computational Geometry Aims and scope Submit manuscript

Abstract

The interval distance geometry problem consists in finding a realization in \(\mathbb {R}^K\) of a simple undirected graph \(G=(V,E)\) with non-negative intervals assigned to the edges in such a way that, for each edge, the Euclidean distance between the realization of the adjacent vertices is within the edge interval bounds. In this paper, we focus on the application to the conformation of proteins in space, which is a basic step in determining protein function: given interval estimations of some of the inter-atomic distances, find their shape. Among different families of methods for accomplishing this task, we look at mathematical programming based methods, which are well suited for dealing with intervals. The basic question we want to answer is: what is the best such method for the problem? The most meaningful error measure for evaluating solution quality is the coordinate root mean square deviation. We first introduce a new error measure which addresses a particular feature of protein backbones, i.e. many partial reflections also yield acceptable backbones. We then present a set of new and existing quadratic and semidefinite programming formulations of this problem, and a set of new and existing methods for solving these formulations. Finally, we perform a computational evaluation of all the feasible solver \(+\) formulation combinations according to new and existing error measures, finding that the best methodology is a new heuristic method based on multiplicative weights updates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alfakih, A., Khandani, A., Wolkowicz, H.: Solving Euclidean distance matrix completion problems via semidefinite programming. Comput. Optim. Appl. 12, 13–30 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  2. Alt, H., Mehlhorn, K., Wagener, H., Welzl, E.: Congruence, similarity and symmetries of geometric objects. Discrete Comput. Geom. 3, 237–256 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  3. Arora, S., Hazan, E., Kale, S.: Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In: Foundations of Computer Science, FOCS, vol. 46, pp. 339–348. IEEE (2005)

  4. Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8, 121–164 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Atkinson, M.: An optimal algorithm for geometrical congruence. J. Algorithms 8, 159–172 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bahr, A., Leonard, J., Fallon, M.: Cooperative localization for autonomous underwater vehicles. Int. J. Robot. Res. 28(6), 714–728 (2009)

    Article  Google Scholar 

  7. Basu, S., Pollack, R., Roy, M.-F.: Algorithms in Real Algebraic Geometry. Springer, New York (2006)

    MATH  Google Scholar 

  8. Beeker, N., Gaubert, S., Glusa, C., Liberti, L.: Is the distance geometry problem in NP? In: Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.) Distance Geometry: Theory, Methods, and Applications. Springer, New York (2013)

    Google Scholar 

  9. Belotti, P., Lee, J., Liberti, L., Margot, F., Wächter, A.: Branching and bounds tightening techniques for non-convex MINLP. Optim. Methods Softw. 24(4), 597–634 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Benedetti, R., Risler, J.-J.: Real Algebraic and Semi-algebraic Sets. Hermann, Paris (1990)

    MATH  Google Scholar 

  11. Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I.N., Bourne, P.: The protein data bank. Nucleic Acid Res. 28, 235–242 (2000)

    Article  Google Scholar 

  12. Biswas, P.: Semidefinite programming approaches to distance geometry problems. Ph.D. Thesis, Stanford University (2007). http://www.optimization-online.org/DB_FILE/2008/12/2170.pdf

  13. Biswas, P., Lian, T., Wang, T., Ye, Y.: Semidefinite programming based algorithms for sensor network localization. ACM Trans. Sens. Netw. 2, 188–220 (2006)

    Article  Google Scholar 

  14. Biswas, P., Liang, T.-C., Toh, K.-C., Wang, T.-C., Ye, Y.: Semidefinite programming approaches for sensor network localization with noisy distance measurements. IEEE Trans. Autom. Sci. Eng. 3, 360–371 (2006)

    Article  Google Scholar 

  15. Candès, E., Strohmer, T., Voroninski, V.: PhaseLift: exact and stable signal recovery from magniture measurements via convex programming. Commun. Pure Appl. Math. 66(8), 1241–1274 (2012)

    Article  MATH  Google Scholar 

  16. Cassioli, A., Bordeaux, B., Bouvier, G., Mucherino, A., Alves, R., Liberti, L., Nilges, M., Lavor, C., Malliavin, T.: An algorithm to enumerate all possible protein conformations verifying a set of distance constraints. BMC Bioinform. 16, 23 (2015)

    Article  Google Scholar 

  17. Cassioli, A., Günlük, O., Lavor, C., Liberti, L.: Discretization vertex orders for distance geometry. Discrete Appl. Math. 197, 27–41 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  18. COIN-OR.: Introduction to IPOPT: a tutorial for downloading, installing, and using IPOPT (2006)

  19. Coutsias, E., Seok, C., Dill, K.: Using quaternions to calculate rmsd. J. Comput. Chem. 25(15), 1849–1857 (2004)

    Article  Google Scholar 

  20. D’Ambrosio, C., Ky, V.K., Lavor, C., Liberti, L., Maculan, N.: Computational experience on distance geometry problems 2.0. In: Casado, L., Garcia, I., Hendrix, E. (eds.) Mathematical and Applied Global Optimization. Global Optimization Workshop, vol. 12, pp. 97–100. University of Malaga, Malaga (2014)

    Google Scholar 

  21. Ding, Y., Krislock, N., Qian, J., Wolkowicz, H.: Sensor network localization, Euclidean distance matrix completions, and graph realization. Optim. Eng. 11, 45–66 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  22. Du, H., Alechina, N., Stock, K., Jackson, M.: The logic of NEAR and FAR. In: Tenbrink, T., et al. (eds.) COSIT. LNCS, vol. 8116, pp. 475–494. Springer, Zürich (2013)

    Google Scholar 

  23. Fourer, R., Gay, D.: The AMPL Book. Duxbury Press, Pacific Grove (2002)

    Google Scholar 

  24. Goodall, C.: Procrustes methods in the statistical analysis of shape. J. R. Stat. Soc. B 53(2), 285–339 (1991)

    MathSciNet  MATH  Google Scholar 

  25. Henneberg, L.: Die Graphische Statik der Starren Systeme. Teubner, Leipzig (1911)

    MATH  Google Scholar 

  26. Lavor, C.: On generating instances for the molecular distance geometry problem. In: Liberti, L., Maculan, N. (eds.) Global Optimization: From Theory to Implementation, pp. 405–414. Springer, Berlin (2006)

    Chapter  Google Scholar 

  27. Lavor, C., Alves, R., Figuereido, W., Petraglia, A., Maculan, N.: Clifford algebra and the discretizable molecular distance geometry problem. Adv. Appl. Clifford Algebr. 25, 925–942 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  28. Lavor, C., Lee, J., John, A.L.S., Liberti, L., Mucherino, A., Sviridenko, M.: Discretization orders for distance geometry problems. Optim. Lett. 6, 783–796 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  29. Lavor, C., Liberti, L., Maculan, N., Mucherino, A.: The discretizable molecular distance geometry problem. Comput. Optim. Appl. 52, 115–146 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  30. Lavor, C., Liberti, L., Mucherino, A.: The interval Branch-and-Prune algorithm for the discretizable molecular distance geometry problem with inexact distances. J. Glob. Optim. 56, 855–871 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  31. Liberti, L.: Reformulations in mathematical programming: definitions and systematics. RAIRO-Oper. Res. 43(1), 55–86 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  32. Liberti, L., Dražic, M.: Variable neighbourhood search for the global optimization of constrained NLPs. In: Proceedings of GO Workshop, Almeria (2005)

  33. Liberti, L., Lavor, C.: Solving large-scale distance geometry problems exactly versus approximately. In: Optimization Society, Proceedings of the Annual Conference, INFORMS, Houston (2014)

  34. Liberti, L., Lavor, C., Alencar, J., Abud, G.: Counting the number of solutions of \({}^k\)DMDGP instances. In: Nielsen, F., Barbaresco, F. (eds.) Geometric Science of Information. LNCS, vol. 8085, pp. 224–230. Springer, New York (2013)

    Chapter  Google Scholar 

  35. Liberti, L., Lavor, C., Maculan, N.: A Branch-and-Prune algorithm for the molecular distance geometry problem. Int. Trans. Oper. Res. 15, 1–17 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  36. Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM Rev. 56(1), 3–69 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  37. Liberti, L., Lavor, C., Mucherino, A.: The discretizable molecular distance geometry problem seems easier on proteins. In: Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.) Distance Geometry: Theory, Methods, and Applications. Springer, New York (2013)

    Google Scholar 

  38. Liberti, L., Lavor, C., Mucherino, A., Maculan, N.: Molecular distance geometry methods: from continuous to discrete. Int. Trans. Oper. Res. 18, 33–51 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  39. Liberti, L., Masson, B., Lavor, C., Lee, J., Mucherino, A.: On the number of realizations of certain Henneberg graphs arising in protein conformation. Discrete Appl. Math. 165, 213–232 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  40. Liberti, L., Mencarelli, L.: A multiplicative weights update algorithm for MINLP, Working paper, (2014)

  41. Liberti, L., Mladenović, N., Nannicini, G.: A recipe for finding good solutions to MINLPs. Math. Program. Comput. 3, 349–390 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  42. Löfberg, J.: YALMIP: A toolbox for modeling and optimization in MATLAB. In: Proceedings of the International Symposium of Computer-Aided Control Systems Design, CACSD, Taipei, vol. 1. IEEE (2004)

  43. Maiorov, V., Crippen, G.: Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins. J. Mol. Biol. 235, 625–634 (1994)

    Article  Google Scholar 

  44. Malliavin, T., Mucherino, A., Nilges, M.: Distance geometry in structural biology. In: Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.) Distance Geometry: Theory, Methods, and Applications. Springer, New York (2013)

    Google Scholar 

  45. MATLAB R2014a. The MathWorks, Inc., Natick (2014)

  46. Milnor, J.: Topology from the Differentiable Viewpoint. University Press of Virginia, Charlottesville (1969)

    MATH  Google Scholar 

  47. Moré, J., Wu, Z.: Distance geometry optimization for protein structures. J. Glob. Optim. 15, 219–234 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  48. Mosek ApS.: The mosek manual, Version 7 (Revision 114), 2014. www.mosek.com

  49. Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.): Distance Geometry: Theory, Methods, and Applications. Springer, New York (2013)

    MATH  Google Scholar 

  50. Plotkin, S., Shmoys, D., Tardos, É.: Fast approximation algorithm for fractional packing and covering problems. Math. Oper. Res. 20, 257–301 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  51. Saxe, J.: Embeddability of weighted graphs in \(k\)-space is strongly NP-hard. In: Proceedings of 17th Allerton Conference in Communications, Control and Computing, pp. 480–489 (1979)

  52. Singer, A.: Angular synchronization by eigenvectors and semidefinite programming. Appl. Comput. Harmonic Anal. 30, 20–36 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  53. Tay, T.-S., Whiteley, W.: Generating isostatic frameworks. Struct. Topol. 11, 21–69 (1985)

    MathSciNet  MATH  Google Scholar 

  54. Wächter, A., Biegler, L.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106(1), 25–57 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  55. Wikipedia. Variance, Sum of correlated variables (2016). Accessed 22 Jun 2016

  56. Yajima, Y.: Positive semidefinite relaxations for distance geometry problems. Jpn. J. Ind. Appl. Math. 19, 87–112 (2002)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We are grateful to the Editor-in-Chief for simplifying a technical argument, and to two anonymous referees for helping us improve this paper. The second author (VKK) is supported by a Microsoft Research PhD Fellowship. The third author (CL) is grateful to the Brazilian funding agencies FAPESP and CNPq for financial support. The fourth author (LL) is partly supported by the ANR grant “Bip:Bip” under contract ANR-10-BINF-0003. The fifth author (NM) is grateful to the Brazilian funding agencies FAPERJ and CNPq for financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leo Liberti.

Additional information

Editor in Charge: Kenneth Clarkson

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

D’Ambrosio, C., Vu, K., Lavor, C. et al. New Error Measures and Methods for Realizing Protein Graphs from Distance Data. Discrete Comput Geom 57, 371–418 (2017). https://doi.org/10.1007/s00454-016-9846-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00454-016-9846-7

Keywords

Navigation