Connectivity Inference in Mass Spectrometry Based Structure Determination

  • Deepesh Agarwal
  • Julio-Cesar Silva Araujo
  • Christelle Caillouet
  • Frederic Cazals
  • David Coudert
  • Stephane Pérennes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8125)


We consider the following Minimum Connectivity Inference problem (MCI), which arises in structural biology: given vertex sets V i  ⊆ V, i ∈ I, find a graph G = (V,E) minimizing the size of the edge set E, such that the sub-graph of G induced by each V i is connected. This problem arises in structural biology, when one aims at finding the pairwise contacts between the proteins of a protein assembly, given the lists of proteins involved in sub-complexes. We present four contributions.

First, using a reduction of the set cover problem, we establish that the MCI problem is APX-hard. Second, we show how to solve the problem to optimality using a mixed integer linear programming formulation (MILP). Third, we develop a greedy algorithm based on union-find data structures (Greedy), yielding a 2(log2 |V| + log2 κ)-approximation, with κ the maximum number of subsets V i a vertex belongs to. Fourth, application-wise, we use the MILP and the greedy heuristic to solve the aforementioned connectivity inference problem in structural biology. We show that the solutions of MILP and Greedy are more parsimonious with respect to edges than those reported by the algorithm initially developed in biophysics, which are not qualified in terms of optimality. Since MILP outputs a set of optimal solutions, we introduce the notion of consensus solution. Using assemblies whose pairwise contacts are known exhaustively, we show an almost perfect agreement between the contacts predicted by our algorithms and the experimentally determined ones, especially for consensus solutions.


Greedy Algorithm Mixed Integer Linear Programming Network Design Problem Subset Versus Connectivity Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, D., Araujo, J., Caillouet, C., Cazals, F., Coudert, D., Pérennes, S.: Connectivity inference in mass spectrometry based structure determination. Technical Report 8320, Inria (2013),
  2. 2.
    Agarwal, D., Cazals, F., Malod-Dognin, N.: Stoichiometry determination for mass-spectrometry data: the interval case (2013),
  3. 3.
    Alber, F., Dokudovskaya, S., Veenhoff, L.M., Zhang, W., Kipper, J., Devos, D., Suprapto, A., Karni-Schmidt, O., Williams, R., Chait, B.T., Rout, M.P., Sali, A.: Determining the Architectures of Macromolecular Assemblies. Nature 450(7170), 683–694 (2007)CrossRefGoogle Scholar
  4. 4.
    Alber, F., Förster, F., Korkin, D., Topf, M., Sali, A.: Integrating Diverse Data for Structure Determination of Macromolecular Assemblies. Ann. Rev. Biochem. 77, 11.1–11.35 (2008)Google Scholar
  5. 5.
    Alon, N., Moshkovitz, D., Safra, S.: Algorithmic construction of sets for k-restrictions. ACM Trans. Algorithms 2, 153–177 (2006)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bocker, S., Liptak, Z.: A fast and simple algorithm for the money changing problem. Algorithmica 48(4), 413–432 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Cai, Q., Todorovic, A., Andaya, A., Gao, J., Leary, J.A., Cate, J.H.D.: Distinct regions of human eIF3 are sufficient for binding to the hcv ires and the 40s ribosomal subunit. Journal of Molecular Biology 403(2), 185–196 (2010)CrossRefGoogle Scholar
  8. 8.
    Cazals, F., Proust, F., Bahadur, R., Janin, J.: Revisiting the Voronoi description of protein-protein interfaces. Protein Science 15(9), 2082–2092 (2006)CrossRefGoogle Scholar
  9. 9.
    ElAntak, L., Tzakos, A.G., Locker, N., Lukavsky, P.J.: Structure of eIF3b rna recognition motif and its interaction with eIF3j. Journal of Biological Chemistry 282(11), 8165–8174 (2007)CrossRefGoogle Scholar
  10. 10.
    Hernández, H., Dziembowski, A., Taverner, T., Séraphin, B., Robinson, C.V.: Subunit architecture of multimeric complexes isolated directly from cells. EMBO Reports 7(6), 605–610 (2006)Google Scholar
  11. 11.
    Janin, J., Bahadur, R.P., Chakrabarti, P.: Protein-protein interaction and quaternary structure. Quarterly Reviews of Biophysics 41(2), 133–180 (2008)CrossRefGoogle Scholar
  12. 12.
    Kao, A., Randall, A., Yang, Y., Patel, V.R., Kandur, W., Guan, S., Rychnovsky, S.D., Baldi, P., Huang, L.: Mapping the structural topology of the yeast 19s proteasomal regulatory particle using chemical cross-linking and probabilistic modeling. Molecular & Cellular Proteomics (2012)Google Scholar
  13. 13.
    Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations. The IBM Research Symposia Series, pp. 85–103. Plenum Press, New York (1972)CrossRefGoogle Scholar
  14. 14.
    Lasker, K., Förster, F., Bohn, S., Walzthoeni, T., Villa, E., Unverdorben, P., Beck, F., Aebersold, R., Sali, A., Baumeister, W.: Molecular architecture of the 26s proteasome holocomplex determined by an integrative approach. Proceedings of the National Academy of Sciences 109(5), 1380–1387 (2012)CrossRefGoogle Scholar
  15. 15.
    Levy, E., Erba, E.-B., Robinson, C., Teichmann, S.: Assembly reflects evolution of protein complexes. Nature 453(7199), 1262–1265 (2008)CrossRefGoogle Scholar
  16. 16.
    Loriot, S., Cazals, F.: Modeling macro–molecular interfaces with intervor. Bioinformatics 26(7), 964–965 (2010)CrossRefGoogle Scholar
  17. 17.
    Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. J. ACM 41, 960–981 (1994)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Makino, D.L., Baumgärtner, M., Conti, E.: Crystal structure of an rna-bound 11-subunit eukaryotic exosome complex. Nature 495(7439), 70–75 (2013)CrossRefGoogle Scholar
  19. 19.
    Raghavan, S.: Formulations and algorithms for network design problems with connectivity requirements. PhD thesis, MIT, Cambridge, MA, USA (1995)Google Scholar
  20. 20.
    Sharon, M., Robinson, C.V.: The role of mass spectrometry in structure elucidation of dynamic protein complexes. Annu. Rev. Biochem. 76, 167–193 (2007)CrossRefGoogle Scholar
  21. 21.
    Sharon, M., Taverner, T., Ambroggio, X.I., Deshaies, R.J., Robinson, C.V.: Structural organization of the 19s proteasome lid: insights from ms of intact complexes. PLoS Biology 4(8), e267 (2006)Google Scholar
  22. 22.
    Stengel, F., Aebersold, R., Robinson, C.V.: Joining forces: integrating proteomics and cross-linking with the mass spectrometry of intact complexes. Molecular & Cellular Proteomics 11(3) (2012)Google Scholar
  23. 23.
    Sun, C., Todorovic, A., Querol-Audí, J., Bai, Y., Villa, N., Snyder, M., Ashchyan, J., Lewis, C.S., Hartland, A., Gradia, S.: et al. Functional reconstitution of human eukaryotic translation initiation factor 3 (eIF3). Proceedings of the National Academy of Sciences 108(51), 20473–20478 (2011)CrossRefGoogle Scholar
  24. 24.
    Tarjan, R.E.: Data Structures and Network Algorithms. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 44. Society for Industrial and Applied Mathematics, Philadelphia (1983)CrossRefGoogle Scholar
  25. 25.
    Taverner, T., Hernández, H., Sharon, M., Ruotolo, B.T., Matak-Vinkovic, D., Devos, D., Russell, R.B., Robinson, C.V.: Subunit architecture of intact protein complexes from mass spectrometry and homology modeling. Accounts of Chemical Research 41(5), 617–627 (2008)CrossRefGoogle Scholar
  26. 26.
    Zhou, M., Sandercock, A.M., Fraser, C.S., Ridlova, G., Stephens, E., Schenauer, M.R., Yokoi-Fong, T., Barsky, D., Leary, J.A., Hershey, J.W., Doudna, J.A., Robinson, C.V.: Mass spectrometry reveals modularity and a complete subunit interaction map of the eukaryotic translation factor eIF3. Proceedings of the National Academy of Sciences 105(47), 18139–18144 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Deepesh Agarwal
    • 1
  • Julio-Cesar Silva Araujo
    • 1
    • 2
  • Christelle Caillouet
    • 1
    • 2
  • Frederic Cazals
    • 1
  • David Coudert
    • 1
    • 2
  • Stephane Pérennes
    • 1
    • 2
  1. 1.INRIA Sophia-Antipolis - MéditerranéeFrance
  2. 2.CNRS, I3S, UMR 7271Univ. Nice Sophia AntipolisSophia AntipolisFrance

Personalised recommendations