Abstract
We consider the following Minimum Connectivity Inference problem (MCI), which arises in structural biology: given vertex sets V i ⊆ V, i ∈ I, find a graph G = (V,E) minimizing the size of the edge set E, such that the sub-graph of G induced by each V i is connected. This problem arises in structural biology, when one aims at finding the pairwise contacts between the proteins of a protein assembly, given the lists of proteins involved in sub-complexes. We present four contributions.
First, using a reduction of the set cover problem, we establish that the MCI problem is APX-hard. Second, we show how to solve the problem to optimality using a mixed integer linear programming formulation (MILP). Third, we develop a greedy algorithm based on union-find data structures (Greedy), yielding a 2(log2 |V| + log2 κ)-approximation, with κ the maximum number of subsets V i a vertex belongs to. Fourth, application-wise, we use the MILP and the greedy heuristic to solve the aforementioned connectivity inference problem in structural biology. We show that the solutions of MILP and Greedy are more parsimonious with respect to edges than those reported by the algorithm initially developed in biophysics, which are not qualified in terms of optimality. Since MILP outputs a set of optimal solutions, we introduce the notion of consensus solution. Using assemblies whose pairwise contacts are known exhaustively, we show an almost perfect agreement between the contacts predicted by our algorithms and the experimentally determined ones, especially for consensus solutions.
Keywords
- Greedy Algorithm
- Mixed Integer Linear Programming
- Network Design Problem
- Subset Versus
- Connectivity Constraint
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work has been partially supported by ANR AGAPE and CNPq-Brazil 202049/2012-4.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agarwal, D., Araujo, J., Caillouet, C., Cazals, F., Coudert, D., Pérennes, S.: Connectivity inference in mass spectrometry based structure determination. Technical Report 8320, Inria (2013), http://hal.inria.fr/hal-00837496
Agarwal, D., Cazals, F., Malod-Dognin, N.: Stoichiometry determination for mass-spectrometry data: the interval case (2013), http://hal.inria.fr/hal-00741491
Alber, F., Dokudovskaya, S., Veenhoff, L.M., Zhang, W., Kipper, J., Devos, D., Suprapto, A., Karni-Schmidt, O., Williams, R., Chait, B.T., Rout, M.P., Sali, A.: Determining the Architectures of Macromolecular Assemblies. Nature 450(7170), 683–694 (2007)
Alber, F., Förster, F., Korkin, D., Topf, M., Sali, A.: Integrating Diverse Data for Structure Determination of Macromolecular Assemblies. Ann. Rev. Biochem. 77, 11.1–11.35 (2008)
Alon, N., Moshkovitz, D., Safra, S.: Algorithmic construction of sets for k-restrictions. ACM Trans. Algorithms 2, 153–177 (2006)
Bocker, S., Liptak, Z.: A fast and simple algorithm for the money changing problem. Algorithmica 48(4), 413–432 (2007)
Cai, Q., Todorovic, A., Andaya, A., Gao, J., Leary, J.A., Cate, J.H.D.: Distinct regions of human eIF3 are sufficient for binding to the hcv ires and the 40s ribosomal subunit. Journal of Molecular Biology 403(2), 185–196 (2010)
Cazals, F., Proust, F., Bahadur, R., Janin, J.: Revisiting the Voronoi description of protein-protein interfaces. Protein Science 15(9), 2082–2092 (2006)
ElAntak, L., Tzakos, A.G., Locker, N., Lukavsky, P.J.: Structure of eIF3b rna recognition motif and its interaction with eIF3j. Journal of Biological Chemistry 282(11), 8165–8174 (2007)
Hernández, H., Dziembowski, A., Taverner, T., Séraphin, B., Robinson, C.V.: Subunit architecture of multimeric complexes isolated directly from cells. EMBO Reports 7(6), 605–610 (2006)
Janin, J., Bahadur, R.P., Chakrabarti, P.: Protein-protein interaction and quaternary structure. Quarterly Reviews of Biophysics 41(2), 133–180 (2008)
Kao, A., Randall, A., Yang, Y., Patel, V.R., Kandur, W., Guan, S., Rychnovsky, S.D., Baldi, P., Huang, L.: Mapping the structural topology of the yeast 19s proteasomal regulatory particle using chemical cross-linking and probabilistic modeling. Molecular & Cellular Proteomics (2012)
Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations. The IBM Research Symposia Series, pp. 85–103. Plenum Press, New York (1972)
Lasker, K., Förster, F., Bohn, S., Walzthoeni, T., Villa, E., Unverdorben, P., Beck, F., Aebersold, R., Sali, A., Baumeister, W.: Molecular architecture of the 26s proteasome holocomplex determined by an integrative approach. Proceedings of the National Academy of Sciences 109(5), 1380–1387 (2012)
Levy, E., Erba, E.-B., Robinson, C., Teichmann, S.: Assembly reflects evolution of protein complexes. Nature 453(7199), 1262–1265 (2008)
Loriot, S., Cazals, F.: Modeling macro–molecular interfaces with intervor. Bioinformatics 26(7), 964–965 (2010)
Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. J. ACM 41, 960–981 (1994)
Makino, D.L., Baumgärtner, M., Conti, E.: Crystal structure of an rna-bound 11-subunit eukaryotic exosome complex. Nature 495(7439), 70–75 (2013)
Raghavan, S.: Formulations and algorithms for network design problems with connectivity requirements. PhD thesis, MIT, Cambridge, MA, USA (1995)
Sharon, M., Robinson, C.V.: The role of mass spectrometry in structure elucidation of dynamic protein complexes. Annu. Rev. Biochem. 76, 167–193 (2007)
Sharon, M., Taverner, T., Ambroggio, X.I., Deshaies, R.J., Robinson, C.V.: Structural organization of the 19s proteasome lid: insights from ms of intact complexes. PLoS Biology 4(8), e267 (2006)
Stengel, F., Aebersold, R., Robinson, C.V.: Joining forces: integrating proteomics and cross-linking with the mass spectrometry of intact complexes. Molecular & Cellular Proteomics 11(3) (2012)
Sun, C., Todorovic, A., Querol-Audí, J., Bai, Y., Villa, N., Snyder, M., Ashchyan, J., Lewis, C.S., Hartland, A., Gradia, S.: et al. Functional reconstitution of human eukaryotic translation initiation factor 3 (eIF3). Proceedings of the National Academy of Sciences 108(51), 20473–20478 (2011)
Tarjan, R.E.: Data Structures and Network Algorithms. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 44. Society for Industrial and Applied Mathematics, Philadelphia (1983)
Taverner, T., Hernández, H., Sharon, M., Ruotolo, B.T., Matak-Vinkovic, D., Devos, D., Russell, R.B., Robinson, C.V.: Subunit architecture of intact protein complexes from mass spectrometry and homology modeling. Accounts of Chemical Research 41(5), 617–627 (2008)
Zhou, M., Sandercock, A.M., Fraser, C.S., Ridlova, G., Stephens, E., Schenauer, M.R., Yokoi-Fong, T., Barsky, D., Leary, J.A., Hershey, J.W., Doudna, J.A., Robinson, C.V.: Mass spectrometry reveals modularity and a complete subunit interaction map of the eukaryotic translation factor eIF3. Proceedings of the National Academy of Sciences 105(47), 18139–18144 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agarwal, D., Araujo, JC.S., Caillouet, C., Cazals, F., Coudert, D., Pérennes, S. (2013). Connectivity Inference in Mass Spectrometry Based Structure Determination. In: Bodlaender, H.L., Italiano, G.F. (eds) Algorithms – ESA 2013. ESA 2013. Lecture Notes in Computer Science, vol 8125. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40450-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-40450-4_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40449-8
Online ISBN: 978-3-642-40450-4
eBook Packages: Computer ScienceComputer Science (R0)