Connectivity Inference in Mass Spectrometry Based Structure Determination
We consider the following Minimum Connectivity Inference problem (MCI), which arises in structural biology: given vertex sets V i ⊆ V, i ∈ I, find a graph G = (V,E) minimizing the size of the edge set E, such that the sub-graph of G induced by each V i is connected. This problem arises in structural biology, when one aims at finding the pairwise contacts between the proteins of a protein assembly, given the lists of proteins involved in sub-complexes. We present four contributions.
First, using a reduction of the set cover problem, we establish that the MCI problem is APX-hard. Second, we show how to solve the problem to optimality using a mixed integer linear programming formulation (MILP). Third, we develop a greedy algorithm based on union-find data structures (Greedy), yielding a 2(log2 |V| + log2 κ)-approximation, with κ the maximum number of subsets V i a vertex belongs to. Fourth, application-wise, we use the MILP and the greedy heuristic to solve the aforementioned connectivity inference problem in structural biology. We show that the solutions of MILP and Greedy are more parsimonious with respect to edges than those reported by the algorithm initially developed in biophysics, which are not qualified in terms of optimality. Since MILP outputs a set of optimal solutions, we introduce the notion of consensus solution. Using assemblies whose pairwise contacts are known exhaustively, we show an almost perfect agreement between the contacts predicted by our algorithms and the experimentally determined ones, especially for consensus solutions.
KeywordsGreedy Algorithm Mixed Integer Linear Programming Network Design Problem Subset Versus Connectivity Constraint
Unable to display preview. Download preview PDF.
- 1.Agarwal, D., Araujo, J., Caillouet, C., Cazals, F., Coudert, D., Pérennes, S.: Connectivity inference in mass spectrometry based structure determination. Technical Report 8320, Inria (2013), http://hal.inria.fr/hal-00837496
- 2.Agarwal, D., Cazals, F., Malod-Dognin, N.: Stoichiometry determination for mass-spectrometry data: the interval case (2013), http://hal.inria.fr/hal-00741491
- 4.Alber, F., Förster, F., Korkin, D., Topf, M., Sali, A.: Integrating Diverse Data for Structure Determination of Macromolecular Assemblies. Ann. Rev. Biochem. 77, 11.1–11.35 (2008)Google Scholar
- 10.Hernández, H., Dziembowski, A., Taverner, T., Séraphin, B., Robinson, C.V.: Subunit architecture of multimeric complexes isolated directly from cells. EMBO Reports 7(6), 605–610 (2006)Google Scholar
- 12.Kao, A., Randall, A., Yang, Y., Patel, V.R., Kandur, W., Guan, S., Rychnovsky, S.D., Baldi, P., Huang, L.: Mapping the structural topology of the yeast 19s proteasomal regulatory particle using chemical cross-linking and probabilistic modeling. Molecular & Cellular Proteomics (2012)Google Scholar
- 14.Lasker, K., Förster, F., Bohn, S., Walzthoeni, T., Villa, E., Unverdorben, P., Beck, F., Aebersold, R., Sali, A., Baumeister, W.: Molecular architecture of the 26s proteasome holocomplex determined by an integrative approach. Proceedings of the National Academy of Sciences 109(5), 1380–1387 (2012)CrossRefGoogle Scholar
- 19.Raghavan, S.: Formulations and algorithms for network design problems with connectivity requirements. PhD thesis, MIT, Cambridge, MA, USA (1995)Google Scholar
- 21.Sharon, M., Taverner, T., Ambroggio, X.I., Deshaies, R.J., Robinson, C.V.: Structural organization of the 19s proteasome lid: insights from ms of intact complexes. PLoS Biology 4(8), e267 (2006)Google Scholar
- 22.Stengel, F., Aebersold, R., Robinson, C.V.: Joining forces: integrating proteomics and cross-linking with the mass spectrometry of intact complexes. Molecular & Cellular Proteomics 11(3) (2012)Google Scholar
- 23.Sun, C., Todorovic, A., Querol-Audí, J., Bai, Y., Villa, N., Snyder, M., Ashchyan, J., Lewis, C.S., Hartland, A., Gradia, S.: et al. Functional reconstitution of human eukaryotic translation initiation factor 3 (eIF3). Proceedings of the National Academy of Sciences 108(51), 20473–20478 (2011)CrossRefGoogle Scholar
- 26.Zhou, M., Sandercock, A.M., Fraser, C.S., Ridlova, G., Stephens, E., Schenauer, M.R., Yokoi-Fong, T., Barsky, D., Leary, J.A., Hershey, J.W., Doudna, J.A., Robinson, C.V.: Mass spectrometry reveals modularity and a complete subunit interaction map of the eukaryotic translation factor eIF3. Proceedings of the National Academy of Sciences 105(47), 18139–18144 (2008)CrossRefGoogle Scholar