Abstract
Proteins are highly dynamic entities attaining a myriad of different conformations. Protein side chains change their states during dynamics, causing clashes that are propagated at distal sites. A convenient formalism to analyze protein dynamics is based on network theory using Protein Structure Networks (PSNs). Despite their broad applicability, few efforts have been devoted to benchmarking PSN methods and to provide the community with best practices. In many applications, it is convenient to use the centers of mass of the side chains as nodes. It becomes thus critical to evaluate the minimal distance cutoff between the centers of mass which will provide stable network properties. Moreover, when the PSN is derived from a structural ensemble collected by molecular dynamics (MD), the impact of the MD force field has to be evaluated. We selected a dataset of proteins with different fold and size and assessed the two fundamental properties of the PSN, i.e. hubs and connected components. We identified an optimal cutoff of 5 Å that is robust to changes in the force field and the proteins. Our study builds solid foundations for the harmonization and standardization of the PSN approach.
Similar content being viewed by others
Introduction
Proteins are complex and highly dynamic entities attaining a myriad of different conformations in solution1,2,3,4,5 that are often related to the protein function. Indeed, they can resemble bound states to a biological partner6,7,8,9,10, active states of enzymes11,12,13,14, or conformations that are stabilized by a post-translational modification (PTM)6, 11, as well as altered by a disease-related mutation15.
An interesting property of proteins is that a perturbation (e.g. a binding event, a mutation or a PTM) occurring at a certain site of the structure can be transmitted over long distances to another location16,17,18,19. This long-range communication is often related to allostery and may affect critical distal sites for protein function.
At the atom-level, the perturbation from one protein site to a distal one can be propagated by a cascade of collisional clashes between residue side chains, which undergo changes of their rotameric states during protein dynamics19, 20. Local rearrangements occurring in the intramolecular contacts during the protein dynamics are thus at the base of this long-range communication19.
A convenient formalism to unravel the complexity behind long-range structural communication in proteins is the application of network theory to protein structure, i.e. the so-called Protein Structure Networks (PSNs). In a PSN, the protein residues become the nodes of the network connected by edges which can, for example, be described as the contact strength between each pair of residues20,21,22,23,24,25,26,27,28,29,30. Networks indeed are proper tools to link the local to global perturbations occurring during protein dynamics since they are by definition mediators of communication from local to global scales19.
Nowadays, PSN-based strategies are very popular and used in structural biology, and a plethora of different methodologies has been proposed25,26,27,28, 31,32,33,34,35,36,37. PSN approaches are often integrated to the dynamic description of proteins that all-atom molecular dynamics (MD) simulations or other sampling methods provide21, 31, 38,39,40,41,42,43,44,45.
Despite their broad applicability, few efforts have been devoted so far to the benchmarking of PSN and PSN-MD methods, to define best practices in the field and to ultimately provide the community with clear rules to determine PSN optimal parameters. The definition of arbitrary cutoffs is one of the major weaknesses of contact-based networks applied to protein structure and dynamics46, 47. As previously shown, many options are available to select suitable distance cutoffs for the prediction of residue contacts in protein structures47. Alternative solutions exist, i.e. using different principles for edge and weight definition such as energies or correlated motions. Nevertheless, a contact-based approach is still valuable especially if we consider the major advances that techniques such as atomistic biomolecular simulations have achieved in the last decade48, 49. Indeed, MD simulations have now reached high accuracy in describing conformational changes even at the side-chain levels and occurring on different time scales, as attested by the agreement with experimental observables4, 50,51,52,53.
In many PSN-MD applications, it is convenient to use the centers of mass of residue side chains as PSN nodes, the distance between the centers of mass for edge definition and their occurrence as weight20, 31, 41, 54, 55. It becomes, thus, critical to evaluate the minimal distance cutoff between the centers of mass of two residues to include an edge in the PSN. Moreover, when the PSN is derived from a structural ensemble collected by MD simulations and not from experimental structures, it is mandatory to evaluate the impact of the physical model (i.e. force field) on the PSN parameters.
We selected a dataset of proteins with different architecture and size and assessed the distribution of the two fundamental properties of a PSN, i.e. the hubs and the connected components. We also evaluated the influence of the force field selection on the PSN parameters, and we propose an optimal distance cutoff for PSN based on distances between the centers of mass of protein residues. The cutoff here identified is robust independently on the protein size, fold, and the MD force field employed. Our study builds strong foundations toward the harmonization and standardization of PSN strategies and a framework to apply also more generally to the choice of parameters for other PSN-based approaches.
Results and Discussion
Selected protein structures for PSN-MD analyses
We selected four different three-dimensional (3D) structures of monomeric proteins of various size and fold (Fig. 1) and four different force fields (Table S1). In particular, we chose state-of-the-art physical models from each of the most used force-field families for MD simulations of proteins, i.e. CHARMM (CHARMM22*56 and CHARMM3657), AMBER (Amber99SB*-ILDN58, 59) and GROMOS (GROMOS54a760). We carried out the MD simulations in explicit solvent for one μs so that they could reflect the MD sampling that is employed for PSN-MD studies40, 54. For each MD ensemble, PSN based on distances between the side-chain centers of mass have been calculated as detailed in the Materials and Methods.
A distance cutoff of 5 Å allows a robust description of PSN properties independently on the protein and the MD force field employed
The choice of the distance cutoff is essential for the PSN definition. Indeed, the distance cutoff is used to discriminate which contact between two side chains has to be included or not as a link of the network, ultimately affecting the network topology. When the distance is calculated between the centers of mass of the residue side chains, the choice of the cutoff becomes even more critical. Indeed, we cannot arbitrarily assume that the distances commonly used in structural biology to define an interaction between two amino acids - such as 4 or 4.5 Å - are valid. The issue becomes even more cogent when a PSN is derived by an MD ensemble where each force field relies on different atomic masses.
The two most important properties of a PSN, which ultimately dictate how distant regions of the PSN are linked are the so-called hubs and connected components (also known as clusters of nodes) (Fig. 2).
Hubs are nodes that have a high degree of connectivity in a network. The highest degree of residue hubs is limited by steric constraints and it could vary from three to ten in PSN27. Protein structures are known to be made up of a significant number of strongly and weakly interacting residue hubs that stabilize the tertiary structure of the protein and provide resilience against random mutations19, 27.
A robust PSN should feature a certain amount of hub residues that have at least a node degree of three (i.e. connected with three or more other nodes by an edge in the PSN) and it should be composed of multiple connected components which are not too fragmented. Cluster fragmentation is particularly critical in the PSN definition. Other colleagues and we showed that central parameters that influence the size of the connected components are the p crit 31, 42 or I crit 40, 61, 62, depending on the methods used for PSN construction. Indeed, edges that have extremely low weights would increase the noise and connect all the clusters into a single one. Conversely, if only high weights are retained, only sparsely populated and highly fragmented clusters will be observed with a minimal number of communication paths between distal regions.
In a PSN approach based on side-chain-side-chain contacts, the distance cutoffs used can affect the network in a similar way. Indeed, if a distance that is too short and restrictive is chosen, the network will appear as very fragmented with small separated clusters and few or virtually no hubs. If the distance is too long, each residue of the network will be connected, resulting in a single cluster that embraces the entire network. It is thus critical to find an optimal distance cutoff.
Moreover, since the PSN-MD approaches, as the one here employed, generally rely on extracting an average and static PSN from an MD trajectory, it becomes fundamental to assess the convergence of hubs and connected components over the simulation time.
We thus here evaluated: (i) the convergence of hubs and connected components in PSN derived by MD simulations using a Jackknife approach (see Materials and Methods) and (ii) the distribution of hubs and connected components at different distance cutoffs (Figs 3 and 4, Fig. S1). (iii) In the attempt of harmonizing the PSN protocol and allowing the reproducibility of the analyses, we also implemented a Python-based pipeline (PyInKnife.py) to automatize the steps described above, which can be used free of charge (see Materials and Methods for details).
At first, we evaluated whether hubs and connected components are stable properties in the MD ensembles here collected (Figs 3 and 4, Fig. S1). With regards to the distance cutoff, we identified common trends in the hubs and connected components distribution independently from the protein under investigation and the force field employed in the simulations. Indeed, in all the cases distance cutoffs lower than 5 Å resulted in a minimal number of hubs (less than four hub residues) where the connection degree was smaller than three (Fig. 3). On the contrary, distance cutoffs higher than 5 Å showed only one large cluster accounting for most of the protein residues (Fig. 4), indicating that this value is the more appropriate cutoff to employ for a PSN-MD where the contacts are calculated as distances between the centers of mass of residue side chains.
Localization of hubs and connected components on the 3D structure is conserved using the 5 Å distance cutoff
The 5 Å distance cutoff allows for similar general features of the PSN of the same protein described by different force fields (Figs 3 and 4 ). Despite this result is encouraging, we need to take into consideration that PSNs are employed to achieve residue-level details in structural biology. PSNs are used to identify the localization of the hub residues, the specific residues that belong to the same cluster or even the paths of communication between distal residues and their intermediate nodes. These are all important PSN properties that can, for example, be altered by interactions with biological partners6, 40, 63 or mutations21, 40, 42, 51, 64. It is thus not enough to observe that the PSN description is robust regarding the overall distribution of hubs and connected components. Indeed, the PSNs collected for the same protein, but using different MD force fields, with the 5 Å distance cutoff might differ in the localization of hub residues in the 3D structure or in the individual residues that belong to the same cluster without affecting the total number of hubs and connected components. The same observation holds for the localization of hubs and connected components when the entire MD trajectory is compared to the resampled MD trajectories collected from the Jackknife approach.
We thus compared the hubs and connected components at the residue-level as derived by the PSN analyses of the entire MD trajectories or of the resampled MD trajectories obtained with the Jackknife procedure (see Materials and Methods). The analyses showed a reasonable convergence of hubs and connected components also at the residue-level with only minor discrepancies among the PSN calculated from the entire MD trajectory and few of the resampled trajectories (Figs S2 and S3).
Moreover, we analyzed the hub localization and their degree in the MD simulations of CypA where different force fields have been used (Figs 5 and 6A). We noticed that the localization of the hubs appears to be equally distributed on the 3D structure coming from different force fields, apart from minor changes in their node degree. Similar results were obtained for Trx using CHARMM22* and GROMOS54a7 force fields.
In parallel, we also mapped the first five more populated connected components onto the CypA sequence and 3D structure (Figs 6B and 7). The composition and distribution of the clusters are different only in CHARMM36 simulations. This apparent difference is only due to a splitting of the connected component number 1 in three smaller clusters, as well as to a different localization of the 5th cluster (i.e. the smallest one). Only subtle differences have been observed for Amber99SB*-ILDN and CHARMM22*, suggesting a robust description of the connected components with these two force fields, as also found in a recent PSN study of a dimer54.
Conclusions
In the protein world, a perturbation occurring at a certain site of the protein structure can be transmitted over long distances to another site. These structural rearrangements can be propagated by a cascade of changes in the conformational states of the residue side chains. Local changes occurring in the residue-residue contacts during the protein dynamics are thus at the base of this long-range communication. Network theory is a suitable formalism to evoke to analyze protein structures and to identify the paths of residues that can transmit the structural changes over long distances. In this context, a plethora of different approaches to define a PSN has been developed, often integrated with molecular dynamics simulations to account for the protein dynamics.
Despite the broad application of these methods, the community is missing clear rules and a solid framework to define the PSN parameters. It becomes thus critical to evaluate the minimal distance cutoff that can be used to include an edge in the PSN and that provides stable network properties, as well as the influence of the physical model used to describe the protein in the simulations.
Indeed, there are not consolidated and uniform protocols in the PSN-MD field, especially when the edges are defined according to the distance between the centers of mass of protein side chains. Moreover, most of the PSN approaches have been optimized using datasets of static experimental structures from the Protein Data Bank. A careful evaluation of the PSN parameters in an MD ensemble of structures has been poorly applied. PSN parameters that are optimal for the network analyses of experimental crystallographic structures are not necessarily suitable for the analysis of an MD ensemble, as recently pointed out40. Most of the publications in which a PSN was calculated using the PyInteraph suite of tools, for example, employ very different distance cutoffs.
We thus selected a dataset of proteins to use as model systems to assess important PSN properties as a function of different distance cutoffs and physical models. In particular, we focused on two fundamental properties of the PSN, i.e. the hubs and the connected components. We identified an optimal value for the distance cutoff (5 Å) that is robust to changes in the MD force field and applicable to proteins with different sizes or folds. Our study provides a general framework to select PSN parameters and to improve reproducibility of the results thanks to a free-of-charge Python-based pipeline, PyInKnife. We here built the foundations toward the harmonization and standardization of the PSN-MD approach.
Materials and Methods
Molecular dynamics simulations
We performed explicit solvent MD simulations using the GROMACS software version 4.666 with different force fields and solvent models. A summary of the starting structures, protein size, force fields and solvent models used in this study is reported in Table S1. The MD simulation of Dri ARID domain has been published before40 and here employed for the analyses. 500-ns simulations of CypA have been published before51 and we here elongated them to achieve one μs of sampling. We collected the remaining simulations for the first time in this study at 300 K and 1 bar in the NVT ensemble with 150 mM of NaCl. We employed periodic boundary conditions and we set a distance equal or greater than 1.8 Å from the protein atoms and the box edges of a dodecahedral box of water molecules. Preparation steps have been carried out according to a protocol recently applied to other proteins67. We applied a 2-fs time step and the LINCS algorithm68, as well as the Particle-Mesh Ewald (PME) summation scheme69 to treat long-range electrostatic interactions. Van der Waals and short-range Coulomb interactions were truncated at 9 Å and conformations stored every 10 ps. We carried out productive MD simulations for one μs.
We calculated the minimal distance between each protein and its image to rule out artifacts due to periodic boundary conditions and artificial contacts between the protein and the corresponding image.
PSN definition
We used the PyInteraph suite of tools31 to construct a PSN-MD based on side-chain contacts using all the residues except for glycines. The contacts are defined as distances between the centers of mass of side chains on the base of the atomic mass files provided by PyInteraph. Different distance cutoffs have been assessed in this study in the range of 4–6 Å (see below) to include a certain contact as edge of the network. Moreover, to derive a weighted network, the persistence of the contact in each MD ensemble was measured and a p crit of 20% was employed to filter out meaningless interactions and to maintain the network structure, in agreement with previous applications of the same method31, 42, 70. We also used the xPyder plugin65 for Pymol to map on the 3D structure the PSN connected components.
The PyInKnife pipeline
We developed a Python-based pipeline (which is available free of charge at https://github.com/ELELAB/PyInKnife) called PyInKnife in order to: (i) automatize the pre-processing of the trajectories for PSN analyses, (ii) sub-set the trajectories in shorter trajectory files that retain 90% of the frames (see below), (iii) run the different steps of PyInteraph on each trajectory subset and using different distance cutoffs, including the creation of the PSN, calculation of hubs and connected components and their distribution, and (iv) generate a final report with publication-ready plots and figures. The pipeline is illustrated in Fig. 8.
PyInKnife requires the pre-processing of the MD trajectory to remove artefacts due to the periodic boundary conditions and to extract a reference structure along with the topology required for the PSN calculations. The pre-processing is carried out by three different GROMACS tools (www.Gromacs.org): make_ndx, trjconv and editconf. These tools allow us to generate the index file, convert and manipulate the trajectories and structures, respectively.
PyInKnife can be also used on trajectories obtained with other simulation packages, such as Amber, CHARMM and NAMD after conversion of the MD trajectory to the GROMACS format (.xtc or.trr file). This can be achieved with several tools such as WORDOM 71, the MDAnalysis package72 and the Catdcd plugin (http://www.ks.uiuc.edu/Development/MDTools/catdcd/). The user can employ the GROMACS tool editconf to convert the PDB file of the starting structure, or one frame extracted from the trajectory, into the file format required by PyInteraph (GROMACS.gro file).
PyInKnife allows to automatize the analyses of contact-based PSN, hydrophobic interactions, and hydrogen bond networks implemented in PyInteraph. The user can specify from the command line the PyInteraph atomic mass databases, the distance cutoff values to be tested and other PSN parameters.
After the PSN for each MD trajectory is obtained, it is possible to calculate with PyInKnife the hubs and connected components for each class of interactions by using the graph_analysis tool of the PyInteraph suite.
PyInKnife also implements a pipeline to evaluate the convergence of the two most important PSN properties, i.e. hubs and connected components in the MD trajectory. We used the Jackknife resampling method73 to calculate the deviation from resampled trajectories where a 10% has been discarded at regular intervals of the simulation frames. The resampled trajectories are calculated using the GROMACS tool trjcat. The procedure is illustrated in Fig. 9.
PyInKnife also includes R-based scripts to plot the results and produce publication-ready figures. To use the plotting R scripts, the R packages ggplot, ggplot2 and lattice are required.
The Jackknife standard error is calculated as
where n is the number of resampled trajectories (10 as default), \(\hat{\theta }\) is the estimator of the ith resampled trajectory and \({\hat{\theta }}_{(.)}\) is the empirical average of the estimator on the resampled trajectories
References
Karplus, M. & Kuriyan, J. Molecular dynamics and protein function. Proc. Natl. Acad. Sci. USA 102, 6679–85, doi:10.1073/pnas.0408930102 (2005).
Klepeis, J. L., Lindorff-Larsen, K., Dror, R. O. & Shaw, D. E. Long-timescale molecular dynamics simulations of protein structure and function. Curr. Opin. Struct. Biol. 19, 120–7, doi:10.1016/j.sbi.2009.03.004 (2009).
Grant, B. J., Gorfe, A. A. & McCammon, J. A. Large conformational changes in proteins: Signaling and other functions. Curr. Opin. Struct. Biol. 20, 142–147, doi:10.1016/j.sbi.2009.12.004 (2010).
Lindorff-Larsen, K. et al. Systematic validation of protein force fields against experimental data. PLoS One 7, e32131, doi:10.1371/journal.pone.0032131 (2012).
Kovermann, M., Rogne, P. & Wolf-Watz, M. Protein dynamics and function from solution state NMR spectroscopy. Q. Rev. Biophys. 49, e6, doi:10.1017/S0033583516000019 (2016).
Lambrughi, M. et al. DNA-binding protects p53 from interactions with cofactors involved in transcription-independent functions. Nucleic Acids Res. 44, 9096–9109, doi:10.1093/nar/gkw770 (2016).
Boehr, D. D., Nussinov, R. & Wright, P. E. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789–96, doi:10.1038/nchembio.232 (2009).
Baldwin, A. J. & Kay, L. E. NMR spectroscopy brings invisible protein states into focus. Nat. Chem. Biol. 5, 808–814, doi:10.1038/nchembio.238 (2009).
Masterson, L. R. et al. Dynamics connect substrate recognition to catalysis in protein kinase A. Nat. Chem. Biol. 6, 821–8, doi:10.1038/nchembio.452 (2010).
Sumbul, F., Acuner-Ozbabacan, S. E. & Haliloglu, T. Allosteric Dynamic Control of Binding. Biophys. J. 109, 1190–1201, doi:10.1016/j.bpj.2015.08.011 (2015).
Papaleo, E. et al. An Acidic Loop and Cognate Phosphorylation Sites Define a Molecular Switch That Modulates Ubiquitin Charging Activity in Cdc34-Like Enzymes. PLoS Comput. Biol. 7 (2011).
Papaleo, E. et al. Loop 7 of E2 Enzymes: An Ancestral Conserved Functional Motif Involved in the E2-Mediated Steps of the Ubiquitination Cascade. PLoS One 7 (2012).
Campbell, E. et al. Changes in protein dynamics optimize the active site during evolution of new enzyme function. Nat. Chem. Biol. 12, 944–950, doi:10.1038/nchembio.2175 (2016).
Ma, B. & Nussinov, R. Enzyme dynamics point to stepwise conformational selection in catalysis. Curr. Opin. Chem. Biol. 14, 652–9, doi:10.1016/j.cbpa.2010.08.012 (2010).
Demir, Ö. et al. Ensemble-based computational approach discriminates functional activity of p53 cancer and rescue mutants. PLoS Comput. Biol. 7 (2011).
Guo, J. & Zhou, H. X. Protein Allostery and Conformational Dynamics. Chem. Rev. 116, 6503–6515, doi:10.1021/acs.chemrev.5b00590 (2016).
Ribeiro, A. A. S. T. & Ortiz, V. A Chemical Perspective on Allostery. Chem. Rev. 116, 6488–6502, doi:10.1021/acs.chemrev.5b00543 (2016).
Papaleo, E. et al. The Role of Protein Loops and Linkers in Conformational Dynamics and Allostery. Chem. Rev. 116, 6391–6423, doi:10.1021/acs.chemrev.5b00623 (2016).
Vuillon, L. & Lesieur, C. From local to global changes in proteins: a network view. Curr. Opin. Struct. Biol. 31, 1–8, doi:10.1016/j.sbi.2015.02.015 (2015).
Papaleo, E. Integrating atomistic molecular dynamics simulations, experiments, and network analysis to study protein dynamics: strength in unity. Front. Mol. Biosci. 2, 28, doi:10.3389/fmolb.2015.00028 (2015).
Angelova, K. et al. Conserved amino acids participate in the structure networks deputed to intramolecular communication in the lutropin receptor. Cell. Mol. Life Sci. 68, 1227–39, doi:10.1007/s00018-010-0519-z (2011).
Di Paola, L., De Ruvo, M., Paci, P., Santoni, D. & Giuliani, A. Protein Contact Networks: An Emerging Paradigm in Chemistry. Chem. Rev. 113, 1598–1613, doi:10.1021/cr3002356 (2013).
Di Paola, L. & Giuliani, A. Protein contact network topology: a natural language for allostery. Curr. Opin. Struct. Biol. 31, 43–8, doi:10.1016/j.sbi.2015.03.001 (2015).
Cheng, S., Fu, H. & Cui, D.-X. Characteristics Analyses and Comparisons of the Protein Structure Networks Constructed by Different Methods. Interdiscip. Sci Comput Life Sci 8, 65–74, doi:10.1007/s12539-015-0106-y (2016).
O’Rourke, K. F., Gorman, S. D. & Boehr, D. D. Biophysical and computational methods to analyze amino acid interaction networks in proteins. Comput. Struct. Biotechnol. J. 14, 245–251, doi:10.1016/j.csbj.2016.06.002 (2016).
Feher, V. A., Durrant, J. D., Van Wart, A. T. & Amaro, R. E. Computational approaches to mapping allosteric pathways. Curr. Opin. Struct. Biol. 25, 98–103, doi:10.1016/j.sbi.2014.02.004 (2014).
Bhattacharyya, M., Ghosh, S. & Vishveshwara, S. Protein Structure and Function: Looking through the Network of Side-Chain Interactions. Curr. Protein Pept. Sci. 17, 4–25, doi:10.2174/1389203716666150923105727 (2016).
van den Bedem, H., Bhabha, G., Yang, K., Wright, P. E. & Fraser, J. S. Automated identification of functional dynamic contact networks from X-ray crystallography. Nat. Methods 10, 896–902, doi:10.1038/nmeth.2592 (2013).
Csermely, P., Nussinov, R. & Szilágyi, A. From allosteric drugs to allo-network drugs: state of the art and trends of design, synthesis and computational methods. Curr. Top. Med. Chem. 13, 2–4, doi:10.2174/1568026611313010002 (2013).
Csermely, P., Korcsmáros, T., Kiss, H. J. M., London, G. & Nussinov, R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol. Ther. 138, 333–408, doi:10.1016/j.pharmthera.2013.01.016 (2013).
Tiberti, M. et al. PyInteraph: A Framework for the Analysis of Interaction Networks in Structural Ensembles of Proteins. J Chem Inf Model 54, 1537–1551, doi:10.1021/ci400639r (2014).
Van Wart, A. T., Durrant, J., Votapka, L. & Amaro, R. E. Weighted implementation of suboptimal paths (WISP): An optimized algorithm and tool for dynamical network analysis. J. Chem. Theory Comput. 10, 511–517, doi:10.1021/ct4008603 (2014).
Chakrabarty, B. & Parekh, N. NAPS: Network analysis of protein structures. Nucleic Acids Res. 44, W375–W382, doi:10.1093/nar/gkw383 (2016).
Seeber, M., Felline, A., Raimondi, F., Mariani, S. & Fanelli, F. WebPSN: A web server for high-throughput investigation of structural communication in biomacromolecules. Bioinformatics 31, 779–781, doi:10.1093/bioinformatics/btu718 (2015).
Stolzenberg, S., Michino, M., Levine, M. V., Weinstein, H. & Shi, L. Computational approaches to detect allosteric pathways in transmembrane molecular machines. Biochim. Biophys. Acta - Biomembr. 1858, 1652–1662, doi:10.1016/j.bbamem.2016.01.010 (2016).
Nepomnyachiy, S., Ben-Tal, N. & Kolodny, R. CyToStruct: Augmenting the network visualization of CyToStruct with the power of molecular viewers. Structure 23, 941–948, doi:10.1016/j.str.2015.02.013 (2015).
Niknam, N., Khakzad, H., Arab, S. S. & Naderi-Manesh, H. PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory. Comput. Biol. Med. 72, 151–159, doi:10.1016/j.compbiomed.2016.03.012 (2016).
Ghosh, A. & Vishveshwara, S. A study of communication pathways in methionyl- tRNA synthetase by molecular dynamics simulations and structure network analysis. Proc. Natl. Acad. Sci. USA 104, 15711–6, doi:10.1073/pnas.0704459104 (2007).
Karami, Y., Laine, E. & Carbone, A. Dissecting protein architecture with communication blocks and communicating segment pairs. BMC Bioinformatics 17, 13, doi:10.1186/s12859-015-0855-y (2016).
Invernizzi, G., Tiberti, M., Lambrughi, M., Lindorff-Larsen, K. & Papaleo, E. Communication Routes in ARID Domains between Distal Residues in Helix 5 and the DNA-Binding Loops. PLoS Comput. Biol. 10, e1003744, doi:10.1371/journal.pcbi.1003744 (2014).
Marino, V. & Dell’Orco, D. Allosteric communication pathways routed by Ca2+/Mg2+ exchange in GCAP1 selectively switch target regulation modes. Sci. Rep. 6, 34277, doi:10.1038/srep34277 (2016).
Papaleo, E., Renzetti, G. & Tiberti, M. Mechanisms of intramolecular communication in a hyperthermophilic acylaminoacyl peptidase: a molecular dynamics investigation. PLoS One 7, e35686, doi:10.1371/journal.pone.0035686 (2012).
Skjærven, L., Yao, X.-Q., Scarabelli, G. & Grant, B. J. Integrating protein structural dynamics and evolutionary analysis with Bio3D. BMC Bioinformatics 15, 399, doi:10.1186/s12859-014-0399-6 (2014).
Ribeiro, A. A. S. T. & Ortiz, V. MDN: A Web Portal for Network Analysis of Molecular Dynamics Simulations. Biophys. J. 109, 1110–1116, doi:10.1016/j.bpj.2015.06.013 (2015).
Ribeiro, A. A. S. T. & Ortiz, V. Energy propagation and network energetic coupling in proteins. J. Phys. Chem. A 119, 1835–1846, doi:10.1021/jp509906m (2015).
Ribeiro, A. A. S. T. & Ortiz, V. Determination of signaling pathways in proteins through network theory: Importance of the topology. J. Chem. Theory Comput. 10, 1762–1769, doi:10.1021/ct400977r (2014).
Da Silveira, C. H. et al. Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins. Proteins Struct. Funct. Bioinforma. 74, 727–743, doi:10.1002/prot.v74:3 (2009).
Hertig, S., Latorraca, N. R. & Dror, R. O. Revealing Atomic-Level Mechanisms of Protein Allostery with Molecular Dynamics Simulations. PLoS Comput. Biol. 12, 1–16, doi:10.1371/journal.pcbi.1004746 (2016).
Dror, R. O., Dirks, R. M., Grossman, J. P., Xu, H. & Shaw, D. E. Biomolecular simulation: a computational microscope for molecular biology. Annu. Rev. Biophys. 41, 429–52, doi:10.1146/annurev-biophys-042910-155245 (2012).
Martín-García, F., Papaleo, E., Gomez-Puertas, P., Boomsma, W. & Lindorff-Larsen, K. Comparing Molecular Dynamics Force Fields in the Essential Subspace. PLoS One 10, e0121114, doi:10.1371/journal.pone.0121114 (2015).
Papaleo, E., Sutto, L., Gervasio, F. L. & Lindorff-Larsen, K. Conformational Changes and Free Energies in a Proline Isomerase. J. Chem. Theory Comput. 10, 4169–4174, doi:10.1021/ct500536r (2014).
Wang, Y., Papaleo, E. & Lindorff-Larsen, K. Mapping transiently formed and sparsely populated conformations on a complex energy landscape. Elife 5, e17505, doi:10.7554/eLife.17505 (2016).
Lindorff-Larsen, K., Maragakis, P., Piana, S. & Shaw, D. E. Picosecond to Millisecond Structural Dynamics in Human Ubiquitin. J. Phys. Chem. B acs.jpcb.6b02024, doi:10.1021/acs.jpcb.6b02024 (2016).
Nygaard, M. et al. The mutational landscape of the oncogenic MZF1 SCAN domain in cancer. Front. Mol. Biosci. doi:10.3389/fmolb.2016.00078 (2016).
Marino, V., Scholten, A., Koch, K. W. & Dell’Orco, D. Two retinal dystrophy-associated missense mutations in GUCA1A with distinct molecular properties result in a similar aberrant regulation of the retinal guanylate cyclase. Hum. Mol. Genet. 24, 6653–6666, doi:10.1093/hmg/ddv370 (2015).
Piana, S., Lindorff-Larsen, K. & Shaw, D. E. How robust are protein folding simulations with respect to force field parameterization? Biophys. J. 100, L47–9, doi:10.1016/j.bpj.2011.03.051 (2011).
Best, R. B. et al. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 8, 3257–3273, doi:10.1021/ct300400x (2012).
Best, R. B. & Hummer, G. Optimized molecular dynamics force fields applied to the helix-coil transition of polypeptides. J. Phys. Chem. B 113, 9004–15, doi:10.1021/jp901540t (2009).
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78, 1950–8, doi:10.1002/prot.22711 (2010).
Schmidt, C., Beilsten-Edmands, V. & Robinson, C. V. Insights into Eukaryotic Translation Initiation from Mass Spectrometry of Macromolecular Protein Assemblies. J Mol. Biol 1–13, doi:10.1016/j.jmb.2015.10.011 (2015).
Brinda, K. V. & Vishveshwara, S. A network representation of protein structures: implications for protein stability. Biophys. J. 89, 4159–70, doi:10.1529/biophysj.105.064485 (2005).
Kannan, N. & Vishveshwara, S. Identification of side-chain clusters in protein structures by a graph spectral method. J. Mol. Biol. 292, 441–64, doi:10.1006/jmbi.1999.3058 (1999).
Stetz, G. & Verkhivker, G. M. Probing Allosteric Inhibition Mechanisms of the Hsp70 Chaperone Proteins Using Molecular Dynamics Simulations and Analysis of the Residue Interaction Networks. J. Chem. Inf. Model. 56, 1490–1517, doi:10.1021/acs.jcim.5b00755 (2016).
Mariani, S., Dell’Orco, D., Felline, A., Raimondi, F. & Fanelli, F. Network and Atomistic Simulations Unveil the Structural Determinants of Mutations Linked to Retinal Diseases. PLoS Comput. Biol. 9 (2013).
Pasi, M., Tiberti, M., Arrigoni, A. & Papaleo, E. xPyder: A PyMOL Plugin To Analyze Coupled Residues and Their Networks in Protein Structures. J. Chem. Inf. Model. 279, 1–6, doi:10.1021/ci300213c (2012).
Hess, B., Kutzner, C., van der Spoel, D. & Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 4, 435–447, doi:10.1021/ct700301q (2008).
Tiberti, M., Invernizzi, G. & Papaleo, E. (Dis) similarity Index To Compare Correlated Motions in Molecular Simulations. J. Chem. Theory Comput. 11, 4404–14, doi:10.1021/acs.jctc.5b00512 (2015).
Hess, B., Bekker, H., Berendsen, H. & Fraaije, J. LINCS: A linear constraint solver for molecular simulations. J Comput Chem 12, 1463–1472 (1993).
Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593, doi:10.1063/1.470117 (1995).
Papaleo, E., Pasi, M., Tiberti, M. & De Gioia, L. Molecular dynamics of mesophilic-like mutants of a cold-adapted enzyme: insights into distal effects induced by the mutations. PLoS One 6, e24214, doi:10.1371/journal.pone.0024214 (2011).
Seeber, M. et al. Wordom: a user-friendly program for the analysis of molecular structures, trajectories, and free energy surfaces. J. Comput. Chem. 32, 1183–94, doi:10.1002/jcc.21688 (2011).
Michaud-Agrawal, N., Denning, E. J., Woolf, T. B. & Beckstein, O. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J Comput Chem 32, 2319–2327, doi:10.1002/jcc.21787 (2011).
Miller, R. G. The jackknife-a review. Biometrika 61, 1–15 (1974).
Acknowledgements
The authors would like to thank Matteo Tiberti and Wouter Boomsma for fruitful discussion and suggestions.
Author information
Authors and Affiliations
Contributions
E.P. conceived and designed the research; J.S.V. and M.F.A. carried out the experiments; E.P., J.S.V. M.L., and M.F.A. discussed the data; J.S.V. and M.F.A prepared the figures and tables, E.P. wrote the manuscript with inputs from all the coauthors.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Salamanca Viloria, J., Allega, M.F., Lambrughi, M. et al. An optimal distance cutoff for contact-based Protein Structure Networks using side-chain centers of mass. Sci Rep 7, 2838 (2017). https://doi.org/10.1038/s41598-017-01498-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-017-01498-6
- Springer Nature Limited
This article is cited by
-
Structural determination of a full-length plant cellulose synthase informed by experimental and in silico methods
Cellulose (2024)
-
Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design
Molecular Biotechnology (2024)
-
The ClusPro AbEMap web server for the prediction of antibody epitopes
Nature Protocols (2023)
-
SARS-CoV-2 antibodies recognize 23 distinct epitopic sites on the receptor binding domain
Communications Biology (2023)
-
Dynamic stability of salt stable cowpea chlorotic mottle virus capsid protein dimers and pentamers of dimers
Scientific Reports (2022)