Towards Data Warehousing and Mining of Protein Unfolding Simulation Data

Berrar, Daniel; Stahl, Frederic; Silva, Candida; Rodrigues, J. Rui; Brito, Rui M. M.; Dubitzky, Werner

doi:10.1007/s10877-005-0676-z

Towards Data Warehousing and Mining of Protein Unfolding Simulation Data

Published: October 2005

Volume 19, pages 307–317, (2005)
Cite this article

Journal of Clinical Monitoring and Computing Aims and scope Submit manuscript

Daniel Berrar¹,
Frederic Stahl²,
Candida Silva³,
J. Rui Rodrigues³,
Rui M. M. Brito³ &
…
Werner Dubitzky⁴

110 Accesses
14 Citations
Explore all metrics

Abstract

Objectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data.Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis.Results.To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse.Conclusions.Web and grid services, especially pre-defined data mining services that can run on or ‘near’ the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Replica Exchange Molecular Dynamics: A Practical Application Protocol with Solutions to Common Problems and a Peptide Aggregation and Self-Assembly Example

Molecular Dynamics Simulations of Protein Aggregation: Protocols for Simulation Setup and Analysis with Markov State Models and Transition Networks

Biophysical Methods to Investigate Intrinsically Disordered Proteins: Avoiding an “Elephant and Blind Men” Situation

References

Leopold PE, Montal M, Onuchic JN. Protein folding funnels: A kinetic approach to the sequence-structure relationship. Proc Natl Acad Sci USA 1992; 89: 8721–8725.
Google Scholar
Pande VS, Baker I, Chapman J, Elmer SP, Khaliq S, Larson SM, Rhee YM, Shirts MR, Snow CD, Sorin EJ, Zagrovic B. Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing. Biopolymers 2003; 68: 91–109.
Article PubMed CAS Google Scholar
Brito RMM, Dubitzky W, Rodrigues JR. Protein folding and unfolding simulations: A new challenge for data mining. In Dubitzky W (guest editor) OMICS: A Journal of Integrative Biology 2004; 8(2): 153–166.
CAS Google Scholar
Jeong KJ. MGrid: A Molecular Grid System. At http://imc.konkuk.ac.kr/~mgrid/
Feig M, Abdullah M, Johnsson L, Pettitt BM. Large scale distributed data repository: Design of a molecular dynamics trajectory database. Future Generation Computer Systems 1999; 16: 101–110.
Article Google Scholar
Zhang Y, Peters MH, Li Y. Nonequilibrium multiple time scale dynamic simulation of receptor-ligand interactions in structured protein systems. Proteins, Structure, Function and Genetics 2003; 52: 339–348.
Article CAS Google Scholar
Rodrigues JR, Brito RMM. Amyloid formation by transthyretin: How much unfolding is required? Biophys J 2004; 86(1): 340a.
Google Scholar
Brito RMM, Damas AM, Saraiva MJS. Amyloid Formation by transthyretin: From Protein Stability to Protein Aggregation. Current Medicinal Chemistry – Immun. Endoc. & Metab. Agents 2003; 3: 349–360.
CAS Google Scholar
Kale L, Skeel R, Bhandarkar M, Brunner R, Gursoy A, Krawetz N, Phillips J, Shinozaki A, Varadarajan K, Schulten K. NAMD2: Greater scalability for parallel molecular dynamics. J Comp Physics 1999; 151: 283–312.
CAS Google Scholar
Dubes AC, Jain AK. Algorithms for Clustering Data. New York: Prentice Hall, 1988.
Google Scholar
Shea JE, Brooks CL 3rd. From folding theories to folding proteins: A review and assessment of simulation studies of protein folding and unfolding. Ann Rev Phys Chem 2001; 52: 499–535.
Article CAS Google Scholar
Moss L, Adelman A. Data warehousing methodology. J Data Warehousing 2000; 5: 23–31.
Google Scholar
Lamehamedi H, Szymanski B, Shentu Z, Deelman E. Data replication strategies in Grid environments. In Proc. 5th Intl. Conf. Algorithms and Architecture for Parallel Processing (ICA3PP'2002), Bejing, China, October 2002, IEEE Computer Science Press, Los Alamitos, CA, 2002, pp. 378–383.
Keene C. Achieve bottleneck-free grid applications: Learn how to eliminate data bottlenecks for grid computing. At http://www.javaworld.com/javaworld/jw-08-2004/jw-0802-grid.html, 2005.
The Open Grid Services Architecture – Data Access and Integration (OGSA-DAI) project and website at http://www.ogsadai.org.uk/, 2004.
DAIS, Database Access and Integration Services Working Group at http://forge.gridforum.org/projects/dais-wg/, 2004.
Gornik D. Entity relationship modeling with UML. At http://www-106.ibm.com/developerworks/rational/library/content/03July/2500/2785/2785_uml.pdf, 2004.
Murdock S, Tai K, Ng MH, Johnston S, Fangohr H, Wu B, Cox S, Essex J, Sansom M, BioSimGrid Tutorial, at http://www.biosimgrid.org/docs/2004/manuals/Tutorial/tutorialManual.pdf, 2004.
Rodrigues JR, Brito RMM. How important is the role of compact denatured states on amyloid formation by transthyretin? In “Amyloid and Amyloidosis”, 2004: 323–325, Gilles Grateau, Robert A. Kyle and Martha Skinner eds., CRC Press.

Download references

Author information

Authors and Affiliations

School of Biomedical Sciences, University of Ulster, Coleraine, Coleraine, Northern Ireland Cromore Road, BT52 1SA, Coleraine, Northern Ireland
Daniel Berrar
School of Biomedical Sciences, University of Ulster, Coleraine, Northern Ireland
Frederic Stahl
Weihenstephan University of Applied Sciences, Freising, Germany
Candida Silva, J. Rui Rodrigues & Rui M. M. Brito
Departamento de Quámica, Faculdade de Ciências e Tecnologia, and Centro de Neurociências de Coimbra, Universidade de Coimbra, Coimbra, Portugal
Werner Dubitzky

Authors

Daniel Berrar
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Stahl
View author publications
You can also search for this author in PubMed Google Scholar
Candida Silva
View author publications
You can also search for this author in PubMed Google Scholar
J. Rui Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar
Rui M. M. Brito
View author publications
You can also search for this author in PubMed Google Scholar
Werner Dubitzky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Berrar.

Additional information

Based on “Grid Warehousing of Molecular Dynamics Protein Unfolding Data”, by Frederic Stahl, Daniel Berrar, Candida Silva, J. Rui Rodrigues, Rui M.M. Brito, and Werner Dubitzky, which appeared in Proceedings of the IEEE/ACM International Symposium on Cluster Computing and the Grid, Cardiff, UK, May 9–12, 2005.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berrar, D., Stahl, F., Silva, C. et al. Towards Data Warehousing and Mining of Protein Unfolding Simulation Data. J Clin Monit Comput 19, 307–317 (2005). https://doi.org/10.1007/s10877-005-0676-z

Download citation

Received: 30 June 2005
Accepted: 30 June 2005
Issue Date: October 2005
DOI: https://doi.org/10.1007/s10877-005-0676-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards Data Warehousing and Mining of Protein Unfolding Simulation Data

Abstract

Access this article

Similar content being viewed by others

Replica Exchange Molecular Dynamics: A Practical Application Protocol with Solutions to Common Problems and a Peptide Aggregation and Self-Assembly Example

Molecular Dynamics Simulations of Protein Aggregation: Protocols for Simulation Setup and Analysis with Markov State Models and Transition Networks

Biophysical Methods to Investigate Intrinsically Disordered Proteins: Avoiding an “Elephant and Blind Men” Situation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards Data Warehousing and Mining of Protein Unfolding Simulation Data

Abstract

Access this article

Similar content being viewed by others

Replica Exchange Molecular Dynamics: A Practical Application Protocol with Solutions to Common Problems and a Peptide Aggregation and Self-Assembly Example

Molecular Dynamics Simulations of Protein Aggregation: Protocols for Simulation Setup and Analysis with Markov State Models and Transition Networks

Biophysical Methods to Investigate Intrinsically Disordered Proteins: Avoiding an “Elephant and Blind Men” Situation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation