Trajectory NG: portable, compressed, general molecular dynamics trajectories

Spångberg, Daniel; Larsson, Daniel S. D.; van der Spoel, David

doi:10.1007/s00894-010-0948-5

Trajectory NG: portable, compressed, general molecular dynamics trajectories

Original Paper
Published: 26 January 2011

Volume 17, pages 2669–2685, (2011)
Cite this article

Journal of Molecular Modeling Aims and scope Submit manuscript

Daniel Spångberg¹,
Daniel S. D. Larsson² &
David van der Spoel²

1569 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

We present general algorithms for the compression of molecular dynamics trajectories. The standard ways to store MD trajectories as text or as raw binary floating point numbers result in very large files when efficient simulation programs are used on supercomputers. Our algorithms are based on the observation that differences in atomic coordinates/velocities, in either time or space, are generally smaller than the absolute values of the coordinates/velocities. Also, it is often possible to store values at a lower precision. We apply several compression schemes to compress the resulting differences further. The most efficient algorithms developed here use a block sorting algorithm in combination with Huffman coding. Depending on the frequency of storage of frames in the trajectory, either space, time, or combinations of space and time differences are usually the most efficient. We compare the efficiency of our algorithms with each other and with other algorithms present in the literature for various systems: liquid argon, water, a virus capsid solvated in 15 mM aqueous NaCl, and solid magnesium oxide. We perform tests to determine how much precision is necessary to obtain accurate structural and dynamic properties, as well as benchmark a parallelized implementation of the algorithms. We obtain compression ratios (compared to single precision floating point) of 1:3.3–1:35 depending on the frequency of storage of frames and the system studied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic light scattering: a practical guide and applications in biomedical sciences

Article 06 October 2016

Python tools for structural tasks in chemistry

Article 14 May 2024

Using Dali for Protein Structure Comparison

References

Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) J Chem Theory Comput 4(3):435
Article CAS Google Scholar
Bowers KJ, Chow E, Xu H, Dror RO, Eastwood MP, Gregersen BA, Klepeis JL, Kolossváry I, Moraes MA, Sacerdoti FD, Salmon JK, Shan Y, Shaw DE (2006) SC ’06: Proceedings of the ACM/IEEE Conference on Supercomputing. ACM, New York
Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K (2005) J Comp Chem 26:1781
Article CAS Google Scholar
Gailly J et al (2010) GZIP version 1.4. http://ftp.gnu.org/gnu/gzip/
Seward J (2008) BZIP2 version 1.0.5. http://www.bzip.org/
Green D, Meacham KE, Surridge M, van Hoesel F, Berendsen HJC (1995) Methods and techniques in computational chemistry: METECC-95. STEF, Cagliari, p 435
Melo A, Puga AT, Gentil F, Brito N, Alves AP, Ramos MJ (2000) J Chem Inf Comput Sci 40:559
CAS Google Scholar
Meyer T, Ferrer-Costa C, Pérez A, Rueda M, Bidon-Chanal A, Luque FJ, Laughton A, Oronzco M (2006) J Chem Theory Comput 2:251
Article CAS Google Scholar
Uppsala Universitet (2010)TrajNG—trajectory compression library. http://www.uppmax.uu.se/Members/daniels/trajng-trajectory-compression-library
Burrows M, Wheeler DJ (1994) SRC research report. Digital Equipment Corporation, Palo Alto
Ziv J, Lempel A (1977) IEEE Trans Inf Theory IT23:337
Article Google Scholar
Huffman DV (1952) IRE 40:1098
Article Google Scholar
Bentley J, Sleator D, Tarjan R, Wei V (1986) Commun ACM 29(4):320
Article Google Scholar
Schulz R, Lindner B, Petridis L, Smith J (2009) J Chem Theory Comput 5:2798
Article CAS Google Scholar
Allen MP, Tildesley DJ (1987) Computer simulation of liquids. Clarendon, Oxford
Google Scholar
Swope WC, Andersen HC, Berens PH, Wilson KR (1982) J Chem Phys 76:637
Article CAS Google Scholar
Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) J Chem Phys 79:926
Article CAS Google Scholar
Nosé S (1984) Mol Phys 52:255
Article Google Scholar
Hoover WG (1985) Phys Rev A 31:1695
Article Google Scholar
Andersen HC (1983) J Comput Phys 52:24
Article CAS Google Scholar
Cicotti G, Ferrario M, Ryckaert J (1982) Mol Phys 47(6):1253
Article Google Scholar
Harding JH, Harker AH (1985) Phil Mag B 25(3):119
Google Scholar
Cleveland CL (1988) J Chem Phys 89(8):4987
Article CAS Google Scholar
Parrinello M, Rahman A (1981) J App Phys 52(12):7182
Article CAS Google Scholar
Mitchell PJ, Fincham D (1993) J Phys Condens Matter 5:1031
Article CAS Google Scholar
Jones T, Liljas L (1984) J Mol Biol 177:735
Article CAS Google Scholar
Berendsen HJC, Postma JPM, van Gunsteren WF, Nola AD, Haak JR (1984) J Chem Phys 81:3684
Article CAS Google Scholar
Miyamoto S, Kollman P (1992) J Comput Chem 13:952
Article CAS Google Scholar
Hess B (2008) J Chem Theory Comput 4:116
Article CAS Google Scholar

Download references

Acknowledgements

The computations were performed on resources provided by the Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX), and resources provided by SNIC through the National Supercomputer Centre (NSC).

Author information

Authors and Affiliations

Uppsala Multidisciplinary Center for Advanced Computational Methods (UPPMAX) and Department of Materials Chemistry, Uppsala University, Box 538, SE-751 21, Uppsala, Sweden
Daniel Spångberg
Department of Cell and Molecular Biology, Uppsala University, Box 596, SE-751 24, Uppsala, Sweden
Daniel S. D. Larsson & David van der Spoel

Authors

Daniel Spångberg
View author publications
You can also search for this author in PubMed Google Scholar
Daniel S. D. Larsson
View author publications
You can also search for this author in PubMed Google Scholar
David van der Spoel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Spångberg.

Appendix

A. Automatic selection of optimal compression algorithms

The optimal compression algorithm to use depends on the system simulated and the frequency with which frames are written to the trajectory file. The first time a block of frames is to be compressed and written to disk, we run a test of all compression algorithms and choose the one that gives the smallest compressed size. This test is performed only once, so all subsequent blocks are compressed using the same compression algorithm as initially determined. The selection of algorithms to include in the test is controlled by a parameter to the library routines.

B. Portable storage

Our implementation writes all integers with the least significant byte first, making the file format essentially little endian. However the file format is completely portable, since all external (I/O) references in our implementation are done using individual bytes only. This means that any system endianness—either big, little, or mixed—is handled portably. Also, we never store floating point values, only properly scaled fixed point numbers (integers). All text stored in the file is written as ASCII (automatic conversion to/from the source encoding is performed).

C. File sizes

Tables 3, 4, 5, 6, 7 and 8 show the raw file sizes from the simulation trajectories compressed with the different algorithms.

Table 3 Trajectory file sizes in bytes from the liquid argon simulation trajectory. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are also shown

Full size table

Table 4 Trajectory file sizes in bytes from the liquid water simulation trajectory. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are shown

Full size table

Table 5 Trajectory file sizes in bytes from the liquid water simulation trajectory stored with high accuracy. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are shown

Full size table

Table 6 Trajectory file sizes in bytes from the solid magnesium oxide simulation trajectory. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are shown

Full size table

Table 7 Trajectory file sizes in bytes from the virus-in-water simulation trajectory. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are shown

Full size table

Table 8 Trajectory file sizes in bytes from the virus-in-water simulation trajectory. The results for different compression algorithms are shown. For comparison, the uncompressed trajectories where values are stored as 32-bit floats are shown

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spångberg, D., Larsson, D.S.D. & van der Spoel, D. Trajectory NG: portable, compressed, general molecular dynamics trajectories. J Mol Model 17, 2669–2685 (2011). https://doi.org/10.1007/s00894-010-0948-5

Download citation

Received: 21 November 2010
Accepted: 27 December 2010
Published: 26 January 2011
Issue Date: October 2011
DOI: https://doi.org/10.1007/s00894-010-0948-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trajectory NG: portable, compressed, general molecular dynamics trajectories

Abstract

Access this article

Similar content being viewed by others

Dynamic light scattering: a practical guide and applications in biomedical sciences

Python tools for structural tasks in chemistry

Using Dali for Protein Structure Comparison

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

A. Automatic selection of optimal compression algorithms

B. Portable storage

C. File sizes

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Trajectory NG: portable, compressed, general molecular dynamics trajectories

Abstract

Access this article

Similar content being viewed by others

Dynamic light scattering: a practical guide and applications in biomedical sciences

Python tools for structural tasks in chemistry

Using Dali for Protein Structure Comparison

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

A. Automatic selection of optimal compression algorithms

B. Portable storage

C. File sizes

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation