RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication

Jenkins, John; Zou, Xiaocheng; Tang, Houjun; Kimpe, Dries; Ross, Robert; Samatova, Nagiza F.

doi:10.1007/978-3-319-07518-1_19

John Jenkins^18,19,
Xiaocheng Zou¹⁸,
Houjun Tang¹⁸,
Dries Kimpe¹⁹,
Robert Ross¹⁹ &
…
Nagiza F. Samatova^18,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8488))

Included in the following conference series:

International Supercomputing Conference

2655 Accesses
13 Citations

Abstract

Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these factors, we present a partial data replication system called RADAR. We capture datatype- and collective-aware I/O access patterns (indicating logical access) via MPI-IO tracing and use a combination of coarse-grained and fine-grained performance modeling to evaluate and select optimized physical data distributions for the task at hand. Unlike conventional methods, we store all replica data and metadata, along with the original untouched data, under a single file container using the object abstraction in parallel filesystems. Our system results in manyfold improvements in some commonly used subvolume decomposition access patterns.Moreover, the modeling approach can determine whether such optimizations should be undertaken in the first place.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bent, J., Gibson, G., Grider, G., McClelland, B., Nowoczynski, P., Nunez, J., Polte, M., Wingate, M.: PLFS: A checkpoint filesystem for parallel applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 21:1–21:12. ACM, New York (2009)
Google Scholar
Bhadkamkar, M., Guerra, J., Useche, L., Burnett, S., Liptak, J., Rangaswami, R., Hristidis, V.: BORG: Block-reORGanization for self-optimizing storage systems. In: Proccedings of the 7th Conference on File and Storage Technologies, FAST 2009, pp. 183–196. USENIX Association, Berkeley (2009)
Google Scholar
Bucy, J.S., Schindler, J., Schlosser, S., Ganger, G.: Contributors. The DiskSim simulation environment version 4.0 reference manual. Technical Report CMU-PDL-08-101, Carnegie Mellon University Parallel Data Lab (2008)
Google Scholar
Byna, S., Chen, Y., Sun, X.-H., Thakur, R., Gropp, W.: Parallel I/O prefetching using MPI file caching and I/O signatures. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008, pp. 1–12. IEEE (2008)
Google Scholar
Carns, P., Harms, K., Allcock, W., Bacon, C., Lang, S., Latham, R., Ross, R.: Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage (TOC) 7(3), 8:1–8:26 (2011)
Google Scholar
Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: IEEE International Conference on Cluster Computing, Cluster 2010, pp. 1–10 (2009)
Google Scholar
Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: A parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 317–327 (2000)
Google Scholar
Carns, P.H., Ligon III, W.B., Ross, R.B., Wyckoff, P.: BMI: A network abstraction layer for parallel I/O. In: Workshop on Communication Architecture for Clusters, Proceedings of IPDPS 2005, Denver, CO (April 2005)
Google Scholar
Dayal, S.: Characterizing HEC storage systems at rest. Technical Report CMU-PDL-09-109, Carnegie Mellon University Parallel Data Laboratory (2008)
Google Scholar
Frazier, M.W.: An Introduction to Wavelets through Linear Algebra. Springer (1999)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google File System. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP 2003, pp. 29–43. ACM, New York (2003)
Chapter Google Scholar
Godard, S.: Sysstat utilities home page, http://sebastien.godard.pagesperso-orange.fr/index.html
Gong, Z., Boyuka II, D.A., Zou, X., Liu, Q., Podhorszki, N., Klasky, S., Ma, X., Samatova, N.F.: PARLO: PArallel Run-time Layout Optimization for scientific data explorations with heterogeneous access patterns. In: 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2013), Delft, The Netherlands (2013)
Google Scholar
Gong, Z., Rogers, T., Jenkins, J., Kolla, H., Ethier, S., Chen, J., Ross, R., Klasky, S., Samatova, N.F.: MLOC: Multi-level layout optimization framework for compressed scientific data exploration with heterogeneous access patterns. In: Proceedings of the 41st International Conference on Parallel Processing, ICPP 2012 (2012)
Google Scholar
Goodell, D., Kim, S.J., Latham, R., Kandemir, M., Ross, R.: An evolutionary path to object storage access. In: Proceedings of the Seventh Workshop on Parallel Data Storage, PDSW 2012 (2012)
Google Scholar
He, J., Bent, J., Torres, A., Grider, G., Gibson, G., Maltzahn, C., Sun, X.-H.: Discovering structure in unstructured I/O. In: Proceedings of the Seventh Workshop on Parallel Data Storage, PDSW 2012 (2012)
Google Scholar
Huang, H., Hung, W., Shin, K.G.: Fs2: Dynamic data replication in free disk space for improving disk performance and energy consumption. In: Proceedings of the Twentieth ACM Symposium on Operating Systems Principles, SOSP 2005, pp. 263–276. ACM, New York (2005)
Chapter Google Scholar
Idreos, S.: Database Cracking: Towards Auto-tuning Database Kernels. PhD thesis, University of Amsterdam (2010)
Google Scholar
Idreos, S., Kersten, M., Manegold, S.: Database cracking. In: Proceedings of the 3rd International Conference on Innovative Data Systems Research, CIDR 2007 (2007)
Google Scholar
Interleaved or random (IOR) parallel filesystem I/O benchmark, https://github.com/chaos/ior
Jenkins, J., et al.: ALACRITY: Analytics-driven lossless data compression for rapid in-situ indexing, storing, and querying. In: Hameurlain, A., Küng, J., Wagner, R., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) TLDKS X. LNCS, vol. 8220, pp. 95–114. Springer, Heidelberg (2013)
Chapter Google Scholar
Jenkins, J., Schendel, E., Lakshminarasimhan, S., Boyuka II, D.A., Rogers, T., Ethier, S., Ross, R., Klasky, S., Samatova, N.F.: Byte-precision level of detail processing for variable precision analytics. In: ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), Salt Lake City, UT, USA (2012)
Google Scholar
Kim, S.J., Son, S.W., Liao, W.-K., Kandemir, M., Thakur, R., Choudhary, A.: IOPin: Runtime profiling of parallel I/O in HPC systems. In: 7th Parallel Data Storage Workshop, PDSW 2012 (2012)
Google Scholar
Kim, S.J., Zhang, Y., Son, S.W., Prabhakar, R., Kandemir, M., Patrick, C., Liao, W.-k., Choudhary, A.: Automated tracing of I/O stack. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 72–81. Springer, Heidelberg (2010)
Chapter Google Scholar
Lakshminarasimhan, S., Jenkins, J., Arkatkar, I., Gong, Z., Kolla, H., Ku, S.-H., Ethier, S., Chen, J., Chang, C.S., Klasky, S., Latham, R., Ross, R., Samatova, N.F.: ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific data. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 31:1–31:11. ACM, New York (2011)
Google Scholar
Lawder, J.K., King, P.J.H.: Querying multi-dimensional data indexed using the Hilbert Space-Filling Curve. SIGMOD Record 30 (2001)
Google Scholar
Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: Building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2005, pp. 190–200. ACM, New York (2005)
Chapter Google Scholar
Madhyastha, T.M., Reed, D.A.: Learning to classify parallel input/output access patterns. IEEE Transactions on Parallel and Distributed Systems 13(8), 802–813 (2002)
Article Google Scholar
McKusick, M.K., Quinlan, S.: GFS: Evolution on fast-forward. Queue 7(7), 10:10–10:20 (2009)
Google Scholar
MPI parallel environment (MPE), http://www.mcs.anl.gov/research/projects/perfvis/software/MPE/
Narayanan, S., Catalyurek, U., Kurc, T., Kumar, V.S., Saltz, J.: A runtime framework for partial replication and its application for on-demand data exploration. In: High Performance Computing Symposium, SCS Spring Simulation Multiconference, HPC 2005 (2005)
Google Scholar
Noeth, M., Ratn, P., Mueller, F., Schulz, M., de Supinski, B.R.: ScalaTrace: Scalable compression and replay of communication traces for high-performance computing. Journal of Parallel and Distributed Computing 69(8), 696–710 (2009)
Article Google Scholar
Oly, J., Reed, D.A.: Markov model prediction of I/O requests for scientific applications. In: Proceedings of the 16th International Conference on Supercomputing, ICS 2002, pp. 147–155. ACM, New York (2002)
Google Scholar
Parallel I/O benchmarking consortium, http://www.mcs.anl.gov/research/projects/pio-benchmark/
Pascucci, V., Frank, R.J.: Global static indexing for real-time exploration of very large regular grids. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2001 (2001)
Google Scholar
Ratn, P., Mueller, F., de Supinski, B.R., Schulz, M.: Preserving time in large-scale communication traces. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, pp. 46–55. ACM, New York (2008)
Chapter Google Scholar
Schmuck, F., Haskin, R.: GPFS: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST 2002. USENIX Association, Berkeley (2002)
Google Scholar
Schwan, P.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)
Google Scholar
Shorter, F.: Design and analysis of a performance evaluation standard for parallel file systems. Master’s thesis, Clemson University (2003)
Google Scholar
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST 2010, pp. 1–10. IEEE Computer Society, Washington, DC (2010)
Google Scholar
Son, S.W., Latham, R., Ross, R., Thakur, R.: Reliable MPI-IO through layout-aware replication. In: Proceedings of the 7th IEEE International Workshop on Storage Network Architecture and Parallel I/O, SNAPI 2011 (2011)
Google Scholar
Song, H., Yin, Y., Chen, Y., Sun, X.-H.: A cost-intelligent application-specific data layout scheme for parallel file systems. In: Proceedings of the 20th International Symposium on High Performance Distributed Computing, HPDC 2011, pp. 37–48. ACM, New York (2011)
Chapter Google Scholar
Song, H., Yin, Y., Sun, X.-H., Thakur, R., Lang, S.: A segment-level adaptive data layout scheme for improved load balance in parallel file systems. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 414–423 (2011)
Google Scholar
Tantisiriroj, W., Son, S.W., Patil, S., Lang, S.J., Gibson, G., Ross, R.B.: On the duality of data-intensive file system design: Reconciling HDFS and PVFS. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 67:1–67:12. ACM, New York (2011)
Google Scholar
Thakur, R., Choudhary, A.: An extended two-phase method for accessing sections of out-of-core arrays. Scientific Programming 5(4), 301–317 (1996)
Article Google Scholar
Thakur, R., Gropp, W., Lusk, E.: An abstract-device interface for implementing portable parallel-I/O interfaces. In: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation, FRONTIERS 1996, pp. 180–187. IEEE Computer Society, Washington, DC (1996)
Google Scholar
Thakur, R., Ross, R., Lust, E., Gropp, W.: Users guide for ROMIO: A high-performance, portable MPI-IO implementation. Technical Report ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory (2004)
Google Scholar
Tran, N., Reed, D.A.: Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Transactions on Parallel and Distributed Systems 15(4), 362–377 (2004)
Article Google Scholar
Vetter, J.S., McCracken, M.O.: Statistical scalability analysis of communication operations in distributed applications. In: Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, PPoPP 2001, pp. 123–132. ACM, New York (2001)
Google Scholar
Vijayakumar, K., Mueller, F., Ma, X., Roth, P.C.: Scalable I/O tracing and analysis. In: Proceedings of the 4th Annual Workshop on Petascale Data Storage, PDSW 2009, pp. 26–31. ACM, New York (2009)
Google Scholar
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI 2006, pp. 307–320. USENIX Association, Berkeley (2006)
Google Scholar
Weng, L., Catalyurek, U., Kurc, T., Agrawal, G., Saltz, J.: Servicing range queries on multidimensional datasets with partial replicas. In: IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2005, vol. 2, pp. 726–733. IEEE (2005)
Google Scholar
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann (1999)
Google Scholar
Wu, X., Vijayakumar, K., Mueller, F., Ma, X., Roth, P.C.: Probabilistic communication and I/O tracing with deterministic replay at scale. In: Proceedings of the 2011 International Conference on Parallel Processing, ICPP 2011, pp. 196–205. IEEE Computer Society, Washington, DC (2011)
Chapter Google Scholar
Yin, Y., Byna, S., Song, H., Sun, X.-H., Thakur, R.: Boosting application-specific parallel I/O optimization using IOSIG. In: Cluster, Cloud and Grid Computing (CCGrid), pp. 196–203 (2012)
Google Scholar
Yin, Y., Li, J., He, J., Sun, X.-H., Thakur, R.: Pattern-direct and layout-aware replication scheme for parallel i/o systems. In: IEEE International Symposium on Parallel and Distributed Computing, IPDPS 2013, pp. 345–356 (2013)
Google Scholar
Zhang, X., Jiang, S.: InterferenceRemoval: Removing interference of disk access for mpi programs through data replication. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, pp. 223–232. ACM, New York (2010)
Chapter Google Scholar
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38(2) (July 2006)
Google Scholar

Download references

Author information

Authors and Affiliations

North Carolina State University, Raleigh, NC, 27695, USA
John Jenkins, Xiaocheng Zou, Houjun Tang & Nagiza F. Samatova
Argonne National Laboratory, Argonne, IL, 60439, USA
John Jenkins, Dries Kimpe & Robert Ross
Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
Nagiza F. Samatova

Authors

John Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Xiaocheng Zou
View author publications
You can also search for this author in PubMed Google Scholar
Houjun Tang
View author publications
You can also search for this author in PubMed Google Scholar
Dries Kimpe
View author publications
You can also search for this author in PubMed Google Scholar
Robert Ross
View author publications
You can also search for this author in PubMed Google Scholar
Nagiza F. Samatova
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIN Faculty, Department of Informatics Scientific Computing, University of Hamburg, Bundestraße 45a, 20146, Hamburg, Germany
Julian Martin Kunkel
Deutsches Klimarechenzentrum, Bundesstraße 45a, 20146, Hamburg, Germany
Thomas Ludwig
Germany and Prometeus GmbH, University of Mannheim, Fliederstraße 2, 74915, Waibstadt, Germany
Hans Werner Meuer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jenkins, J., Zou, X., Tang, H., Kimpe, D., Ross, R., Samatova, N.F. (2014). RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-07518-1_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07517-4
Online ISBN: 978-3-319-07518-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics