Cluster Computing

, Volume 17, Issue 4, pp 1101–1119 | Cite as

DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance

  • Sriram Lakshminarasimhan
  • Xiaocheng Zou
  • David A. BoyukaII
  • Saurabh V. Pendse
  • John Jenkins
  • Venkatram Vishwanath
  • Michael E. Papka
  • Scott Klasky
  • Nagiza F. Samatova


Scientific data analytics in high-performance computing environments has been evolving along with the advancement of computing capabilities. With the onset of exascale computing, the increasing gap between compute performance and I/O bandwidth has rendered the traditional post-simulation processing a tedious process. Despite the challenges due to increased data production, there exists an opportunity to benefit from “cheap” computing power to perform query-driven exploration and visualization during simulation time. To accelerate such analyses, applications traditionally augment, post-simulation, raw data with large indexes, which are then repeatedly utilized for data exploration. However, the generation of current state-of-the-art indexes involves a compute- and memory-intensive processing, thus rendering them inapplicable in an in situ context. In this paper we propose DIRAQ, a parallel in situ, in network data encoding and reorganization technique that enables the transformation of simulation output into a query-efficient form, with negligible runtime overhead to the simulation run. DIRAQ’s effective core-local, precision-based encoding approach incorporates an embedded compressed index that is 3–6\(\times \) smaller than current state-of-the-art indexing schemes. Its data-aware index adjustmentation improves performance of group-level index layout creation by up to 35 % and reduces the size of the generated index by up to 27 %. Moreover, DIRAQ’s in network index merging strategy enables the creation of aggregated indexes that speed up spatial-context query responses by up to \(10\times \) versus alternative techniques. DIRAQ’s topology-, data-, and memory-aware aggregation strategy results in efficient I/O and yields overall end-to-end encoding and I/O time that is less than that required to write the raw data with MPI collective I/O.


Exascale computing Indexing Query processing  Compression 



We would like to thank the FLASH Center for Computational Science at the University of Chicago for providing access to the FLASH simulation code and both the FLASH and S3D teams for providing access to the related datasets. We would like to acknowledge the use of resources at the Leadership Computing Facilities at Argonne National Laboratory and Oak Ridge National Laboratory, ALCF and OLCF respectively. Oak Ridge National Laboratory is managed by UT-Battelle for the LLC U.S. D.O.E. under Contract DEAC05-00OR22725. This work was supported in part by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (SDAVI Institute and RSVP Project) and the U.S. National Science Foundation (Expeditions in Computing and EAGER programs). The work of MEP and VV was supported by the DOE Contract DE-AC02-06CH11357.


  1. 1.
    Abbasi, H., Eisenhauer, G., Wolf, M., Schwan, K., Klasky, S.: Just in time: adding value to the IO pipelines of high performance applications with JITStaging. In Proc. Symp, High Performance Distributed Computing (HPDC), (2011)Google Scholar
  2. 2.
    Abbasi, H., Lofstead, J., Zheng, F., Schwan, K., Wolf, M., Klasky, S.: Extending I/O through high performance data services. In Proc. Conf, Cluster Computing (CLUSTER), (2009)Google Scholar
  3. 3.
    Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: DataStager: scalable data staging services for petascale applications. In Proc. Symp, High Performance Distributed Computing (HPDC), (2009)Google Scholar
  4. 4.
    Bennett, J.C., Abbasi, H., Bremer, P.-T., Grout, R., Gyulassy, A., Jin, T., Klasky, S., Kolla, H., Parashar, M., Pascucci, V., Pebay, P., Thompson, D., Yu, H., Zhang, F., Chen, J.: Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC), (2012)Google Scholar
  5. 5.
    S. Byna, J. Chou, O. Rübel, Prabhat, H. Karimabadi, W. S. Daughton, V. Roytershteyn, E. W. Bethel, M. Howison, K.-J. Hsu, K.-W. Lin, A. Shoshani, A. Uselton, and K. Wu. Parallel I/O, analysis, and visualization of a trillion particle simulation. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC), (2012).Google Scholar
  6. 6.
    Chaarawi, M., Gabriel, E.: Automatically selecting the number of aggregators for collective I/O operations. In Proc. Conf, Cluster Computing (CLUSTER), (2011)Google Scholar
  7. 7.
    Chen, J.H., Choudhary, A., de Supinski, B., De Vries, E.R., Hawkes, S., Klasky, W.-K., Liao, K.-L., Ma, J., Mellor-Crummey, N., Podhorszki, R., Sankaran, S., Yoo, C.S.: Terascale direct numerical simulations of turbulent combustion using S3D. J. Comput. Sci. Dis. 2(1), 015001 (2009)CrossRefGoogle Scholar
  8. 8.
    J. Chou, K. Wu, and Prabhat, H. FastQuery: a parallel indexing system for scientific data. In Proc. Conf. Cluster Computing (CLUSTER), (2011).Google Scholar
  9. 9.
    J. Chou, K. Wu, O. Rübel, M. Howison, J. Qiang, Prabhat, B. Austin, E. W. Bethel, R. D. Ryne, and A. Shoshani. Parallel index and query for large scale data analysis. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC), (2011).Google Scholar
  10. 10.
    del Rosario, J.M., Bordawekar, R., Choudhary, A.: Improved parallel I/O via a two-phase run-time access strategy. ACM SIGARCH Comput. Archi. News 21(5), 31–38 (1993)CrossRefGoogle Scholar
  11. 11.
    Fryxell, B., Olson, K., Ricker, P., Timmes, F.X., Zingale, M., Lamb, D.Q., MacNeice, P., Rosner, R., Truran, J.W., Tufo, H.: FLASH: an adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes. Astrophys. J. Suppl. Ser. 131, 273–334 (2000)CrossRefGoogle Scholar
  12. 12.
    Fu, J., Latham, R., Min, M., Carothers, C.D.: I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6. In Proc. Workshop on Runtime and Operating Systems for Supercomputers (ROSS), (2012)Google Scholar
  13. 13.
    Fu, J., Min, M., Latham, R., Carothers, C.D.: Parallel I/O performance for application-level checkpointing on the Blue Gene/P system. In Proc. Conf, Cluster Computing (CLUSTER) (2011)Google Scholar
  14. 14.
    Z. Gong, D. Boyuka, X. Zou, Q. Liu, N. Podhorszki, S. Klasky, X. Ma, and N. F. Samatova. Parlo: Parallel run-time layout optimization for scientific data explorations with heterogeneous access patterns. In International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp 343–351. IEEE, 2013.Google Scholar
  15. 15.
    K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Proc. Conf. Neural Networks, (1989).Google Scholar
  16. 16.
    Igel, C., Hüsken, M.: Empirical evaluation of the improved rprop learning algorithm. J. Neurocomput. 50, 2003 (2003)CrossRefGoogle Scholar
  17. 17.
    Jenkins, J., Arkatkar, I., Lakshminarasimhan, S., Shah, N., Schendel, E.R., Ethier, S., Chang, C.-S., Chen, J.H., Kolla, H., Klasky, S., Ross, R.B., Samatova, N.F.: Analytics-driven lossless data compression for rapid in-situ indexing, storing, and querying. In Proc. Conf. Database and Expert Systems Applications, Part II (DEXA), (2012)Google Scholar
  18. 18.
    Kim, J., Abbasi, H., Chacon, L., Docan, C., Klasky, S., Liu, Q., Podhorszki, N., Shoshani, A., Wu, K.: Parallel in situ indexing for data-intensive computing. In Proc. Symp, Large Data Analysis and Visualization (LDAV), (2011)Google Scholar
  19. 19.
    Kumar, S., Vishwanath, V., Carns, P., Levine, J.A., Latham, R., Scorzelli, G., Kolla, H., Grout, R., Ross, R., Papka, M.E., Chen, J., Pascucci, V.: Efficient data restructuring and aggregation for I/O acceleration in PIDX. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC), (2012)Google Scholar
  20. 20.
    S. Lakshminarasimhan, D. A. Boyuka, S. V. Pendse, X. Zou, J. Jenkins, V. Vishwanath, M. E. Papka, and N. F. Samatova. Scalable in situ scientific data encoding for analytical query processing. In Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing, pp 1–12. ACM, (2013).Google Scholar
  21. 21.
    Ma, K.L.: In situ visualization at extreme scale: challenges and opportunities. J. Comput. Graph. Appl. 29, 14–19 (2009)Google Scholar
  22. 22.
    S. Nissen. Implementation of a fast artificial neural network library (fann). Technical report, Department of Computer Science University of Copenhagen (DIKU), (2003).
  23. 23.
    O. Rübel, Prabhat, K. Wu, H. Childs, J. Meredith, C. G. R. Geddes, E. Cormier-Michel, S. Ahern, G. H. Weber, P. Messmer, H. Hagen, B. Hamann, and E. W. Bethel. High performance multivariate visual data exploration for extremely large data. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC), (2008).Google Scholar
  24. 24.
    Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In Proc. Conf, File and Storage Technologies (FAST) (2002)Google Scholar
  25. 25.
    Thakur, R., Choudhary, A.: An extended two-phase method for accessing sections of out-of-core arrays. J. Sci. Program. 5(4), 301–317 (1996)Google Scholar
  26. 26.
    Tu, T., Yu, H., Bielak, J., Ghattas, O., Lopez, J.C., Ma, K.-L., O’Hallaron, D.R., Ramirez-Guzman, L., Stone, N., Taborda-Rios, R., Urbanic, J.: Remote runtime steering of integrated terascale simulation and visualization. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC) (2006)Google Scholar
  27. 27.
    Vishwanath, V., Hereld, M., Morozov, V., Papka, M.E.: Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems, pp. 1–11. In Proc. Conf. High Performance Computing, Networking, Storage and Analysis (SC) (2011).Google Scholar
  28. 28.
    Wu, K.: FastBit: an efficient indexing technology for accelerating data-intensive science. J. Phys. 16, 556 (2005)Google Scholar
  29. 29.
    Wu, K., Otoo, E., Shoshani, A.: On the performance of bitmap indices for high cardinality attributes. In Proc, Conf Very Large Data Bases (VLDB) (2004)Google Scholar
  30. 30.
    K. Wu, R. R. Sinha, C. Jones, S. Ethier, S. Klasky, K.-L. Ma, A. Shoshani, and M. Winslett. Finding regions of interest on toroidal meshes. J. Comput. Sci. Dis. 4(1), (2011).Google Scholar
  31. 31.
    Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In Proc. Conf, World Wide Web (WWW) (2009)Google Scholar
  32. 32.
    Yoo, R.M., Lee, H., Chow, K., Lee, H.-H.S.: Constructing a non-linear model with neural networks for workload characterization. In Proc. Symp, Workload Characterization (IISWC), (2006)Google Scholar
  33. 33.
    H. Yu, C. Wang, R. W. Grout, J. H. Chen, and K.-L. Ma. In situ visualization for large-scale combustion simulations. J. Comput. Graph. Appl. 30(3), 45–57, (2010).Google Scholar
  34. 34.
    Zhang, J., Long, X., Torsten, S.: Performance of compressed inverted list caching in search engines. In Proc. Conf, World Wide Web (WWW) (2008)Google Scholar
  35. 35.
    Zheng, F., Abbasi, H., Docan, C., Lofstead, J., Liu, Q., Klasky, S., Parashar, M., Podhorszki, N., Schwan, K., Wolf, M.: PreDatA: preparatory data analytics on peta-scale machines. In Proc. Symp, Parallel Distributed Processing (IPDPS), (2010)Google Scholar
  36. 36.
    Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar RAM-CPU cache compression. In Proc. Conf, Data Engineering (ICDE). (2006)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Sriram Lakshminarasimhan
    • 1
    • 2
    • 3
    • 4
  • Xiaocheng Zou
    • 2
    • 3
  • David A. BoyukaII
    • 2
    • 3
  • Saurabh V. Pendse
    • 2
    • 3
  • John Jenkins
    • 2
    • 3
  • Venkatram Vishwanath
    • 4
  • Michael E. Papka
    • 4
    • 5
  • Scott Klasky
    • 3
  • Nagiza F. Samatova
    • 2
    • 3
  1. 1.IBM India Research LabBangaloreIndia
  2. 2.North Carolina State UniversityRaleighUSA
  3. 3.Oak Ridge National LaboratoryOak RidgeUSA
  4. 4.Argonne National LaboratoryArgonneUSA
  5. 5.Northern Illinois UniversityDeKalbUSA

Personalised recommendations