Abstract
Among the major challenges of transitioning to exascale in HPC is the ubiquitous I/O bottleneck. For analysis and visualization applications in particular, this bottleneck is exacerbated by the write-onceread- many property of most scientific datasets combined with typically complex access patterns. One promising way to alleviate this problem is to recognize the application’s access patterns and utilize them to prefetch data, thereby overlapping computation and I/O. However, current research methods for analyzing access patterns are either offline-only and/or lack the support for complex access patterns, such as high-dimensional strided or composition-based unstructured access patterns. Therefore, we propose an online analyzer capable of detecting both simple and complex access patterns with low computational and memory overhead and high accuracy. By combining our pattern detection with prefetching,we consistently observe run-time reductions, up to 26%, across 18 configurations of PIOBench and 4 configurations of a micro-benchmark with both structured and unstructured access patterns.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chen, J.H., Choudhary, A., De Supinski, B., DeVries, M., Hawkes, E., Klasky, S., Liao, W., Ma, K., Mellor-Crummey, J., Podhorszki, N., et al.: Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science & Discovery 2(1), 15001 (2009)
Wang, W., Lin, Z., Tang, W., Lee, W., Ethier, S., Lewandowski, J., Rewoldt, G., Hahm, T., Manickam, J.: Gyro-kinetic simulation of global turbulent transport properties in tokamak experiments. Physics of Plasmas 13, 092505 (2006)
Zhu, Y., Jiang, H., Qin, X., Feng, D., Swanson, D.R.: Improved read performance in a cost-effective, fault-tolerant parallel virtual file system (ceft-pvfs). In: CCGrid 2003, pp. 730–735. IEEE (2003)
Di Biagio, A., Speziale, E., Agosta, G.: Exploiting thread-data affinity in openmp with data access patterns. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 230–241. Springer, Heidelberg (2011)
Byna, S., Chen, Y., Sun, X.H., Thakur, R., Gropp, W.: Parallel I/O prefetching using MPI file caching and I/O signatures. In: SC 2008, pp. 1–12. IEEE (2008)
Oly, J., Reed, D.A.: Markov model prediction of I/O requests for scientific applications. In: ICS 2002, pp. 147–155. ACM (2002)
Li, Z., Chen, Z., Srinivasan, S.M., Zhou, Y.: C-Miner: Mining Block Correlations in Storage Systems. In: FAST, pp. 173–186 (2004)
Choi, J.Y., Abbasi, H., Pugmire, D., Podhorszki, N., Klasky, S., Capdevila, C., Parashar, M., Wolf, M., Qiu, J., Fox, G.: Mining hidden mixture context with adios-p to improve predictive pre-fetcher accuracy. In: 2012 IEEE 8th International Conference on E-Science (e-Science), pp. 1–8. IEEE (2012)
Crandall, P.E., Aydt, R.A., Chien, A.A., Reed, D.A.: Input/output characteristics of scalable parallel applications. In: Proceedings of the IEEE/ACM SC 1995 Conference on Supercomputing, pp. 59–59. IEEE (1995)
Madhyastha, T.M., Reed, D.A.: Learning to classify parallel input/output access patterns. TPDS 13(8), 802–813 (2002)
Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: Cluster 2010, pp. 1–10 (2010)
Shorter, F.: Design and analysis of a performance evaluation standard for parallel file systems. PhD thesis, Clemson University (2003)
Gong, Z., Boyuka, D., Zou, X., Liu, Q., Podhorszki, N., Klasky, S., Ma, X., Samatova, N.F.: Parlo: Parallel run-time layout optimization for scientific data explorations with heterogeneous access patterns. In: CCGrid 2013, pp. 343–351 (2013)
Han, W.S., Moon, Y.S., Whang, K.Y.: Prefetchguide: Capturing navigational access patterns for prefetching in client/server object-oriented/object-relational dbmss. Information Sciences 152, 47–61 (2003)
Baer, J.L., Chen, T.F.: An effective on-chip preloading scheme to reduce data access penalty. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing 1991, pp. 176–186. IEEE (1991)
Dahlgren, F., Dubois, M., Stenstrom, P.: Fixed and adaptive sequential prefetching in shared memory multiprocessors. In: ICPP 1993, vol. 1, pp. 56–63. IEEE (1993)
Dahlgren, F., Dubois, M., Stenstrom, P.: Sequential hardware prefetching in shared-memory multiprocessors. TPDS 6(7), 733–746 (1995)
Ding, X., Jiang, S., Chen, F., Davis, K., Zhang, X.: Diskseen: Exploiting disk layout and access history to enhance I/O prefetch. In: USENIX Annual Technical Conference, vol. 7, pp. 261–274 (2007)
Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: Pvfs: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)
Braam, P.J., Zahir, R.: Lustre: A scalable, high performance file system. Cluster File Systems, Inc. (2002)
Patterson, R.H., Gibson, G.A., Ginting, E., Stodolsky, D., Zelenka, J.: Informed prefetching and caching, vol. 29. ACM (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Tang, H. et al. (2014). Improving Read Performance with Online Access Pattern Analysis and Prefetching. In: Silva, F., Dutra, I., Santos Costa, V. (eds) Euro-Par 2014 Parallel Processing. Euro-Par 2014. Lecture Notes in Computer Science, vol 8632. Springer, Cham. https://doi.org/10.1007/978-3-319-09873-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-09873-9_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09872-2
Online ISBN: 978-3-319-09873-9
eBook Packages: Computer ScienceComputer Science (R0)