Abstract
With the significant advances in Cloud Computing, it is inevitable to explore the usage of Cloud technology in HPC workflows. While many Cloud vendors offer to move complete HPC workloads into the Cloud, this is limited by the massive demand of computing power alongside storage resources typically required by I/O intensive HPC applications. It is widely believed that HPC hardware and software protocols like MPI yield superior performance and lower resource consumption compared to the HTTP transfer protocol used by RESTful Web Services that are prominent in Cloud execution and Cloud storage. With the advent of enhanced versions of HTTP, it is time to reevaluate the effective usage of cloud-based storage in HPC and their ability to cope with various types of data-intensive workloads. In this paper, we investigate the overhead of the REST protocol via HTTP compared to the HPC-native communication protocol MPI when storing and retrieving objects. Albeit we compare the MPI for a communication use case, we can still evaluate the impact of data communication and, therewith, the efficiency of data transfer for data access patterns. We accomplish this by modeling the impact of data transfer using measurable performance metrics. Hence, our contribution is the creation of a performance model based on hardware counters that provide an analytical representation of data transfer over current and future protocols. We validate this model by comparing the results obtained for REST and MPI on two different cluster systems, one equipped with Infiniband and one with Gigabit Ethernet. The evaluation shows that REST can be a viable, performant, and resource-efficient solution, in particular for accessing large files.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
I.T. Association: About Infiniband. https://www.infinibandta.org/about-infiniband/. Accessed 29 July 2019
AWS: AWS S3. https://aws.amazon.com/de/s3/. Accessed 19 July 2019
Bent, J., et al.: PLFS: a checkpoint filesystem for parallel applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, p. 21. ACM (2009)
Bortolotti, D., et al.: Comparison of UDP transmission performance between IP-over-InfiniBand and 10-Gigabit ethernet. IEEE Trans. Nucl. Sci. 58(4), 1606–1612 (2011)
Chang, C.S., Thomas, J.A.: Effective bandwidth in high-speed digital networks. IEEE J. Sel. Areas Commun. 13(6), 1091–1100 (1995)
Cloudflare: Implementation of the QUIC protocol. https://github.com/cloudflare/quiche. Accessed 01 Apr 2020
Denis, A., Trahay, F.: MPI overlap: benchmark and analysis. In: 45th International Conference on Parallel Processing (ICPP), pp. 258–267. IEEE (2016)
Devresse, A., Furano, F.: Efficient HTTP Based I/O on very large datasets for high performance computing with the Libdavix library. In: Zhan, J., Han, R., Weng, C. (eds.) BPOE 2014. LNCS, vol. 8807, pp. 194–205. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13021-7_15
DKRZ: Mistral. https://www.dkrz.de/up/systems/mistral/configuration. Accessed 19 July 2019
Dumazet, E.: Increase loopback MTU (2012). https://bit.ly/3c4PHVO. Accessed 24 Feb 2020
Eitzinger, J., Röhl, T., Hager, G., Wellein, G.: LIKWID 4 tools architecture
Folk, M., Heber, G., Koziol, Q., Pourmal, E., Robinson, D.: An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases, pp. 36–47. ACM (2011)
Gettys, J.: SMUX Protocol Specification. https://www.w3.org/TR/1998/WD-mux-19980710 (1998). Accessed 19 July 2019
Glozer, W.: wrk - a HTTP benchmarking tool. https://github.com/wg/wrk. Accessed 19 July 2019
Grant, R.E., Balaji, P., Afsahi, A.: A study of hardware assisted ip over InfiniBand and its impact on enterprise data center performance. In: IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pp. 144–153. IEEE (2010)
Gruber, T.: Likwid:about L3 evict. https://github.com/RRZE-HPC/likwid/issues/213. Accessed 13 July 2019
h2load: benchmarking tool for HTTP/2 server. https://nghttp2.org/documentation/h2load.1.html. Accessed 19 Oct 2019
He, Q., Dovrolis, C., Ammar, M.: On the predictability of large transfer TCP throughput. Comput. Netw. 51(14), 3959–3977 (2007)
IETF: QUIC Working Group. https://quicwg.org/. Accessed 01 April 2020
IETF: Request for Comments: 6298. https://tools.ietf.org/html/rfc6298 (2011). Accessed 19 Jan 2020
Intel: Address Translation on Intel X56xx. https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/277182. Accessed 15 Sept 2019
Intel: An Introduction to the Intel® QuickPath Interconnect. https://www.intel.com/technology/quickpath/introduction.pdf. Accessed 15 Sept 2019
Intel: Intel® Xeon® Processor E5–2680. https://ark.intel.com/content/www/us/en/ark/products/81908/intel-xeon-processor-e5-2680-v3-30m-cache-2-50-ghz.html. Accessed 15 Sept 2019
Kneschke, J.: Lighttpd. https://www.lighttpd.net/. Accessed 29 July 2019
Ko, R.K., Kirchberg, M., Lee, B.S., Chew, E.: Overcoming large data transfer bottlenecks in restful service orchestrations. In: IEEE 19th International Conference on Web Services, pp. 654–656. IEEE (2012)
Liu, J., et al.: Evaluation of HPC application i/o on object storage systems. In: 2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS), pp. 24–34. IEEE (2018)
Liu, J., et al.: Microbenchmark performance comparison of high-speed cluster interconnects. IEEE Micro 24(1), 42–51 (2004)
Lofstead, J., Jimenez, I., Maltzahn, C., Koziol, Q., Bent, J., Barton, E.: DAOS and friends: a proposal for an exascale storage system. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, pp. 585–596. IEEE (2016)
Ma, D., Zhang, W., Li, Q.: Dynamic scheduling algorithm for parallel real-time jobs in heterogeneous system. In: The Fourth International Conference on Computer and Information Technology, CIT 2004, pp. 462–466. IEEE (2004)
Mell, P., Grance, T., et al.: The NIST definition of cloud computing (2011)
ngtcp2: Effort to implement IETF QUIC protocol. https://github.com/ngtcp2/ngtcp2. Accessed 01 Apr 2020
NLANR/DAST: Iperf. https://github.com/esnet/iperf. Accessed 11 July 2019
OpenLiteSpeed: OpenLiteSpeed Web Server. https://openlitespeed.org/. Accessed 19 Dec 2019
OpenSSL: QUIC and OpenSSL. https://www.openssl.org/blog/blog/2020/02/17/QUIC-and-OpenSSL/. Accessed 01 Apr 2020
Richardson, L., Ruby, S.: RESTful Web Services. O’Reilly Media Inc., Newton (2008)
Tene, G.: A constant throughput, correct latency recording variant of wrk. https://github.com/giltene/wrk2. Accessed 11 July 2019
Thakur, R., Gropp, W., Lusk, E.: Data sieving and collective i/o in ROMIO . In: Proceedings of the Seventh Symposium on the Frontiers of Massively Parallel Computation, Frontiers 1999, pp. 182–189. IEEE (1999)
The MPI Forum, C.: MPI: a message passing interface. In: Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, Supercomputing 1993, pp. 878–883. ACM, New York (1993). https://doi.org/10.1145/169627.169855. http://doi.acm.org/10.1145/169627.169855
Tianhua, L., Hongfeng, Z., Guiran, C., Chuansheng, Z.: The design and implementation of zero-copy for Linux. In: Eighth International Conference on Intelligent Systems Design and Applications, vol. 1, pp. 121–126. IEEE (2008)
Wu, K., Arpaci-Dusseau, A., Arpaci-Dusseau, R.: Towards an unwritten contract of intel Optane SSD. In: 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 19), Renton, WA. USENIX Association (2019)
Zadok, E., Hildebrand, D., Kuenning, G., Smith, K.A.: POSIX is dead! long live... errr... what exactly. In: Proceedings of the 9th USENIX Conference on Hot Topics in Storage and File Systems, p. 12. USENIX Association (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gadban, F., Kunkel, J., Ludwig, T. (2020). Investigating the Overhead of the REST Protocol When Using Cloud Services for HPC Storage. In: Jagode, H., Anzt, H., Juckeland, G., Ltaief, H. (eds) High Performance Computing. ISC High Performance 2020. Lecture Notes in Computer Science(), vol 12321. Springer, Cham. https://doi.org/10.1007/978-3-030-59851-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-59851-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59850-1
Online ISBN: 978-3-030-59851-8
eBook Packages: Computer ScienceComputer Science (R0)