Skip to main content

Building a Wide-Area File Transfer Performance Predictor: An Empirical Study

  • Conference paper
  • First Online:
Machine Learning for Networking (MLN 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11407))

Included in the following conference series:

Abstract

Wide-area data transfer is central to geographically distributed scientific workflows. Faster delivery of data is important for these workflows. Predictability is equally (or even more) important. With the goal of providing a reasonably accurate estimate of data transfer time to improve resource allocation & scheduling for workflows and enable end-to-end data transfer optimization, we apply machine learning methods to develop predictive models for data transfer times over a variety of wide area networks. To build and evaluate these models, we use 201,388 transfers, involving 759 million files totaling 9 PB transferred, over 115 heavily used source-destination pairs (“edges”) between 135 unique endpoints. We evaluate the models for different retraining frequencies and different window size of history data. In the best case, the resulting models have a median prediction error of \(\le \)21% for 50% of the edges, and \(\le \)32% for 75% of the edges. We present a detailed analysis of these results that provides insights into the cause of some of the high errors. We envision that the performance predictor will be informative for scheduling geo-distributed workflows. The insights also suggest obvious directions for both further analysis and transfer service optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kettimuthu, R., Agrawal, G., Sadayappan, P., Foster, I.: Differentiated scheduling of response-critical and best-effort wide-area data transfers. In: 2016 IEEE International Parallel and Distributed Processing Symposium, pp. 1113–1122, May 2016

    Google Scholar 

  2. Allcock, W., et al.: Data management and transfer in high-performance computational grid environments. Parallel Comput. 28(5), 749–771 (2002). https://doi.org/10.1016/S0167-8191(02)00094-7

    Article  Google Scholar 

  3. Kettimuthu, R., Liu, Z., Wheeler, D., Foster, I., Heitmann, K., Cappello, F.: Transferring a petabyte in a day. Future Gener. Comput. Syst. 88, 191–198 (2018). https://doi.org/10.1016/j.future.2018.05.051

    Article  Google Scholar 

  4. Stavrinides, G.L., Duro, F.R., Karatza, H.D., Blas, J.G., Carretero, J.: Different aspects of workflow scheduling in large-scale distributed systems. Simul. Model. Pract. Theory 70, 120–134 (2017). https://doi.org/10.1016/j.simpat.2016.10.009

    Article  Google Scholar 

  5. Liu, Z., Kettimuthu, R., Leyffer, S., Palkar, P., Foster, I.: A mathematical programming- and simulation-based framework to evaluate cyberinfrastructure design choices. In: IEEE 13th International Conference on e-Science, October 2017, pp. 148–157 (2017). https://doi.org/10.1109/eScience.2017.27

  6. Bicer, T., Gürsoy, D., Kettimuthu, R., De Carlo, F., Foster, I.T.: Optimization of tomographic reconstruction workflows on geographically distributed resources. J. Synchrotron Radiat. 23(4), 997–1005 (2016)

    Article  Google Scholar 

  7. Kettimuthu, R., et al.: Toward autonomic science infrastructure: architecture, limitations, and open issues. In: The 1st Autonomous Infrastructure for Science Workshop, AI-Science 2018. ACM, New York (2018). https://doi.org/10.1145/3217197.3217205

  8. Rao, N.S.V., Liu, Q., Liu, Z., Kettimuthu, R., Foster, I.: Throughput analytics of data transfer infrastructures. In: Gao, H., Yin, Y., Yang, X., Miao, H. (eds.) TridentCom 2018. LNICST, vol. 270, pp. 20–40. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12971-2_2

    Chapter  Google Scholar 

  9. Kettimuthu, R., Vardoyan, G., Agrawal, G., Sadayappan, P., Foster, I.: An elegant sufficiency: load-aware differentiated scheduling of data transfers. In: SC15: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12, November 2015

    Google Scholar 

  10. Vazhkudai, S.: Enabling the co-allocation of grid data transfers. In: Proceedings of First Latin American Web Congress, pp. 44–51, November 2003

    Google Scholar 

  11. Wei, D.X., Jin, C., Low, S.H., Hegde, S.: FAST TCP: motivation, architecture, algorithms, performance. IEEE/ACM Trans. Netw. 14(6), 1246–1259 (2006)

    Article  Google Scholar 

  12. Tierney, B., Johnston, W., Crowley, B., Hoo, G., Brooks, C., Gunter, D.: The NetLogger methodology for high performance distributed systems performance analysis. In: 7th International Symposium on High Performance Distributed Computing, pp. 260–267. IEEE (1998)

    Google Scholar 

  13. Kosar, T., Kola, G., Livny, M.: Data pipelines: enabling large scale multi-protocol data transfers. In: 2nd Workshop on Middleware for Grid Computing, pp. 63–68 (2004)

    Google Scholar 

  14. Kelly, T.: Scalable TCP: improving performance in highspeed wide area networks. ACM SIGCOMM Comput. Commun. Rev. 33(2), 83–91 (2003)

    Article  MathSciNet  Google Scholar 

  15. Wolski, R.: Forecasting network performance to support dynamic scheduling using the Network Weather Service. In: 6th IEEE Symposium on High Performance Distributed Computing, Portland, Oregon (1997)

    Google Scholar 

  16. Hacker, T.J., Athey, B.D., Noble, B.: The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network. In: 16th International Parallel and Distributed Processing Symposium, IPDPS 2002, p. 314. IEEE Computer Society, Washington, DC (2002). http://dl.acm.org/citation.cfm?id=645610.661894

  17. Rao, N.S.V., Sen, S., Liu, Z., Kettimuthu, R., Foster, I.: Learning concave-convex profiles of data transport over dedicated connections. In: Renault, É., Mühlethaler, P., Boumerdassi, S. (eds.) MLN 2018. LNCS, vol. 11407, pp. 1–22. Springer, Cham (2019)

    Google Scholar 

  18. Liu, Z., Balaprakash, P., Kettimuthu, R., Foster, I.: Explaining wide area data transfer performance. In: 26th ACM Symposium on High-Performance Parallel and Distributed Computing (2017)

    Google Scholar 

  19. Allcock, W., et al.: The Globus striped GridFTP framework and server. In: SC, Washington, DC, USA, pp. 54–61 (2005)

    Google Scholar 

  20. www.slac.stanford.edu/abh/bbcp/, BBCP (2017). http://www.slac.stanford.edu/~abh/bbcp/. Accessed 3 Jan 2017

  21. FDT: FDT - Fast Data Transfer. http://monalisa.cern.ch/FDT/. Accessed Apr 2017

  22. Settlemyer, B.W., Dobson, J.D., Hodson, S.W., Kuehn, J.A., Poole, S.W., Ruwart, T.M.: A technique for moving large data sets over high-performance long distance networks. In: 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–6, May 2011

    Google Scholar 

  23. Chard, K., Tuecke, S., Foster, I.: Globus: recent enhancements and future plans. In: XSEDE 2016 Conference on Diversity, Big Data, and Science at Scale, p. 27. ACM (2016)

    Google Scholar 

  24. Deelman, E., et al.: Pegasus: a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015)

    Article  Google Scholar 

  25. Arslan, E., Guner, K., Kosar, T.: Harp: predictive transfer optimization based on historical analysis and real-time probing. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, pp. 288–299, November 2016

    Google Scholar 

  26. Liu, Z., Kettimuthu, R., Foster, I., Beckman, P.H.: Towards a smart data transfer node. Future Gener. Comput. Syst. 89, 10–18 (2018)

    Article  Google Scholar 

  27. Arslan, E., Guner, K., Kosar, T.: HARP: predictive transfer optimization based on historical analysis and real-time probing. In: SC, Piscataway, NJ, USA, pp. 25:1–25:12 (2016). http://dl.acm.org/citation.cfm?id=3014904.3014938

  28. Arslan, E., Kosar, T.: A heuristic approach to protocol tuning for high performance data transfers, ArXiv e-prints, August 2017

    Google Scholar 

  29. Kim, J., Yildirim, E., Kosar, T.: A highly-accurate and low-overhead prediction model for transfer throughput optimization. Clust. Comput. 18(1), 41–59 (2015)

    Article  Google Scholar 

  30. www.maxmind.com: MaxMind: IP Geolocation and Online Fraud Prevention (2017). https://www.maxmind.com. Accessed 3 Apr 2017

  31. Maclin, R., Opitz, D.W.: Popular ensemble methods: an empirical study, CoRR, vol. abs/1106.0257 (2011). http://arxiv.org/abs/1106.0257

  32. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  Google Scholar 

  33. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  34. Ho, T.K.: Random decision forests. In: 3rd International Conference on Document Analysis and Recognition, ICDAR 1995, pp. 278–282. IEEE (1995). http://dl.acm.org/citation.cfm?id=844379.844681

  35. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  36. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  Google Scholar 

  37. Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59119-2_166

    Chapter  Google Scholar 

  38. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

    Article  MathSciNet  Google Scholar 

  39. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system, arXiv preprint arXiv:1603.02754 (2016)

  40. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012). http://dl.acm.org/citation.cfm?id=2188385.2188395

  41. Vazhkudai, S., Schopf, J.M., Foster, I.: Predicting the performance of wide area data transfers. In: International Parallel and Distributed Processing Symposium, 10-pp. IEEE (2001)

    Google Scholar 

  42. Swany, M., Wolski, R.: Multivariate resource performance forecasting in the Network Weather Service. In: Supercomputing Conference, p. 11. IEEE (2002)

    Google Scholar 

  43. Lu, D., Qiao, Y., Dinda, P.A., Bustamante, F.E.: Characterizing and predicting TCP throughput on the wide area network. In: 25th IEEE International Conference on Distributed Computing Systems, pp. 414–424. IEEE (2005)

    Google Scholar 

  44. He, Q., Dovrolis, C., Ammar, M.: On the predictability of large transfer TCP throughput. Comput. Netw. 51(14), 3959–3977 (2007)

    Article  Google Scholar 

  45. Huang, T.-i., Subhlok, J.: Fast pattern-based throughput prediction for TCP bulk transfers. In: International Symposium on Cluster Computing and the Grid, vol. 1, pp. 410–417. IEEE (2005)

    Google Scholar 

  46. Shah, S.M.H., ur Rehman, A., Khan, A.N., Shah, M.A.: TCP throughput estimation: a new neural networks model. In: International Conference on Emerging Technologies, pp. 94–98. IEEE (2007)

    Google Scholar 

  47. Mirza, M., Sommers, J., Barford, P., Zhu, X.: A machine learning approach to TCP throughput prediction. IEEE/ACM Trans. Netw. 18(4), 1026–1039 (2010)

    Article  Google Scholar 

  48. Kettimuthu, R., Vardoyan, G., Agrawal, G., Sadayappan, P.: Modeling and optimizing large-scale wide-area data transfers. In: 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 196–205. IEEE (2014)

    Google Scholar 

  49. Nine, M., Guner, K., Kosar, T.: Hysteresis-based optimization of data transfer throughput. In: 5th International Workshop on Network-Aware Data Management, p. 5. ACM (2015)

    Google Scholar 

  50. Hours, H., Biersack, E., Loiseau, P.: A causal approach to the study of TCP performance. ACM Trans. Intell. Syst. Technol. (TIST) 7(2), 25 (2016)

    Google Scholar 

  51. Liu, Z., Kettimuthu, R., Foster, I., Rao, N.S.V.: Cross-geography scientific data transferring trends and behavior. In: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018, pp. 267–278. ACM, New York (2018). https://doi.org/10.1145/3208040.3208053

  52. Liu, Z., Kettimuthu, R., Foster, I., Liu, Y.: A comprehensive study of wide area data movement at a scientific computing facility. In: IEEE International Conference on Distributed Computing Systems. Scalable Network Traffic Analytics. IEEE (2018)

    Google Scholar 

  53. Rao, N., Liu, Q., Sen, S., Liu, Z., Kettimuthu, R., Foster, I.: Measurements and analytics of wide-area file transfers over dedicated connections. In: 20th International Conference on Distributed Computing and Networking. ACM (2019)

    Google Scholar 

Download references

Acknowledgments

This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contract DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengchun Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Z., Kettimuthu, R., Balaprakash, P., Rao, N.S.V., Foster, I. (2019). Building a Wide-Area File Transfer Performance Predictor: An Empirical Study. In: Renault, É., Mühlethaler, P., Boumerdassi, S. (eds) Machine Learning for Networking. MLN 2018. Lecture Notes in Computer Science(), vol 11407. Springer, Cham. https://doi.org/10.1007/978-3-030-19945-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-19945-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-19944-9

  • Online ISBN: 978-3-030-19945-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics