Journal of Grid Computing

, Volume 14, Issue 3, pp 463–476 | Cite as

Time-Series Forecast Modeling on High-Bandwidth Network Measurements

Article

Abstract

With the increasing number of geographically distributed scientific collaborations and the growing sizes of scientific data, it has become challenging for users to achieve the best possible network performance on a shared network. We have developed a model to forecast expected bandwidth utilization on high-bandwidth wide area networks. The forecast model can improve the efficiency of the resource utilization and scheduling of data movements on high-bandwidth networks to accommodate ever increasing data volume for large-scale scientific data applications. A univariate time-series forecast model is developed with the Seasonal decomposition of Time series by Loess (STL) and the AutoRegressive Integrated Moving Average (ARIMA) on Simple Network Management Protocol (SNMP) path utilization measurement data. Compared with the traditional approach such as Box-Jenkins methodology to train the ARIMA model, our forecast model reduces computation time up to 92.6 %. It also shows resilience against abrupt network usage changes. Our forecast model conducts the large number of multi-step forecast, and the forecast errors are within the mean absolute deviation (MAD) of the monitored measurements.

Keywords

Data modeling Time series Prediction model Network measurements Network traffic 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Energy Sciences Network (ESnet) http://www.es.net/ (2014)
  2. 2.
    Network Simulator (ns2). http://www.isi.edu/nsnam/ns/ (2014)
  3. 3.
    Aceto, G., Botta, A., Pescapé, A., D’Arienzo, M.: Unified architecture for network measurement: The case of available bandwidth. J. Netw. Comput. Appl. 35(5), 1402–1414 (2012)CrossRefGoogle Scholar
  4. 4.
    Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Balman, M., Chaniotakisy, E., Shoshani, A., Sim, A.: A flexible reservation algorithm for advance network provisioning. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM/IEEE (2010)Google Scholar
  6. 6.
    Benson, T., Akella, A., Maltz, D.A.: Network traffic characteristics of data centers in the wild. In: Proceedings of the Conference on Internet Measurement - IMC ’10, pp. 267–280. ACM, New York (2010)Google Scholar
  7. 7.
    Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley (2013)Google Scholar
  8. 8.
    Brockwell, P., Davis, R.: Time Series: Theory and Methods. Springer-Verlag (2009)Google Scholar
  9. 9.
    Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I.: STL: A seasonal-trend decomposition procedure based on loess. J. Official Stat. 6(1), 3–73 (1990)Google Scholar
  10. 10.
    Cleveland, W., Devlin, S.: Locally weighted regression: an approach to regression analysis by local fitting. J. Amer. Stat. Assoc. 83(403), 596–610 (1988)CrossRefMATHGoogle Scholar
  11. 11.
    Cortez, P., Rio, M., Rocha, M., Sousa, P.: Multi-scale internet traffic forecasting using neural networks and time series methods 29(2), 143–155 http://onlinelibrary.wiley.com/doi/10.1111/j.1468-0394. 2010.00568.x/abstract
  12. 12.
    Croce, D., Melliay, M., Leonardiy, E.: The quest for bandwidth estimation techniques for large-scale distributed systems. ACM SIGMETRICS Perform. Eval. Rev. 37(3), 20–25 (2010)CrossRefGoogle Scholar
  13. 13.
    Crovella, M., Bestavros, A.: Self-similarity in World Wide Web traffic: Evidence and possible causes. IEEE/ACM Trans.on Network. 5(6), 835–846 (1997)CrossRefGoogle Scholar
  14. 14.
    Diebold, F.X., Mariano, R.S.: Comparing predictive accuracy 13(3), 253–263 http://amstat.tandfonline.com/doi/abs/10.1080/07350015.1995.10524599
  15. 15.
    Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: SIGCOMM ’03. pp. 137–148. ACMGoogle Scholar
  16. 16.
    Feamster, N., Rexford, J., Zegura, E.: The road to SDN: An intellectual history of programmable networks. ACM SIGCOMM Comput. Commun. Rev. 44(2), 87–98 (2014)CrossRefGoogle Scholar
  17. 17.
    Gonzalez, B.P., Snchez, G.G., Donate, J.P., Cortez, P., Miguel, A.S.d.: Parallelization of an evolving artificial neural networks system to forecast time series using OPENMP and MPI. In: 2012 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS). pp. 186–191Google Scholar
  18. 18.
    Hampel, F.R.: The influence curve and its role in robust estimation. J. Amer. Stat. Assoc. 69(346), 383–393 (1974)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    He, Q., Dovrolis, C., Ammar, M.: On the predictability of large transfer TCP throughput. In: Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, vol. 35. ACM, New York (2005)Google Scholar
  20. 20.
    Hjorth, J.: Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap. CRC Press (1993)Google Scholar
  21. 21.
    Hu, N., Steenkiste, P.: Evaluation and characterization of available bandwidth probing techniques. IEEE J. Selected Areas Commun. 21(6), 879–894 (2003)CrossRefGoogle Scholar
  22. 22.
    Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S.: A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18(3), 439–454 (2002)CrossRefGoogle Scholar
  23. 23.
    Jain, M., Dovrolis, C.: End-to-end available bandwidth. In: Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications - SIGCOMM ’02, vol. 32, p. 295. ACM Press, New York (2002)Google Scholar
  24. 24.
    Krithikaivasan, B., Zeng, Y., Deka, K., Medhi, D.: ARCH-based traffic forecasting and dynamic bandwidth provisioning for periodically measured nonstationary traffic. IEEE/ACM Trans. Network. 15 (3), 683–696 (2007)CrossRefGoogle Scholar
  25. 25.
    Kwiatkowski, D., Phillips, P.C., Schmidt, P., Shin, Y.: Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econom. 54(1-3), 159–178 (1992)CrossRefMATHGoogle Scholar
  26. 26.
    Leland, W., Taqqu, M., Willinger, W., Wilson, D.: On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Trans. Netw 2(1) (1994)Google Scholar
  27. 27.
    Ljung, G., Box, G.: On a measure of lack of fit in time series models. Biometrika 65(2), 297–303 (1978)MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    Lu, D., Qiao, Y., Dinda, P., Bustamante, F.: Characterizing and predicting TCP throughput on the wide area network. In: 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05). pp. 414–424. IEEE (2005)Google Scholar
  29. 29.
    Mirza, M., Sommers, J., Barford, P.: A machine learning approach to TCP throughput prediction. IEEE/ACM Trans. Netw. 18(4), 1026–1039 (2010)CrossRefGoogle Scholar
  30. 30.
    Papagiannaki, K., Taft, N., Zhang, Z.L., Diot, C.: Long-term forecasting of internet backbone traffic. IEEE Trans. Neural Netw. 16(5), 1110–1124 (2005)CrossRefGoogle Scholar
  31. 31.
    Paxson, V., Floyd, S.: Wide area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995)CrossRefGoogle Scholar
  32. 32.
    Pearson, R.: Data cleaning for dynamic modeling and control. In: Proceedings of European Control Conference (1999)Google Scholar
  33. 33.
    Qiao, Y., Skicewicz, J., Dinda, P.: An empirical study of the multiscale predictability of network traffic. In: Proceedings of the International Symposium on High performance Distributed Computing. pp. 66–76. IEEE (2004)Google Scholar
  34. 34.
    Ribeiro, V.J., Riedi, R.H., Baraniuk, R.G., Navratil, J., Cottrell, L.: pathChirp: Efficient available bandwidth estimation for network paths. In: Proceedings of the Passive and Active Measurements (PAM) Workshop (2003)Google Scholar
  35. 35.
    Sang, A., Li, S.q.: A predictability analysis of network traffic. Comput. Netw. 39(4), 329–345 (2002)CrossRefGoogle Scholar
  36. 36.
    Shao, J.: An asymptotic theory for linear model selection. Statistica Sinica 7, 221–264 (1997)MathSciNetMATHGoogle Scholar
  37. 37.
    Shriram, A., Kaur, J.: Empirical evaluation of techniques for measuring available bandwidth. In: Proceedings of the International Conference on Computer Communications. pp. 2162–2170. IEEE (2007)Google Scholar
  38. 38.
    Strauss, J., Katabi, D., Kaashoek, F.: A measurement study of available bandwidth estimation tools. In: Proceedings of the Conference on Internet Measurement - IMC ’03, pp. 39–44. ACM, New York (2003)Google Scholar
  39. 39.
    Yin, D., Yildirim, E., Kulasekaran, S., Ross, B., Kosar, T.: A data throughput prediction and optimization service for widely distributed many-task computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 899–909 (2011)CrossRefGoogle Scholar
  40. 40.
    Yoo, W., Sim, A.: Network bandwidth utilization forecast model on high bandwidth networks. In: Proceedings of the IEEE International Conference on Computing, Networking and Communications (2015)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht (outside the USA) 2016

Authors and Affiliations

  1. 1.Lawrence Berkeley National LaboratoryBerkeleyUSA

Personalised recommendations