International Journal of Parallel Programming

, Volume 44, Issue 5, pp 1054–1077 | Cite as

SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services

  • Rui Han
  • Jianfeng Zhan
  • Jose Vazquez-Poletti Luis
Article

Abstract

Despite the importance of providing quick responsiveness to user requests for online services, such request processing is very resource expensive when dealing with large-scale service datasets. These often exceed the service providers’ budget when services are deployed on a cloud, in which resources are charged in monetary terms. Providing approximate processing results in request processing is a feasible solution for such problem that trades off result correctness (e.g. prediction or query accuracy) for response time reduction. However, existing techniques in this area either use parts of datasets or skip expensive computations to produce approximate results, thus resulting in large losses in result correctness on a tight resource budget. In this paper, we propose Synopsis-based Approximate Request Processing (SARP), a SARP framework to produce approximate results with small correctness losses even using small amount of resources. To achieve this, SARP conducts computations over synopses, which aggregate the statistical information of the entire service dataset at different approximation levels, based on two key ideas: (1) offline synopsis management that generates and maintains a set of synopses that represent the aggregation information of the dataset at different approximation levels. (2) Online synopsis selection that considers both the current resource allocation and the workload status so as to select the synopsis with the maximal length that can be processed within the required response time. We demonstrate the effectiveness of our approach by testing the recommendation services in e-commerce sites using a large, real-world dataset. Using prediction accuracy as the result correctness metric, the results demonstrate: (i) SARP achieves significant response time reduction with very small correctness losses compared to the exact processing results; (ii) using the same processing time, SARP demonstrates a considerable reduction in correctness loss compared to existing approximation techniques.

Keywords

Cloud online service Approximate request processing Result correctness Synopsis 

References

  1. 1.
    Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)CrossRefGoogle Scholar
  2. 2.
    Farber, D.: Google’s marissa mayer: speed wins. ZDNet Between the Lines. http://www.zdnet.com/article/googles-marissa-mayer-speed-wins/#! (2006)
  3. 3.
    Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 4 (2009)CrossRefGoogle Scholar
  4. 4.
    Chippa, V.K., Chakradhar, S.T., Roy, K. and Raghunathan, A.: Analysis and characterization of inherent application resilience for approximate computing. In: DAC’13, pp. 113. ACM (2013)Google Scholar
  5. 5.
    Agarwal, S., Mozafari, B., Panda, A., Milner, H., Madden, S. and Stoica, I.: Blinkdb: queries with bounded errors and bounded response times on very large data. In: EuroSys’13, pp. 29–42. ACM (2013)Google Scholar
  6. 6.
    Kranen, P., Assent, I., Baldauf, C. and Seidl, T.: Self-adaptive anytime stream clustering. In: ICDM’09, pp. 249–258. IEEE, Piscataway, NJ (2009)Google Scholar
  7. 7.
    Wu, G., Chang, E., Chen, Y.K. and Hughes, C.: Incremental approximate matrix factorization for speeding up support vector machines. In: SIGKDD’06, pp. 760–766. ACM, New York, NY (2006)Google Scholar
  8. 8.
    Yang, Y., Webb, G., Korb, K., Ting, K.M.: Classifying under computational resource constraints: anytime classification using probabilistic estimators. Mach. Learn. 69(1), 35–53 (2007)CrossRefGoogle Scholar
  9. 9.
    Zilberstein, S.: Using anytime algorithms in intelligent systems. AI Mag. 17(3), 73 (1996)Google Scholar
  10. 10.
    Kirsch, C.M. and Payer, H.: Incorrect systems: it’s not the problem, it’s the solution. In: DAC’12, pp. 913–917. ACM, New York, NY (2012)Google Scholar
  11. 11.
  12. 12.
  13. 13.
  14. 14.
    Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Rev. 37(4), 573–595 (1995)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRefGoogle Scholar
  16. 16.
    Gorrell, G.: Generalized hebbian algorithm for incremental singular value decomposition in natural language processing. In: EACL’06, Citeseer pp. 97–104. Association for Computational Linguistics (ACL), New Brunswick, NJ (2006)Google Scholar
  17. 17.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching, In: SIGMOD ’84, pp. 47–57. ACM, New York, NY (1984)Google Scholar
  18. 18.
    Amazon movie reviews dataset. http://snap.stanford.edu/data/web-Movies.html
  19. 19.
    Pearson, K.: Notes on the history of correlation. Biometrika 13(1), 25–45 (1920)CrossRefGoogle Scholar
  20. 20.
  21. 21.
    Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. ACM SIGOPS Oper. Syst. Rev. 37(5), 164–177 (2003)CrossRefGoogle Scholar
  22. 22.
    Candillier, L., Meyer, F. and Boullé M.: Comparing state-of-the-art collaborative filtering systems. In: Petra Perner (ed.) Machine Learning and Data Mining in Pattern Recognition, pp. 548–562. Springer, Berlin Heidelberg (2007)Google Scholar
  23. 23.
    Bacigalupo, D.A., Van Hemert, J., Chen, X., Usmani, A., Chester, A.P., He, L., Dillenberger, D.N., Wills, G.B., Gilbert, L., Jarvis, S.A.: Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models. Simul. Model. Pract. Theory 19(6), 1479–1495 (2011)CrossRefGoogle Scholar
  24. 24.
    Movielens 1 million dataset. http://grouplens.org/datasets/movielens/
  25. 25.
  26. 26.
  27. 27.
    Han, R., Ghanem, M.M., Guo, L., Guo, Y., Osmond, M.: Enabling cost-aware and adaptive elasticity of multi-tier cloud applications. Future Gener. Comput. Syst. 32, 82–98 (2014)CrossRefGoogle Scholar
  28. 28.
    Han, R., Guo, L., Ghanem, M.M. and Guo, Y.: Lightweight resource scaling for cloud applications. In: CCGrid’12, pp. 644–651. IEEE (2012)Google Scholar
  29. 29.
    Liu, S., Ren, S., Quan, G., Zhao, M. and Ren, S.: Profit aware load balancing for distributed cloud data centers. In: IPDPS’13, pp. 611–622. IEEE (2013)Google Scholar
  30. 30.
    Schwiegelshohn, U. and Tchernykh, A.: Online scheduling for cloud computing and different service levels. In: IPDPSW’12, pp. 1067–1074. IEEE (2012)Google Scholar
  31. 31.
    Luis, J.V.-P., Moreno-Vozmediano, R., Llorente, I.M.: Admission Control in the Cloud: Algorithms for SLA-Based Service Model. In Handbook of Research on Architectural Trends in Service-Driven Computing. IGI Global, Hershey (2014)Google Scholar
  32. 32.
    Han, R., Ghanem, M. and Guo, Y.: Towards elastic algorithms as a new model of computation for the cloud. Int. J. Next Gener. Comput. 4(2), 82–98. (2013)Google Scholar
  33. 33.
    He, Y., Elnikety, S. and Sun, H.: Tians scheduling: using partial processing in best-effort applications. In: ICDCS’11, pp. 434–445. IEEE, Piscataway, NJ (2011)Google Scholar
  34. 34.
    Kranen, P., Seidl, T.: Harnessing the strengths of anytime algorithms for constant data streams. Data Mining Knowl. Discov. 19(2), 245–260 (2009)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Chakrapani, L.N., Muntimadugu, K.K., Lingamneni, A., George, J. and Palem, K.V.: Highly energy and performance efficient embedded computing through approximately correct arithmetic: a mathematical foundation and preliminary experimental validation. In: CASES’08, pp. 187–196. ACM (2008)Google Scholar
  36. 36.
    Trancoso, P.: Getting ready for approximate computing: trading parallelism for accuracy for dss workloads. In: CF’14, p. 8. ACM, New York, NY (2014)Google Scholar
  37. 37.
    Gaber, M.M. and Yu P.S.: A framework for resource-aware knowledge discovery in data streams: a holistic approach with its application to clustering. In: SAC’06, pp. 649–656. ACM (2006)Google Scholar
  38. 38.
    Poladian, V., Sousa, J.P., Garlan, D. and Shaw, M.: Dynamic configuration of resource-aware services. In: ICSE’04, pp. 604–613. IEEE, Piscataway, NJ (2004)Google Scholar
  39. 39.
    Han, R., Zhan, S., Shao, C., Wang, J., Xu, J., John, L.K., Gang, L. and Wang, L.: Bigdatabench-mt: a benchmark tool for generating realistic mixed data center workloads. In: SoCC’15, vol. 9495, pp. 7–18. Springer, Berlin, Heidelberg, Germany (2015)Google Scholar
  40. 40.
    Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H. and Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: google trace analysis. In: SoCC’12, p. 7. ACM, New York, NY (2012)Google Scholar
  41. 41.
    Han, R., Wang, J., Huang, S., Shao, C., Zhan, S., Zhan, J. and Vazquez-Poletti, J.L.: Pcs: Predictive component-level scheduling for reducing tail latency in cloud online services. In: ICPP’15, pp. 490–499. IEEE, Piscataway, NJ (2015)Google Scholar
  42. 42.
    Han, R., Wang, J., Ge, F., Vazquez-Poletti, J.L. and Zhan, J.: Sarp: producing approximate results with small correctness losses for cloud interactive services. In: CF’15, p. 22. ACM, New York, NY (2015)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Rui Han
    • 1
  • Jianfeng Zhan
    • 1
  • Jose Vazquez-Poletti Luis
    • 2
  1. 1.Institute Of Computing TechnologyChinese Academy of SciencesBeijingChina
  2. 2.Universidad Complutense de MadridMadridSpain

Personalised recommendations