Predico: A System for What-if Analysis in Complex Data Center Applications

  • Rahul Singh
  • Prashant Shenoy
  • Maitreya Natu
  • Vaishali Sadaphal
  • Harrick Vin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7049)


Modern data center applications are complex distributed systems with tens or hundreds of interacting software components. An important management task in data centers is to predict the impact of a certain workload or reconfiguration change on the performance of the application. Such predictions require the design of “what-if” models of the application that take as input hypothetical changes in the application’s workload or environment and estimate its impact on performance.

We present Predico, a workload-based what-if analysis system that uses commonly available monitoring information in large scale systems to enable the administrators to ask a variety of workload-based “what-if” queries about the system. Predico uses a network of queues to analytically model the behavior of large distributed applications. It automatically generates node-level queueing models and then uses model composition to build system-wide models. Predico employs a simple what-if query language and an intelligent query execution algorithm that employs on-the-fly model construction and a change propagation algorithm to efficiently answer queries on large scale systems. We have built a prototype of Predico and have used traces from two large production applications from a financial institution as well as real-world synthetic applications to evaluate its what-if modeling framework. Our experimental evaluation validates the accuracy of Predico’s node-level resource usage, latency and workload-models and then shows how Predico enables what-if analysis in two different applications.


  1. 1.
    Baskett, F., Mani Chandy, K., Muntz, R.R., Palacios, F.G.: Open, closed, and mixed networks of queues with different classes of customers. J. ACM 22(2), 248–260 (1975)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bennani, M.N., Menascé, D.A.: Resource allocation for autonomic data centers using analytic performance models. In: ICAC, Washington, DC, USA, pp. 229–240 (2005)Google Scholar
  3. 3.
    Denning, P.J., Buzen, J.P.: The operational analysis of queueing network models. ACM Comput. Surv. 10, 225–261 (1978)CrossRefzbMATHGoogle Scholar
  4. 4.
    Desnoyers, P., Wood, T., Shenoy, P., Patil, S., Vin, H.: Modellus: Automated Modeling of Complex Internet Data Center Applications. Technical report, UMass CS (2009)Google Scholar
  5. 5.
    Diao, Y., Hellerstein, J.L., Parekh, S.S., Shaikh, H., Surendra, M., Tantawi, A.N.: Modeling differentiated services of multi-tier web applications. In: MASCOTS, pp. 314–326 (2006)Google Scholar
  6. 6.
    Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19(1) (1991)Google Scholar
  7. 7.
    Jiang, G., Chen, H., Yoshihira, K.: Discovering Likely Invariants of Distributed Transaction Systems for Autonomic System Management. In: ICAC, Dublin, Ireland, pp. 199–208 (June 2006)Google Scholar
  8. 8.
    Kind, A., Hurley, P., Massar, J.: A Light-Weight and Scalable Network Profiling System. ERCIM News 60 (2005)Google Scholar
  9. 9.
    Menascé, D.A., Almeida, V.A.F.: Capacity planning for Web performance: metrics, models and methods. Prentice-Hall, Inc., Upper Saddle River (1998)Google Scholar
  10. 10.
    Christopher, S., Terence, K., Alex, Z.: Exploiting nonstationarity for performance prediction. In: EuroSys, pp. 31–44 (2007)Google Scholar
  11. 11.
    Stewart, C., Shen, K.: Performance Modeling and System Management for Multi-component Online Services. In: Proc. USENIX Symp. on Networked Systems Design and Implementation (NSDI) (May 2005)Google Scholar
  12. 12.
    Tariq, M., Zeitoun, A., Valancius, V., Feamster, N., Ammar, M.: Answering what-if deployment and configuration questions with wise. SIGCOMM Comput. Commun. Rev. 38(4), 99–110 (2008)CrossRefGoogle Scholar
  13. 13.
    Thereska, E., Abd-El-Malek, M., Wylie, J.J., Narayanan, D., Ganger, G.R.: Informed data distribution selection in a self-predicting storage system. In: ICAC, pp. 187–198. IEEE Computer Society, Washington, DC (2006)Google Scholar
  14. 14.
    Thereska, E., Ganger, G.R.: Ironmodel: robust performance models in the wild. SIGMETRICS Perform. Eval. Rev. 36(1), 253–264 (2008)CrossRefGoogle Scholar
  15. 15.
    TPC. The tpcw benchmark,
  16. 16.
    Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., Tantawi, A.: An Analytical Model for Multi-tier Internet Services and Its Applications. In: Proc. of the ACM SIGMETRICS Conf., Banff, Canada (June 2005)Google Scholar
  17. 17.
    Zhang, Q., Cherkasova, L., Smirni, E.: A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: ICAC, Washington, DC, USA (2007)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2011

Authors and Affiliations

  • Rahul Singh
    • 1
  • Prashant Shenoy
    • 1
  • Maitreya Natu
    • 2
  • Vaishali Sadaphal
    • 2
  • Harrick Vin
    • 2
  1. 1.Dept. of Computer ScienceUniversity of MassachusettsAmherstUSA
  2. 2.Tata Research Development and Design CenterPuneIndia

Personalised recommendations