A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions

  • Rafael Ferreira da Silva
  • Tristan Glatard
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7640)


Archives of distributed workloads acquired at the infrastructure level reputably lack information about users and application-level middleware. Science gateways provide consistent access points to the infrastructure, and therefore are an interesting information source to cope with this issue. In this paper, we describe a workload archive acquired at the science-gateway level, and we show its added value on several case studies related to user accounting, pilot jobs, fine-grained task analysis, bag of tasks, and workflows. Results show that science-gateway workload archives can detect workload wrapped in pilot jobs, improve user identification, give information on distributions of data transfer times, make bag-of-task detection accurate, and retrieve characteristics of workflow executions. Some limits are also identified.


Grid Infrastructure Science Gateway Infrastructure Level European Grid Infrastructure Critical Path Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.J.: The grid workloads archive. Future Gener. Comput. Syst. 24(7), 672–686 (2008)CrossRefGoogle Scholar
  2. 2.
    Iosup, A., Epema, D.: Grid computing workloads: bags of tasks, workflows, pilots, and others. IEEE Internet Computing 15(2), 19–26 (2011)CrossRefGoogle Scholar
  3. 3.
    Kondo, D., Javadi, B., Iosup, A., Epema, D.: The failure trace archive: Enabling comparative analysis of failures in diverse distributed systems. In: CCGrid 2010, pp. 398–407 (2010)Google Scholar
  4. 4.
    Germain-Renaud, C., Cady, A., Gauron, P., Jouvin, M., Loomis, C., Martyniak, J., Nauroy, J., Philippon, G., Sebag, M.: The grid observatory. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 114–123 (2011)Google Scholar
  5. 5.
    Ostermann, S., Prodan, R., Fahringer, T., Iosup, R., Epema, D.: On the characteristics of grid workflows. In: CoreGRID Symposium - Euro-Par 2008 (2008)Google Scholar
  6. 6.
    Christodoulopoulos, K., Gkamas, V., Varvarigos, E.: Statistical analysis and modeling of jobs in a grid environment. Journal of Grid Computing 6, 77–101 (2008)CrossRefGoogle Scholar
  7. 7.
    Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 36–61. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Iosup, A., Jan, M., Sonmez, O., Epema, D.: The Characteristics and Performance of Groups of Jobs in Grids. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 382–393. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Ferreira da Silva, R., Camarasu-Pop, S., Grenier, B., Hamar, V., Manset, D., Montagnat, J., Revillard, J., Balderrama, J.R., Tsaregorodtsev, A., Glatard, T.: Multi-Infrastructure Workflow Execution for Medical Simulation in the Virtual Imaging Platform. In: HealthGrid 2011, Bristol, UK (2011)Google Scholar
  10. 10.
    Shahand, S., Santcroos, M., Mohammed, Y., Korkhov, V., Luyf, A.C., van Kampen, A., Olabarriaga, S.D.: Front-ends to Biomedical Data Analysis on Grids. In: Proceedings of HealthGrid 2011, Bristol, UK (June 2011)Google Scholar
  11. 11.
    Kacsuk, P.: P-GRADE Portal Family for Grid Infrastructures. Concurrency and Computation: Practice and Experience 23(3), 235–245 (2011)CrossRefGoogle Scholar
  12. 12.
    Ardizzone, V., Barbera, R., Calanducci, A., Fargetta, M., Ingrà, E., La Rocca, G., Monforte, S., Pistagna, F., Rotondo, R., Scardaci, D.: A European framework to build science gateways: architecture and use cases. In: 2011 TeraGrid Conference: Extreme Digital Discovery, pp. 43:1–43:2. ACM, New York (2011)Google Scholar
  13. 13.
    Krefting, D., Bart, J., Beronov, K., Dzhimova, O., Falkner, J., Hartung, M., Hoheisel, A., Knoch, T.A., Lingner, T., Mohammed, Y., Peter, K., Rahm, E., Sax, U., Sommerfeld, D., Steinke, T., Tolxdorff, T., Vossberg, M., Viezens, F., Weisbecker, A.: Medigrid: Towards a user friendly secured grid infrastructure. Future Generation Computer Systems 25(3), 326–336 (2009)CrossRefGoogle Scholar
  14. 14.
    Luckow, A., Weidner, O., Merzky, A., Maddineni, S., Santcroos, M., Jha, S.: Towards a common model for pilot-jobs. In: HPDC 2012, Delft, The Netherlands (2012)Google Scholar
  15. 15.
    Tsaregorodtsev, A., Brook, N., Ramo, A.C., Charpentier, P., Closier, J., Cowan, G., Diaz, R.G., Lanciotti, E., Mathe, Z., Nandakumar, R., Paterson, S., Romanovsky, V., Santinelli, R., Sapunov, M., Smith, A.C., Miguelez, M.S., Zhelezov, A.: DIRAC3. The New Generation of the LHCb Grid Software. Journal of Physics: Conference Series 219(6), 062029 (2009)Google Scholar
  16. 16.
    Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. Concurrency and Computation: Practice and Experience 17(2-4), 323–356 (2005)CrossRefGoogle Scholar
  17. 17.
    Ferreira da Silva, R., Glatard, T., Desprez, F.: Self-healing of operational workflow incidents on distributed computing infrastructures. In: IEEE/ACM CCGrid 2012, Ottawa, Canada, pp. 318–325 (2012)Google Scholar
  18. 18.
    Ilijasic, L., Saitta, L.: Characterization of a Computational Grid as a Complex System. In: Grid Meets Autonomic Computing (GMAC 2009), pp. 9–18 (June 2009)Google Scholar
  19. 19.
    Lingrand, D., Montagnat, J., Martyniak, J., Colling, D.: Optimization of jobs submission on the EGEE production grid: modeling faults using workload. Journal of Grid Computing (JOGC) Special Issue on EGEE 8(2), 305–321 (2010)CrossRefGoogle Scholar
  20. 20.
    Casanova, H.: On the harmfulness of redundant batch requests. In: International Symposium on High-Performance Distributed Computing, pp. 255–266 (2006)Google Scholar
  21. 21.
    Brasileiro, F., Gaudencio, M., Silva, R., Duarte, A., Carvalho, D., Scardaci, D., Ciuffo, L., Mayo, R., Hoeger, H., Stanton, M., Ramos, R., Barbera, R., Marechal, B., Gavillet, P.: Using a simple prioritisation mechanism to effectively interoperate service and opportunistic grids in the eela-2 e-infrastructure. Journal of Grid Computing 9, 241–257 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Rafael Ferreira da Silva
    • 1
  • Tristan Glatard
    • 1
  1. 1.CNRS, INSERM, CREATISUniversity of LyonVilleurbanneFrance

Personalised recommendations