Job Provenance – Insight into Very Large Provenance Datasets
Following the job-centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per-job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally.
The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform non-trivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible.
The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo — a computational experiment — is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.
KeywordsHippocampus Volume Grid Environment Main Message Interactive Query Pilot Application
- 2.Křenek, A., et al.: gLite job provenance—a job-centric view. Concurrency and Computation: Practice and Experience 20(5) (2007) doi: 10.1002/cpe.1252 Google Scholar
- 3.Křenek, A., et al.: Multiple ligand trajectory docking study —semiautomatic analysis of molecular dynamics simulations using EGEE gLite services. In: Proc. Euromicro Conference on Parallel Distributed and network-based Processing (2008)Google Scholar
- 4.Schovancová, J., et al.: VO AUGER large scale Monte Carlo simulations using the EGEE grid environment. In: 3rd EGEE User Forum, Clermont-Ferrand, France (2008)Google Scholar
- 5.Křenek, A., et al.: Experimental evaluation of job provenance in ATLAS environment. J. Phys.: Conf. Series (accepted, 2007)Google Scholar
- 7.Matyska, L., et al.: Job tracking on a grid—the Logging and Bookkeeping and Job Provenance services. Technical Report 9/2007, CESNET (2007), http://www.cesnet.cz/doc/techzpravy