Abstract
Workflow management is a ubiquitous task faced by many organizations, and entails the coordination of various activities. This coordination is increasingly carried out by software systems called workflow management systems (WFMS). An important component of many WFMSs is a DBMS for keeping track of workflow activity. This DBMS maintains an audit trail, or event history, that records the results of each activity. Like other data, the event history can be indexed and queried, and views can be defined on top of it. In addition, a WFMS must accommodate frequent workflow changes, which result from a rapidly evolving business environment. Since the database schema depends on the workflow, the DBMS must also support dynamic schema evolution. These requirements are especially challenging in high-throughput WFMSs—i.e., systems for managing high-volume, mission-critical workflows. Unfortunately, existing database benchmarks do not capture the combination of flexibility and performance required by these systems. To address this issue, we have developed LabFlow-1, the first version of a benchmark that concisely captures the DBMS requirements of high-throughput WFMSs. LabFlow-1 is based on the data and workflow management needs of a large genome-mapping laboratory, and reflects their real-world experience. In addition, we use LabFlow-1 to test the usability and performance of two object storage managers. These tests revealed substantial differences between these two systems, and highlighted the critical importance of being able to control locality of reference to persistent data.
This work was supported by funds from the U.S. National Institutes of Health, National Center for Human Genome Research, grant number P50 HG00098, and from the U.S. Department of Energy under contract DE-FG02-95ER62101.
Preview
Unable to display preview. Download preview PDF.
References
T.L. Anderson, A.J. Berre, M. Mallison, H.H. Porter, and B. Schneider. The hypermodel benchmark. In Proceedings of the International Conference on Extending Database Technology (EDBT), pages 317–331, Venice, Italy, March 1990.
A. Bonner, A. Shrufi, and S. Rozen. LabFlow-1: a database benchmark for high-throughput workflow management. Technical report, Department of Computer Science, University of Toronto, 1995. 53 pages. Available at ftp://db.toronto.edu/pub/bonner/papers/workflow/report.ps.gz.
M.J. Carey, D.J. DeWitt, M.J. Franklin, et al. Shoring up persistent applications. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 383–394, Minneapolis, MN, May 1994.
M.J. Carey, D.J. DeWitt, and J.F. Naughton. The OO7 benchmark. Technical report, Computer Sciences Department, University of Wisconsin-Madison, January 1994. Available at ftp://ftp.cs.wisc.edu/oo7/techreport.ps.
R.G.G. Cattell. An engineering database benchmark. In [10], chapter 6, pages 247–281.
A. Chaudhri. An Annotated Bibliography of Benchmarks for Object Databases. SIGMOD Record, 24(1):50–57, March 1995.
Communications of the ACM, 34(11), November 1991. Special issue on the Human Genome Project.
D. Georgakopoulos, M. Hornick, and A. Sheth. An overview of workflow management: From process modeling to infrastructure for automation. Journal on Distributed and Parallel Database Systems, 3(2):119–153, April 1995.
Nathan Goodman. An object oriented DBMS war story: Developing a genome mapping database in C++. In Won Kim, editor, Modern Database Management: Object-Oriented and Multidatabase Technologies. ACM Press, 1994.
Jim Gray, editor. The Benchmark Handbook for Database and Transaction Processing Systems. Morgan Kaufmann, San Mateo, CA, 1991.
M. Hsu, Ed. Special issue on workflow and extended transaction systems. Bulletin of the Technical Committee on Data Engineering (IEEE Computer Society), 16(2), June 1993.
Setrag Khoshafian and Marek Buckiewicz. Introduction to Groupware, Workflow, and Workgroup Computing. John Wiley & Sons, Inc., 1995.
Charles Lamb, Gordon Landis, Jack Orenstein, and Dan Weinreb. The ObjectStore database system. Communications of the ACM, 34(10):50–63, October 1991.
Allen S. Nakagawa. LIMS: Implementation and Management. Royal Society of Chemistry, Thomas Granham House, The Science Park, Cambridge CB4 4WF, England, 1994.
P. O'Neal. The set query benchmark. In [10], chapter 5, pages 209–245.
Steve Rozen, Lincoln Stein, and Nathan Goodman. Constructing a domain-specific DBMS using a persistent object system. In M.P. Atkinson, V. Benzaken, and D. Maier, editors, Persistent Object Systems, Workshops in Computing. Springer-Verlag and British Computer Society, 1995. Presented at POS-VI, Sep. 1994. Available at ftp://genome.wi.mit.edu/pub/papers/Y1994/labbase-design.ps.Z.
O. Serlin. The history of debit credit and the TPC. In [10], chapter 2, pages 19–117.
Vivek Singhal, Sheetal V. Kakkad, and Paul R. Wilson. Texas: an efficient, portable persistent store. In Proceedings of the Fifth International Workshop on Persistent Object Systems (POS-V), San Minato, Italy, September 1992. Available at ftp://cs.utexas.edu/pub/garbage/texaspstore.ps.
M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. The Sequoia 2000 storage benchmark. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 2–11, Minneapolis, MN, May 1993.
T.J. Teorey, D. Yang, and J.P. Fry. A logical design methodology for relational databases using the extended entity-relationship model. ACM Computing Surveys, 18:197–222, June 1986.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonner, A., Shrufi, A., Rozen, S. (1996). LabFlow-1: A database benchmark for high-throughput workflow management. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds) Advances in Database Technology — EDBT '96. EDBT 1996. Lecture Notes in Computer Science, vol 1057. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014171
Download citation
DOI: https://doi.org/10.1007/BFb0014171
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61057-1
Online ISBN: 978-3-540-49943-5
eBook Packages: Springer Book Archive