Evaluation: Given a mapping of nodes to processors, how can one compute the period and latency?
Optimization: Given a filtering workflow, how can one compute the mapping and schedule that minimize the period or latency? A solution to this problem requires generating both the mapping and the associated operation list—the order in which each processor executes its assigned tasks.
We address this general problem in two steps. First, we address the simplified model without communication cost. In this case, the evaluation problems are easy, and the optimization problems have polynomial complexity on homogeneous platforms. However, we show that the optimization problems become NP-hard on heterogeneous platforms. Second, we consider platforms with communication costs. Clearly, due to the previous results, the optimization problems on heterogeneous platforms are still NP-hard. Therefore we come back to homogeneous platforms and extend the framework with three significant realistic communication models. Now even evaluation problems become difficult, because the mapping must now be enriched with an operation list that provides the time-steps at which each computation and each communication occurs in the system: determining the best operation list has a combinatorial nature. Not too surprisingly, optimization problems are NP-hard too. Altogether, this paper provides a comprehensive overview of the additional difficulties induced by heterogeneity and communication costs.
KeywordsQuery optimization Web service Streaming application Workflow Communication model Period Latency Complexity results
Unable to display preview. Download preview PDF.
- 1.Agnetis, A., Detti, P., Pranzo, M., Sodhi, M.S.: Sequencing unreliable jobs on parallel machines. J. Sched. 12(1), 45–54 (2008). Available on-line at http://www.springerlink.com/content/c571u1221560j432 CrossRefMathSciNetGoogle Scholar
- 4.Benoit, A., Dufossé, F., Robert, Y.: Filter placement on a pipelined architecture. In: 11th Workshop on Advances in Parallel and Distributed Computational Models APDCM 2009. IEEE Computer Society, Los Alamitos (2009) Google Scholar
- 6.Burge, J., Munagala, K., Srivastava, U.: Ordering pipelined query operators with precedence constraints. Research Report 2005-40, Stanford University, November 2005 Google Scholar
- 8.DataCutter Project: Middleware for Filtering Large Archival Scientific Datasets in a Grid Environment. http://www.cs.umd.edu/projects/hpsl/ResearchAreas/DataCutter.htm
- 9.Florescu, D., Grunhagen, A., Kossmann, D.: Xl: A platform for web services. In: CIDR 2003, First Biennial Conference on Innovative Data Systems Research, 2003. On-line proceedings at http://www-db.cs.wisc.edu/cidr/program/p8.pdf
- 11.Hellerstein, J.M.: Predicate migration: optimizing queries with expensive predicates. In: Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 267–276 (1993) Google Scholar
- 12.Hong, B., Prasanna, V.: Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput. In: Proceedings of the 32th International Conference on Parallel Processing, ICPP’2003. IEEE Computer Society, Los Alamitos (2003) Google Scholar
- 14.Snir, M., Otto, S.W., Huss-Lederman, S., Walker, D.W., Dongarra, J.: MPI the Complete Reference. MIT Press, Cambridge (1996) Google Scholar
- 15.Srivastava, U., Munagala, K., Widom, J., Motwani, R.: Query optimization over web services. In: VLDB ’06: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 355–366. VLDB Endowment (2006) Google Scholar
- 16.Taura, K., Chien, A.A.: A heuristic algorithm for mapping communicating tasks on heterogeneous resources. In: Heterogeneous Computing Workshop, pp. 102–115. IEEE Computer Society, Los Alamitos (2000) Google Scholar
- 17.Vydyanathan, N., Catalyurek, U., Kurc, T., Saddayappan, P., Saltz, J.: Toward optimizing latency under throughput constraints for application workflows on clusters. In: Euro-Par’07. LNCS, vol. 4641, pp. 173–183. Springer, Berlin (2007) Google Scholar
- 18.Vydyanathan, N., Catalyurek, U., Kurc, T., Saddayappan, P., Saltz, J.: A duplication based algorithm for optimizing latency under throughput constraints for streaming workflows. In: ICPP’2008, the International Conference on Parallel Processing, pp. 254–261. IEEE Computer Society, Los Alamitos (2008) Google Scholar
- 19.Wu, Q., Gu, Y.: Supporting distributed application workflows in heterogeneous computing environments. In: 14th International Conference on Parallel and Distributed Systems, ICPADS. IEEE Computer Society, Los Alamitos (2008) Google Scholar
- 21.Yu, W.: The two-machine flow shop problem with delays and the one-machine total tardiness problem. PhD Thesis, Technishe Universiteit Eidhoven, June 1996 Google Scholar