Characteristics of Workloads Using the Pipeline Programming Model

  • Christian Bienia
  • Kai Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6161)

Abstract

Pipeline parallel programming is a frequently used model to program applications on multiprocessors. Despite its popularity, there is a lack of studies of the characteristics of such workloads. This paper gives an overview of the pipeline model and its typical implementations for multiprocessors. We present implementation choices and analyze their impact on the program. We furthermore show that workloads that use the pipeline model have their own unique characteristics that should be considered when selecting a set of benchmarks. Such information can be beneficial for program developers as well as for computer architects who want to understand the behavior of applications.

Keywords

Work Unit Pipeline Stage Data Approach Input Queue Output Queue 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bienia, C., Kumar, S., Li, K.: PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites on Chip-Multiprocessors. In: Proceedings of the 2008 International Symposium on Workload Characterization (September 2008)Google Scholar
  2. 2.
    Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC Benchmark Suite: Characterization and Architectural Implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (October 2008)Google Scholar
  3. 3.
    Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream Computing on Graphics Hardware. In: International Conference on Computer Graphics and Interactive Techniques 2004, pp. 777–786. ACM, New York (2004)Google Scholar
  4. 4.
    Eeckhout, L., Vandierendonck, H., Bosschere, K.D.: Quantifying the Impact of Input Data Sets on Program Behavior and its Applications. Journal of Instruction-Level Parallelism 5, 1–33 (2003)Google Scholar
  5. 5.
    Giladi, R., Ahituv, N.: SPEC as a Performance Evaluation Measure. Computer 28(8), 33–42 (1995)CrossRefGoogle Scholar
  6. 6.
    Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 151–162. ACM, New York (2006)Google Scholar
  7. 7.
    Joshi, A., Phansalkar, A., Eeckhout, L., John, L.K.: Measuring Benchmark Similarity Using Inherent Program Characteristics. IEEE Transactions on Computers 28(8), 33–42 (1995)Google Scholar
  8. 8.
    Khailany, B., Dally, W.J., Kapasi, U.J., Mattson, P., Namkoong, J., Owens, J.D., Towles, B., Chang, A., Rixner, S.: Imagine: Media Processing with Streams. IEEE Micro 21(2), 35–46 (2001)CrossRefGoogle Scholar
  9. 9.
    Kuck, D.J.: A Survey of Parallel Machine Organization and Programming. ACM Computing Surveys 9(1), 29–59 (1977)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Kudlur, M., Mahlke, S.: Orchestrating the Execution of Stream Programs on Multicore Platforms. SIGPLAN Notices 43(6), 114–124 (2008)CrossRefGoogle Scholar
  11. 11.
    Li, K., Naughton, J.F.: Multiprocessor Main Memory Transaction Processing. In: Proceedings of the First International Symposium on Databases in Parallel and Sistributed Systems, pp. 177–187. IEEE Computer Society Press, Los Alamitos (1988)Google Scholar
  12. 12.
    Liao, S.w., Du, Z., Wu, G., Lueh, G.Y.: Data and Computation Transformations for Brook Streaming Applications on Multiprocessors. In: Proceedings of the International Symposium on Code Generation and Optimization, pp. 196–207. IEEE Computer Society Press, Washington, DC (2006)Google Scholar
  13. 13.
    Ottoni, G., Rangan, R., Stoler, A., August, D.: Automatic Thread Extraction with Decoupled Software Pipelining. In: Proceedings of the 38th Annual International Symposium on Microarchitecture, p. 12 (2005)Google Scholar
  14. 14.
    Phansalkar, A., Joshi, A., John, L.K.: Analysis of Redundancy and Application Balance in the SPEC CPU2006 Benchmark Suite. In: ISCA 2007: Proceedings of the 34th Annual International Symposium on Computer Architecture, pp. 412–423. ACM, New York (2007)Google Scholar
  15. 15.
    Rangan, R., Vachharajani, N., Vachharajani, M., August, D.: Decoupled Software Pipelining with the Synchronization Array. In: Proceedings of the 13th International Conference on Parallel Architecture and Compilation Techniques, pp. 177–188 (2004)Google Scholar
  16. 16.
    Subhlok, J., Vondran, G.: Optimal Latency-Throughput Tradeoffs for Data Parallel Pipelines. In: Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 62–71. ACM, New York (1996)Google Scholar
  17. 17.
    Thies, W., Chandrasekhar, V., Amarasinghe, S.: A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 356–369. IEEE Computer Society Press, Washington, DC (2007)Google Scholar
  18. 18.
    Vandierendonck, H., De Bosschere, K.: Many Benchmarks Stress the Same Bottlenecks. In: Workshop on Computer Architecture Evaluation Using Commercial Workloads, pp. 57–64 (February 2004)Google Scholar
  19. 19.
    Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 Programs: Characterization and Methodological Considerations. In: Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24–36 (June 1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Christian Bienia
    • 1
  • Kai Li
    • 1
  1. 1.Princeton UniversityPrincetonUSA

Personalised recommendations