SmartApps: An Application Centric Approach to High Performance Computing

Rauchwerger, Lawrence; Amato, Nancy M.; Torrellas, Josep

doi:10.1007/3-540-45574-4_6

Lawrence Rauchwerger⁵,
Nancy M. Amato⁵ &
Josep Torrellas⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2017))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

249 Accesses
3 Citations

Abstract

State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, without much customization to individual specific requirements. Little or no guiding information flows directly from the application to the run-time system to allow the latter to fully tailor its services to the application. As a result, the performance is disappointing. To address this problem, we propose application-centric computing, or SMART APPLICATIONS. In the executable of smart applications, the compiler embeds most run-time system services, and a performance-optimizing feedback loop that monitors the application’s performance and adaptively reconfigures the application and the OS/hardware platform. At run-time, after incorporating the code’s input and the system’s resources and state, the SmartApp performs a global optimization. This optimization is instance specific and thus much more tractable than a global generic optimization between application, OS and hardware. The resulting code and resource customization should lead to major speedups. In this paper, we first describe the overall architecture of Smartapps and then present the achievements to date: Run-time optimizations, performance modeling, and moderately reconfigurable hardware.

Research supported in part by NSF CAREER Award CCR-9734471, NSF CAREER Award CCR-9624315, NSF Grant ACI-9872126, NSF-NGS EIA-9975018, DOE ASCI ASAP Level 2 Grant B347886, and Hewlett-Packard Equipment Grants

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Adve, V. Adve, M. Hill, and M. Vernon. Comparison of Hardware and Software Cache Coherence Schemes. In Proc. of the 18th ISCA, pp. 298–308, June 1991.
Google Scholar
N.M. Amato, J. Perdue, A. Pietracaprina, G. Pucci, and M. Mathis. Predicting performance on smps. a case study: The SGI Power Challenge. In Proc. IPDPS, pp. 729–737, May 2000.
Google Scholar
M. Auslander, H. Franke, B. Gamsa, O. Krieger, and M. Stumm. Customization lite. In Proc. of 6th Workshop on Hot Topics in Operating Systems (HotOS-VI), May 1997.
Google Scholar
J. Appavo B. Gamsa, O. Krieger and M. Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proc. of OSDI, 1999.
Google Scholar
B. Grant, et. al. An evaluation of staged run-time optimizations in Dyce. In Proc. of the SIGPLAN 1999 PLDI, Atlanta, GA, May 1999.
Google Scholar
G.E. Blelloch, P.B. Gibbons, Y. Mattias, and M. Zagha. Accounting for memory bank contention and delay in high-bandwidth multiprocessors. IEEE Trans. Par.Dist. Sys., 8(9):943–958, 1997.
Article Google Scholar
J. Mark Bull. Feedback guided dynamic loop scheduling: Algorithms and experiments. In EUROPAR98, Sept., 1998.
Google Scholar
F. Dang and L. Rauchwerger. Speculative parallelization of partially parallel loops. In Proc. of the 5th Int. Workshop LCR 2000, Lecture Notes in Computer Science, May 2000.
Google Scholar
D. Engler. Vcode: a portable, very fast dynamic code generation system. In Proc. of the SIGPLAN 1996 PLDI Philadelphia, PA, May 1996.
Google Scholar
D. Bailey et al. The NAS parallel benchmarks. Int. J. Supercomputer Appl., 5(3):63–73, 1991.
Article Google Scholar
P. B. Gibbons, Y. Matias, and V. Ramachandran. Can a shared-memory model serve as a bridging-model for parallel computation? In Proc. ACM SPAA, pp. 72–83, 1997.
Google Scholar
H. Han and C.-W. Tseng. Improving compiler and run-time support for adaptive irregular codes. In PACT’98, Oct. 1998.
Google Scholar
R. Iyer, N. Amato, L. Rauchwerger, and L. Bhuyan. Comparing the memory system performance of the HP V=AD-Class and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications. In Proc. of ACM ICS, pp. 339–347, June 1999.
Google Scholar
B. H. H. Juurlink and H. A. G.Wijshoff. A quantitative comparison of parallel computation models. In Proc. of ACM SPAA, pp. 13–24, 1996.
Google Scholar
D. Keppel, S. J. Eggers, and R. R. Henry. A case for runtime code generation. TR UWCSE 91-11-04, Dept. of Computer Science and Engineering, Univ. of Washington, Nov. 1991..
Google Scholar
O. Krieger and M. Stumm. Hfs: A performance-oriented flexible file system based on building-block compositions. IEEE Trans. Comput., 15(3):286–321, 1997.
Google Scholar
S. Owicki and A. Agarwal. Evaluating the performance of software cache coherency. In Proc. of ASPLOS III, April 1989.
Google Scholar
L. Rauchwerger, N. Amato, and D. Padua. A Scalable Method for Run-time Loop Parallelization. Int. J. Paral. Prog., 26(6):537–576, July 1995.
Article Google Scholar
L. Rauchwerger and D. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Trans. on Par. and Dist. Systems, 10(2), 1999.
Google Scholar
L. Rauchwerger and D. Padua. Parallelizing WHILE Loops for Multiprocessor Systems. In Proc. of 9th IPPS, April 1995.
Google Scholar
Silicon Graphics Corporation 1995. SGI Power Challenge: User’s Guide, 1995.
Google Scholar
R. Simoni and M. Horowitz. Modeling the Performance of Limited Pointer Directories for Cache Coherence. In Proc. of the 18th ISCA, pp. 309–318, June 1991.
Google Scholar
J. T orrellas, J. Hennessy, and T. Weil. Analysis of Critical Architectural and Programming Parameters in a Hierarchical Shared Memory Multiprocessor. In ACM Sigmetrics Conf. on Measurement and Modeling of Computer Systems, pp. 163–172, May 1990.
Google Scholar
H. Y u and L. Rauchwerger. Adaptive reduction parallelization. In Proc. of the 14th ACM ICS, Santa Fe, NM, May 2000.
Google Scholar
Y. Zhang, L. Rauchwerger, and J. Torrellas. Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors. In Proc. of HPCA-4, pp. 162–173, 1998.
Google Scholar
Y. Zhang, L. Rauchwerger, and J. Torrellas. Speculative Parallel Execution of Loops with Cross-Iteration Dependences in DSM Multiprocessors. In Proc. of HPCA-5, Jan. 1999.
Google Scholar
Ye Zhang. DSM Hardware for Speculative Parallelization. Ph.D. Thesis, Department of ECE, Univ. of Illinois, Urbana, IL, Jan. 1999
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Texas A&M University, USA
Lawrence Rauchwerger & Nancy M. Amato
Department of Computer Science, University of Illinois, Illinois
Josep Torrellas

Authors

Lawrence Rauchwerger
View author publications
You can also search for this author in PubMed Google Scholar
Nancy M. Amato
View author publications
You can also search for this author in PubMed Google Scholar
Josep Torrellas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM T.J.Watson Research Center, P.O. Box 218, Yorktown Heights, NY, 10598, USA
Samuel P. Midkiff , José E. Moreira , Manish Gupta & Siddhartha Chatterjee , , &
Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA, 92093-0114, USA
Jeanne Ferrante
Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27599-3175, USA
Jan Prins
Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
William Pugh & Chau-Wen Tseng &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rauchwerger, L., Amato, N.M., Torrellas, J. (2001). SmartApps: An Application Centric Approach to High Performance Computing. In: Midkiff, S.P., et al. Languages and Compilers for Parallel Computing. LCPC 2000. Lecture Notes in Computer Science, vol 2017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45574-4_6

Download citation

DOI: https://doi.org/10.1007/3-540-45574-4_6
Published: 04 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42862-6
Online ISBN: 978-3-540-45574-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics