A Compiler Directed Approach to Hiding Configuration Latency in Chameleon Processors

Tang, Xinan; Aalsma, Manning; Jou, Raymond

doi:10.1007/3-540-44614-1_4

A Compiler Directed Approach to Hiding Configuration Latency in Chameleon Processors

Xinan Tang⁶,
Manning Aalsma⁶ &
Raymond Jou⁶

Conference paper
First Online: 01 January 2002

630 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1896))

Abstract

The Chameleon CS2112 chip is the industry’s first reconfigurable communication processor. To attain high performance, the reconfiguration latency must be effectively tolerated in such a processor. In this paper, we present a compiler directed approach to hiding the configuration loading latency. We integrate multithreading, instruction scheduling, register allocation, and prefetching techniques to tolerate the configuration loading latency. Furthermore, loading configuration is overlapped with communication to further enhance performance. By running some kernel programs on a cycle-accurate simulator, we showed that the chip performance is significantly improved by leveraging such compiler and multithreading techniques.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K. Bondalapati and V. K. Prasanna. ”mapping loops onto reconfigurable architectures”. In Proc. of Inter. Workshop on Field Programmable Logic and Applications, Sep. 1998.
Google Scholar
M. Budiu and S. C. Goldstein. ”fast compilation for pipelined reconfigurable fabric”. In Proc. of ACM/SIGDA Inter. Symposium on FPGA, 1999.
Google Scholar
T. J. Callahan and F. John Wawrzynek. ”instruction level parallelism for recongurable computing”. In Hartenstein and Keevallik, editors, Inter. Workshop on Field-Programmable Logic and Applications. Lecture Notes in Computer Science, LNCS 1482, Springer-Verlag, Aug. 1998.
Chapter Google Scholar
Chameleon Systems, Inc. http://www.chameleonsystems.com/, 2000.
M. Gokhale and J. Stone. ”NAPA C: Compiling for a hybrid risc/fpga architcture”. In Proc. of the IEEE Symposium on FCCM, Apr. 1998.
Google Scholar
S. C. Goldstein, H. Schmit, M. Moe, M. Budiu, S. Cadambi, R. R. Taylor, and R. Laufer. PipeRench: A coprocessor streaming multimedia acceleration. In Proc. of ISCA-26, pages 28–39, Atlanta, Geor., May 1999.
Google Scholar
S. Hauck, T. W. Fry, M. M. Hosler, and J. P. Kao. ”the chimaera reconfigurable functional unit”. In Proc. of the IEEE Symposium on FCCM, Apr. 1997.
Google Scholar
S. Hauck, T. W. Fry, M. M. Hosler, and J. P. Kao. ”” configuration prefetch for single context reconfigurable coprocessors””. In Proc. of ACM/SIGDA Inter. Symposium on FPGA, Feb. 1998.
Google Scholar
S. Hauck, Z. Li, and E. J. Schwabe. ”configuration compression for the xilinx xc6200 fpga”. In Proc. of the IEEE Symposium on FCCM, Apr. 1998.
Google Scholar
J. R. Hauser and J. Wawrzynek. ”garp: A mips processor with a reconfigurable coprocessor”. In Proc. of the IEEE Symposium on FCCM, Apr. 1997.
Google Scholar
S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers, 1997.
Google Scholar
R. Razdan. PRISC: Programmable Reduced Instruction Set Computers. PhD thesis, Harvard University, Division of Applied Sciences, Boston, 1994.
Google Scholar
C. R. Rupp, M. Landguth, T. Garverick, E. Gomersall, H. Holt, J. M. Arnold, and M. Gokhale. ”the napa adaptive processing architecture”. In Proc. of the IEEE Symposium on FCCM, Apr. 1998.
Google Scholar
S.K. Rajamani and P. Viswanath. ”a quantitative analysis of the processorprogrammable logic interface”. In Proc. of the IEEE Symposium on FCCM, Apr. 1997.
Google Scholar
X. Tang and G. R. Gao. Automatically partitioning threads for multithreaded architectures. Journal of Parallel and Distributed Computing, 58(2):159–189, Aug. 1999.
Google Scholar
M. Weinhardt and W. Luk. ”pipeline vectorization for reconfigurable systems”. In Proc. of the IEEE Symposium on FCCM, Apr. 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Chameleon Systems, Inc., 161 Nortech Parkway, San Jose, CA, 95134
Xinan Tang, Manning Aalsma & Raymond Jou

Authors

Xinan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Manning Aalsma
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Jou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Kaiserslautern, P. O. Box. 30 49, 67653, Kaiserslautern, Germany
Reiner W. Hartenstein
Carinthia Tech Institute, Richard-Wagner-Str. 19, 9500, Villach, Austria
Herbert Grünbacher

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, X., Aalsma, M., Jou, R. (2000). A Compiler Directed Approach to Hiding Configuration Latency in Chameleon Processors. In: Hartenstein, R.W., Grünbacher, H. (eds) Field-Programmable Logic and Applications: The Roadmap to Reconfigurable Computing. FPL 2000. Lecture Notes in Computer Science, vol 1896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44614-1_4

Download citation

DOI: https://doi.org/10.1007/3-540-44614-1_4
Published: 12 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67899-1
Online ISBN: 978-3-540-44614-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics