Synthesis of dedicated SIMD processors

Auguin, M.; Boéri, F.; Carrière, C.; Ménez, G.

doi:10.1007/BF02407083

M. Auguin¹,
F. Boéri¹,
C. Carrière¹ &
…
G. Ménez¹

39 Accesses
Explore all metrics

Abstract

In this paper, a synthesis method (CAPSYS) of dedicated architectures is introduced. Its aim is to produce optimized systems derived from the algorithmic expression of a numerical application. The approach addresses the design of dedicated systems for applications that require high numerical computations. An efficient utilization of hardware resources is achieved through the use of vector processing with an SIMD implementation. The synthesis algorithm realizes simultaneously the design of SIMD structures and the generation of the microcode needed for implementing a software pipelining of operations of the source program. CAPSYS considers a generic model composed of both mechanisms required to manage the flow of controls in a SIMD machine and the description of a parallel data memory. All the synthesized architectures derive from this generic model. Capabilities of CAPSYS are illustrated through the design of an image convolution processor and a two-dimensional median filtering processor. This last example shows also an interesting feature of CAPSYS which permits to instantiate dedicated hardware components in the program of the target application: a hardware realization of conditional schemes in loops allows to get an efficient vectorization of the algorithm and an efficient dedicated architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

B.W. Wah et al., “High Performance Computing and Communication for Grand Challenge Applications: Computer vision, Speech and Natural Language Processing, and Artificial Intelligence,”IEEE Trans. on Knowledge and Data Engineering, Vol. 5, 1993.
P.G. Paulin, J.P. Knight, and E.F. Girczyc, “HAL: A Multi-Paradigm Approach to Automatic Data-Path Synthesis,”Proceedings 23rd Design Automation Conf., New York, pp. 263–270, 1986.
A.C. Parker, J. Pizarro, and M. Mlinar, “MAHA: A program for Datapath Synthesis,”Proceedings 23rd Design Automation Conf., New York, pp. 461–466, 1986.
T.A. Ly, W.L. Elwood, and E.F. Girczic, “A generalized interconnect model for data path synthesis,”Proceedings EDAC, pp. 168–173, 1990.
B.S. Haroun and M.I. Elmasry, ”Automatic synthesis of multibus architectures for DSP,”Proceedings ISCAS, pp. 44–47, 1988.
B.M. Pangrle, “SLICER: a state synthesis for intelligent silicon compiler,”Proceedings Int. Conf. on Computer Design, pp. 436–541, 1988.
M. Breternitz and J.P. Shen, “Architecture synthesis of high performance application specific processors,”Proceedings 27th Design Automation Conf., Orlando, pp. 542–548, 1990.
M.C. McFarland, A.C. Parker, and R. Camposano, “The high level synthesis of digital systems,”Proceedings of IEEE., Vol. 78, pp. 301–318, 1990.
Article Google Scholar
G. Menez, M. Auguin, F. Boeri and C. Carriere, “A partitioning algorithm for system level synthesis,”Proceedings ICCAD92, pp. 482–487, 1992.
J.A. Fisher and J.J. O'Donnel, “VLIW machines: Multiprocessor we can actually program,”Proceedings IEEE Compcon, pp. 299–305, 1984.
A.V. Aho, R. Sethi, and J.D. Ullman,Compilers, principles, techniques and tools, Addison-Wesley, 1986.
D. Landskov, S. Davidson, and B. Shriver, “Local microcode compaction techniques,”ACM Computing Surveys, Vol. 12, 1980.
C. Eisenbeis, “Optimization of horizontal microcode generation for loop structures,”Proceedings International Conf. on Supercomputing, Saint Malo, 1988.
C. Carriere,Definition et optimisation de l'interconnexion dans un processeur specialise a commande synchrone, Ph.D. thesis, Univ. Nice Sophia/Antipolis, September, 1992.
F. Boeri and M. Auguin, “OPSILA: A Vector and Parallel Processor,”IEEE Transactions on Computers, Vol. 42, pp. 76–82, 1993.
Article Google Scholar
M. Auguin and F. Boeri, “Parallel memory management in a SIMD computer,”Proceedings IFIP 10.3 Working Conference on Highly Parallel Computers, Sophia-Antipolis, France, 1986.
J. Lenfant, “Parallel permutation of data: A Benes network control algorithm for frequently used permutations,”IEEE Transactions on Computers, Vol. C-27, pp. 637–647, 1978.
Article MathSciNet Google Scholar
D.H. Lawrie, “Access and alignment of data in an array processor,”IEEE Transactions on Computers, Vol. C-24, pp. 1145–1155, 1975.
Article MathSciNet Google Scholar
P.M. Kogge,The architecture of pipelined computers, McGraw-Hill, 1981.
P. Duclos,Etude du parallelisme en traitement des images, realisation sur une architecture mixte SIMD/SPMD, Ph.D. thesis, Univ. Nice Sophia/Antipolis, October 1988.

Download references

Author information

Authors and Affiliations

Laboratoire Informatique Signaux Systèmes (13S), CNRS, Université de Nice Sophia-Antipolis, 41 Bd Napoléon III, 06041, Nice cedex, France
M. Auguin, F. Boéri, C. Carrière & G. Ménez

Authors

M. Auguin
View author publications
You can also search for this author in PubMed Google Scholar
F. Boéri
View author publications
You can also search for this author in PubMed Google Scholar
C. Carrière
View author publications
You can also search for this author in PubMed Google Scholar
G. Ménez
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Auguin, M., Boéri, F., Carrière, C. et al. Synthesis of dedicated SIMD processors. Journal of VLSI Signal Processing 9, 167–179 (1995). https://doi.org/10.1007/BF02407083

Download citation

Received: 14 November 1993
Revised: 21 May 1994
Published: 01 April 1995
Issue Date: April 1995
DOI: https://doi.org/10.1007/BF02407083

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Synthesis of dedicated SIMD processors

Abstract

Access this article

Similar content being viewed by others

Loop Parallelization Techniques for FPGA Accelerator Synthesis

Comparing Register-Transfer-, C-, and System-Level Implementations of an Image Enhancement Algorithm

Extending OpenMP SIMD Support for Target Specific Code and Application to ARM SVE

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Synthesis of dedicated SIMD processors

Abstract

Access this article

Similar content being viewed by others

Loop Parallelization Techniques for FPGA Accelerator Synthesis

Comparing Register-Transfer-, C-, and System-Level Implementations of an Image Enhancement Algorithm

Extending OpenMP SIMD Support for Target Specific Code and Application to ARM SVE

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation