Sorting networks on FPGAs

Mueller, Rene; Teubner, Jens; Alonso, Gustavo

doi:10.1007/s00778-011-0232-z

Sorting networks on FPGAs

Regular Paper
Published: 01 June 2011

Volume 21, pages 1–23, (2012)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Rene Mueller¹,
Jens Teubner² &
Gustavo Alonso²

837 Accesses
123 Citations
6 Altmetric
Explore all metrics

Abstract

Computer architectures are quickly changing toward heterogeneous many-core systems. Such a trend opens up interesting opportunities but also raises immense challenges since the efficient use of heterogeneous many-core systems is not a trivial problem. Software-configurable microprocessors and FPGAs add further diversity but also increase complexity. In this paper, we explore the use of sorting networks on field-programmable gate arrays (FPGAs). FPGAs are very versatile in terms of how they can be used and can also be added as additional processing units in standard CPU sockets. Our results indicate that efficient usage of FPGAs involves non-trivial aspects such as having the right computation model (a sorting network in this case); a careful implementation that balances all the design constraints in an FPGA; and the proper integration strategy to link the FPGA to the rest of the system. Once these issues are properly addressed, our experiments show that FPGAs exhibit performance figures competitive with those of modern general-purpose CPUs while offering significant advantages in terms of power consumption and parallel stream evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abadi D. J., Carney D., Çetintemel U., Cherniack M., Convey C., Lee S., Stonebraker M., Tatbul N., Zdonik S.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003)
Article Google Scholar
Abadi, D.J., Ahmad, Y., Balazinska, M., Çetintemel, U., Cherniack, M., Hwang, J.H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik, S.: The design of the Borealis stream processing engine. In: Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA (2005)
Ajtai, M., Komlós, J., Szemerédi, E.: An O(n log n) sorting network. In: ACM Symposium on Theory of Computing (STOC), pp. 1–9 (1983)
Arasu A., Babu S., Widom J.: The cql continuous query language: semantic foundations and query execution. VLDB J. 15(2), 121–142 (2006)
Article Google Scholar
Batcher, K.E.: Sorting networks and their applications. In: AFIPS Spring Joint Computer Conference, pp. 307–314 (1968)
Burleson, W.P., Ciesielski, M., Klass, F., Liu, W.: Wave-pipelining: a tutorial and research survey. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 6(3), 464–474. doi:10.1109/92.711317
Chhugani J., Nguyen A. D., Lee V. W., Macy W., Hagog M., Chen Y. K., Baransi A., Kumar S., Dubey P.: Efficient implementation of sorting on multi-core SIMD CPU architecture. Proc. VLDB Endow. 1(2), 1313–1324 (2008)
Google Scholar
Cormen T. H., Leiserson C. E., Rivest R. L., Stein C.: Introduction to Algorithms. 2nd edn. MIT Press, Cambridge (2001)
MATH Google Scholar
DeWitt D.J. DIRECT—a multiprocessor organization for supporting relational database management systems. IEEE Trans. Comput. 28(6) (1979)
Furtak, T., Amaral, J.N., Niewiadomski, R.: Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms. In: ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 348–357 (2007)
Gedik, B., Bordawekar, R.R., Yu, P.S.: CellSort: high performance sorting on the cell processor. In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), Vienna, Austria, pp. 1286–1297 (2007)
Gold, B.T., Ailamaki, A., Huston, L., Falsafi, B.: Accelerating database operators using a network processor. In: International Workshop on Data Management on New Hardware (DaMoN), Baltimore, MD, USA (2005)
Govindaraju, N.K., Lloyd, B., Wang, W., Lin, M., Manocha, D.: Fast computation of database operations using graphics processors. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of data, Paris, France, pp. 215–226 (2004)
Govindaraju, N.K., Gray, J., Kumar, R., Manocha, D.: GPUTeraSort: high performance graphics coprocessor sorting for large database management. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, IL, USA, pp. 325–336 (2006)
Greaves, D.J., Singh, S.: Kiwi: Synthesis of FPGA circuits from parallel programs. In: IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM) (2008)
Harizopoulos, S., Shkapenyuk, V., Ailamaki, A.: QPipe: a simultaneously pipelined relational query engine. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, USA (2005)
Huang, S.S., Hormati, A., Bacon, D.F., Rabbah, R.: Liquid metal: object-oriented programming across the hardware/software boundary. In: European Conference on Object-Oriented Programming, Paphos, Cyprus (2008)
Inoue, H., Moriyama, T., Komatsu, H., Nakatani, T.: AA-sort a new parallel sorting algorithm for multi-core SIMD processors. In: International Conference on Parallel Architecture and Compilation Techniques (PACT), Brasov, Romania, pp. 189–198 (2007)
Kickfire: http://www.kickfire.com/ (2009)
Knuth D. E.: The Art of Computer Programming, Volume 3: Sorting and Searching. 2nd edn. Addison-Wesley, Reading (1998)
Google Scholar
Manegold S., Boncz P. A., Kersten M. L.: Optimizing database architecture for the new bottleneck: Memory access. VLDB J. 9(3), 231–246 (2000)
Article Google Scholar
Mitra, A., Vieira, M.R., Bakalov, P., Tsotras, V.J., Najjar, W.: Boosting XML filtering through a scalable FPGA-based architecture. In: Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA (2009)
Mueller, R.: Data processing on embedded devices. PhD thesis, ETH Zurich, Diss. ETH No. 19163 (2010)
Mueller, R., Eguro, K.: FPGA-accelerated deserialization of object structures. Technical report MSR-TR-2009-126, Microsoft Research Redmond (2009)
Mueller, R., Teubner, J., Alonso, G.: Data processing on fpgas. Proc. VLDB Endow. 2(1) (2009a)
Mueller, R., Teubner, J., Alonso, G.: Streams on wires—a query compiler for FPGAs. Proc. VLDB Endow. 2(1) (2009b)
Netezza: http://www.netezza.com/ (2009)
Oflazer K.: Design and implementation of a single-chip 1-d median filter. IEEE Trans. Acoust. Speech Signal Process. 31, 1164–1168 (1983)
Article Google Scholar
Q6700 datasheet: Intel Core 2 Extreme Quad-Core processor XQ6000 Sequence and Intel Core 2 Quad Processor Q600 Sequence Datasheet. Intel (2007)
Rabiner L. R., Sambur M. R., Schmidt C. E.: Applications of a nonlinear smoothing algorithm to speech processing. IEEE Trans. Acoust. Speech Signal Process. 23(6), 552–557 (1975)
Article Google Scholar
Tukey J. W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)
Wendt, P.D., Coyle, E.J., Gallagher, N.C., Jr.: Stack filters. IEEE Trans. Acoust. Speech Signal Process. 34(4) (1986)
Wentzlaff, D., Griffin, P., Hoffmann, H., Bao, L., Edwards, B., Ramey, C., Mattina, M., Miao, C.C., Brown, J.F., Agarwal, A.: On-chip interconnection architecture of the tile processor. IEEE Micro 27(5) (2007)
Xilinx: Virtex-5 FGPA Data Sheet: DC and Switching Characteristics. Xilinx Inc., v5.0 edn (2009a)
Xilinx: Virtex-5 FPGA User Guide. Xilinx Inc., v4.5 edn (2009b)
XtremeData: http://www.xtremedatainc.com/ (2009)
Zhou, J., Ross, K.A.: Implementing database operations using SIMD instructions. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, WI, USA (2002)

Download references

Author information

Authors and Affiliations

IBM Almaden Research Center, San Jose, CA, USA
Rene Mueller
Systems Group, Department of Computer Science, ETH, Zurich, Switzerland
Jens Teubner & Gustavo Alonso

Authors

Rene Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Jens Teubner
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Alonso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rene Mueller.

Additional information

The work reported in this article was done while Rene Mueller was at ETH Zurich.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mueller, R., Teubner, J. & Alonso, G. Sorting networks on FPGAs. The VLDB Journal 21, 1–23 (2012). https://doi.org/10.1007/s00778-011-0232-z

Download citation

Received: 31 March 2010
Accepted: 10 August 2010
Published: 01 June 2011
Issue Date: February 2012
DOI: https://doi.org/10.1007/s00778-011-0232-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sorting networks on FPGAs

Abstract

Access this article

Similar content being viewed by others

High Performance Stream Processing on FPGA

Hardware Algorithms

FPGA-Based DSP

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sorting networks on FPGAs

Abstract

Access this article

Similar content being viewed by others

High Performance Stream Processing on FPGA

Hardware Algorithms

FPGA-Based DSP

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation