Correlating Radio Astronomy Signals with Many-Core Hardware

van Nieuwpoort, Rob V.; Romein, John W.

doi:10.1007/s10766-010-0144-3

Correlating Radio Astronomy Signals with Many-Core Hardware

Open access
Published: 26 June 2010

Volume 39, pages 88–114, (2011)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Parallel Programming Aims and scope Submit manuscript

Correlating Radio Astronomy Signals with Many-Core Hardware

Download PDF

Rob V. van Nieuwpoort¹ &
John W. Romein¹

740 Accesses
25 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is not only computationally intensive, but also very I/O intensive. The LOFAR telescope, for instance, will produce over 100 terabytes per day. The future SKA telescope will even require in the order of exaflops, and petabits/s of I/O. A recent trend is to correlate in software instead of dedicated hardware, to increase flexibility and to reduce development efforts. We evaluate the correlator algorithm on multi-core CPUs and many-core architectures, such as NVIDIA and ATI GPUs, and the Cell/B.E. The correlator is a streaming, real-time application, and is much more I/O intensive than applications that are typically implemented on many-core hardware today. We compare with the LOFAR production correlator on an IBM Blue Gene/P supercomputer. We investigate performance, power efficiency, and programmability. We identify several important architectural problems which cause architectures to perform suboptimally. Our findings are applicable to data-intensive applications in general. The processing power and memory bandwidth of current GPUs are highly imbalanced for correlation purposes. While the production correlator on the Blue Gene/P achieves a superb 96% of the theoretical peak performance, this is only 16% on ATI GPUs, and 32% on NVIDIA GPUs. The Cell/B.E. processor, in contrast, achieves an excellent 92%. We found that the Cell/B.E. and NVIDIA GPUs are the most energy-efficient solutions, they run the correlator at least 4 times more energy efficiently than the Blue Gene/P. The research presented is an important pathfinder for next-generation telescopes.

Article PDF

Exascale Radio Astronomy: Can We Ride the Technology Wave?

Radio-Astronomical Imaging: FPGAs vs GPUs

The Sunway TaihuLight supercomputer: system and applications

Article 21 June 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Advanced Micro Devices Corporation (AMD): AMD Stream Computing User Guide, Revision 1.1 (2008)
Barker, K.J., Davis, K., Hoisie, A., Kerbyson, D.J., Lang, M., Pakin, S., Sancho, J.C.: Entering the petaflop era: the architecture and performance of Roadrunner. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC’08), Austin, Texas. IEEE Press. ISBN:978-1-4244-2835-9 (2008)
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream computing on graphics hardware. In ACM transactions on graphics, Proceedings of SIGGRAPH 2004, pp. 777–786, Los Angeles, California. ACM Press (2004)
de Souza, L., Bunton, J.D., Campbell-Wilson, D., Cappallo, R.J. Kincaid, B.: A radio astronomy correlator optimized for the Xilinx Virtex-4 SX FPGA. In international conference on field programmable logic and applications (FPL’07), pp. 62–67, (2007)
Gschwind M., Hofstee H.P., Flachs B.K., Hopkins M., Watanabe Y., Yamazaki T.: Synergistic processing in cell’s multicore architecture. IEEE Micro. 26(2), 10–24 (2006)
Article Google Scholar
Harris C., Haines K., Staveley-Smith L.: GPU accelerated radio astronomy signal convolution. Exp. Astron. 22(1–2), 129–141 (2008)
Article Google Scholar
IBMBlue Gene team: Overview of the IBM Blue Gene/P project. IBM J. Res. Develop. 52(1/2), 199–220 (2008)
Johnston S., Taylor R., Bailes M. et al.: Science with ASKAP. The Australian square-kilometre-array pathfinder. Exp. Astron. 22(3), 151–273 (2008)
Article Google Scholar
Khronos OpenCL Working Group. The opencl specification. version 1.0. See http://www.khronos.org/opencl/ (2009)
Lazowska E.D., Zahorjana J., Graham G.S., Sevcik K.C.: Quantitative System Performance, Computer System Analysis Using Queueing Network Models. Prentice-Hall, USA (1984)
Google Scholar
Mattson, T.G., der Wijngaart, R.V., Frumkin, M.: Programming the Intel 80-core network-on-a-chip terascale processor. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC’08), pages 1–11, Austin, Texas, (2008)
NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 2.0, july (2008)
Owens J.D., Luebke D., Govindaraju N., Harris M., Krüger J., Lefohn A.E., Purcell T.: A survey of general-purpose computation on graphics hardware. Comp. Graph. Forum 26(1), 80–113 (2007)
Article Google Scholar
Romein, J.W., Broekema, P.C., Mol, J.D., van Nieuwpoort, Rob V.: The LOFAR correlator: implementation and performance analysis. In 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010), Bangalore, India. Accepted for for publication. See http://www.astron.nl/~romein/papers/ (2010)
Romein, J.W., Broekema, P.C., van Meijeren, E., van der Schaaf, K., Zwart, W.H.: Astronomical real-time streaming signal processing on a Blue Gene/L supercomputer. In ACM Symposium on Parallel Algorithms and Srchitectures (SPAA’06), pp. 59–66, Cambridge, MA, July (2006)
Schilizzi, R.T., Dewdney, P.E.F., Lazio, T.J.W.: The Square Kilometre Array. Proceedings of SPIE, 7012, july (2008)
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: A many-core x86 architecture for visual computing. ACM Trans. Graph., 27(3), August (2008)
Silberstein, M., Schuster, A., Geiger, D., Patney, A., Owens, J.D.: Efficient computation of sum-products on GPUs through software-managed cache. In Proceedings of the 22nd ACM International Conference on Supercomputing, pp. 309–318, June (2008)
The Karoo Array Telescope (MeerKAT). See http://www.ska.ac.za/
van Nieuwpoort, Rob V., Romein, J.W.: Using many-core hardware to correlate radio astronomy signals. In Proceedings of the ACM International Conference on Supercomputing (ICS’09), pp. 440–449, Yorktown Heights, New York, USA, June (2009)
Varbanescu, A., van Amesfoort, A., Cornwell, T., van Diepen, G., van Nieuwpoort, R., Elmegreen, B., Sips, H.: Building high-resolution sky images using the cell/B.E. scientific programming (accepted, to appear) Special issue on high performance computing on the cell BE, (2008)
Wayth R.B., Greenhill L.J., Briggs F.H.: A GPU-based real-time software correlation system for the murchison widefield array prototype. Pub. Astron. Soc. Pacific 121, 857–865 (2009)
Article Google Scholar
Williams, S., Datta, K., Carter, J., Oliker, L., Half, J., Yelick, K., Bailey, D.: PERI–Auto-tuning memory-intensive kernels for multicore. J. Phys.: Conference Series 125(012038), (2008)
Williams, S., Waterman, A., Patterson, D.: Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Communications of the ACM (CACM), (2009). (to appear)

Download references

Acknowledgements

This work was performed in the context of the NWO STARE AstroStream project. We gratefully acknowledge NVIDIA, and in particular Dr. David Luebke, for providing freely some of the GPU cards used in this work. Finally, we thank Chris Broekema, Jan David Mol, and Alexander van Amesfoort for their comments on an earlier version of this paper.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Author information

Authors and Affiliations

ASTRON, Netherlands Institute for Radio Astronomy, Oude Hoogeveensedijk 4, 7991 PD, Dwingeloo, The Netherlands
Rob V. van Nieuwpoort & John W. Romein

Authors

Rob V. van Nieuwpoort
View author publications
You can also search for this author in PubMed Google Scholar
John W. Romein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rob V. van Nieuwpoort.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

van Nieuwpoort, R.V., Romein, J.W. Correlating Radio Astronomy Signals with Many-Core Hardware. Int J Parallel Prog 39, 88–114 (2011). https://doi.org/10.1007/s10766-010-0144-3

Download citation

Received: 25 September 2009
Accepted: 11 June 2010
Published: 26 June 2010
Issue Date: February 2011
DOI: https://doi.org/10.1007/s10766-010-0144-3

Keywords

CR Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Correlating Radio Astronomy Signals with Many-Core Hardware

Abstract

Article PDF

Similar content being viewed by others

Exascale Radio Astronomy: Can We Ride the Technology Wave?

Radio-Astronomical Imaging: FPGAs vs GPUs

The Sunway TaihuLight supercomputer: system and applications

References

Acknowledgements

Open Access

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

CR Subject Classification

Navigation

Correlating Radio Astronomy Signals with Many-Core Hardware

Abstract

Article PDF

Similar content being viewed by others

Exascale Radio Astronomy: Can We Ride the Technology Wave?

Radio-Astronomical Imaging: FPGAs vs GPUs

The Sunway TaihuLight supercomputer: system and applications

References

Acknowledgements

Open Access

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CR Subject Classification

Search

Navigation