HyGrid: A CPU-GPU Hybrid Convolution-Based Gridding Algorithm in Radio Astronomy

  • Qi Luo
  • Jian XiaoEmail author
  • Ce Yu
  • Chongke Bi
  • Yiming Ji
  • Jizhou Sun
  • Bo Zhang
  • Hao Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11334)


New-generation radio telescopes have been producing an unprecedented scale of data every day and requiring fast algorithms to speedup their data processing work flow urgently. The most data intensive computing phase during the entire work flow is gridding, which converts original data from irregular sampling space to regular grid space. Current methods are mainly focused on interferometers or have limitations on the resolutions due to the memory wall. Here we propose a CPU-GPU hybrid algorithm which accelerates the process of gridding. It employs multi-CPU to perform pre-ordering and GPU to speed up convolution-based gridding. Several optimization strategies are further proposed for reducing unnecessary memory access and maximizing the utilization of the heterogeneous architecture. Testing results demonstrate that the proposal is especially suitable for gridding large-scale data and can improve performance by up to 71.25 times compared to the traditional multi-thread CPU-based approach.


Gridding Heterogeneous computing Convolution Data pipeline Astroinformatics 



The authors would like to thank Benjamin Winkel for providing the Cython code of the cygrid method.

This work is supported by the Joint Research Fund in Astronomy (U1731125, U1531111) under a cooperative agreement between the National Natural Science Foundation of China (NSFC) and Chinese Academy of Sciences (CAS). This work is also supported by the Young Researcher Grant of National Astronomical Observatories, Chinese Academy of Sciences.


  1. 1.
    van Amesfoort, A.S., Varbanescu, A.L., Sips, H.J., van Nieuwpoort, R.V.: Evaluating multi-core platforms for HPC data-intensive kernels. In: Proceedings of the 6th ACM conference on Computing frontiers, CF 2009, pp. 207–216. ACM, New York (2009)Google Scholar
  2. 2.
    Baron, F., Kloppenborg, B., Monnier, J.: Toward 5D image reconstruction for optical interferometry, vol. 8445. Amsterdam, Netherlands (2012)Google Scholar
  3. 3.
    Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: Hwu, W.W. (ed.) GPU Computing Gems. Applications of GPU Computing Series, Jade edn, pp. 359–371. Morgan Kaufmann, Boston (2012)CrossRefGoogle Scholar
  4. 4.
    Calabretta, M.R., Roukema, B.F.: Mapping on the healpix grid. Mon. Not. R. Astron. Soc. 381(2), 865–872 (2007)CrossRefGoogle Scholar
  5. 5.
    Cornwell, T.J., Golap, K., Bhatnagar, S.: W projection: a new algorithm for wide field imaging with radio synthesis arrays. In: Astronomical Data Analysis Software and Systems XIV. Astronomical Society of the Pacific Conference Series, vol. 347, p. 86 (12 2005)Google Scholar
  6. 6.
    Cornwell, T.J., Golap, K., Bhatnagar, S.: The noncoplanar baselines effect in radio interferometry: the W-projection algorithm. IEEE J. Sel. Top. Signal Process. 2(5), 647–657 (2008)CrossRefGoogle Scholar
  7. 7.
    De, K., Gupta, Y.: A real-time coherent dedispersion pipeline for the giant metrewave radio telescope. Exp. Astron. 41(1), 67–93 (2016)CrossRefGoogle Scholar
  8. 8.
    Dickey, J.M.: Spectral line advanced topics. In: Single-Dish Radio Astronomy: Techniques and Applications. Astronomical Society of the Pacific Conference Series, vol. 278, pp. 209–225 (2002)Google Scholar
  9. 9.
    Dudgeon, D.E., Mersereau, R.M.: Multidimensional Digital Signal Processing. Prentice Hall Signal Processing Series. Prentice-Hall (1984)Google Scholar
  10. 10.
    Edgar, R., et al.: Enabling a high throughput real time data pipeline for a large radio telescope array with GPUs. Comput. Phys. Commun. 181(10), 1707–1714 (2010)CrossRefGoogle Scholar
  11. 11.
    Fernique, P., Durand, D., Boch, T., Oberto, A., Pineau, F.: HEALpix based cross-correlation in astronomy. In: Astronomical Data Analysis Software and Systems XXII. Astronomical Society of the Pacific Conference Series, vol. 475, p. 135 (2013)Google Scholar
  12. 12.
    Gai, J., et al.: More IMPATIENT: a gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs. J. Parallel Distrib. Comput. 73(5), 686–697 (2013)CrossRefGoogle Scholar
  13. 13.
    Górski, K.M., et al.: HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J. 622(2), 759 (2005)CrossRefGoogle Scholar
  14. 14.
    Giovanelli, R., Haynes, M.P., Kent, B.R., et al.: The arecibo legacy fast ALFA survey: I. Science goals, survey design, and strategy. Astrophys. J. 130(6), 2598 (2005)Google Scholar
  15. 15.
    Hong, Z., Yu, C., Wang, J., Xiao, J., Cui, C., Sun, J.: Aquadexim: highly efficient in-memory indexing and querying of astronomy time series images. Exp. Astron. 42(3), 387–405 (2016)CrossRefGoogle Scholar
  16. 16.
    Hotan, A.W., et al.: The Australian square kilometre array pathfinder: system architecture and specifications of the boolardy engineering test array, vol. 31, p. e041. Publications of the Astronomical Society of Australia (2014)Google Scholar
  17. 17.
    Humphreys, B., Cornwell, T.: SKA memo 132: analysis of convolutional resampling algorithm performance (2011)Google Scholar
  18. 18.
    Hwu, W.M.W., et al.: Accelerating MR image reconstruction on GPUs. In: 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 1283–1286 (2009)Google Scholar
  19. 19.
    Jackson, J.I., Meyer, C.H., Nishimura, D.G., Macovski, A.: Selection of a convolution function for Fourier inversion using gridding. IEEE Trans. Med. Imaging 10(3), 473–478 (1991)CrossRefGoogle Scholar
  20. 20.
    Léna, P., Rouan, D., Lebrun, F., Mignard, F., Pelat, D., Lyle, S.: Observational Astrophysics. Astronomy and Astrophysics Library, 3rd edn. Springer, Heidelberg (2012). Scholar
  21. 21.
    Maeda, A., Sano, K., Yokoyama, T.: Reconstruction by weighted correlation for MRI with time-varying gradients. IEEE Trans. Med. Imaging 7(1), 26–31 (1988)CrossRefGoogle Scholar
  22. 22.
    Mangum, J.G., Emerson, D.T., Greisen, E.W.: The on the fly imaging technique. A&A 474(2), 679–687 (2007)CrossRefGoogle Scholar
  23. 23.
    McCool, M., Reinders, J., Robison, A.: Structured Parallel Programming: Patterns for Efficient Computation, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2012)Google Scholar
  24. 24.
    Merry, B.: Faster GPU-based convolutional gridding via thread coarsening. Astron. Comput. 16, 140–145 (2016)CrossRefGoogle Scholar
  25. 25.
    Mink, D.: WCSTools 4.0: Building Astrometry and Catalogs into Pipelines. In: Astronomical Data Analysis Software and Systems XV. Astronomical Society of the Pacific Conference Series, vol. 351, p. 204 (2006)Google Scholar
  26. 26.
    Muscat, D.: High-performance image synthesis for radio interferometry (2014)Google Scholar
  27. 27.
    Nan, R.: Five hundred meter aperture spherical radio telescope (fast). Sci. China Ser. G 49(2), 129–148 (2006)CrossRefGoogle Scholar
  28. 28.
    Nan, R., et al.: The five-hundred-meter aperture spherical radio telescope (fast) project. Int. J. Mod. Phys. D 20(06), 989–1024 (2011)CrossRefGoogle Scholar
  29. 29.
    O’Sullivan, J.D.: A fast sinc function gridding algorithm for Fourier inversion in computer tomography. IEEE Trans. Med. Imaging 4(4), 200–207 (1985)CrossRefGoogle Scholar
  30. 30.
    Plauger, P., Lee, M., Musser, D., Stepanov, A.A.: C++ Standard Template Library, 1st edn. Prentice Hall PTR, Upper Saddle River (2000)Google Scholar
  31. 31.
    Reynolds, C., Paragi, Z., Garrett, M.: Pipeline Processing of VLBI Data. Physics (2002)Google Scholar
  32. 32.
    Romein, J.W.: An efficient work-distribution strategy for gridding radio-telescope data on GPUs. In: Proceedings of the 26th ACM International Conference on Supercomputing, ICS 2012, pp. 321–330. ACM, New York (2012)Google Scholar
  33. 33.
    Rosenfeld, D.: An optimal and efficient new gridding algorithm using singular value decomposition. Magn. Reson. Med. 40(1), 14–23 (1998)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming, 1st, edn. Addison-Wesley Professional, Boston (2010)Google Scholar
  35. 35.
    Sinnott, R.W.: Virtues of the Haversine, vol. 68, p. 158 (1984)Google Scholar
  36. 36.
    Sum, J., Leung, C.S., Cheung, R.C.C., Ho, T.Y.: HEALPIX DCT technique for compressing PCA-based illumination adjustable images. Neural Comput. Appl. 22(7), 1291–1300 (2013)CrossRefGoogle Scholar
  37. 37.
    Tingay, S.J., et al.: The Murchison widefield array: the square kilometre array precursor at low radio frequencies. Publications of the Astronomical Society of Australia, vol. 30, no. 30, pp. 109–121 (2013)Google Scholar
  38. 38.
    Vincenty, T.: Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Surv. Rev. 23(176), 88–93 (1975)CrossRefGoogle Scholar
  39. 39.
    Wells, D.C., Greisen, E.W.: Fits: a flexible image transport system, vol. 44, p. 363 (1981)Google Scholar
  40. 40.
    Winkel, B., Lenz, D., Flöer, L.: Cygrid: a fast cython-powered convolution-based gridding module for python. Astron. Astrophys. 591, A12 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Qi Luo
    • 1
  • Jian Xiao
    • 2
    Email author
  • Ce Yu
    • 1
  • Chongke Bi
    • 2
  • Yiming Ji
    • 1
  • Jizhou Sun
    • 1
  • Bo Zhang
    • 3
  • Hao Wang
    • 1
  1. 1.School of Computer Science and TechnologyTianjin UniversityTianjinChina
  2. 2.School of Computer SoftwareTianjin UniversityTianjinChina
  3. 3.National Astronomical ObservatoriesChinese Academy of SciencesBeijingChina

Personalised recommendations