Abstract
This paper presents a novel Graphics Processing Unit (GPU)-accelerated method for large-scale data processing of tomographic particle image velocimetry. The multiplicative algebraic reconstruction technique (MART) is utilized to reconstruct three-dimensional (3D) particle fields, and cross-correlation with fast Fourier transform is used to generate the displacement vectors. The Compute Unified Device Architecture (CUDA) C programming model is used to port the velocity field reconstruction from CPU code to GPU code to improve efficiency. For similar reconstruction tasks, a particular thread grid hierarchy is designed to construct the corresponding computational kernel functions, and each task is launched in a single thread. A modified strategy of pixel batch processing is then used to manage the GPU memory access. Subsequently, the asynchronous stream concurrency is used to generate the velocity field with the GPU cuFFT library. A synthetic 3D experiment with a ring vortex is carried out to verify the accuracy and efficiency of the developed method. The parallel results agree well with the generated data and other research conclusions reported in the literature. The speed-up ratio by multi-core CPU (Intel® Xeon® Platinum 8168) parallel implementation with OpenMP converges to 2.5 × in MFG-MART and 3.0 × in cross-correlation. In contrast to a 24-core CPU implementation, a GPU (NVIDIA Tesla V100S, 32 GB) under maximum memory usage achieves an impressive speed-up ratio of over 20 × in parallel MFG-MART and 4 × in concurrent cross-correlation. The measurement of turbulent flow in a circular jet flow at Reynolds 3,000 is used to examine the efficiency promotion of the parallelized framework in real experimental settings. For the synthetic volume reconstruction of 700 × 700 × 140 voxels and cross-correlation with 413 voxels window in a 75% overlap, and the experimental volume reconstruction of 550 × 1100 × 550 voxels and cross-correlation with 323 voxels window in a 50% overlap, a frame of velocity field can be completed within 2 min in each domain.
Graphical abstract
Similar content being viewed by others
Abbreviations
- d :
-
Scalar velocity (mm/s)
- D :
-
Diameter of circular nozzle (mm)
- E(X j, Y j, Z j):
-
Light intensity distribution of voxels (−)
- I(x i, y i):
-
Gray level of image pixels (−)
- l :
-
Vortex width (mm)
- N i :
-
Number of voxels associated with each line of sight (−)
- Q :
-
Reconstruction quality factor (−)
- R :
-
Distance to the voxel-center ring (voxels)
- Re:
-
Reynolds number (−)
- u, v, w :
-
Three Cartesian components of velocity (mm/s)
- w i , j :
-
Weighting coefficient (−)
- μ :
-
Scalar relaxation parameter, which must be ≤ 1 for MART
- ωD/U 0 :
-
Normalized vorticity (−)
- U 0 :
-
Free-stream or bulk velocity (m/s)
- ART:
-
Algebraic reconstruction technique
- CPU:
-
Central Processing Unit
- CUDA:
-
Compute Unified Device Architecture
- FFTW:
-
The Fastest Fourier Transform in the West
- GPU:
-
Graphics Processing Unit
- MART:
-
Multiplicative algebraic reconstruction technique
- MKL:
-
Intel Math Kernel Library
- Tomo-PIV:
-
Tomographic particle image velocimetry
References
Atkinson C, Soria J (2009) An efficient simultaneous reconstruction technique for tomographic particle image velocimetry. Exp Fluids 47:553–568. https://doi.org/10.1007/s00348-009-0728-0
Bajpayee A, Techet AH (2017) Fast volume reconstruction for 3D PIV. Exp Fluids 58:1–4. https://doi.org/10.1007/s00348-017-2373-3
Bajpai MK, Munshi P, Gupta P (2012) An efficient GPU based parallel algorithm for image reconstruction. 2nd IEEE international conference on parallel, distributed and grid computing Solan, Himachal Pradesh, India, pp 242–245. https://doi.org/10.1109/PDGC.2012.6449825
Bajpai M, Gupta P, Munshi P, Titarenko V, Withers PJ (2013) A graphical processing unit-based parallel implementation of multiplicative algebraic reconstruction technique algorithm for limited view tomography. Res Nondestruct Eval 24:211–222. https://doi.org/10.1080/09349847.2013.795635
Bardet PM, Peterson PF, Savaş Ö (2010) Split-screen single-camera stereoscopic PIV application to a turbulent confined swirling layer with free surface. Exp Fluids 49:513–524. https://doi.org/10.1007/s00348-010-0823-2
Bogaerts S, Burke K, Shelburne B, Stahlberg E (2010) Concurrency and parallelism as a medium for computer science concepts. Curricula for concurrency and parallelism workshop at systems, programming, languages, and applications: software for humanity
Castaño Díez D, Mueller H, Frangakis AS (2007) Implementation and performance evaluation of reconstruction algorithms on graphics processors. J Struct Biol 157:288–295. https://doi.org/10.1016/j.jsb.2006.08.010
Chaudhary SK, Munshi P (2021) Computation and storage efficient sparse MART algorithm for 2-D, 3-D reconstruction from fan beam, cone-beam projection data. Res Nondestruct Eval 32:1–17. https://doi.org/10.1080/09349847.2021.1928350
Dallas C (2018) A GPU-based high performance particle image velocimetry algorithm for characterizing a flow control actuator. Dissertation, University of Toronto
Dallas C, Sullivan P (2018) Particle image velocimetry data processing on a GPU cluster. Canadian Society for Mechanical Engineering (CSME) International Congress Toronto, Canada
Dallas C, Wu M, Chou V, Liberzon A, Sullivan PE (2019) Graphical processing unit-accelerated open-source particle image velocimetry software for high performance computing systems. J Fluids Eng. https://doi.org/10.1115/1.4043422
Ding J, Lim D, Sheikh S, Xu S, Shi S, New TH (2018) Volumetric measurement of a supersonic jet with single-camera light-field PIV. 19th International symposium on the application of laser and imaging Lisbon, Portugal
Discetti S, Astarita T (2012) A fast multi-resolution approach to tomographic PIV. Exp Fluids 52:765–777. https://doi.org/10.1007/s00348-011-1119-x
Discetti S, Natale A, Astarita T (2013) Spatial filtering improved tomographic PIV. Exp Fluids 54:1–13. https://doi.org/10.1007/s00348-013-1505-7
Elsinga GE (2008) Tomographic particle image velocimetry and its application to turbulent boundary layers. Dissertation, Technische Universiteit Delft
Elsinga GE, Scarano F, Wieneke B, van Oudheusden BW (2006) Tomographic particle image velocimetry. Exp Fluids 41:933–947. https://doi.org/10.1007/s00348-006-0212-z
Frigo M, Johnson SG (2012) FFTW: fastest Fourier transform in the West. Astrophysics Source Code Library:ascl-1201.
Gan L, Cardesa-Duenas JI, Michaelis D, Dawson J (2012) Comparison of tomographic PIV algorithms on resolving coherent structures in locally isotropic turbulence. In: 16th International symposium on applications of laser techniques to fluid mechanics Lisbon, Portugal, pp 9–12
Gao Q, Pan S, Wang H, Wei R, Wang J (2021) Particle reconstruction of volumetric particle image velocimetry with the strategy of machine learning. Adv Astron 3:1–14. https://doi.org/10.1186/s42774-021-00087-6
Gautier N, Aider JL (2015) Real-time planar flow velocity measurements using an optical flow algorithm implemented on GPU. J Visualization 18:277–286. https://doi.org/10.1007/s12650-014-0222-5
Gordon R, Bender R, Herman GT (1970) Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography. J Theor Biol 29:471–481. https://doi.org/10.1016/0022-5193(70)90109-8
Harris M (2007) Optimizing parallel reduction in CUDA. In: Nvidia Dev Technol. NVIDIA Corp. California. https://vuduc.org/teaching/cse6230-hpcta-fa12/slides/cse6230-fa12--05b-reduction-notes.pdf.
Harish P, Narayanan PJ (2007) Accelerating large graph algorithms on the GPU using CUDA. In: 14th International Conference on High-Performance Computing Goa, India, pp 197–208
Harris M, Sengupta S, Owens JD (2007) Parallel prefix sum (scan) with CUDA GPU gems. pp 851–876
He C, Liu Y, Gan L (2020) Instantaneous pressure determination from unsteady velocity fields using adjoint-based sequential data assimilation. Phys Fluids 32:035101. https://doi.org/10.1063/1.5143760
He C, Liu Y, Gan L (2021) Dynamics of the jet flow issued from a lobed Nozzle: tomographic particle image velocimetry measurements. Int J Heat Fluid Flow 89:108795. https://doi.org/10.1016/j.ijheatfluidflow.2021.108795
Herman GT, Lent A (1976) Iterative reconstruction algorithms. Comput Biol Med 6:273–294. https://doi.org/10.1016/0010-4825(76)90066-4
Hong S, Kim SK, Oguntebi T, Olukotun K (2011) Accelerating CUDA graph algorithms at maximum warp. Acm Sigplan Notices 46:267–276. https://doi.org/10.1145/2038037.1941590
Klöckner A, Pinto N, Lee Y, Catanzaro B, Ivanov P, Fasih A (2012) PyCUDA and PyOpenCL: a scripting-based approach to GPU run-time code generation. Parallel Comput 38:157–174. https://doi.org/10.1016/j.parco.2011.09.001
Knaus DA, Davis B, Micka DJ, Phillips S, A Hm Ed KA (2021) Description and application of a software tool for simulation of tomographic PIV data. AIAA Propulsion and Energy 2021 Forum
Li H, Yu D, Kumar A, Tu Y-C (2014) Performance modeling in CUDA streams—A means for high-throughput data processing. IEEE international conference on big data (big data), pp 301–310
Lin D-L, Huang T-W (2021) Efficient GPU computation using task graph parallelism. European conference on parallel processing, pp 435–450. https://doi.org/10.1007/978-3-030-85665-6_27
Lynch KP, Scarano F (2015) An efficient and accurate approach to MTE-MART for time-resolved tomographic PIV. Exp Fluids 56:66. https://doi.org/10.1007/s00348-015-1934-6
Moore N, Leeser M, King LS (2014) Kernel specialization provides adaptable GPU code for particle image velocimetry. IEEE T Parall Distr 26:1049–1058. https://doi.org/10.1109/TPDS.2014.2317721
Novara M, Scarano F (2012) Performances of motion tracking enhanced Tomo-PIV on turbulent shear flows. Exp Fluids 52:1027–1041. https://doi.org/10.1007/s00348-011-1187-y
Novara M, Batenburg KJ, Scarano F (2010) Motion tracking-enhanced MART for tomographic PIV. Meas Sci Technol 21:035401. https://doi.org/10.1088/0957-0233/21/3/035401
Novara M, Ianiro A, Scarano F (2012) Adaptive interrogation for 3D-PIV. Meas Sci Technol 24:024012. https://doi.org/10.1088/0957-0233/24/2/024012
Okamoto K, Nishio S, Saga T, Kobayashi T (2000) Standard images for particle-image velocimetry. Meas Sci Technol 11:685–691. https://doi.org/10.1088/0957-0233/11/6/311
Pan S, Gao Q, Li Q, Wang H, Wei R, Wang J (2019) 3D particle reconstruction of volumetric particle image velocimetry with convolutional neural network. pp P11–002
Scarano F, Riethmuller ML (2000) Advances in iterative multigrid PIV image processing. Exp Fluids 29:S051–S060. https://doi.org/10.1007/s003480070007
Schiwietz T, Westermann R (2004) GPU-PIV. In: Proceedings of the vision, modeling, and visualization conference Stanford, California, USA, pp 151–158
Schröder A, Willert CE (2008) Particle image velocimetry: new developments and recent applications. Springer, Heidelberg
Shi S, Ding J, Atkinson C, Soria J, New TH (2018) A detailed comparison of single-camera light-field PIV and tomographic PIV. Exp Fluids 59:1–13. https://doi.org/10.1007/s00348-018-2500-9
Tang Z-Q, Jiang N, Schröder A, Geisler R (2012) tomographic PIV investigation of coherent structures in a turbulent boundary layer flow. Acta Mech Sin 28:572–582. https://doi.org/10.1007/s10409-012-0082-y
Tarashima S, Tange M, Someya S, Okamoto K (2010) GPU accelerated direct cross-correlation PIV with window deformation. In: 15th Int Symp on Applications of Laser Techniques to Fluid Mechanics Lisbon, Portugal
Wang E, Zhang Q, Shen B et al (2014) Intel math kernel library high-performance computing on the Intel® Xeon Phi™. Springer, New York, pp 167–188
Wang HP, Gao Q, Wei RJ, Wang JJ (2016) Intensity-enhanced MART for tomographic PIV. Exp Fluids 57:1–19. https://doi.org/10.1007/s00348-016-2176-y
Wang J, Zhang C, Katz J (2019) GPU-based, parallel-line, omni-directional integration of measured pressure gradient field to obtain the 3D pressure distribution. Exp Fluids 60:1–24. https://doi.org/10.1007/s00348-019-2700-y
Wieneke B (2008) Volume self-calibration for 3D particle image velocimetry. Exp Fluids 45:549–556. https://doi.org/10.1007/s00348-008-0521-5
Wieneke B (2013) Iterative reconstruction of volumetric particle distribution. Meas Sci Technol 24:827–837. https://doi.org/10.1088/0957-0233/24/2/024008
Willert CE, Gharib M (1991) Digital particle image velocimetry. Exp Fluids 10:181–193. https://doi.org/10.1007/BF00190388
Worth NA, Nickels TB (2008) Acceleration of Tomo-PIV by estimating the initial volume intensity distribution. Exp Fluids 45:847–856. https://doi.org/10.1007/s00348-008-0504-6
Worth NA, Nickels TB (2011) Time-resolved volumetric measurement of fine-scale coherent structures in turbulence. Phys Rev E 84:025301. https://doi.org/10.1103/PhysRevE.84.025301
Xu F, Mueller K (2007) Real-time 3D computed tomographic reconstruction using commodity graphics hardware. Phys Med Biol 52:3405. https://doi.org/10.1088/0031-9155/52/12/006
Ye Z, Gao Q, Wang H, Wei R, Wang J (2015) Dual-basis reconstruction techniques for tomographic PIV. Sci China Technol Sci 58:1963–1970. https://doi.org/10.1007/s11431-015-5909-x
Zhang S, Geng G, Cao G, Zhang Y, Liu B, Dong X (2018) Fast projection algorithm for LIM-based simultaneous algebraic reconstruction technique and its parallel implementation on GPU. IEEE Access 6:23007–23018. https://doi.org/10.1109/ACCESS.2018.2829861
Zhao Z, Buchner A-J, Atkinson C, Shi S, Soria J (2019) Volumetric measurements of a self-similar adverse pressure gradient turbulent boundary layer using single-camera light-field particle image velocimetry. Exp Fluids 60:1–14. https://doi.org/10.1007/s00348-019-2788-0
Acknowledgements
The authors gratefully acknowledge the financial support from the National Natural Science Foundation of China (11725209, 12002208) and the Natural Science Foundation of Shanghai (20ZR1425700).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zeng, X., He, C. & Liu, Y. GPU-accelerated MART and concurrent cross-correlation for tomographic PIV. Exp Fluids 63, 91 (2022). https://doi.org/10.1007/s00348-022-03444-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00348-022-03444-3