Three-Dimensional Ultrasound Localization Microscopy with Bipartite Graph-Based Microbubble Pairing and Kalman-Filtering-Based Tracking on a 256-Channel Verasonics Ultrasound System with a 32 × 32 Matrix Array

Three-dimensional (3D) ultrasound localization microscopy (ULM) using a 2-D matrix probe and microbubbles (MBs) has recently been proposed to visualize microvasculature in three spatial dimensions beyond the ultrasound diffraction limit. However, 3D ULM has several limitations, including: (1) high system complexity, (2) complex MB flow dynamics in 3D, and (3) extremely long acquisition time that had to be addressed. To reduce the system complexity while maintaining high image quality, we used a sub-aperture process to reduce received channel counts. To address the second issue, a 3D bipartite graph-based method with Kalman filtering-based tracking was used in this study for MB tracking. An MB separation approach was incorporated to separate high concentration MB data into multiple, sparser MB datasets, allowing better MB localization and tracking for a limited acquisition time. The proposed method was first validated in a flow channel phantom, showing improved spatial resolutions compared with the contrasted enhanced power Doppler image. Then the proposed method was evaluated with an in vivo chicken embryo brain dataset. Results showed that the reconstructed 3D super-resolution image achieved a spatial resolution of around 52 μm (smaller than the wavelength of around 200 μm). A lower system complexity of 3D ULM has been proposed. In addition, our proposed 3D ULM provided the capability of 3D motion compensation and MB tracking. Microvessels that cannot be resolved clearly using localization only, can be well identified with the proposed method.


Introduction
Two-dimensional (2D) ultrasound localization microscopy (ULM) [1][2][3][4][5][6] has been proposed to achieve spatial resolution at the scale of micrometers while preserving the imaging penetration of conventional ultrasound. The concept of the 2D ULM is analog to the optical super-resolution microscopy techniques such as photo-activation localization microscopy [7], where a super-resolution image is reconstructed by localizing the centroids of spatially isolated microbubbles (MBs) and accumulating the MB centroids over thousands of ultrasound frames. In ULM, individual MB can be tracked over multiple ultrasound frames to provide the measurement of blood flow speed which is Doppler angle independent. However, it is still challenging to U-Wai Lok and Chengwu Huang have equally contributed to this work. accurately track fast-moving MBs for the low frame rate imaging, thus, a Markov-Chain-Monte-Carlo-Data-Association method [4] has been proposed to refine the tracking accuracy of MB signals and remove noisy signals. This method was successfully applied to image in-human breast tumor microvasculature, providing reliable multiparametric quantification of tumors [8,9]. We have also proposed a bipartite graph-based pairing approach [10] based on solving the partial assignment problem to robustly track the movement of MBs. This method is preferable for high frame rate imaging or low MB concentration and the imaging performance can be further improved with a Kalman filter-based tracking algorithm [11]. The feasibility of these methods has been previously demonstrated in in-human liver and kidney applications [12].
Despite the advantages of 2D ULM, low MB concentration was often used to ensure MBs were spatially isolated. This will lead to a long data acquisition time (up to several to ten minutes) to accumulate enough MBs to fully populate the microvessel lumen, posing challenges for clinical applications. To mitigate this issue, we have previously proposed an MB separation method [13] which can separate the higher MB concentration data into several subsets with sparser MB concentrations. Besides, other approaches have been proposed to achieve super-resolution ultrasound microvessel imaging with shorter acquisition times such as deconvolution [14,15], inpainting [16] and sparsity-based ultrasound super-resolution hemodynamic imaging (SUSHI) [17]. Though these approaches allowed the reduction of the acquisition time (e.g. injection of high MB concentration or in-paint the ULM image); they suffer from the uncertainty of the number of iterations, inaccurate estimated point spread function (PSF) and exhaustive parameter optimization to achieve an optimal image quality. Additionally, 2D deep learning based ULM (deep-ULM) approaches [18][19][20][21][22][23][24] have been actively developed recently to improve this issue. Deep-ULMs used neural networks to extract features of the MB signal during the training process and then make predictions to identify MB signals, resulting in fast recognition of MB signals and microvascular reconstruction even at high MB concentrations.
To prompt ULM from 2 to 3D, recent studies [25][26][27] have been proposed using a mechanical micro-step motor with a 1D transducer to generate 2-D super-resolved images and then combine all the 2D images to achieve a 3-D superresolved image. However, one of the critical issues using a 1D transducer is the out-of-plane MB movement (i.e., MB moves in and out of the 2D imaging plane), which will lead to bias in MB location and velocity estimation. Recently, with the use of a 2D matrix probe, volumetric ultrasound data can be collected to allow MB tracking in 3D, and thus, the out-of-plane motion issue in 2D imaging can potentially be compensated. However, 3D ULM remains challenging due to the high system complexity (e.g. four 256-channel systems may be used for a 1024-channel probe) and the demand for sophisticated MB localization and tracking to visualize the complex 3D flow dyanmics.
There are numerous studies for the motion registration (motion correction and estimation), localization, and tracking of microbubbles for 2D ULMs; however, only a few studies have been presented for the 3D ULMs using a matrix probe. In this study, we demonstrated the required modifications based on the methods we proposed in 2D ULMs and extend them to suit 3D ULMs, especially for the 3D motion registration and Kalman-filter-based microbubble tracking. Our first step was to extend the subpixel-based image registration method [43] to suit the 3D motion estimation. In the same way as 2D motions (lateral and axial motions) that can be estimated independently using the 2D cross-power spectrum, 3D motions can also be estimated using the 3D cross-power spectrum. In this study, we demonstrated the required mathematical model and showed how to compute the 3D cross power spectrum. Additionally, we used our proposed MB separation method [13] to separate original MB data into multiple datasets, each with sparser MBs, according to the speed and direction of MB movements to reduce the acquisition time. For the localization process, a 3D point spread function of the MB signal was first estimated using MB signals acquired from a water tank [14]. The centroids of spatially isolated MBs were then localized using the estimated PSF. Individual localized MBs were then paired using a 3D bipartite graph-based pairing approach extended from our previous method [10]. To achieve better MB tracking, 3D Kalman-based tracking was adopted in this study by extending the state-space and measurement models mentioned in [11] from 2 to 3D, and the corresponding algorithm for the 3D Kalman-based tracking was presented.
The remainder of this paper is organized as follows: Section 2 provides the methods and materials of the proposed method for 3D ULM. Section 3 presents the results of the proposed method on a flow channel phantom and a chicken embryo brain. Discussion is provided and conclusions are drawn in Sections 4 and 5.

Data Acquisition and Pre-processing
In this study, an 8 MHz 2D matrix probe (Vermon S.A., Tours, France) was used to acquire ultrasound RF data. This probe consists of 1024 channels, and the pitch size of each channel was around 300 µm. In the elevational direction, channel row numbers 9, 18, and 27 were missing for the purpose of wiring. With the 2D matrix probe connected to a Universal Transducer Adapter (UTA) 1024 MUX, four elements can be connected, which permits transmission of four channels with the same delay simultaneously. A Verasonics 256-channel system (Kirkland, WA, USA) was also connected to the UTA 1024 MUX. Moreover, the UTA 1024 MUX supports switching sub-apertures (each sub-aperture consists of 256 elements) during data reception, so the 1024-channel volumetric ultrasound RF data can be collected sequentially during four transmissions and receptions as shown in Fig. 1a and b. In addition, nine-angle coherent compounding plane-wave imaging along the lateral direction (lateral and elevation angles of (− 4°, 0°) to (4°, 0°) with (1°, 0°) interval) was performed for data acquisition with a transmitted frequency of 7.8 MHz. To acquire one compounded RF volume, 36 transmitted-received cycles (4 firings at each of 9 angles) were required, leading to a post-compounded volume rate of around 350 Hz.
The real-time B-mode imaging and data acquisition sequences were used to identify the two centered slices of the required location and to capture the required 3D ultrasound RF data. The software-based beamforming process was then performed using the pixel-based beamforming approach, which is based on the multi-core CPU architecture [44]. After beamforming, IQ data reconstructed from different tilted angles were summed together to form a single post-compounded IQ data.

Flow Channel Imaging
A 380 µm inner diameter customized flow channel phantom (Ismatec SC0415, Cole-Parmer Instrument Co., IL, USA) was used in this study. One side of the flow channel was connected to a syringe pump (Model NE-1010, New Era Pump Systems Inc., Farmingdale, NY, USA). Lumason MB suspension (Bracco Diagnostics Inc., Monroe Township, NJ, USA) with mean diameter range of 1.5-2.5 µm was diluted with saline to approximately 1/1000 times the original concentration (approximately 1.5 to 5.6 × 10 8 MBs/mL). Ultrasound RF data were acquired using a Verasonics Vantage ultrasound system (Verasonics Inc.,Kirkland, WA, USA) equipped with a 1024-channel 2D matrix probe (Vermon S.A., Tours, France), as described above. The transmit voltage was set as 15 V (one-side voltage). For the flow channel study, 2000 volumes of RF channel data were collected and transferred to the host computer for post-processing.

Ultrasound Imaging of Chicken Embryo Brain
A 12th-day of embryonic development chicken embryo was used in this study. The chicken embryo brain was scanned through the intact skull bone with a field-ofview (FOV) of 12.8 mm (lateral) by 12.8 mm (axial) by 12.8 mm (elevational). To inject MB, an 18 G × 1.5-inch beveled needle tip was attached to 8 cm of Tygon R-3603 laboratory tubing. A glass capillary needle was attached to the open end of the tubing. With the aid of a dissecting microscope, the glass capillary needle tip was manually cannulated into a high-order vein on the chicken embryo surface for contrast injection. A bolus of 70 μL Lumason MB (mean diameter range of 1.5-2.5 µm) at the original concentration (1.5 to 5.6 × 10 8 MBs/mL) was injected into the chicken embryo. Noted that MB signals' s intensity varies over time, the timing to data acquisition is a crucial factor for the post-processing. Therefore, after injection, the rSVD-based filtering technique [45] was performed for a quick view of filtered MB signals for every 10 s until observing the moving MBs. Finally, 22 s of ultrafast plane wave data acquisition was collected and saved for offline post-processing. The transmit voltage was set as 15 V (one-side voltage, with a mechanic index of around 0.2). No IACUC approval was necessary to perform the chicken embryo experiments presented in this manuscript since avian embryos are not considered to be live vertebrate animals according to the NIH public health service policy.

Motion Estimation and Correction
Long acquisitions are typically required to observe small vessels in ULM. As super-resolved images are reconstructed from the superposition of many localizations collected over time, motions between volumes reduce the resolution of the output images. Therefore, a 3D motion registration method was used to estimate and correct the motions in this study. For the motion estimation, we extended the sub-pixel-based motion estimation method [43] based on the Fourier shift property. The motion in the spatial domain is transformed in the Fourier domain as linear phase differences. This can be described as follows: Let f 1 and f 2 be two functions where f 2 (x,y,z) is the spatial shift of f 1 (x,y,z) The 3D discrete Fourier transform of (1) can be expressed as.
The normalized cross power spectrum is described as After computing the inverse Fourier transform of the cross power spectrum, the 3D movement can be estimated using the correlation-based method as described in [43] for lateral, elevational, and axial directions, separately. (1)

Tissue Clutter Filtering and Microbubble Separation
The post-compounded and motion-corrected IQ volumetric data were then passed through an SVD-based clutter filter for tissue clutter rejection. A 3D spatiotemporal singular value decomposition-based clutter filter [46] was applied to the post-compounded IQ data to extract moving MB signals from background tissue signals and stationary MB signals. The cutoffs of tissue clutter subspace were automatically selected using the lower-order singular value thresholding method described in [45,47]. Then, the filtered signals were passed through the noise equalization to mitigate the effect of spatial-dependent noise profile [48]. Subsequently, an MB separation method was applied to separate original MB data into multiple datasets, each with sparser MB concentration, according to the speed and direction of MB movements [13]. MB signals within the different speeds and motion directions have different Doppler frequenciesand can thus be extracted in different subsets. In this study, two subsets of MB signals separated from the high-dense were used.

3D Localization and Pairing
The MB signals in each subset were normalized by their amplitude and then interpolated to have a voxel size of around 24.7 μm × 24.7 μm × 24.7 μm using a 3D linear interpolation (interp3.m function in MATLAB). The envelope of interpolated MB signals was then thresholded by an intensity value to suppress noisy background. A system Gaussianfitting 3D point spread function [14] was derived and used to perform a 3D normalized cross-correlation with each frame of MB signals in each subset. The 3D cross-correlation coefficient maps were then thresholded with a pre-defined value and then the regional peaks of 3D normalized cross-correlation maps were identified as MB locations. A bipartition graph-based pairing approach with a partial assignment algorithm was used for MB tracking [10]. This algorithm enforces that the distance between two MB signals in two consecutive frames should be mutually minimal. To extend the pairing algorithm from 2 to 3D, the 3D distances of MB signals between two consecutive volumes instead of 2D distances were computed.Two MBs are paired together only if the distance of the mutual minimal between two consecutive frames. A detected MB was a reliable MB signal when it was paired in 10 consecutive volumes. Otherwise, the MB signal in the current volume will be discarded (See Fig. 2).

Smoother MB Trajectories with 3D Kalman
Filtering-Based Tracking The state model of a moving MB in 3D can be expressed as follows: where x(k), y(k) and z(k) denote the MB locations (lateral, elevational and axial respectively) at time k; dx(k), dy(k) and dz(k) represent their corresponding displacement difference, respectively; w and dw are the random perturbations that influence MB locations and velocities, respectively. Thus, the MB state at time k can be predicted with its previous state at time k − 1. Since the imaging system and MB localization algorithm introduce noise to the observation, the observed location of the MB at time k [m x (k), m y (k), m z (k)] is the real MB location [x(k), y(k), z(k)] plus noise n(k), where the observation model is The state transition matrix F performs the prediction model, and H is the observation matrix, which maps the state vector space into the measurement vector space. The fourth to sixth columns of H are given the numerical value of zero since only MB location information were used. The procedure of the Kalman filtering is shown as summarized as follows: To further improve the velocity tracking performance, the acceleration constraint is used to discard MBs with unexpected motion acceleration due to false pairing or localization, and the adaptive interpolation were used to in-paint the gap between the MB trajectory as shown in Fig. 3. The acceleration of an MB can be computed as where VR, v t+1 , v t , are the volume rate, the velocity at the time point t + 1 and t. The acceleration threshold a thr was set as where v is the mean velocity for the given MB trajectory. The MB trajectory was discarded if large acceleration beyond a thr was found. After the acceleration constraint, we performed an adaptive interpolation based on the local MB movement speed. An MB with high speed produces larger gaps between adjacent locations. Therefore, a larger interpolation factor is needed to in-paint the gap in between the MB positions (Fig. 3). On the other hand, a slow-moving MB has smaller gaps in between the MB trajectory, requiring a smaller interpolation factor.

Performance Evaluation
To compare the spatial resolutions, the vessel cross-sectional profiles were interpolated with a pixel resolution 10-times higher than the original cross-section vessel profiles in ULM. The full-width-half-maximums (FWHMs) a thr = 0.5 × v, Fig. 2 Ultrasound volumetric data were collected from a chicken embryo brain using a 2D matrix probe after microbubble injection were used to evaluate the interpolated cross-section profile of a single vessel for the super-resolved images.
To compare the contrast enhancement, the contrast ratio (CR) was used to evaluate the improvement with the proposed method in 3D as.
where the B mean is the mean intensity including blood vessels' signals and surrounding signals in the defined ROI. N mean is the mean background intensity in the defined ROI.

Flow Channel Phantom
3-D power Doppler and ULM images of the flow channel phantom are shown in Fig. 4a and b, respectively. As compared with the power Doppler image (Fig. 4a), the thickness of the 3D flow channel phantom with the proposed ULM (11) CR = 10 × log 10 B mean N mean , was reduced due to the improvement of the spatial resolution. To quantitatively investigate the improvement of spatial resolution, the elevational-axial slices and corresponding cross-sectional profiles are explored as shown in Fig. 5a-d. The white dashed lines and green dashed lines shown in Fig. 5a and b indicated the locations where the elevational and axial cross-sectional profiles are extracted to evaluate the FWHMs. As shown in Fig. 5c and d, the cross-sectional profiles indicate an axial and elevational FWHMs of 660 µm and 510 µm for the power Doppler image, while the counterparts of the proposed ULM (3D localization, pairing, Kalman-based tracking and adaptive interpolation) were around 310 µm and 280 µm.

3-D super-Resolved Image of Chicken Embryo Brain
3-D super-resolution images reconstructed by localization and the proposed method (localization, pairing, Kalman filtering and adaptive interpolation) are shown in Fig. 6a and b, respectively. As can be seen, a sharper 3D image can be achieved with the proposed method, which is expected because of the rejection of noises and the false MB signals using the proposed method.
To further investigate the resolution (FWHM) and the contrast enhancement (CR) with the proposed method, the lateral-axial projection images are performed where each projection image is the maximum intensity projection along the elevational direction. The super-resolution microvessel projection images without (localization only) and with the proposed method are shown in Fig. 7a and b. To highlight the resolution of the microvasculature image, a small region of the super-resolved image without the proposed method was magnified and shown in Fig. 7c and compared with the magnified super-resolution image with the proposed method in Fig. 7d. The FWHM of a microvessel obtained with the proposed method indicated by the blue arrows in Fig. 7e was about 52 µm while it cannot be identified using localization only.
The elevational-axial projection images are performed where each projection image is the maximum intensity projection along the lateral direction. The super-resolution microvessel projection images without (localization only) and with the proposed method are shown in Fig. 8a and b. To highlight the resolution of the microvasculature image, a small region of the super-resolved image without the proposed method was magnified and shown in Fig. 8c and compared with the magnified super-resolution image with the proposed method in Fig. 8d. The FWHM of a microvessel obtained with the proposed method indicated by the blue arrows in Fig. 8e was about 74 µm. Three ROIs were used as shown in the white (Region 1), green (Region 2), and yellow (Region 3) solid and dashed boxes to evaluate the CRs as shown in Fig. 9a and b. The corresponding CRs are listed in Table 1. The CRs obtained with the proposed method showed a gain of about 6.93 dB, 1.90 dB, and 6.22 dB for the region 1, 2, 3 as compared with those without the proposed method.
In Fig. 10a, b, and c, super-resolution microvessel projection images are shown with localization only (Fig. 10a, localization and pairing (Fig. 10b, and localization, pairing, Kalman-based tracking and adaptive interpolation (Fig. 10c. The 3D motion correction was applied to all these three images. Compared to the projection image with localization only (Fig. 10a), the unreliable MB signals can be discarded with the 3D pairing method (Fig. 10b), and MB trajectory can be further improved with the 3D Kalman-filtering-based tracking (Fig. 10 c). Figure 10d shows the super-resolution microvessel projection image with localization, pairing, Kalman-based tracking, and adaptive interpolation, but without the 3D motion correction. As compared with Fig. 10c, the resolution without the motion correction is slightly degraded.

Discussion
In this study, we demonstrated the feasibility of the proposed 3D ULM on a 256-channel ultrasound system with a 32 × 32 matrix probe. With the proposed method, better 3D super-resolution images can be achieved as compared with those using localization only and power Doppler image. The improvement is the rejection of unreliable MB signals by the 3D bi-partition pairing approach, and the smoothening of the MB trajectories by the 3D Kalman filtering. The validation of the proposed method was conducted with a flow channel phantom. From the results of the flow channel phantom study, the proposed method can image the axial and elevational cross-sectional FWHM profiles to nearly 310 µm and 280 µm, which is better than that of the power Doppler image (660 µm and 510 µm). The sub-wavelength resolution can also be observed in the projection images of In this study, the voxel size of the super-resolved microvessel images was around 24.7 μm × 24.7 μm × 24.7 μm (axial/ lateral/elevational). This spatial resolution can be further improved by using a smaller voxel size down to about 1/10 of the acoustic wavelength (that is, 19 μm for a 7.8 MHz center frequency). However, longer acquisition times will be required to fully populate the microvasculature [49,50]. Additionally, it should be noted that there is a physiological  limitation to image small microvessels, it might not be possible to image the whole capillary networks at the level of micrometers in around 22 s (used in this study) even the microbubble separation method was used.
This study used rSVD-based filtering to measure the intensity of MB signals every 10 s until the MB signals are observed, and around 22 s of MB data were acquired. A future study can compute the MB time intensity in a specific region over time to evaluate the best time (or time slot) to acquire data.
There are several drawbacks in our proposed 3D ULM. First, high computational power is required to handle enormous ultrasound data to reconstruct a 3D super-resolved image. The computational complexity will be increased as the number of subsets in the MB separation method. However, the increase of computational power will mitigate this drawback and increase the interest in this technique. The second drawback is the spatial resolution along the elevational direction. For the current system, we can only transmit a steered plane wave along the lateral direction using all 1024 channels. The spatial resolution along the elevational direction is limited since only 0-degree plane waves were transmitted along the elevational direction. One possible approach to increase the spatial resolution is to increase the number of transmitted (compounding) angles along with the elevational resolution. To transmit a steered plane wave along the elevational direction for all 1024 channels, each sub-aperture (32 × 8 channels) should transmit and receive the steered plane wave signals separately, and a total of 16 Table 1 CRs for different approaches for the chicken embryo brain dataset CR (Fig. 9a) CR (Fig. 9b) a. Region (1) 9.12 dB 16.05 dB b. Region (2) 5.59 dB 7.49 dB c. Region (3) 6.82 dB 13.04 dB acquisitions are required to acquire a volume dataset for a particular transmitted angle. This implies that there will be a fourfold reduction of the post-compounded volume rate when a steered plane wave is transmitted along the elevational direction for the current probe and system design. Additionally, the lateral resolution using our 2D matrix probe is poorer than that of conventional 1D ultrasound probes due to the small aperture size (nearly 1 cm × 1 cm). The localization precision may thus degrade due to the poor spatial resolution compared to 1D probes in the lateral and axial directions. Another important issue is the signal-tonoise ratio (SNR) of the 2D matrix probe, which is lower than that of 1D probes. The low SNR may be one of the issues that we cannot visualize smaller microvessels in this study. One possible approach to enhance SNR is to increase the number of transmitted (compounding) angles at the cost of lower volume rate. In addition, the SNR will be slightly increased with longer pulse cycle with the cost of the degradation of the axial resolution. Furthermore, noise reduction approaches, such as non-local mean denoising [51] or debiased noise-suppression method [52] can be applied to suppress the noise effect at the expense of computational cost. Furthermore, the field-of-view and the transmitted plane-wave angle were limited due to the grating lobe of this matrix probe. The pitch size of each channel of the 2D matrix probe was around 300 µm (wavelength around 200 µm); thus, a grating lobe occurs around 40° as a 0° plane-wave is transmitted. Higher grating lobes will occur in the elevational direction than in the lateral direction because of the three missing channels (numbers 9, 18, and 27 along the elevational direction). Grating lobes will result in localizing false-positive MB signals, degrading the image quality of the super-resolved image. To avoid grating lobes, the FOVs were set to around 12.8 mm by 12.8 mm by 12.8 mm, and small steered transmitted angles (− 4° to 4°) were used in this study. Furthermore, the inner diameter of the flow channel phantom was around 380 µm (wavelength around 200 µm), which is the smallest flow phantom that we have for the demonstration of the resolution improvement between the power Doppler image and the proposed ULM.
To better investigate the resolution of the proposed method, a smaller inner diameter flow channel phantom could be used in future study. Additionally, the current study only demonstrated the feasibility of the 3D motion estimation based on the extension of the sub-pixel-based image registration method using the 3D power spectrum, further investigation (e.g. simulation of tissue motions in 3D) should be performed to evaluate the performance of such method in the future study. Finally, our results (e.g. vessel size and distribution) did not compared with those obtained with other imaging modalities. Contrast-enhanced microCT using Microfil contrast agent [53] would have been one method of imaging brains in 3D. The comparison between 3D ULM and Contrast-enhanced microCT would be intriguing in the future study.

Conclusion
In this study, we demonstrated the proposed method could improve the image quality of 3D ULM as compared with that of 3D ULM using localization only and the 3D contrasted enhanced power Doppler image. The proposed 3D ULM provides a feasible motion compensation to mitigate the out-of-plane motion limitations in 2D ULM related to the absence of elevational information. The performance of the proposed 3D ULM was evaluated using flow channel phantoms and in vivo chicken embryo brain datasets. Microvessels could hardly be distinguished using localization only were better identified using our 3D pairing and Kalman-filtering-based tracking algorithms. To sum up, the feasibility of 3D super-resolution imaging was demonstrated, showing potential in clinical applications.