Speed-of-sound imaging using diverging waves

Purpose. Due to its safe, low-cost, portable, and real-time nature, ultrasound is a prominent imaging method in computer-assisted interventions. However, typical B-mode ultrasound images have limited contrast and tissue differentiation capability for several clinical applications. Methods. Recent introduction of imaging speed-of-sound (SoS) in soft tissues using conventional ultrasound systems and transducers has great potential in clinical translation providing additional imaging contrast, e.g., in intervention planning, navigation, and guidance applications. However, current pulse-echo SoS imaging methods relying on plane wave (PW) sequences are highly prone to aberration effects, therefore suboptimal in image quality. In this paper we propose using diverging waves (DW) for SoS imaging and study this comparatively to PW. Results. We demonstrate wavefront aberration and its effects on the key step of displacement tracking in the SoS reconstruction pipeline, comparatively between PW and DW on a synthetic example. We then present the parameterization sensitivity of both approaches on a set of simulated phantoms. Analyzing SoS imaging performance comparatively indicates that using DW instead of PW, the reconstruction accuracy improves by over 20% in root-mean-square-error (RMSE) and by 42% in contrast-to-noise ratio (CNR). We then demonstrate SoS reconstructions with actual US acquisitions of a breast phantom. With our proposed DW, CNR for a high contrast tumor-representative inclusion is improved by 42%, while for a low contrast cyst-representative inclusion a 2.8-fold improvement is achieved. Conclusion. SoS imaging, so far only studied using a plane wave transmission scheme, can be made more reliable and accurate using DW. The high imaging contrast of DW-based SoS imaging will thus facilitate the clinical translation of the method and utilization in computer-assisted interventions such as ultrasound-guided biopsies, where B-Mode contrast is often to low to detect potential lesions.


Introduction
Ultrasound (US) imaging is indispensable in computerassisted interventions from surgical navigation to operative guidance, thanks to its being a low cost, non-ionizing, portable, and real-time imaging modality. Typically, US is known as a B-Mode imaging method that maps echo amplitudes indicating local tissue reflectivity. Nevertheless, B-mode images do not necessarily provide sufficient contrast for certain anatomical structures and pathological conditions. B Richard Rau richard.rau@vision.ee.ethz.ch Orcun Goksel orcun.goksel@vision.ee.ethz.ch 1 Computer-assisted Applications in Medicine group, ETH Zurich, Zurich, Switzerland Elastography, for example, creates images of local tissue elasticity in terms of shear modulus which may indicate pathological state [35]. Speed-of-sound (SoS) and acoustic attenuation are other tissue parameters that are known to have valuable differentiation capability [2]. To characterize and map these acoustic properties, transmission-based computed tomography (CT) uses specialized US imaging setups for arrival time and power-loss computation with water-bath suspension of the breast anatomy [21,22]. Such transmissionmode US imaging systems do not rely on echoes, i.e. US reflections at tissue interfaces but rather record a transmitted signal directly with another transducer at an opposite location, e.g. in a ring structure. It was shown that with such setups quantitative assessment of SoS bears tremendous potential for breast cancer detection [19][20][21]. In comparison to shear-wave elastography, SoS was found to lead to a better ex vivo tissue differentiation [8] with high specificity for benign and malignant tumors [10,11,18]. However, in transmission-mode US imaging, a double-sided access and hence a water-bath suspension of the anatomical structure is required, e.g., with two opposing [20], ring shaped [5] or full 3D [7] transducer geometries. These require costly and non-portable systems and an additional technician to operate, with a limited application on only submersible body parts, e.g. the breast and the extremities.
Novel US contrast modalities as above and their tissue differentiation capability are highly relevant also for imageguided planning and navigation. For instance, for US-guided needle biopsies as for the breast, prostate, and liver lesions, the visibility of potentially suspected regions in the US images could allow to target those specific locations; in a real-time fashion potentially enabling sensitivity for lesions otherwise invisible in B-mode. Nevertheless, dedicated and bulky transmission imaging setups mentioned above preclude interventional applications of these novel contrast modalities, due to limitations to submersible anatomy. Even for the breast, carrying out interventions, such as biopsies, in a water bath and within the limited space of a bulky setup would be infeasible in the current clinical realm. Reliable imaging of such modalities with conventional clinical handheld US transducers is an essential step in enabling their interventional applications.
For hand-held imaging, the use of a passive reflector as an acoustic mirror and timing reference was proposed in [28,32] for SoS reconstruction from time-of-flight measurements to the reflector placed at a known distance from the transducer. This was later extended to imaging acoustic attenuation [25] and its spectral mapping [26] by referencing measurements to water-bath calibration of the reflector appearance. Obviating the need for a reflector, small misalignments between images acquired at different plane-wave (PW) angles were used in [14] to reconstruct SoS distribution using a Fourier domain reconstruction approach. In [31], SoS reconstruction in the spatial domain was shown to yield improved accuracy and less artifacts. In [34] it was proposed for PW transmits to adapt receive apertures dynamically when beamforming different image locations to minimize spatial point-spread function (PSF) variation, in order to improve displacement estimation used for SoS3 reconstruction. Deeplearning-based variational neural network approaches for inverse-problem of SoS have been demonstrated to yield fast and robust reconstructions in [3,39,40].
In clinical settings, several works have studied SoS imaging using transmission-mode and water-submerged systems, e.g. for breast tissue classification [17], solid mass differentiation [13], and imaging human-knee [41]. Using conventional transducers in pulse-echo mode, SoS has been studied clinically for quantifying muscle loss [30] and breast density [29], as well as for differential diagnosis of breast can-cer [27]. SoS maps can also help to correct for beamforming delays and hence to improve any other US imaging modality. Typical beamforming assuming a constant SoS computes incorrect delays, not only reducing B-mode image resolution but potentially also affecting any following image processing such as texture analysis, tumor classification, segmentation, image translation, and displacement estimation for elastography. With the knowledge of SoS distribution, such aberrations can be corrected as demonstrated in [1,15,24].
Despite promising studies, robust pulse-echo SoS reconstructions using conventional transducers are still challenging. Compared to differential diagnosis, where the real-time aspect is less of an essence and presegmented regions may potentially be used as priors [12], e.g. for quantification of region averages, interventional imaging and image-guided applications with real-time probe manipulation depend on robust image reconstructions without priors. State-of-theart SoS techniques using PW transmit sequences are shown herein to yield subobtimal imaging due to PSF distortions and consequent displacement measurement errors. To address this, we herein propose a transmit sequence with diverging waves (DW) to minimize aberration artifacts and thus yield improved reconstructions. Although PW sequences are known to allow for high frame-rate and high quality images [23,36], DW (also termed synthetic transmit aperture imaging) benefit from lower aberration effects and have been presented over the recent decades for several other ultrasound imaging modalities, including B-Mode, Doppler, and Vector Flow imaging [16,36]. We herein study the feasibility of utilizing DW for SoS imaging, comparatively to PW, also considering the effect of PSF centering via adapted receive apertures.

Motivation
To demonstrate the effects of a wavefront choice on aberration related artifacts and to motivate the use of diverging waves in this context, we first present a simulated example below (cf. Fig. 1) comparing PW and DW. Using the MATLAB toolbox k-Wave [37], we simulate different transmit schemes and record the spatio-temporal acoustic signal at each and every point in the entire imaging field-of-view (FOV). For both transmit settings, we run two simulations: one with an SoS inclusion and one for a homogeneous case with no SoS inclusions, in order to comparatively quantify the effect of aberrations introduced by the inclusion. Consider the wavefronts arriving at a certain depth (e.g., marked with the dashed line in Fig.1a). As expected, the wavefronts passing through the inclusion would arrive earlier at such depth, compared to a no-inclusion scenario, cf. Fig. 1b/b'. Besides such earlier arrival, one can observe the strong aberration effects below the edges of the inclusion for the PW case (shown with the arrows in Fig. 1b), mainly caused by diffraction. Such aberrations could aggravate when the echos are considered, and would largely hinder any post-processing such as delay estimation for SoS reconstruction. To demonstrate this, we perform here a delay estimation only for the transmit side using normalized cross-correlation (NCC) between spatiotemporal data at all image points for with-inclusion and noinclusion cases. Fig. 1c/c' show the estimated delays based on cross-correlation lags and Fig. 1d/d' show correlation errors (i.e. 1−CorrelationCoefficient) in delay estimation, resulting from PW and DW, respectively. As seen in Fig. 1d, delay estimation accuracy is quite low with PW transmits, compared to DW. This can be more generally stated via statistics from multiple PW and DW settings, shown in Fig.1e/e' as probability density functions, where the PW case is seen to have errors further away from zero. To better illustrate this, the cumulative distribution functions are plotted in Fig.1f, which indicates the number of highly aberrated readings that forestall accurate displacement estimation. As can be seen, given any NCC tolerance/threshold, DW would yield much superior displacement estimation than PW, for aberrations typical to expect in in-vivo tissue. For instance, for an NCC tolerance of minimum 0.99, 20% of PW readings would be below this threshold, while all DW readings would be within bounds -which is a large margin considering the single small inclusion given a large homogeneous FOV.
In the context of SoS imaging, such wavefront distortions lead to incoherent signal summations in receive beamforming, thus potentially corrupting displacement estimations, which in turn degrade SoS reconstruction. By reducing wavefront aberrations, DW can yield improved SoS imaging, as shown later in our experiments.

Methods
We herein use a limited-angle computed tomography (LA-CT) reconstruction method in the spatial domain, similarly to [31] with the adaptations described below. The fundamental imaging principle and an overview of the data processing is sketched out in Fig. 2. First, raw data is acquired based on a PW or DW sequence, both of which involve multiple transmits (Tx) and after each Tx a receive (Rx) recording of RF data on all element channels. For a DW Tx, a single element emits a narrow band-limited pulse. For a PW Tx, all elements emit such a pulse, with a fixed time-delay between neighbouring elements to angulate the wave-front, where necessary. Rx recordings from a set of multiple such transmissions (Tx) is the input herein for the reconstruction of an SoS frame. Then, separately for each Tx, these Rx signals are beamformed into spatial RF frames, between which apparent local displacements are estimated to be next used to reconstruct a SoS map.
Beamforming. To beamform with the received raw channel data from the PW or DW transmit sequences (cf. Fig. 2), we herein employ a conventional delay-and-sum algorithm. Delays are computed with an assumed constant SoS of 1500 m/s for the simulations and 1470m/s for the phantom data acquisition. For both transmit schemes and all RF frames, beamforming is performed on a fixed Cartesian grid aligned with the transducer surface, for a fixed sampling space for the subsequent displacement estimation between these frames.
We present results with two different beamforming choices: with full Rx aperture and with an adapted Rx aperture. In full Rx aperture case signals from all channels are fed into beamforming, while still subjected to dynamic aperture per imaging depth (F-number = 1), which results in Rx aperture staying centered above each beamformed image point. As Tx arrival directions to a point keep changing with each Tx, this yields a PSF varying between different transmits, impeding the subsequent displacement estimation. This is remedied with an adapted Rx aperture (Fig. 3a), centered for each beamformed image point such that the PSF between Tx events to be displacement estimated is aligned as described in [34]. Depending on the RX aperture, PSFs can be aligned at different angles for the same Tx event. Each Tx event here is beamformed with N psf = 3 PSF angle alignments: ψ psf = 0 • and ±15 • , similar to the settings in [24]. For all transmits, we utilize a fixed Cartesian beamforming grid of For the apparent displacement estimation between beamformed RF frames (cf. Fig. 2), we use a normalized crosscorrelation algorithm in the axial direction, similarly to [31]. Depending on the alignment of PSF in beamforming, the estimated displacements are then corrected in the corresponding Tx-Rx direction, i.e. by multiplying them by cos(ψ psf ). For a constant computational complexity and to keep the data input into the reconstruction constant, we herein compute the relative delay data for a fixed number of M = 9 combinations, yielding an apparent displacement vector of Δτ ∈ R M N x N z . Note that for the adapted Rx aperture case, each Tx sequence is beamformed with N psf = 3 PSF alignments, such that in this case Δτ ∈ R N psf M N x N z . For this case, the imaging fieldof-view where PSF alignment can be effectively applied is smaller than the full field-of-view, due to limited aperture of the transducer. For instance, in Fig. 3a where ψ psf = 0 • is illustrated, the regions on the further right cannot be imaged, because the corresponding Rx apertures fall outside the physical aperture of the transducer.
For the PW transmissions, the relative delays from an angle separation of Δφ (see Fig. 3b) are used in the SoS reconstruction. Nevertheless, the actual displacement estimations are performed using cross-correlation between PW angles with a smaller increment Δθ , in order to prevent speckle decorrelation and artefactual readings due to phase  Fig. 3 a Tx and Rx paths for two DWs i, j (adapted RX aperture). b PW (arrows indicating PW normals) with displacement tracking between Δθ. Relative delay data obtained by accumulating increments to an angular disparity Δφ PW . c DWs are created with a single channel. d A sample row of the differential path matrix L linking SoS distribution and relative delay data at pixel (x, z) (positive/negative values for path i/ j wrapping. To obtain the relative delay data for larger disparities Δφ, the delay readings from the Δθ increments are then accumulated. In the literature on pulse-echo SoS imaging, delay accumulations were performed for increments of Δθ = 0.5 • as in [14,32] or Δθ = 2 • as in [34]. We study and compare both these settings in our experiments later below. As the choice of Δφ highly affects final SoS reconstructions, we vary this parameter to find an optimal setting, as later presented in our results section. For the DW case, the relative delays are obtained by a frame selection as illustrated in Fig. 3c. Here, the delays are directly estimated using cross-correlation-based displacement tracking between consecutive single element transmissions separated by Δchannel. As can be expected, this setting highly affects the quality of the relative delays measurements: On the one hand, for a small element separation, e.g. using consecutive channels, apparent displacements can be below the tracking noise level, as also illustrated later in our experiments. On the other hand, for large element separations, coherent speckle pattern changes drastically, precluding displacement estimation. Given this tradeoff, an optimal element separation Δchannel is expected, which is studied later below in the experiments. Speed-of-Sound Reconstruction. Derivation of SoS maps is based on relative delay data (cf. Fig. 2) and an inverse problem formulated to reconstruct the slownessσ ∈ R N x N z on a N x × N z spatial grid, which is just the inverse of the SoS: The differential path matrix L ∈ R M N x N z ×N x N z here links the relative slowness distribution σ − σ 0 to the relative delay measurements; for instance, in Fig. 3d the delay measurement at pixel (x, z) is sensitive to SoS variation along the illustrated paths between the beamformed images j and i. The σ 0 describes the initial slowness, which was used to compute the delays of the beamformed RF data. The regularization matrix D together with the weight λ controls the amount of spatial smoothness and is essential due to ill-conditioning of L. D implements LA-CT specific image filtering aimed to suppress streaking artifacts along wave propagation directions via anisotropic weighting of horizontal, vertical and diagonal gradients. For the corresponding directions either a Sobel (horizontal and vertical) or a Roberts kernel (diagonal) is used. Similarly to [31], we herein utilize a κ = 0.9 anisotropic weighting. The optimization problem is solved using a limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm [4,6,9,33].
For computational efficiency, we restrict the number of relative delay data readings Δτ in eq. (1) to 10 4 , which are randomly selected from all R M N x N z (full Rx aperture) or R N psf M N x N z (adapted Rx aperture) recordings, respectively.

Materials and experiments
Numerical simulations. To evaluate how accurate SoS heterogeneities can be imaged using the above explained SoS imaging method based on PWs or DWs, we simulated a pulseecho scenario, where a linear transducer is simulated and the echos at each element are recorded. In total 28 SoS heterogeneity cases are simulated (see first rows in Fig.5a/b), which are divided into two subsets.
The first subset (cases 1-6 and 28) consists of seven defined shapes on a homogeneous background substrate of 1500 m/s. Elliptical and circular inclusions have a SoS contrast of either −2% (i.e. 1470 m/s) or +2% (i.e. 1530 m/s). The last case has two circular inclusions. The second subset    Fig. 2), a fully-developed speckle pattern is required, realized herein by increasing a random 10% set of the medium pixels by a slight density perturbation.
A linear array transducer is modeled with N c = 128 channels and a 300 μm pitch. We used transmit pulses of f c = 5 MHz center frequency with 3 half cycles. All simulations (including in Fig. 1) were run with a spatial discretization of 75μm pixels and a temporal resolution of 6.25 ns (i.e., 160 MHz sampling frequency) to allow for an accurate sampling of the wave propagation. Herein, for each case a full-matrix capture with multi-static transmission was simulated first and then recomposed into corresponding PWs or DWs using synthetic aperture Tx/Rx beamforming. To reduce high frequency artifacts, we additionally applied a 60% band-pass filter on the recomposed RF data.
For experiments, 81 PW angles (ranging between −20 • and 20 • with a step size of 0.5 • and Tukey apodization) were simulated via synthetic-aperture. For DW, we used single element multi-static transmissions (see Fig. 3a/b). All raw data was beamformed based on a constant SoS assumption of 1500m/s. Tissue-mimicking phantom. For data acquisition of the breast phantom (CIRS Multi-Modality Breast Biopsy and Sonographic Trainer, Model 073, CIRS Inc., Norfolk, VA, USA), unbeamformed RF data using a UF-760AG ultrasound system (Fukuda Denshi, Tokyo, Japan) were recorded with a FUT-LA385-12P linear array transducer (N c = 128 channels, 300 μm pitch and 4 half cycles pulses of f c = 5 MHz center frequency. To increase the signal-to-noise ratio, data was transmitted using Walsh-Hadamard coded pulses [38] with subsequent recomposition into angled PWs or DWs, Fig. 4 RMSE and CNR vs regularization weight, Δφ PW for PW, and Δchannel for DW. a Evaluation using full Rx aperture [14,31], and b adapted Rx aperture [24,34]. Blue bars indicate optimal parameter values for each of the 6 approaches  respectively. After recomposition, a 60% band-pass filter was applied. Beamforming was performed based on a constant SoS assumption of 1470m/s, which is the approximate nominal SoS value of the phantom substrate. Evaluation metrics. For a quantitative analysis of the SoS reconstructionĉ = 1/σ in simulation, we used Root-meansquared-error (RMSE = ĉ − c 2 2 /N ) and Contrast-tonoise ratio (CNR = 2(μ inc − μ bkg ) 2 /(σ 2 inc + σ 2 bkg )), given mean μ and variance σ 2 of (·) inc and (·) bkg , respectively denoting the region of the inclusion and the background. In the simulation datasets, the CNR was only computed for cases 1-18, where the inclusion had an SoS contrast is at least 15 m/s, i.e. 1% compared to the substrate. The background values were computed based on the whole substrate region for the simulated datasets.

Results and discussion
Simulation study. First, a sensitivity analysis with respect to major parametrization choices was performed for the corresponding transmission sequences (PW and DW). We used a simulated phantom dataset of 28 ground-truth SoS distributions, representative of different characteristics in inclusion shape, size and SoS contrast as well as background SoS variations. These datasets are then evaluated in terms of RMSE as   well as in terms of CNR, indicating how well the inclusions can be separated from the background, which is of major importance in the context tumor detection/characterization. The optimal parameters (Δφ, Δchannel and regularization weight λ) are then selected as follows: For each parameter combination, average RMSE and CNR across all sample images was computed, as also plotted in Fig. 4. These values were then ranked from best to worst (i.e., ascending order for RMSE, and descending for CNR), and the optimal parameter set (cf. Table 1) was chosen as the one minimizing the average rank of the two metrics. The results are also summarized in Table 1, where it can be seen that with DW the best results are achieved with an overall RMSE = 7.7 and 8.0, respectively, for full and adapted Rx aperture cases. For the PW case, the best achievable results are at least 1.7m/s on average poorer with RMSE = 9.4 and 10, respectively. The contrast is also substantially improved with DW to CNR = 7.5 and 14.7, respectively, with over 42% improvement compared to best case PW results of CNR = 3.4 and 10.3. Using the determined optimal parameter settings, the reconstructions of the 28 test images are shown in Fig. 5. The DW-based SoS reconstructions are seen to be superior to PW-based in almost all cases. Especially with the case 28 in Fig. 5 with both higher and lower inclusions, DW is seen to perform significantly superior, regardless of the choice of aperture. The adapted RX aperture is especially beneficial with DW in layered structures such as shown in cases 1 and 2. This may be thanks to the smaller RX aperture leading to a higher coherence in delayed signal summation when beamforming, thus improving displacement tracking and hence SoS reconstruction. Some reconstruction errors are seen when the inclusion is very deep (e.g., simulations 15 and 26), when there cannot be sufficiently many lag measurements within the field-of-view below an inclusion, to help drive its reconstruction. Similarly, when an inclusion is located to the sides (e.g., simulation 27), many Tx apertures may not cover it, again reducing lag measurements for its reconstruction. Accordingly, where possible, an inclusion should be imaged in the middle of the imaging field for optimal reconstructions [27,31]. For the PW cases, the best reconstructions are achieved using an angle accumulation of Δθ = 2 • and an adapted receive aperture, similarly to [34]. It is worthwhile to note that the adaptive receive aperture setting with a similar RMSE (of 10.0 vs. 9.4 m/s) compared to the full receive aperture setting, leads to substantial improvement in CNR (of 10.3 vs. 2.7). A similar trend is observed in the DW case with adapted vs. full receive aperture settings (RMSE: 8.0 vs. 7.7 m/s; CNR: 14.7 vs. 7.5). Notwithstanding the aperture differences, the overall SoS imaging is significantly improved using DW vs. PW. Improvements in average metrics is corroborated using a paired hypothesis test as shown in Fig. 6, where the DW methods (using either a full or adaptive aperture) are compared for each sample against all other PW methods. Using a full receive aperture setting in DW results in significant RMSE improvement compared to any other PW method, irrespective of the PW receive aperture setting. Furthermore, significant CNR improvement w.r.t. any PW method is indicated using DW with adaptive aperture.
Note that for PW we focus on the center part of the image and mask out 10% on both sides of the imaging region (Fig. 5), since the apodization of the angled PW causes significant artifacts in these image regions, as was also discussed in [27]. Accordingly, RMSE and CNR were computed in these shown central regions. For a fair comparison, this same region is also used for computing the DW metrics, even though this is not a limitation in DW imaging and such masking is not required as depicted in Fig. 5.
Beamforming was conducted assuming a constant 1500 m/s, although the actual background SoS differed sometimes over 15 m/s. Despite such deviations between actual and beamforming SoS, the reconstructions are seen to still perform relatively well, as can be seen, e.g. in case 8 with an inclusion as well as in cases 20 and 25 with nearly homogeneous SoS distributions. Indeed, these examples indicate that it is possible to use our estimated SoS values in the beamforming process, as shown in [24], and potentially extend this to further refine SoS reconstructions iteratively. This is relevant to real-case scenarios where the exact SoS values are not known a priori. Phantom experiment. CIRS breast phantom has stiff inclusions representing malignant solid masses with higher speed-of-sound (cf. Fig. 7g), and hypoechoic inclusions representing cysts (cf. Fig. 7g'), which have smaller SoS contrast with its surrounding. We reconstructed SoS maps using the optimal settings found in the previous section (cf. Table 1), since we modeled this probe and acquisition scheme in our simulations. to the wide range of experimental settings studied herein (i.e. varying contrast, inclusion size, inclusion shape, background SoS for both, simulated and phantom data), the derived optimal settings are hoped to generalize to a variety of applications and tissue types. For different imaging device and probe characteristics, these may however need to be reparametrized. SoS reconstructions using the different studied methods are shown in Fig. 7a-f and a'f'. Similar to a few cases in the simulation study, different methods can result in overall different absolute SoS offset, which is believed to be caused by strong aberration effects and corresponding inaccurate shifts delay estimations. The artifactual SoS offsets were represented in the simulation study by the RMSE, which was seen to be superior in the DW cases. Hence it can be assumed that the DW methods in the phantom study also result in more accurate absolute SoS estimations. Furthermore the DW approach is seen to substantially improve the detection of both the solid mass and the cyst, whereas with PW neither the inclusion shows contrast nor the background SoS appears consistent. This is also reflected by the CNR improvement as shown in the corresponding figure corners. For the solid mass, which has a high SoS contrast, the DW approach leads to a CNR improvement of more than 42%. For the more challenging case of the cyst-representative inclusion with lower SoS contrast, the improvement is even more substantial with a 2.8-fold better CNR.
Regarding the optimal choice of Rx aperture for the breast phantom datasets, it was observed that for PW adapted Rx apertures often lead to significantly improved results (except for the 2 • PW of cyst inclusion where the CNR is generally very low due to low contrast nature of this inclusion). For DW, however, adapted Rx apertures only yields a marginal improvement in CNR for the cyst case, while being inferior for the high contrast inclusion. The reason for this might be that for the full Rx aperture case, a higher regularization weight (λ full = 9 vs. λ adap = 5) was found to be optimal in our parametrization study. This leads to higher spatial smoothness and a reduced noise in the background, thus potentially boosting the CNR of high contrast masses. A high regularization would negatively affect the detectability of low contrast inclusions, as these are more prone to be smoothed out during reconstruction, which corroborates the observation in Fig. 7c'.
We herein compared DW and PW sequences that would theoretically yield a similar framerate, i.e. with the setting M = 9 a total of 18 transmit events needed with either method for one SoS frame (neglecting any compounding preprocessing or SNR-boosting steps that may be used for either method). With DW, although each Tx yields less measurements for reconstruction, due to limited aperture, this in turn speeds up computations for beamforming, time-delay measurements, and subsequent optimization. A major drawback of DW with a single element on a physical system would be the limited Tx energy and hence a low SNR. For the experimental example, we address this herein with Walsh-Hadamard coded pulses [38], which excite the tissue with sufficient power while they can be mapped linearly to any single element combination (DW Tx event) retrospectively. Since such a coded imaging approach may reduce framerates for in-vivo applications, an alternative way of inducing DWs with high SNR and without loss of frame-rate would be to utilize virtual source transmit, where a DW is formed using multiple transducer elements.

Conclusion
We have presented herein the use of diverging waves (DW) in pulse-echo SoS image reconstruction, studying it comparatively to existing plane waves (PW) approaches. Analyzing the wavefront aberrations with PW and DW insonifications, DW was seen to cause less aberration artifacts that lead to inaccuracies in displacement estimation. This improved delay estimation applies irrespective of chosen linear-path forward model and ray discretization assumptions for SoS reconstruction. Motivated by this, we have studied a set of numerical phantoms, observing that the quantitative accuracy (RMSE) of SoS reconstructions is over 20% improved by using DW compared to PW. Even more pronounced are the improvements in inclusion contrasts, where CNR led to an improvement of over 42% with DW. These results are corroborated by an actual ultrasound acquisition of a breast phantom, where CNR improvements of more than 42% and 280% are achieved with DW for, respectively, high and low contrast inclusions.
Diverging waves in this work are generated without loss of generality using a single element transmission yielding circular wavefronts. Nevertheless, the presented method after minor adjustments of L matrix paths and beamforming delays is also applicable for multiple-element transmission using a virtual source approach and also to non-circular wavefronts, which would allow to increase the echo SNR. With our findings SoS imaging based on conventional ultrasound systems can be substantially improved, paving the way for translating SoS imaging into the clinic. Quantitative SoS imaging and its improvements as presented herein are not only valuable in diagnostic and interventional imaging, but would also help improve many other ultrasound-based modalities by correcting aberrations, such as improved beamforming for higher-quality B-mode images as presented in [24].
Funding Open Access funding provided by ETH Zurich. Funding was provided by the Swiss National Science Foundation and Innosuisse.

Declarations
Conflict of interest The authors, namely Richard Rau, Dieter Schweizer, Valery Vishnevskiy and Orcun Goksel, declare no conflict of interest.

Ethics approval and Informed consent
Being based on simulated data and phantoms, no ethical approval nor consent was required for this work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.