Summary of Key Findings
To the best of our knowledge, this Challenge presents the first report of the total (“real-world”) variability in aneurysm WSS as predicted by image-based aneurysm CFD, at least as practiced ca. 2015. It shows that there was appreciable variability in the prediction of aneurysm WSS, driven by the broad variety of strategies employed among participating teams for segmentation, boundary conditions, and CFD. Lumen geometries were highly variable in their morphology, extents and degrees of smoothing, yet while sac WSS magnitudes did vary substantially among teams (sometimes by orders of magnitude) there appeared to be more consensus regarding sac WSS patterns and relative ranking of cases after normalizing to the parent artery WSS.
Among the factors we could quantify objectively from the submitted data, input parameters like parent artery inflow rates and Reynolds numbers showed non-negligible case-average variabilities (23 and 26%, respectively), which resulted in variabilities of output hemodynamic parameters that could be higher (e.g., AWSS, 48%) or lower (e.g., AWSS*, 18%). The former is consistent with that fact that sac WSS should be proportional to flow rate, which is why normalizing to parent artery WSS, i.e., the latter AWSS*, typically reduces variability.
Since normalizing essentially renders the WSS patterns a function of the parent artery Reynolds number, it is interesting that high variability of Re resulted in lower overall variability of AWSS*. This echoes a point made at least as early as 2005,8 namely, that aneurysm flow patterns are relatively robust to variations in flow rate (i.e., Re). (However, see “Looking Beyond IQR and CoD” section below for further discussion of this point.) This is encouraging in light of the fact that even good-faith estimations of inflow rates are probably in error relative to the actual—and usually unknown—patient-specific flow rates.10 With that said, we feel obliged to remind the reader that sac WSS dynamics, and especially high-frequency WSS fluctuations, may be more susceptible to variability in Re.26
Visually, there did not seem to be much difference in the variabilities of high vs. medium vs. low experience teams, which was reflected in the lack of significant differences in medians across experience levels. With the exception of the choice of solver (Ansys) and inlet location (ICA), high-experience teams did not show any more consensus about their image-based CFD pipelines than among other, less experienced teams.
Intra-team Variability
Although the present study was not designed to systematically separate the influence of segmentation variability from boundary condition or solver variability, we note that two teams (19 and 35) each submitted two CFD datasets which differed only in terms of segmentation and/or smoothing, i.e., the inflow/outflow schemes and CFD solution strategies were the same within each team. For (high-experience) Team 19, automated vs. more intensive manual segmentations were performed, also with differences in the number and lengths of outflow branches. For (low experience) Team 35, two different segmentation software tools were used.
As reported in Table 4, segmentation generally had small influence on case-average MCA diameter, although for Team 35 differences could be as high as 11% for individual cases. Differences in case-average inflow characteristics were less than 10%; however, for individual cases, the imposed flow rate or Re could differ by as much as 38% (Team 19, Case 5). For Team 19, there was a 45% difference in case-average calculated MCA WSS between the two segmentations (driven by nearly 80% differences for Case 2 and 5), which is comparable to the inter-team CoD = 46% reported in Table 2. For Team 35, however, segmentation had a less dramatic, albeit still non-negligible (20%), effect on MCA WSS. Nevertheless, again for individual cases, MCA WSS could differ between segmentations by up to 65% (Case 5).
Table 4 Intra-team variability for input and output parameters, based on team case-average data. Absolute values of sac WSS differed appreciably between the two segmentations for Team 19 (42% for AWSS, 56% for MWSS, both driven largely by differences for Cases 2 and 5), but these were reduced to 4% and 12% by normalization, suggesting that much of this difference could be attributed to differences in parent artery (inflow) characteristics. For Team 35, sac WSS hardly differed between the two segmentations, except for a 60% difference in LSA, which could be attributed to its already-near-zero values. Taken together, these results indicate that even minor differences in segmentation may non-negligibly affect the commonly reported hemodynamic parameters, especially those based on absolute WSS, and thus intra-team variability may appreciably contribute to the inter-team variability.
Reported Vs. Computed Quantities
As part of the Challenge, teams were asked to report their prescribed inflow rates and sac-averaged WSS for all five cases. Since some teams imposed inflow at the ICA, we were required to calculate parent artery (MCA) flow rates from their submitted velocity field data, as described in the Methods. For teams with MCA inlets, we also calculated their MCA flow rates from their CFD velocity fields, for quality control purposes.
As Fig. 9a shows, there was generally excellent agreement between the reported and calculated MCA flow rates although, for 5 of the 16 teams that reported MCA flow rates, the calculated flow rates disagreed by more than 10%. For Team 8 this could be attributed to outflow from side branches included between the MCA inlet (where their reported flow rates were imposed) and the distal MCA (where our flow rates were calculated). Team 2 imposed plug velocity profiles on what turned out to be the coarsest tetrahedral meshes of any team, and without any boundary layer elements, so it is possible that the flow rates actually imposed may have been less than the nominal ones reported. Team 5 reported 2 mL/s for all five cases, but appear to have imposed 1 mL/s for Case 5. Regarding Teams 10 and 17, we note that they were among a handful of teams that did not submit vector velocity fields, requiring us to estimate flow rates from their provided velocity magnitudes rather than through-plane velocities we did for other teams; however, as noted in the Methods, this should not have introduced any significant bias.
Figure 9b shows that, for the 22 teams that reported their own AWSS values, there was generally good agreement with the AWSS that we calculated based on a consistent sac clipping plane, suggesting that the impact of sac delineation was generally negligible, at least for AWSS. Nevertheless, for a few teams (3, 24, 35a, 36) the reported AWSS averaged 1.5–3× higher than our calculated value. (Interestingly, Team 35’s other submission (35b) showed no such discrepancy). Conversely, Team 2 reported AWSS values that averaged about 4× lower than what we calculated from their WSS data. The largest discrepancy, however, was for Team 34, which reported AWSS averaging 2.2 Pa, but for which we calculated AWSS averaging 0.012 Pa from their WSS data, a nearly 200× difference. We initially suspected that this might be a discrepancy in the units of the WSS field provided, but their MCA WSS (calculated from the same WSS surface data) averaged 3.7 Pa, well within what other teams reported.
Outlier and/or Inconsistent Data
According to published phase-contrast MRI measurements of nearly 100 adults, cycle-averaged blood flow rates in the MCA are 2.43 ± 0.52 mL/s,50 suggesting a 95th percentile range (i.e., roughly ± 2 SD) of 1.39–3.47 mL/s. Four teams (2, 14, 17, and 34) were up to 25% above this range, and one team (36) was 30% below. This may not, however, reflect a lack of experience—these teams had a mix of experience levels, from high to low—or knowledge of cerebrovascular flow rates. Three of the teams (2, 14, and 36) provided no specific rationale for their choice of flow rates; however, one team (34) did note that they chose to perform steady flow simulations corresponding to peak-systolic velocity conditions, which was not unreasonable in light of the focus of the Challenge on WSS variability in the context of predicting rupture status. On the other hand, for (high-experience) Team 17, CFD models were segmented proximal to the ICA terminus, but anterior cerebral artery (ACA) branches were not included. This team appeared to impose inflow rates consistent with those for the ICA, meaning that the one third of flow typically directed to the ACA50 was instead directed into the MCA.
These teams with outlier flow rates also tended to be outliers for hemodynamic parameters. Looking first at MCA WSS (Fig. 4f), Team 2 had values averaging 37 Pa, which was ~ 5× the median and ~ 2× higher than any other team. While this team did have the highest case-average MCA flow rates (4.34 mL/s), their predicted Poiseuille WSS of 12.8 Pa was not nearly as much of an outlier according to Fig. 4e. Instead, the high MCA WSS appears to have been due to this team’s use of plug velocity profile with a relatively short MCA inlet length, whereas most other teams with short MCA segments imposed fully-developed velocity profiles. On the other hand, Team 34, which similarly imposed plug velocity profiles onto CFD models with relatively short MCA inlet lengths, had comparable Poiseuille WSS (10.7 Pa), but, counter-intuitively, had lower MCA WSS values of only 3.7 Pa (in fact the only team for which this happened), further hinting at a possible inconsistency in the provided WSS surface data (more about this below).
Turning attention to Fig. 7, the highest AWSS was consistently provided by (medium experience) Team 2; however, their AWSS* values were comparable to those of other teams, which, as noted in the previous section, could be explained by Team 2’s high MCA WSS. At the other extreme, (low experience) Team 34 had AWSS averaging 0.012 Pa, ~ 400× lower than the median case-average AWSS. (This is not inconsistent with a recent meta-analysis, which reported ~ 100× differences in WSS levels across the aneurysm CFD literature.5) Consequently, this team’s LSA and LSA* values were also consistently outliers, close to 1.0. This would seem to suggest a possible inconsistency in the units of the provided WSS surface data, yet case-average MWSS for this team was 2.9 Pa, “only” ~ 20× lower than the median MWSS value.
This is not to say that only inexperienced teams contributed outlier results. Per Fig. 7a, one high-experience team (17) contributed some of the highest AWSS values for Cases 1 and 3, well in excess of any of the other high-experience team, likely due to their outlier high flow rates as discussed above. At the other end of the scale, Teams 37 (high experience) and 38 (medium experience) had AWSS values at least 5× lower than the median case-average AWSS, likely due to their flow rates (1.42 and 1.62 mL/s, respectively), which were at the low end of the spectrum. As a result, these teams were consistently among the outliers for LSA and LSA*. That rank-ordering of cases by the hemodynamic parameters (i.e., Fig. 8) improved consensus suggests that, even if a team over- or underestimated flow rates or WSS, as long as it was being done consistently, the relative ordering of cases by some WSS parameter could be more robust.
Finally, we do not mean to single out some of the above teams as the only outliers. Considering the 5 aneurysm cases and 14 (inflow, outflow, and sac) parameters investigated in the present study, every team had data points outside of the 10th–90th percentile range (i.e., “outliers”) for at least one of those 70 comparisons, and all teams were outside the IQR for at least 14 of those 70 comparisons. We do note, however, that low-experience teams contributed 43% of the “outlier” data points, compared to 40 and 17% from medium- and high-experience teams, respectively. This is out of proportion to the respective 32, 47 and 21% of all data points contributed by low-, medium- and high-experience teams, and would seem to suggest that, while we found no significant difference in the data across experience levels, low-experience teams were more likely to contribute outlier data.
Looking Beyond IQR and CoD
In this study, we focused on IQR and CoD as standard descriptive statistics for datasets having non-parametric distributions. This however, makes it more difficult to compare against the standard deviations (SD) and coefficients of variation (i.e., CoV = SD/mean) typically reported in the literature (albeit often without testing for normality). To give some context, CoD was 23% for case-averaged MCA flow rates, which could be considered negligible or at least tolerable in light of an early report that ± 25% variations in flow rate had only a modest impact of aneurysm flow patterns.8 This, however, ignores that fact that IQR and CoD include, by definition, only half of the 28 datasets.
Expanding to the 10th and 90th percentiles (the “whiskers” in Figs. 4 and 7) brings in 22 of the 28 datasets. The resulting inter-decile range for MCA flow rates is 2.2×, greater, corresponding to a percent variability of 44%. Similarly, for case-averaged AWSS and AWSS*, the inter-decile ranges were 2.2× and 3.1× wider than their respective IQRs, corresponding to percent variabilities of 85 and 63%, vs. their respective CoDs of 48 and 18%. We therefore recommend some caution in relying solely on IQR and CoD as measures of variability, since they will tend to paint a more optimistic picture of the breadth of the variability. A good rule of thumb for our data would seem to be that 2 × IQR or 2 × CoD encompass the variability of most teams.
Caveats
As noted in the Introduction, the aim of this Challenge was decidedly not to separate the impact of the various (and often interacting) input variabilities on output hemodynamic parameters. We attempted this only where we could objectively characterize input parameters like inflow rates or outflow divisions. Those findings seemed to suggest a prominent role for inflow variability on the variability of the chosen hemodynamic parameters, but we cannot say with authority to what extent segmentation or CFD solver/settings variability may have contributed. We also cannot say to what extent inlet location vs. choice of inflow power law may have impacted the variability in prescribed flow rates.43 Finally, in choosing a consistent location for the parent artery segment, from which derived the MCA velocity, Re, and normalizing WSS, we obscured a potential contribution to the real-world variability in those input parameters, and in the normalizing of absolute hemodynamic parameters.
Because of the underlying objective of understanding CFD variability in the context of rupture status/risk assessment, we did not require pulsatile simulations, and focused only on the most-common integrated or point-wise hemodynamic parameters, for which steady flow is anyway considered a good proxy for time-averaged pulsatile flow.35 Thus, our findings cannot be extrapolated to applications where the spatiotemporal fluctuations of WSS may be of interest, e.g., oscillatory shear index (OSI),49 spectral power index,26etc. In those cases, the impact of flow rate pulsatility (and CFD solver settings 28) cannot be overlooked, especially since, as noted in the “Results”, teams that did perform pulsatile CFD employed a wide variety of flow waveform shapes.
We also remind the reader that the reported variabilities are predicated on medians derived from the submitted teams; however, it is not at all clear that the majority should rule. First, while the 26 teams span a wide range of expertises and strategies, their distribution may not be representative of the aneurysm CFD community or published studies as a whole. For example, our Challenge did not attract participants from some of the most well-published aneurysm CFD groups. Second, what constitutes “truth” in image-based aneurysm CFD remains an open question.24 Even if we were to eliminate variability in segmentations, boundary conditions and CFD solutions, medical imaging can introduce its own distortions, and patient-specific input parameters like flow rates are usually not known, and are anyway subject to their own inherent physiological variations.
Finally, although this Challenge did involve a large amount of data, it was still based on “only” five aneurysms of bifurcation type from a particular cerebrovascular territory. Some caution must therefore be exercised before extrapolating these findings too broadly.