Sampleability assessments were applied to a decreasing number of ROIs as the mission progressed, as some were eliminated while others proved worthy of further study. Later assessments had access to more and higher-resolution data than earlier assessments. Here we describe the progression of the assessments, which techniques and algorithms were used, and the data sources that fed them (Fig. 14).
Global Sampleability – Selecting Regions of Interest
The global sampleability assessment was focused on finding ROIs and then making relative comparisons between them with the aim of identifying those most likely to have abundant sampleable material, to be revealed with later imaging. Globally, particles were mapped completely for diameters 8 m and larger (a “completeness” limit”; DellaGiustina et al. 2019), which eliminated only a small fraction of the surface area. The mapped locations of these particles similarly indicated only a few regions with a high density of very large boulders (Walsh et al. 2019).
The key factors for quantitative analysis were tilts derived from the global DTM and the location of the mapped particles, all of which were large enough to be hazards and frustrate sampling efficiency. Therefore, other methods were used to identify 50 ROIs:
-
Detailed visual inspection,
-
Crowdsourcing inspection of the global basemap to the OSIRIS-REx science team using the CosmoQuest tool,
-
Machine learning (Cambioni et al. 2019), and
-
Algorithmic extraction of regions with particularly low tilts and not covered by large mapped boulders.
Visual inspection of images and mosaics by individual team members identified numerous apparently smooth and low-tilt ROIs that were often small craters with diameters \(\sim10\text{--}30~\text{m}\); these sites carry the label DL or BB. The entire OSIRIS-REx science team was “crowd-sourced” to inspect a global mosaic with a 21 cm ground sample distance (Bennett et al. 2021), facilitated by a citizen science platform CosmoQuest, internally, for uniform image display and mapping. CosmoQuest is an online citizen science platform where small sections of images are shown and mapped with simple polyline tools, such as lines or dots or circles (Gay and Lehan 2020). To survey for ROIs, the global mosaic was split into 3,385 individual small-format images with 20% overlap between images that were then were displayed in uniform fashion via the CosmoQuest platform. The outputs were circles drawn over regions that, by eye, appeared to be more smooth than surrounding regions. The OSIRIS-REx science team found a large number of possible ROIs using CosmoQuest, and the largest and most commonly mapped regions were extracted from this analysis and carry the label CQ.
The location of the CQ ROIs showed a slight bias for the northern hemisphere. Although geologic differences do exist between the two hemispheres (Daly et al. 2020), there was some concern that mapping fatigue was responsible because the images were displayed for all users in the same order, starting in the northern hemisphere. To combat this possible effect, a customized machine learning algorithm was used (Cambioni et al. 2019) that was originally developed for automatic classification and mapping of geologic features (Wagstaff et al. 2013). It was trained on 36 images of Bennu terrain previously mapped as smooth, rough, or unknown (Cambioni et al. 2019). This effort identified three new ROIs (indicated by the label ML), and found, generally, a more even distribution of smooth regions between the hemispheres than did the human mappers, suggesting that fatigue did play a role in science team mapping quality.
Finally, combinations of tilt metrics (tilt variation and mean relative tilt) identified regions that were very flat over long and short baselines. Such ROIs carry the label EX and TM.
From this collection of ROIs, 50 were selected for further study by the Site Selection Board, a collection of representatives from the science, operations, and leadership elements of the team (Lauretta et al. 2021). The 50 were intended to represent a wide range of terrain types (e.g., small craters, flat depressions within larger craters, or surfaces of large flat boulders), detection methods, and latitudes and longitudes. Following the selection of the 50 initial ROIs, visual inspection quickly eliminated many owing to them having more than 50% of the surface area covered by particles larger than 0.5 cm, and this narrowed the list down to 16 sites. These top 16 sites were spread globally and had a wide range of surface areas, from \(11~\text{m}^{2}\) to more than \(400~\text{m}^{2}\) (Table 2 and Fig. 15).
Table 2 The size and coordinates of the top 16 ROIs Global Sampleability – Assessing the Top 16 ROIs
The top 16 ROIs that emerged from the global ROI search were then subject to more rigorous quantitative analysis. As shown above, these ROIs spanned an order of magnitude in surface area, which was problematic for a few reasons. First, making a relative comparison between BB21 with over \(200~\text{m}^{2}\) and CQ09 with only \(20.4~\text{m}^{2}\) is not reasonable without including knowledge of spacecraft deliverability capabilities. There are numerous \(20.4~\text{m}^{2}\) regions inside BB21, some of which may have better quantitative unresolved material scores than CQ09. Second, at that time in the mission, the as-built spacecraft deliverability capabilities were still being tested, and it was not yet known what the final deliverability ellipse sizes would be or how they might vary with latitude. If the deliverability ellipse was only tens of centimeters, then it was possible that the best few square meters in a small ROI were superior to any few square meters anywhere else on the asteroid.
In fact, the stark roughness of Bennu and lack of clear deposits of fine-grained regolith prompted a change in mission strategy that dramatically altered the expectations for deliverability uncertainties. Instead of using the planned LIDAR-based navigation strategy to deliver the spacecraft to the surface of Bennu for sampling, the mission used an autonomous optical navigation system, called Natural Feature Tracking (NFT), which improved deliverability accuracies from around 25 m down to 5–8 m (Olds et al. 2022; Lauretta et al. 2021), depending on the specific location on the asteroid (NFT is similar to Terrain Relative Navigation; see Farley et al. 2020). Therefore, as the sampleability analysis progressed, ROIs were re-mapped and then analyzed with a limiting area of \(r = 5~\text{m}\) (\(\sim78.5~\text{m}^{2}\)). More than one \(r = 5\) area was analyzed for two of the largest ROIs (DL15 and DL06). This process led to the elimination of some sites (BB22, CQ03, CQ09, CQ15, DL11, EX01, EX15, TM24), as expanding their surface area to this minimum size led to the inclusion of hazards or an overwhelming amount of resolved unsampleable material.
Particle mapping was performed for each remaining ROI using ArcMap tools to quantify the fractions of resolved versus unresolved material (Burke et al. 2021). The particles in each ROI were mapped by multiple individuals, and the inputs from different mappers were clustered and combined into a single list of particle locations and lengths for each ROI (Burke et al. 2021).
Calculations of unresolved versus resolved fractional area for each ROI were performed using ArcMap. A minimum particle size of 30 cm was used for the calculation to balance the wide range of completeness limits for each site among ROIs imaged under different conditions. Unresolved versus resolved fractional areas were calculated for entire ROIs and for \(r=5~\text{m}\) regions within two of the largest ROIs (DL15 and DL06).
A tilt score was generated for each of the remaining ROIs using 15-cm DTMs. The mean of relative tilts for all facets within a radius of 3 m of the center of each site was collected, and each facet’s value was converted into an efficiency between 0 and 1 based on the tilt function. The average of those efficiency values was then recorded as the tilt score for the site. The effect of this process was to put tilt values into the correct scale of their potential impact on the outcome (a 0–1 efficiency factor), whereby sites with average values of \(1^{\circ}\) or \(10^{\circ}\) would both be found to have efficiencies of 1.0, but the fall-off with slopes beyond \(14^{\circ}\) is severe.
The analyses found that numerous ROIs had more than 60% unresolved area with a 30-cm particle completeness limit, and that three had \(>75\%\) unresolved material (DL15, DL09, and DL06; Table 3). The tilt scores varied between 0.55 and 0.83 and did not correlate with the unresolved fractional area scores. These scores were not meant to be combined quantitatively, but rather ingested and analyzed separately.
Table 3 The tabulated values for unresolved area, a tilt score, and the names used for the four sites that were selected as part of the Final Four sites. The list of sites does not include some that were removed when the minimum surface area was set to \(78~\text{m}^{2}\) (BB22, CQ03, CQ09, CQ15, DL11, EX01, EX15, TM24) The downselection from the top 16 ROIs to the final four candidate sites for further reconnaissance reflects the importance of the unresolved material calculation, but also that it was not the sole consideration. The selected sites included two in the top three of unresolved fractional area scores, DL15 and DL06 (formally re-named Nightingale and Osprey, respectively); one with a moderate unresolved area score but the best tilt score, EX07 (re-named Sandpiper); and one that was not among the top scorers in either respect but had unresolved material clearly clustered in the center of a small crater, CQ13 (re-named Kingfisher). This range of candidate sites was chosen because of a desire to have a variety of terrains for the higher-resolution imaging campaigns to come, which would reveal details that had been extrapolated earlier in the selection process.
Site-Specific Sampleability – Downselection from Final Four to Primary and Backup Sample Sites
The first local reconnaissance campaign, Recon A (Table 1), acquired images optimal for particle mapping at each of the final four candidate sample sites. The Recon A imaging campaign, with pixel scales of \(\sim1~\text{cm}\), dramatically improved knowledge of the particle sizes and locations at each site. However, OLA data collected during the Orbital B mission phase, which preceded Recon A, led to an increase in the fidelity of the asteroid DTM (Daly et al. 2020) and facilitated the identification of locations within each of the final four sites that would serve as the nominal targeted spot for each. This selection of targeted spots was primarily driven by optimizing deliverability and safety considerations relative to the terrain, and included some qualitative assessments based on the location of mapped particles and hazards.
Having nominal locations to specifically target at each site changed the calculations and assessments in two ways. First, by targeting a single spot, or facet, the properties around that facet could be weighted with respect to their distance to better represent the distribution of material around the site (as discussed in Sect. 4.2.4).
The average pixel scale for Recon A imaging was 1 cm, with average phase angles at each site ranging from \(30.99^{\circ}\) to \(43.04^{\circ}\). Particles were counted in a region around the targeted spot within a radius equal to 3 times the semi-major axis of the deliverability uncertainty ellipse, with an additional 2 m added for flexibility in the analysis of the spatial distribution of material. This resulted in circular regions with radii between 7.35 m and 13.1 m, which spanned three and to six individual images for the four sites. Between 5,111 and 17,867 particles were mapped at each of the four locations (Burke et al. 2021).
The tilts were derived from local DTMs of the candidate sample sites constructed with OLA data (Daly et al. 2020) from the v13 global DTM with 5-cm facet sizes. These used the safety tilt (Sect. 3.2) and were relative to the approach vector at each targeted location at each sample site. The deliverability ellipses used for weighting included the semi-major axes, semi-minor axes, and ellipse orientation. The semi-major axes varied from 1.838 to 2.592 m (Berry et al. 2020).
This calculation was first performed with a limit on the smallest particle size of 16 cm (Table 4), as determined by analysis of the differential particle distributions that was valid at all four sites (Burke et al. 2021) to allow a fair comparison between them. (This calculation was also performed with no minimum particle size; the results were later used to help refine the search within each site for the optimal facet, but not for comparison between sites.)
Table 4 The final calculated unresolved material scores for each of the final four candidate sample sites. The unresolved (16 cm) value is the percentage of facets within the analyzed region at each site that were unresolved for a minimum particle size of 16 cm. When the proximity mask was applied, all of the values improved, as many facets were nearby other unresolved facets. These two values were then weighted by the tilt score and the deliverability ellipse The particle mask with minimum particle size of 16 cm for each site was then processed again to determine, for each facet, whether there was an unresolved facet within a radius of 10.5 cm (the radius of the opening of the TAGSAM head). This was dubbed the “proximity mask” because even if the targeted facet itself was covered by a resolved particle, unresolved material would still be accessible to the TAGSAM orifice. It also illustrated the distribution of material, specifically unresolved material, throughout a sample site. For example, whether unresolved material was spatially clustered versus distributed could increase or decrease results from the proximity mask depending on where within the site the targeted facet was located.
The next processing step weighted the facets in two ways. First, a tilt weighting was implemented considering the facet’s tilt efficiency using the facet tilt function (Sect. 3.2). This weighting scheme was simple: an unresolved facet was weighted from 1 to 0 by its tilt efficiency. A facet with an expected collection efficiency of 0 due to a high facet tilt should not be counted in a sum of unresolved material – that is, it might be unresolved and covered with sampleable material, but its high tilt makes it unsampleable. Candidate sites studied at this point in the mission had low facet tilts, and thus this weighting primarily provided redundancy on particle masking, as it mostly altered the weighting for facets on the edges of irregularly shaped rocks that had not been perfectly masked (this effect is visible around the edges of rocks in Fig. 16).
The second weighting was by distance from the targeted spot of the sample site using the deliverability ellipse. This weighting takes into account the location within the sample site of unresolved and low-tilt facets relative to a single targeted spot. This was done for unresolved material and tilt-weighted unresolved material masking.
The sites Nightingale and Osprey had the highest fraction of unresolved facets (Table 4). The proximity mask increased the unresolved fractions dramatically at all sites, pushed Osprey ahead of Nightingale, and brought their unresolved fractions closer to each other. Deliverability and tilt weighting improved scores for sites Nightingale and Sandpiper owing to very low tilts and centrally clustered unresolved material, and decreased the score for Osprey owing to high tilts due to terrain and poor clustering of unresolved material (Table 4).
The downselection to the primary and backup sites also took into account the safety and deliverability assessments, which became closely related owing to the switch in navigation strategy from LIDAR to NFT and the development of a Hazard Map (Olds et al. 2022; Enos et al. 2020; Lauretta et al. 2021). The Hazard Map utilized DTMs to identify specific features or regions of a potential sample site that could be hazardous to the spacecraft. The integration of the Hazard Map with NFT allowed for a waive-off and early backaway burn if the software predicted contact with a previously identified and mapped hazard (Lauretta et al. 2021). However, an early waive-off would alter the sample site due to the close proximity of the backaway thrusters, and thus would trigger significant cost in time and resources to re-plan for attempted sampling at an alternative site. Therefore, the final calculations in site selection primarily balanced the chances of making safe contact with the chances of touching a sampleable spot in the sample site. Although Nightingale had a slightly lower chance of safe contact than Osprey, its higher sampleability suggested that any contact was more likely to be successful. For this reason, Nightingale was selected as the prime sample site and Osprey as the backup sample site.
Site-Specific Sampleability – Primary and Backup Sites
The Recon C campaign (Table 1), which imaged Nightingale and Osprey at pixel scales of \(<0.5~\text{cm/pixel}\), further increased knowledge of the particles at each site. This spatial scale allowed us to estimate a collected volume of sampleable material (Sect. 3) as some particles \(\geq 2~\text{cm}\) were resolved. The time-consuming nature of the particle counting process at such a high resolution demanded that only the most central regions of each site were fully analyzed (Burke et al. 2021). The analyzed region for each site was designed to cover 80% of the 2-sigma deliverability ellipse (radius of 4.23 m at Nightingale and 3.02 m at Osprey). As described in Burke et al. (2021), this final stage of particle counting produced a list of particles that included their length, center, and end points, referenced to the standardized OLA-generated DTM for each sample site (v18 for Nightingale and v20 for Osprey).
At both sites the minimum and maximum particle size for each facet in the DTM was identified to establish the minimum and maximum sizes of particles whose centers were within a TAGSAM opening radius (10.5 cm) from the center of each facet. This analysis was intended to record the particles accessible to the TAGSAM head at each facet.
The sampleability algorithm requires a PSFD power-law slope for each facet to connect with the regolith simulants used in laboratory testing (Bierhaus et al. 2018). A meaningful fit to a particle population requires many more particles (\(\gtrsim 100\)) above the completeness limit (\(\sim4~\text{cm}\)) than were typically found on a single facet or within a 10.5 cm radius of a facet. Therefore, all particles within a radius of 1.5 m were used to determine a PSFD power-law slope for each facet; this distance was selected after testing analysis outcomes at each site with a number of possibilities and reflects a balance between the size of the search region and the facets with the lowest total number of particles available for fitting. Although this averaged over a much larger area than the size of the TAGSAM head, it uncovered trends across sites that made meaningful differences in the calculations (Fig. 17). At Nightingale, the power-law slope solutions ranged from −2.8 to −1.2 across the sample site (mean error of 0.007). At Osprey, the solutions ranged from −2.5 to −1.4 (mean error of 0.009).
The particle counts were also directly used in calculating the expected decrease in sampling efficiency due to rock tilt. As described in Sect. 3.4, the measured location and size of each particle are used to estimate the decrease in sampling efficiency owing to the possible tilting of the TAGSAM head or obstruction of its opening. The average value was 0.2954 for Nightingale and 0.2218 for Osprey (Fig. 18).
With the finer resolution of the Recon C dataset, the particle counts included many with lengths shorter than 2 cm (i.e., ingestible by TAGSAM and sampleable), so that the facets could be classified in three ways: covered by particles larger than 2 cm, facets that had at least one \(<2~\text{cm}\) particle, or unresolved (no particles visible for mapping). The difference between the primary and backup sample sites became pronounced in this calculation, as 47% of the facets at Nightingale had a mapped particle smaller than 2 cm or remained unresolved, compared to only 25% of the facets at Osprey. A total of 9,833 particles smaller than 2 cm were counted at Nightingale and 9,037 at Osprey (Fig. 19).
Unresolved facets were not initially accounted for in the sampleability algorithm and had no quantitative way to alter the predicted collection amount (Sect. 3). The nominal sampleability algorithm ingests particle properties, including minimum particle size. Some facets, as described above, have no particle mapped on them and thus possibly indicate the presence of particles below the pixel scale of the image. Mapping experience demonstrated that one- and two-pixel-sized particles could be mapped (although mapping was not complete to these sizes); previous global and reconnaissance analyses demonstrated that unresolved facets, when imaged at higher resolution, typically revealed particles at smaller sizes (see Burke et al. 2021 for examples). So, an extra dimension to the following analyses was added, whereby minimum particle size was also calculated where unresolved facets were considered to have particles with length equal to the pixel scale. A minimum particle size of 0.38 cm was used for both Nightingale and Osprey because it was the larger pixel scale from among the two sites. At Nightingale, 95.9% of all facets were within 10.5 cm of an unresolved facet, and thus a large swath of the sample site was considered to have this minimum particle size.
With particle properties for each facet (minimum, maximum, and PSFD), facet tilts from the DTM, and rock tilts, the site-specific sampleability algorithm to predict collection volume was deployed at each facet of the Nightingale and Osprey sites. The primary calculation became a \(2\times2\) grid, with calculations being made for low-mobility versus high-mobility scoring (Sect. 3.1: Fig. 4), and then for resolved particles only versus all particles (where unresolved facets are considered to have a minimum particle size equal to image pixel scale). For each site, the following statistics were tracked (all assuming Bennu’s bulk density of \(1190~\text{kg}\,\text{m}^{-3}\)):
-
Mean and median predicted collection amounts for all facets at the site,
-
Fraction of facets predicting collection amounts above the mission requirement of 60 g,
-
The deliverability-weighted value for the central targeted facet.
Tables 5 and 6 show that when using low-mobility scoring and only resolved particles, more collected sample mass is predicted for Osprey in all scoring metrics (the upper left of the scoring grid). This is largely due to the heavy dependence on the minimum particle size in the sampleability algorithm, and Osprey had similar numbers of mapped particles 2 cm and smaller over a smaller area than Nightingale.
Table 5 Matrix of sampleability predictions for Nightingale. The mean and median of all facets within the sample site are provided with an assumption of bulk density of material of \(1190~\text{kg}\,\text{m}^{-3}\). The percentage of facets at the site that predicted 60 grams of collected sample are tabulated. Finally, the deliverability-weighted value for the central spot in the sample site is provided Table 6 As for Table 5, but for Osprey However, Nightingale had more unresolved material, and that material was more centrally located. The latter property is evident in the deliverability-weighted scores for Nightingale always being higher than for the mean facet value for the entire site, indicating that that the unresolved material was centrally clustered. Meanwhile at Osprey, the opposite is true for the cases utilizing unresolved material. Therefore, in both scoring regimes that considered unresolved material, Nightingale had a substantial advantage. The key metrics were the deliverability-weighted averages that considered the average of all of the facets around the targeted spots, which predicted 81.7 g and 181.5 g for low-mobility and high-mobility scoring, respectively (Table 5). Both of these values are above the mission requirement of at least 60 g. Only 64.1% of all facets for the low-mobility scoring predicted at least 60 g of collected sample, whereas 82.3% did for high-mobility scoring. Thus, for the most optimistic combination of unresolved particle consideration and mobility scoring, Nightingale satisfied the mission requirement as a sample site. Osprey was close, with 78.7% of all facets achieving 60 g or more predicted collection in the same scoring regime with a deliverability-weighted value of 121.7 g (Table 6).