Introduction

A single photon avalanche diode (SPAD) can detect individual photons and measure their time of arrival with high temporal resolution1,2. A SPAD array is a camera sensor where each pixel is a SPAD3,4. The SPAD array’s compatibility with the CMOS manufacturing process has enabled its multiple-megapixel resolution and low production costs, leading to its adoption in many consumer devices such as smart phones, tablets, low-light video cameras, and drones. Used as a video camera, a SPAD camera can eliminate motion blur, capture fast moving and poorly lit objects, and capture extreme dynamic ranges5,6,7. SPAD cameras are deployed in various active imaging scenarios, including LiDAR8,9,10,11, non-line-of-sight imaging12, and fluorescence lifetime microscopy13.

A scintillator is a transparent material that emits visible light fluorescence when incident ionizing radiation deposits energy, which we refer to as an interaction or a scintillation event. Interaction location, time, energy deposition, and fluorescence decay time are salient measurements for downstream tasks such as localizing the radiation source or characterizing the type and energy of radiation. For example, interaction location and energy deposition measurements are used in the Compton camera, a device that localizes a radiation source in the open environment14,15,16. Interaction timing measurements are used in detecting the coincidence of annihilation photons in positron emission tomography (PET)17,18. Particle types can be discriminated through pulse-shape discrimination by measuring the interaction’s fluorescence decay time19,20. In this paper, we propose a new method to locate interactions in a thick, monolithic scintillator. A thick scintillator is most appropriate for detecting gamma-rays due to their deep penetration properties. It provides a larger volume where interactions can occur, making interactions more likely but also more challenging to locate.

Current commercial sensors for measuring light from scintillation events tend to have either high spatial or temporal resolution, but not both. The silicon photomultiplier (SiPM) array is a commonly adopted sensor that is coupled to the scintillator’s surface and measures the unfocused light distribution originating from the interaction. The SiPM array is high in temporal resolution and can discern events from individual particles at high incidence rates. However, its limited number (~8 × 8) of large-area (~mm2) readout channels makes the sensor itself low in spatial resolution. An alternative is to use high-resolution cameras, such as the EMCCD camera, to measure events. The EMCCD camera focuses scintillation light onto a high-spatial resolution sensor (~100 kilopixel to  ~1 megapixel) to generate high-resolution images of scintillation events. However, the EMCCD camera has low temporal resolution that limits its ability to discern interactions from individual particles at high incidence rates. Its low frame rates of up to about 100 frames per second (fps) are insufficient for performing advanced techniques, and a frame may contain events from many particles. Furthermore, existing scintillator-sensor designs are typically specialized for specific tasks and are inflexible and expensive. This raises the questions: what benefits could be gained from a sensor that is high in both spatial and temporal resolution, and what might the scintillator-sensor design consist of?

A SPAD array has high spatial and temporal resolution that allows it to excel in measuring events whose photons are localized in space and time on the sensor. High array resolution, small pixel pitch (~μm), and individual pixel readout allow to localize a photon’s arrival on the sensor with pixel-pitch resolution without accumulating noise from other pixels like in a SiPM’s binned-pixel readout scheme. If light from an event arrives on a small area of the sensor, the sensor’s high spatial resolution allows to spatially denoise the captured signal, such as from dark counts or other events separated in space. Therfore, a lens is required to focus light to perform spatial denoising, which differentiates a SPAD camera from a SPAD array or other sensors that do not adopt a lens. Similarly, high temporal resolution provides denoising capability through the time domain. If the light from an event is incident on the SPAD array over a short period of time, the SPAD array’s high temporal resolution allows to temporally denoise the event from dark counts or other events separated in time. Thus, a SPAD camera provides denoising capabilities through its high spatial and temporal resolution.

We propose to use a SPAD camera to capture images of individual interactions in a thick, monolithic scintillator and localize them in 3D. We term this monolithic scintillator-SPAD camera configuration as MOSSC. Along with the hardware setup, we introduce an algorithm to estimate the 3D location of an interaction based on the amount of defocus blur in its image. An overview of the method is illustrated in Fig. 1.

Fig. 1: An overview of the image capture and processing pipeline.
figure 1

A radiation source randomly emits gamma-rays over time that may interact in the scintillator. Scintillation photons are emitted from an interaction’s location. The single photon avalanche diode (SPAD) camera captures a sequence of frames, in which a frame may contain an interaction from one gamma-ray. In post-processing, the algorithm detects whether a frame contains an interaction, denoises the frame, localizes the interaction’s photons, and estimates the interaction’s 3D position in the scintillator. Frame pixels are enlarged for visualization purposes.

MOSSC has the potential to take measurements of interactions with levels of accuracy and resolution that are competitive with other designs. However, demonstrating high-resolution measurements of interactions is not the goal of this work (i.e. position, timing, energy). Achieving high-resolution measurements requires sufficient light collection, which depends on factors such as the SPAD array’s photon detection efficiency (PDE), the number of photons emitted during an interaction, the distance between the interaction and the lens, and the lens’ f-number. The spatial resolution for measuring interactions also depends on the chosen optical configuration.

There is limited previous work on measuring scintillation events with SPAD arrays. Unlike the SPAD camera, the configurations considered in these works are lensless and have SPAD arrays coupled to the scintillator’s surface. SPAD arrays have been modeled and simulated to study their properties in the context of PET21,22. Tétrault et al.23 develop a data acquisition system for a low-resolution SPAD array (6 channels of 22 × 22 pixels) for PET and experimentally tests on monolithic LYSO scintillators. Franks et al.24 demonstrate 2D particle tracking of electrons through scintillator fibers coupled to the SwissSPAD2, a high-resolution SPAD array (512 × 512)25. The fibers act as waveguides that direct light to the SPAD array on one end of the scintillator, where pixels are mapped to individual fibers. In this configuration, the spatial resolution for locating scintillation events is limited to the diameter of a fiber, and the event’s depth cannot be estimated.

SiPM-based designs are lensless and couple the SiPM array to the scintillator’s surface. SiPM arrays are low in spatial resolution, typically up to 16 × 16 readout channels where each channel is  ~1 × 1 mm in dimension. The 3D position of an interaction in a monolithic scintillator can be estimated based on the shape of the scintillation light distribution measured on the SiPM array26. Instead, MOSSC focuses scintillation light to a sharper light distribution that is measured on a high-resolution array with small pixel pitch. Pixelated scintillators can be adopted to improve spatial resolution in the SiPM configuration27,28. They come with higher cost, lower spatial efficiency, and lower temporal resolution and require dual-ended readouts for determining the depth of an interaction. The SPAD array’s ability to readout each individual pixel is its key feature that distinguishes it from a SiPM array. Currently, SPAD array resolution is higher than the SiPM array readout-channel resolution, with SPAD arrays deployed as camera sensors. However, as the resolution of SPAD arrays and SiPM arrays continues to increase, their distinguishing features may begin to converge or overlap in the future. SPAD arrays may require binning pixels together due to limited data bandwidth, and SiPM arrays may be developed into high-resolution, small-area readout channels. Regardless of whether such a sensor is a SPAD array or SiPM array, adopting a lens to focus light is needed to take advantage of the sensor’s high resolution. Otherwise, since an interaction emits light isotropically, spatial intensity variations at the sensor are subtle and have only low-spatial frequency content. A lensless, high-resolution array would just over-sample this largely spatially uniform signal.

Previous works use cameras that are too slow to capture individual interactions and do not demonstrate 3D interaction localization. One application of imaging interactions with CMOS and CCD cameras is in radiography, where ionizing radiation is attenuated by an object and detected in a scintillator. The scintillator’s glow is captured over long exposures to form an image of the object29,30,31,32. Often, the scintillator is thin (<1 mm), and its entire thickness is considered in focus31,32,33. A scintillator thickness that extends beyond the camera’s depth of field adds blur to images that can be reduced computationally in post-processing34,35,36. EMCCD cameras have been used to image scintillation events in thin scintillators and perform particle tracking33,37,38,39. EMCCD sensors have also been coupled to the scintillator in a lensless configuration40,41 and through fiber optic tapers42,43. However, they have low frame rates and cannot image events from individual particles at high incidence rates.

Flat-panel thin film transistor (TFT) arrays can be configured to measure interactions either indirectly through scintillation light or directly through charges from a semiconductor detector and are commonly deployed in radiography44. The flat-panel TFT array is a large-area (>100 × 100 mm) and high-resolution (~100 μm pixel pitch) sensor. It can measure interactions with high spatial resolution in a thin scintillator (<1 mm thickness) that is coupled to the sensor in a lensless design45,46. However, its readout rates (~60 fps) are unsuitable for applications that require imaging individual particles or high temporal resolution47,48,49.

The Timepix chip can readout the time of arrival and time over threshold of individual particles50,51,52. However, external amplification from an intensifier is required to overcome pixel noise and detect single photons of light53. Timepix cameras use “hybrid pixel detectors” in which a silicon optical sensor is bump-bonded to a Timepix chip54,55. In these cameras, a single photon is amplified to an avalanche of electrons in a micro-channel plate that is sent to a thin scintillator or phosphor screen. The hybrid pixel detector then detects enough light from the phosphor screen to achieve single photon sensitivity. In contrast, amplification from a single photon to an electron avalanche occurs internally in a SPAD camera’s pixel. Timepix cameras have high spatial (256 × 256) and temporal (~1 ns) resolution and can image individual scintillation events56,57,58,59. However, they are a highly specialized class of camera, and their temporal resolution is limited by the phosphor screen’s emission decay time. The SPAD camera is more compact and belongs to a general-purpose class of camera.

The large area picosecond photodetector (LAPPD) is a microchannel plate-based sensor consisting of microstrip anodes from which signal is read out to localize and time the arrival of single photons60. It has been used in many nuclear physics experiments since the formation of the LAPPD Collaboration in 2009. The current commercial version of the LAPPD has  ~50 ps temporal resolution and  ~1 mm spatial resolution over an area of about 200 × 200 mm61. Many designs optimizing various aspects of the LAPPD during its development have been demonstrated. Some achievements include a 20 μm spatial resolution and  <1 ns temporal resolution over a 100 × 100 mm area, and a 50 μm spatial resolution and  <200 ps temporal resolution over a 200 × 200 mm area62. For comparison, a more recent SPAD array achieves 6.39 μm pixel pitch and 100 ps temporal resolution over a 13.2 × 9.9 mm area63. This SPAD array is launched in a commercial low-light camera (Canon MS-500). The LAPPD can offer high spatial and temporal resolutions that are competitive with SPAD arrays, and our proposed method for measuring events could work on such a sensor. However, the commercial LAPPD is currently sold in low volumes, and its  ~1 mm spatial resolution is low compared to that of a SPAD array64.

In the papers above, defocus blur due to scintillator thickness and the camera’s limited depth of field is an undesirable phenomenon that is sought to be minimized or surpassed. In contrast, we use defocus blur to determine the depth of individual interactions to obtain 3D positions.

Hardware experiments demonstrate MOSSC’s sensitivity to 3D shifts in the interactions’ spatial distribution. Simulations are adjusted to experimental data and demonstrate MOSSC’s 3D event localization capability. The SPAD camera used in experiments is the SwissSPAD225. Experiments are conducted with gamma-rays, but the method is applicable to any radiation detectable by a scintillator. In the future, incorporating emerging improvements in SPAD PDE, array size, and resolution together with more efficient light-trapping geometries may enable a qualitatively different class of versatile radiation detector designs.

Methods

Optical setup

We model an interaction as a point source of light and its image as a disk, known as the circle of confusion (CoC). This assumes the interaction’s recoiled electron’s path length is small relative to the camera’s field of view (FOV), and that the lens is thin and ideal. The disk’s diameter depends on the interaction’s depth, following the depth-defocus relationship.

The optical setup is illustrated in Fig. 2. A lens with diameter A and focal length f is placed at a distance d from the scintillator’s surface. The scintillator has an index of refraction n > 1. The SPAD camera’s focal plane is set to the apparent bottom of the scintillator or below, accounting for the scintillator’s index of refraction. The distance from the lens to the focal plane is S1, and the distance from the lens to the SPAD array is S2. An interaction occurs in the scintillator at a distance S4 from the lens, with its apparent location at a distance S3 from the lens. Light is emitted from the interaction’s location and refracts at the scintillator’s surface. A small fraction of the total light emitted is focused to a spot on the SPAD array with diameter c.

Fig. 2: A diagram illustrating optical parameters in the hardware setup and circle of confusion model.
figure 2

A lens with diameter A is placed at a distance d from the scintillator’s surface. S1 is the distance from the lens to the focal plane. S2 is the distance from the lens to the SPAD array. S3 is the distance from the lens to the interaction’s apparent location along the optical axis. S4 is the distance from the lens to the interaction’s true location along the optical axis. c is the diameter of the circle of confusion of light incident on the SPAD array. The red arrows denote the propagation of light that refracts out of the scintillator, passes through the lens, and arrives on the SPAD array.

Algorithm: detecting an interaction

The following algorithm for detecting which frames contain an interaction is only applicable to experimental data. In experiments, a frame is assumed to contain either no interactions or interactions from only one gamma-ray. The presence of an interaction is distinguished from noise based on elevated counts in a certain sized region of a frame. A frame consists of binary pixels that are individually activated either by a scintillation photon or a dark count. Noise arises from dark counts, which is when a pixel falsely detects a photon arrival. The number of dark counts that occur over time in a pixel is known as the pixel dark count rate (DCR). A small portion of pixels have an elevated DCR compared to the rest and are zeroed out. The pixel DCR is affected by temperature. Due to the extremely low light-level environment, keeping the noise level low and constant is critical. This requires operating at low, constant temperatures.

First, the noise level at the current experimental temperature is measured. Images are taken in the dark with no gamma-ray source present. These frames are denoted as dark frames. Then, data is captured with the gamma-ray source present.

After the capture is performed, the algorithm runs as follows. The distribution of the maximum counts in any k × k region of pixels in a frame is computed over all dark frames over a chosen set of k’s. Using this distribution, a threshold, Tk, on the number of counts in a k × k region is set to detect the presence of an interaction. Then, all frames captured in the presence of the gamma-ray source are processed. If a frame contains any k × k region with counts at or above the corresponding Tk, the frame is passed to the algorithm for locating interactions. This frame is denoted as a detection frame.

Algorithm: locating an interaction

After detecting the presence of an interaction in a frame, the algorithm attempts to remove dark counts. Scintillation photon clusters within the CoC are distinguished from dark counts using the following method inspired by identifying star clusters from a field of background stars65. Overall, this method identifies dark counts by thresholding their distance to the nearest photon and retaining a cluster of close photons, assuming that the scintillation photons are focused to within a circle. The frame’s minimum spanning tree (MST) is computed in which the nodes are all nonzero pixels, and an edge is the distance in units of pixels between two nodes. Edges in the MST above a chosen threshold length Tedge are removed, leaving separate sets of connected nodes. The set with the most nodes is retained as the interaction’s photon cluster, and the remaining nodes are discarded as dark counts. This denoising procedure is illustrated in Fig. 3a. If the remaining number of counts after denoising is less than the lowest Tk, this frame is discarded and the interaction location is not estimated. Otherwise, this frame is kept and denoted as a denoised frame. A spherical 2D Gaussian (μ,σ2) is fit to the counts’ coordinates with the EM algorithm. The interaction’s centroid and diameter are μ and c = sσ, respectively, where s is a scaling factor. The detected photons may not fill the entire span of the CoC, and there may be dark counts that were not removed during denoising. A Gaussian is used to prevent overfitting to these cases.

Fig. 3: Examples of raw frames and their denoised result taken from experimental data.
figure 3

Denoising is performed with Tedge = 46 pixels, and the diameter is estimated with s = 3.7 using the algorithm described in the “Algorithm: locating an interaction” section in Methods (GMM-Loc). The red circle in a frame represents the interaction’s estimated centroid and diameter. Frame pixels are enlarged for visualization purposes. a Example shown with the intermediate steps of the denoising algorithm. This example also illustrates the algorithm’s bias. b Examples of raw frames and the result of denoising and localization.

The algorithm can be generalized to handle m interactions in a frame by retaining the m largest sets of connected nodes after removing edges from the MST. Then, a Gaussian mixture model (GMM) with m components is fit to obtain each interaction’s centroid and diameter.

After obtaining the interaction’s centroid and diameter in a frame, its 3D position in the scintillator is computed. An interaction’s spot size on the SPAD array is related to its distance from the focal plane. Assuming a thin lens, this is quantified by the CoC equation

$$c=A\frac{| {S}_{3}-{S}_{1}| }{{S}_{3}}\frac{f}{{S}_{1}-f}$$
(1)

Parameters are described in the “Optical setup" section in Methods and illustrated in Fig. 2. By substituting \(f=\frac{1}{1/{S}_{1}+1/{S}_{2}}\) (thin lens equation), Equation (1) can be rearranged into

$${S}_{3}=\frac{1}{1/{S}_{1}+c/(A{S}_{2})}$$
(2)

The absolute value can be removed because the focal plane is at the apparent bottom of the scintillator or beyond, so there is only one direction where defocus blur can increase. S1, S2, and A are known. c is measured using the Gaussian fit as described above and converted from units of pixels to units of mm using the SPAD array’s pixel pitch. Interaction depth follows from solving for S3 and correcting for the scintillator’s index of refraction: S4 = d + (S3 − d)n. This yields the interaction’s z-coordinate, zint, in world coordinates. The z-axis is parallel to the optical axis and centered with the lens. An interaction’s centroid μ = (ximageyimage) in pixel coordinates is measured and converted to physical world coordinates in units of mm. Perspective projection is applied in the following form:

$${x}_{{{{\rm{image}}}}}=-{S}_{2}\frac{{x}_{{{{\rm{int}}}}}}{{S}_{3}}\quad \quad {y}_{{{{\rm{image}}}}}=-{S}_{2}\frac{{y}_{{{{\rm{int}}}}}}{{S}_{3}}$$
(3)

ximage, yimage, and S3 are measured as described above, and S2 is known. The interaction’s remaining world coordinates xint and yint are solved for.

Optical calibration

In practice, due to the low light-level environment and the short distance between the interactions and the lens, the lens used in the hardware setup is thick and has a small f-number. The above method based on the thin lens approximation cannot be used as is. Rather, a diameter-to-distance relationship is physically measured for the lens and its focus in a calibration procedure as follows. A pinhole is placed where the apparent end of the scintillator would be relative to the lens, accounting for the scintillator’s index of refraction. With other lights turned off, a light is shined through the pinhole and imaged, approximating the image of a point source at that distance. The pinhole is then translated closer to the lens, and this procedure is repeated. As the pinhole is translated closer to the lens, the image’s diameter increases due to defocus blur. The images’ diameters are visually determined, and a correspondence between diameter and depth is established. Given a diameter, an interaction’s apparent depth (S3) is looked up in this correspondence, applying any required linear interpolation. Thus, S3 is obtained while bypassing Equation (2). xint, yint, and zint can then be solved for as above.

Experimental hardware and algorithm configuration

The hardware components used in the experiments consist of a 50 × 50 × 70 mm CsI(Tl) scintillator (Luxium Solutions), 1 μCi Co-60 (1.17, 1.33 MeV) gamma-ray source (Spectrum Techniques), f = 25 mm, ∅25.4 mm plano-convex lens (Thorlabs LA1951-A), and the SwissSPAD225. The radioactive material is approximately 3.1 mm in diameter and lies centered in the disk containing it, 2.8 mm away from the disk’s surface. The scintillator has an index of refraction of n = 1.79, a decay constant of 1 μs, and a light yield of 54,000 photons/MeV. It was chosen for its high light yield and emission spectrum that closely matches the SwissSPAD2’s PDE. All sides of the scintillator are polished and do not have any glass windows or protective housing.

Images are captured with the SwissSPAD2 using 256 × 496 pixels and operating at a 7 V excess bias. This provides a peak photon detection probability (PDP) of 50% at 520 nm wavelength, which closely matches the CsI(Tl) scintillator’s maximum emission wavelength of 550 nm. The fill factor is 10.5%, yielding a peak PDE of 5.25%. The SwissSPAD2 is set to capture single-bit frames to detect single scintillation photons from interactions of individual gamma-rays in individual frames. The SwissSPAD2 detects the arrival of a photon on a pixel within a frame’s exposure period, but not its time of arrival. For these experiments, images are captured in global shutter mode with a frame period of 65 μs (15.4 kfps) to allow for continuous transfer of long captures over USB 3.0 to the PC. Gating is applied to shorten a frame’s effective exposure time. The gate length should be long enough to provide a buffer time for randomly incident gamma-rays to arrive and to capture all the light emitted over the duration of the scintillator’s fluorescence decay. However, it should also be short enough to minimize dark counts and prevent multiple gamma-rays from being observed in one frame. A gate length of 10 μs is chosen to image the scintillator with a 1 μs decay constant. The entire apparatus is mounted inside a custom made light-tight box to keep ambient light out. The top 5% of pixels with the highest DCR are zeroed out.

The procedure described in the “Optical calibration" section in Methods is performed prior to data collection with a 50 μm pinhole (Thorlabs P50S), diffuser (Thorlabs DG10-1500-MD), two neutral density filters (Thorlabs NE10A-B), and a light source (SugarCUBE LED Illuminator). The lens is mounted with its plano surface 58 ± 5 mm from the SPAD array and facing away from the SPAD array. The pinhole’s initial reference point is set 39 ± 5 mm away from the lens mount’s edge, corresponding to the apparent bottom of the scintillator. The depth-diameter correspondence is measured by translating the pinhole in 2.54 ± 0.1 mm increments over a range of -5.08 mm to +15.24 mm from the initial reference point along the z-axis. The positive direction is toward the lens. Interactions theoretically cannot occur at z < 0 mm, but depth-diameter correspondences are measured there in case the localization algorithm outputs a diameter smaller than what corresponds to z = 0 mm. These interactions are included in the results. The optics are set such that the image diameter corresponding to the origin is not small. This is done to avoid photon pile up. Supplementary Fig. 1 shows a plot of the offsets from the pinhole’s initial reference point and the corresponding image diameters.

Data collection is performed in an ambient temperature range of 0 to 3 degrees Celsius. Captures of 458,752 frames are taken five times in repetition followed by a three minute pause, during which the SwissSPAD2 is turned off. This procedure is repeated multiple times until the desired number of frames is captured. Pauses are implemented to prevent the operating temperature and DCR from rising.

To maximize light collection, the SPAD camera is mounted with the plano side of the lens as close as possible to and facing toward the scintillator. An air gap of 2.7 ± 0.5 mm exists between the scintillator and the lens. Figure 4a, b show the hardware setup and world origin used for experiments. The origin in the world coordinate system is the intersection of the lens’ optical axis and the scintillator’s bottom surface. The scintillator’s bottom surface is opposite the lens, and its top surface is adjacent to the lens. The coordinates are in units mm. The x-axis is parallel to the camera major axis (496 pixels), and the y-axis is parallel to the camera minor axis (256 pixels). The z-axis represents the depth dimension and is directed through the scintillator toward the camera.

Fig. 4: The experimental setup and the world coordinate system axes.
figure 4

a A side view of the hardware inside the light-tight box. b A top-down view of the scintillator and camera.

To demonstrate MOSSC’s 3D sensitivity to shifts in the distribution of interactions, the gamma-ray source is placed at various positions with respect to the camera’s FOV, and the distributions of measured interaction locations are compared. The distribution of interaction locations is expected to shift in the same direction as the source due to source proximity and the inverse square law. The surface of the disk containing the radioactive material is placed flush against the scintillator. Thus, the radioactive material’s true position is about 2.8 mm away from the scintillator’s surface. However, the direction of the source’s shifts is the most relevant information in regard to observing shifts in the interactions’ spatial distribution. We use the center of the source-disk’s surface as the reference point in world coordinates to describe the source’s placement. We perform two separate experiments with the same optical calibration. In the “x-y sensitivity" experiment, we test sensitivity along the x-y plane by shifting the source along the x-axis at the bottom of the scintillator. In the “z sensitivity” experiment, we test sensitivity along the z-axis by shifting the source along the side of the scintillator parallel to the z-axis.

At the start of an experiment, we capture a total of 9,175,040 dark frames with no source present to establish the experiment’s noise level. Then, data is collected with the gamma-ray source present. Data processing is performed after all capturing is finished. The noise-interaction thresholds Tk are set for each k × k region size by taking the 10th highest counts in a region out of all 9,175,040 dark frames and adding 1. The 10th highest is chosen to prevent fitting to any irregular outlier frames. The k values we use are 31, 51, 71, 91, 111, 131, 151, 191, and 231. The lowest Tk is 5 counts for k = 31 in both experiments. See Supplementary Table 1 for all Tk values.

In the localization algorithm, the Gaussian scaling factor s is set to 3.7, based on visual inspection of circles that outline the photon clusters’ areas. The edge length threshold Tedge significantly biases the diameter of photon clusters during denoising. We run the algorithm on the collected data using different values of Tedge. We set Tedge such that the mean z-coordinate over all measured interactions is closest to 10 mm when the source is placed at z = 10 mm in the z sensitivity experiment. We report results using Tedge = 46 pixels. Figure 3b contains examples of captured frames containing an interaction and the result of denoising with these parameter values.

Results

Experiments

x-y sensitivity

The center of the source-disk’s surface is placed at (−12.7, 0, 0), (0, 0, 0), and (12.7, 0, 0) (±2) mm, with the lens in the center of the scintillator’s top surface. 27,525,120 frames are captured for each source position. The mean interaction location and number of interactions detected are (−0.57, 0.04, 9.88) mm with 430 interactions, (−0.02, −0.06, 9.14) mm with 1455 interactions, and (0.48, 0.05, 9.70) mm with 384 interactions, respectively for each source position. Interaction locations are shown in Fig. 5a–c. The distributions of counts in a frame are reported in Table 1.

Fig. 5: Plots of measured interaction locations for different gamma-ray source placements in the experiments.
figure 5

The top row shows measured interaction locations of the x-y sensitivity experiment projected to x-y plane with the gamma-ray source’s x-coordinate at (a) −12.7 mm, b 0 mm, and (c) 12.7 mm. The bottom row shows measured interaction locations of the z sensitivity experiment projected to the x-z plane with the gamma-ray source’s z-coordinate at (d) 0 mm, e 10 mm, and (f) 20 mm. The mean location and number of measured interactions in each plot is (a) (−0.57, 0.04, 9.88) mm, 430 interactions (b) (−0.02, −0.06, 9.14) mm, 1455 interactions (c) (0.48, 0.05, 9.70) mm, 384 interactions (d) (0.30, −0.10, 9.64) mm, 472 interactions (e) (0.47, −0.06, 9.76) mm, 1012 interactions (f) (0.38, −0.08, 11.35) mm, 1346 interactions. ac Test shifts in the mean x-coordinate, and (df) test shifts in the mean z-coordinate.

Table 1 The distributions of counts in a frame observed in both experiments

z sensitivity

The lens is placed such that its edge protrudes 3.7 ± 3 mm over the edge of the scintillator’s top surface. This corresponds to the lens’ center located 9 ± 3 mm from the edge of the scintillator’s top surface. This is done to keep the camera’s FOV close to the gamma-ray source, which is placed flush against the scintillator’s surface parallel to the z-axis. Thus, the center of the source-disk’s surface is placed at (9, 0, 0), (9, 0, 10), and (9, 0, 20) (±2) mm. 18,350,080 frames are captured for each source position. The mean interaction location and number of interactions detected are (0.30, −0.10, 9.64) mm with 472 interactions, (0.47, −0.06, 9.76) mm with 1012 interactions, and (0.38, −0.08, 11.35) mm with 1346 interactions, respectively for each source position. Interaction locations are shown in Fig. 5d–f. The distributions of counts in a frame are reported in Table 1.

A summary of the counts in detection and denoised frames over both the x-y and z sensitivity experiments is visualized in histograms in Supplementary Fig. 3.

Simulations

Monte carlo simulations are performed using the Geant4 library to demonstrate MOSSC’s 3D localization capability66. In this section, an interaction’s image’s centroid and diameter are estimated in two ways. One approach uses the GMM with s = 3.7 as described in the “Algorithm: locating an interaction” section in Methods and used in experiments. The other approach estimates the centroid as the photons’ mean pixel coordinates and the radius as the distance from the centroid to the farthest photon. We refer to these two methods as GMM-Loc and Exact-Loc. Then, S3 is computed using Equation (2) instead of using a depth-diameter correspondence as done in the experiments. Exact-Loc can only be applied to single-interaction events and noise-free frames. GMM-Loc can be applied to single or double-interaction events and to noisy or noise-free frames. Frames with less than 5 counts are discarded. Simulation parameters are adjusted to approximate the experimental data’s photon counts and FOV.

The world origin is the intersection of the camera’s optical axis and the scintillator’s surface opposite the lens, like in the experiments. The scintillator is 10 × 10 × 70 mm with its long edge parallel to the z-axis, spanning from z = 0 mm to z = 70 mm. Its material is CsI with an index of refraction set to 1.79. A thin lens with 7 mm diameter and 25 mm focal length is placed centered at (0, 0, 72.7) mm. The sensor is placed at z = 126.4 mm, corresponding to a focal plane offset 5 mm from the scintillator’s apparent bottom toward the  − z direction. The simulated FOV using this optical configuration is found to approximately match the experimental FOV and is used in all simulations. Verification that the simulated FOV and experimental FOV approximately match is described in Supplementary Notes 2.2. Supplementary Fig. 2 shows the simulation’s depth-diameter correspondence. In all simulated datasets, gamma-ray energies are 1 MeV, and the scintillator’s light yield is 300,000 photons/MeV. Then, the counts in a frame are adjusted by scaling them down by a fixed factor to approximate the counts observed in experimental data. How a frame’s counts are calibrated to approximate experimental data is described in Supplementary Notes 2.3.

Two SPAD arrays are simulated: the SwissSPAD2 (5.25% max PDE) and the SPAD array in the MS-500 sold by Canon (69.4% max PDE)25,63. We set the photon down scaling factor to 0.06923 to approximate the SwissSPAD2’s experimental counts. We approximate MS-500’s PDE as 10 times that of the SwissSPAD2. Thus, MS-500’s photon down scaling factor is set to 0.6923. The SwissSPAD2 is simulated as 256 × 512 pixels over 4.8 × 9.5 mm to approximate the experimental configuration. MS-500 is simulated as 1550 × 1925 pixels over 9.9 × 12.3 mm.

Geant4 simulates the data related to interactions in the scintillator and the propagation of photons up to their incidence on the thin lens. The scintillator’s optical surface is simulated using the Unified model with a polished dielectric-dielectric interface and reflectivity set to 1 (no absorption). Outside of Geant4, we propagate photons through the thin lens and onto the sensor’s nearest pixel to generate the simulated image. This allows us to easily simulate an event’s image on different sensors with the same underlying event and light propagation data produced by Geant4. We simulate different sensors by changing the pixel pitch, array resolution, and photon scaling factor. A photon is not included in the image if it is not incident on the sensor after propagating through the lens. A pixel’s value is 1 if one or more photons are incident on it in a frame. To reduce simulation runtime and storage size, the simulated scintillator’s width is 10 × 10 mm so that interactions are constrained closer to within the camera’s FOV. Additionally, scintillation photons are cut from the simulation if they internally reflect in the scintillator or if their origin is located at z > 25 mm.

Depth estimation biases

There are several sources of bias that affect interaction depth estimation. A phenomenon that we term as refraction blur affects an interaction’s spot size on the SPAD array. Following Snell’s law, photons with higher incidence angles on the scintillator’s surface refract at higher angles as they exit the scintillator. Therefore, due to the photons’ non-uniform angles of refraction, the captured photons do not appear to diverge from a point origin in the scintillator, and the diameter of an interaction’s image will be slightly larger than that of a theoretical point source. Photons that lie toward the outer perimeter of the interaction’s spot size on the SPAD array are shifted radially outward compared to those from a point source without refraction. If refraction blur is not accounted for, the CoC model biases the predicted interaction’s location to a higher z-coordinate toward the lens due to a larger observed spot size. However, the CoC model is already imperfect because it assumes idealities such as a thin lens and a constant, wavelength-independent index of refraction. It also assumes an interaction is a point source of light. Refraction bias is present by default with the scintillator’s index of refraction of 1.79. A lack of refraction bias is simulated by setting the scintillator’s index of refraction to 1.

The amount of light collection biases depth estimation. If few photons are detected in a frame, those photons may not span the entirety of the CoC and bias the estimated z-coordinate to a lower value. We refer to this as low-counts bias. Low-counts bias is simulated using frames with experimentally adjusted photon counts for the sensor in use. Lack of low-counts bias (high counts) is simulated by retaining all photons in a frame without applying a downscaling factor.

Other sources of bias include the denoising algorithm and the method used to estimate an interaction’s diameter, such as GMM-Loc or Exact-Loc.

We report the bias as the difference in the estimated and true z-coordinate of an interaction over a range of true z-coordinates. The datasets for measuring bias are generated by emitting 25,000 gamma-rays from (0, 0, −5) mm in the z direction and retaining frames of single-interaction events containing at least 5 counts. Combinations of different biases are shown in Fig. 6 simulating the SwissSPAD2. Similar plots are generated for MS-500 and reported in Supplementary Fig. 4.

Fig. 6: Depth estimation bias simulated with the SwissSPAD2.
figure 6

ae Combinations of different biases. Each plot’s median and variance taken over all samples is reported for biases that are approximately depth-independent. The orange line is a 100-sample running average.

When approximating the bias present in the experiments, we simulate experimental photon counts and dark counts and perform the same denoising and localization algorithms as in the experiments. The number of dark counts added to a frame is drawn from a Poisson distribution with a mean of 2, which is the median dark count per frame in the z sensitivity experiment (shown in Table 1). Dark counts are uniformly randomly added to the frame’s pixels. We perform the experiment’s denoising algorithm with Tedge = 46 and localization algorithm (GMM-Loc) with s = 3.7. The resulting bias is reported in Fig. 6e.

Spatial resolution and accuracy

Interaction localization performance is quantified by accuracy and spatial resolution. Spatial resolution full-width-half-maximum (FWHM) is computed as 2.355σs, where σs is the standard deviation of the differences in the estimated and true locations. Accuracy is reported in terms of error, which is the distance between an interaction’s estimated and true location. FWHM and error are calculated in the x and z coordinates separately.

The simulation consists of directing gamma rays in the  − x direction originating from (50, 0, 5), (50, 0, 10), (50, 0, 15), and (50, 0, 20) mm. 25,000 gamma rays are emitted from each origin. Events with only one interaction are retained, and the rest are discarded. A frame’s photon counts are downscaled to approximate experimental data. Frames with less than 5 counts are discarded. Interactions are separated into intervals of 0.5 mm along the x-axis from x = −3.5 to 3.5 mm for each depth (z = 5, 10, 15, 20 mm). We report the spatial resolution FWHM and average error for interactions in each interval of space.

Figure 7 reports error and spatial resolution from noise-free data simulated with the SwissSPAD2 and its experimentally adjusted photon counts. Interactions are localized using Exact-Loc. The bias in this simulation corresponds to that of refraction, low counts, no noise, and Exact-Loc illustrated in Fig. 6b. Therefore, a bias of 0.19 mm is subtracted from the final z-coordinate estimate. Supplementary Fig. 5 reports error and spatial resolution for MS-500. A bias of 0.93 mm observed in Supplementary Fig. 4b is subtracted from the final depth estimate.

Fig. 7: Interaction localization error and spatial resolution from noise-free data on the simulated SwissSPAD2 with experimentally adjusted photon counts.
figure 7

Localization is performed with Exact-Loc. Error and spatial resolution full-width-half-maximum (FWHM) are reported separately along the x and z coordinates over a range of x and z intervals in the scintillator. Intervals are 0.5 mm in length across the x-axis.

Figure 8 reports error and spatial resolution on data that replicates the experimental conditions. Noisy frames are simulated with the SwissSPAD2 and its experimentally adjusted photon counts. Interactions are denoised and localized using the same methods as in the experiments. This corresponds to the bias shown in Fig. 6e.

Fig. 8: Interaction localization error and spatial resolution from data simulating the SwissSPAD2 and the experiment’s photon counts, dark counts, denoising algorithm, and localization algorithm.
figure 8

The denoising algorithm uses Tedge = 46, and the localization algorithm is GMM-Loc with s = 3.7. Error and spatial resolution full-width-half-maximum (FWHM) are reported separately along the x and z coordinates over a range of x and z intervals in the scintillator. Intervals are 0.5 mm in length across the x-axis.

Uncertainty in mean interaction position

In order to verify the shifts in the mean position observed in the experiments are greater than statistical uncertainty, the uncertainty in the estimated mean position must be measured. Since the true mean position in the experiments cannot be known, we use simulations that replicate experimental conditions to approximate uncertainty in the estimated mean position. 100,000 gamma-rays are randomly emitted uniformly over an upper hemisphere originating from (0, 0, −0.5) mm. Counts in a frame are adjusted for the SwissSPAD2’s experimental photon counts. We only use single-interaction events that contain an interaction within (xyz) ∈ (−3:3, −3:3, 0:25) mm to better constrain them in the camera’s FOV. The rest are discarded. GMM-Loc is tested on noisy frames with the same denoising and localization algorithm in the experiments (Tedge = 46, s = 3.7), corresponding to the bias in Fig. 6e. The number of dark counts added to a frame is drawn from a Poisson distribution with a mean of 2. Note that true and estimated mean locations are computed only over interactions that emitted enough photons to be detected with a minimum of 5 counts in a frame. This results in 672 frames containing an interaction. 670 frames are randomly split into 10 groups of 67 frames each. In each group, the true and estimated mean 3D positions are calculated, and their difference (estimated mean - true mean), are recorded. The mean and standard deviation of this difference over the 10 groups are −0.011 ± 0.012 mm in the x-coordinate and −1.401 ± 0.278 mm in the z-coordinate.

The standard deviations above represent measures of statistical uncertainty in the mean position. The mean x-coordinate in the x-y sensitivity experiment shifted by 1.05 mm when the source was shifted from (−12.7, 0, 0) mm to (12.7, 0, 0) mm. This shift is about 87 times greater than the standard deviation in the estimated mean x-coordinate. The mean z-coordinate in the z sensitivity experiment shifted by 1.71 mm when the source was shifted from (9, 0, 0) mm to (9, 0, 20) mm. This shift is about 6 times greater than the standard deviation in the estimated z-coordinate. Therefore, it is unlikely that the magnitudes of the shifts in the experiments fall within the uncertainty of the mean position, confirming that sensitivity to 3D shifts in the mean positions is being observed.

Compton camera

A Compton camera estimates the position and energy deposition of two interactions from individual gamma-rays to perform Compton backprojection. For an interaction with true energy E, the energy resolution FWHM as a percentage is 2.355σe/E, where σe is the standard deviation of the estimates of E under Gaussian statistics. Photon count measurements follow Poisson statistics. That is, a measurement with n photon counts has a standard deviation of \(\sqrt{n}\) counts. Therefore, a measurement’s energy resolution FWHM based on photon counts is \(\frac{2.355\sqrt{n}}{n}\). This is valid for large n where the Poisson distribution approximates the Gaussian. Energy resolution is poor for small n where this approximation does not hold, regardless.

We test MOSSC’s feasibility as a monolithic, single-crystal Compton camera simulated on the SwissSPAD2 and MS-500 with adjusted photon counts. The positions of an interaction pair are estimated using GMM-Loc with m = 2 components. No dark counts are added. GMM-Loc’s depth-dependent bias, which is the running mean shown in Fig. 6d for the SwissSPAD2 and Supplementary Fig. 4d for the MS-500, is subtracted from an interaction’s estimated depth. We apply a 74.5% FWHM energy resolution to the SwissSPAD2 dataset and 23.6% FWHM to the MS-500 dataset. These energy resolutions correspond to measurements of n = 10 counts and n = 100 counts, respectively. An interaction’s type (scattering or absorption) is not assumed. Events where both interactions are scatterings are included. Backprojection for one gamma-ray assumes that the higher-energy interaction is the scattering interaction and the other is the absorption. Simple backprojection is performed. The world origin is set to the center of the scintillator’s volume. Both interactions’ true positions must occur within (xyz) ∈ (−3:3, −3:3, −35:−10) mm to to better constrain them in the camera’s FOV. Otherwise, the event is discarded. 100,000 gamma rays are randomly emitted toward the scintillator from (500, 0, 0) mm, corresponding to (0, 0) degrees in azimuth and elevation from the origin. This results in 484 double-interaction frames for the SwissSPAD2 dataset and 755 double-interaction frames for the MS-500 dataset. Note that Compton backprojection is typically performed on an order of 10,000 events or more67.

Source reconstruction fails with the SwissSPAD2 and succeeds with the MS-500, shown in Fig. 9. This demonstrates that the monolithic, single-crystal Compton camera is feasible with MOSSC using next-generation SPAD arrays. The 23.6% energy resolution simulated with the MS-500 is relatively low compared to state-of-the-art Compton cameras. Future work may consider different scintillator geometries and optical configurations to improve light collection and energy resolution.

Fig. 9: Compton backprojection reconstruction result for the simulated MS-500 with 23.6% energy resolution.
figure 9

The reconstruction is plotted over azimuth and elevation angles. The source’s true location is (0, 0)°. Reconstruction intensity is defined by the colorbar.

Discussion

Experiments

Experimental results show that the distribution of measured interaction locations shifts in the same direction as the gamma-ray source both laterally and in depth. This demonstrates that MOSSC is sensitive to 3D shifts in the interactions’ spatial distribution. A more controlled experiment involving a collimated gamma-ray source with high enough activity and energy would be required to demonstrate interaction localization capability with a real system. Even then, the true location of interactions cannot be known due to possible scattering. Instead, we demonstrate localization capability and performance through simulation with a known ground truth.

The mean x and z coordinates are used as metrics to identify a shift in the distribution of measured interaction locations in the lateral and depth directions, respectively. They are not expected to match the source’s coordinates, and the source’s position may be outside the camera’s FOV. Since interactions can only be measured in the camera’s FOV, the shift in the mean x and z coordinates must be small. The source cannot be considered a point source compared to the lengths of scale in this experiment, which may also contribute to smaller shifts in the measured interactions’ mean coordinates. The radioactive material’s 3.1 mm diameter is not negligible relative to the camera’s FOV (approximately 3 × 6 × 23 mm as outlined by interactions in Fig. 5) and 10 mm shifts in source position.

Fig. 10: Images of in-focus interactions simulated with MS-500 and its adjusted photon counts.
figure 10

Frames are cropped and pixels are enlarged for visualization purposes.

Sensitivity to interaction location is nonuniform along the depth dimension. This is evident from the z sensitivity experiment, where the number of measured interactions increases as the gamma-ray source is shifted up the z-axis. The intensity of light incident on the lens depends on the distance to the light source, following the inverse square law. This results in higher photon counts from interactions at higher z positions, which makes them more likely to be detected than interactions at lower z positions. This may also explain why the shift in the measured interactions’ mean z coordinate is small when the source is shifted from z = 0 mm to z = 10 mm. The mean z-coordinate will be biased toward a higher value if interactions closer to z = 0 mm are missed more frequently. The shift in the mean z-coordinate is evident when the source is shifted from z = 10 mm to z = 20 mm. Capturing a higher signal may reduce this nonuniformity and extend the depth at which interactions can be measured.

Interaction localization performance can be adjusted based on the chosen optical configuration. For example, the depth of interaction (DOI) resolution improves if the optics are chosen such that the diameter-depth correspondence, as in Supplementary Fig. 1, has a faster rate of change. This is due to the fact that a small difference in an interaction’s depth would correspond to a larger difference in image diameter. Therefore, uncertainty in measuring the image diameter would have a smaller impact on the depth estimate. In other words, an optical configuration with a smaller depth of field has a better DOI resolution but a smaller range of depths where interactions can be measured. Aperture size is one optical parameter that can be adjusted to change the depth of field and DOI resolution. Changing the aperture size comes with other tradeoffs in terms of light collection. Increasing the SPAD array’s physical dimensions, together with array resolution to keep the pixel pitch constant, would allow to measure a larger range of depths before the image’s diameter covers the entire span of the array. The optical configuration used in the experiments may have a low DOI resolution that contributes to the small shifts in the measured interactions’ mean z-coordinate.

Table 1 summarizes the overall signal and noise levels in the experiments. The distribution of counts in detection frames with the source present (signal) is distinct from the distribution of counts in dark frames with no source present (noise), as set by the noise-interaction thresholds Tk. The distribution of counts is generally concentrated around the median for each category. The 95th and 99th percentiles provide information on how many counts are in the highest-count frames excluding outliers.

The lateral symmetry of the interactions’ spherical distribution is observed when the source is placed at (0, 0, 0) mm in the x-y sensitivity experiment and aligned in the center of the camera’s FOV. As expected for this source placement, the observed mean x and y coordinates are close to 0 mm, and the number of interactions is greater than when the source is at (−12.7, 0, 0) mm and (12.7, 0, 0) mm. This is expected because the source is closest to (and within) the camera’s FOV. In this configuration, we estimate that roughly 11,150 interactions should occur over the total exposure time of the captured frames in the camera’s FOV, while we observe 1455 interactions. See Supplementary Notes 2.1 for the derivation. This discrepancy is likely due to multiple phenomena, the effects of which are amplified from low light collection. Scintillation photons may not be detected if the interaction occurs near the end of the frame’s gate period and the gate ends before all the scintillation photons are emitted. Our algorithm might only be detecting interactions with high energy depositions that emit the most photons and is less likely to detect interactions farther from the lens.

We adopt the CoC model as an approximation of the camera’s true point spread function (PSF) due to low experimental photon counts. Actually, a thick lens’ PSF is not a uniform disk but contains regions of different densities and aberrations. Photons may arrive in lower density regions of the PSF far from the higher density center. Whether these photons are removed as noise depends on Tedge and may negatively influence the computed interaction diameter in either case. Tedge’s bias on estimating an interaction’s diameter is evident in Fig. 3a. The raw frame in Fig. 3a shows an interaction with high photon counts spread throughout the frame, which indicates the interaction is likely centered with the lens and has a high z-coordinate. However, denoising with Tedge = 46 estimates the interaction has a small diameter and is off to the side. Setting Tedge to a higher value would denoise and localize interactions with a large diameter more accurately, but it would also remove fewer dark counts and bias diameter estimates to larger values. Analogously for a smaller Tedge, smaller diameter interactions would be localized more accurately, but more scintillation photons would be removed and diameters biased to a smaller value. Also, based on the distributions in Table 1, our algorithm with Tedge = 46 is likely removing scintillation photons in addition to dark counts. In the x-y experiment, the median dark counts in a frame is 1 count. However, the median counts in a frame decreases by 3 counts from 9 counts in detection frames to 6 counts in denoised frames. Sufficient light collection may allow to measure the image’s photon densities and match them to the true PSF. This may locate interactions more accurately than with the CoC model.

The temporal resolution for measuring an interaction is the gate length of a frame, which is set to 10 μs. However, the frame period is set to 65 μs (15.4 kfps), so there is an effective dead time of 55 μs per frame. Both the gate length and frame period are adjustable in the SwissSPAD2, with a frame period as short as 10.24 μs. The gate length can be shortened, but this may cause scintillation photons to be cut-off temporally due to the scintillator’s 1 μs decay constant. A higher temporal resolution for measuring interactions could be achieved using a SPAD camera that can time the arrival of photons rather than only detecting arrivals in a frame.

Experiments are performed around freezing to reduce noise because the captured signal is low. The SwissSPAD2 without microlenses is not intended to be deployed in a real system. Next generation SPAD arrays with higher light efficiencies will improve the signal to noise ratio such that this method can work under normal conditions.

The low-light environment poses a challenge in this work. In addition to capturing a low signal, low photon counts constrain our experiments to adopting thick, high-curvature lenses while imaging at short distances. This decreases focusing quality and the FOV in the scintillator. Newer SPAD arrays will provide better performance in measuring interactions. The SwissSPAD2 full array is 9.6 × 9.5 mm, has 262 kilopixels, a 10.5% fill factor, and a 5.25% peak PDE. The SwissSPAD2’s fill factor could be improved to about 50% by installing microlenses, which would provide about 5 times more light collection. As 3D-stacking becomes more common in CMOS chip manufacturing, higher fill factors can be achieved. For example, a more recently developed SPAD array (Canon MS-500) is 13.2 × 9.9 mm, has 3.2 megapixels, a 100% fill factor, and a 69.4% peak PDE63. This would yield a larger FOV and about 13 times more light than what is captured in this work using the same procedure. Increasing light collection would also allow to relax lens constraints to better approximate a thin lens. While keeping photon counts constant, the lens could be moved farther from the scintillator to capture a larger FOV. The lens’ aperture could be reduced or its focal length increased. Other than using better sensor technology, future work to improve light collection includes optimizing scintillator shapes and using reflective surfaces to collect light exiting the sides of the scintillator.

Simulations

Simulations demonstrate that MOSSC is able to locate interactions in 3D. The approximate accuracy and spatial resolution that the experimental configuration could achieve if dark counts are correctly denoised or absent using the SwissSPAD2 is shown in Fig. 7. Fig. 7 shows that localization performance with the chosen optical configuration is superior in the lateral dimension than in the depth dimension. Localization performance decreases slightly as interactions occur at higher z-coordinates with higher amounts of defocus blur. Supplementary Fig. 5 shows the approximate performance that could be obtained with MS-500 with improvements in both the x and z-coordinates.

Localization performance in the z-coordinate worsens when replicating experimental conditions in simulation. Figure 8 reports performance when simulating the SwissSPAD2, adjusted photon counts, dark counts, and the denoising and localization algorithms used in the experiments. Figures 6e and 8 show that the denoising and localization algorithms bias the estimated z-coordinate to lower values with increasing magnitude for interactions that occur in higher z-coordinates. Essentially, the algorithms tend to estimate a certain depth based on the values of Tedge in the denoising algorithm and s in GMM-Loc. The values of Tedge and s can be changed to shift the bias, but the bias will still vary over depth. Comparing Fig. 6d, e shows that Tedge has a greater impact on this bias than s. Performance can be improved by using a SPAD array with higher PDE, considering different optical configurations, and using better denoising and localization models.

As stated in the “Experiments" section in Discussion, the chosen optical configuration affects the measurements’ accuracies and spatial resolutions. For the chosen experimental and simulated configuration, the superior localization performance observed in x compared to z shown in Figs. 7 and 8 may be a result of having a small lateral FOV and a long depth of field.

The effect of a sensor’s PDE on localization performance is evident when comparing Fig. 6a, b. Low photon counts bias the estimated z-coordinate to a lower value and increase variance. The low-counts bias and variance are reduced with MS-500’s higher light efficiency, observed when comparing Supplementary Fig. 5a, b.

A bias from refraction blur of about 0.3 mm is observed when comparing Fig. 6a, c. A thorough calibration procedure involving controlled interaction locations can account for refraction blur.

Figure 6a, c show elevated levels of bias in the z-coordinate range from about 0 to 2 mm followed by a fast decrease. Interactions in this depth range are near the focal plane. Therefore, an image may be thin and stretched along the recoiled electron’s path instead of a disk with a small diameter, causing increased localization error using the proposed CoC model. The recoiled electron’s path can be observed, from which particle-tracking information could be gained. Figure 10 shows images of in-focus interactions simulated with MS-500’s adjusted photon counts. These images are generated with the same optical configuration used in other simulations, except the sensor has been shifted such that the focal plane corresponds to z = 0 mm. Interactions in Fig. 10 occur between z = 0 and 1 mm.

Future outlook

We anticipate that next-generation SPAD cameras will provide MOSSC with competitive interaction measurement performance. We envision the SPAD camera to be adopted in designs for measuring multiple interactions from individual gamma-rays in a single, monolithic scintillator. Measuring double-interaction events from individual gamma-rays in a monolithic scintillator is challenging for lensless designs due to light from different interactions overlapping on the sensor. Most monolithic scintillator designs for PET assume that only one interaction occurs26. The monolithic scintillator-SiPM array configuration suffers from light-sharing among its readout channels from multiple interactions, which degrades measurements’ spatial resolution. MOSSC can reduce or eliminate this overlap of light and measure multi-interaction events. This allows for a monolithic, single-crystal Compton camera to be constructed, as demonstrated in simulation. Monolithic scintillators offer benefits in cost, timing resolution, and detector efficiency compared to pixelated scintillators.

The image of a Compton scattering event that occurs in the SPAD camera’s focal plane may contain blurring that results from scintillation photons being emitted along the recoiled electron’s path. See Fig. 10 for examples of simulated in-focus interactions. The amount of blurring depends on the path length of the electron compared to the size of the camera’s FOV, which is set by the chosen optical configuration. Lateral movement of the electron along the focal plane produces blur that elongates the image laterally, while movement along the depth dimension produces defocus blur. The farther the interaction occurs from the focal plane, the more defocus blur dominates and particle-tracking information is lost. When the interaction is exactly in focus, the spatial resolution for measuring an interaction would be limited to the sum of the electron’s track length and the sharpness of the PSF at the focal plane. In this case, electron-tracking information could be gained. MOSSC could potentially compete as an electron-tracking Compton camera68,69. Imaging in-focus alpha particle trajectories has been demonstrated in a thin scintillator using a microscope objective on an EMCCD camera39.

Cherenkov photons may be emitted instantaneously in the shape of a cone oriented along the direction of the recoiled electron with an opening angle related to the electron’s velocity. Due to instantaneous emission, their times of arrival on the sensor would be nearly identical. Therefore, the SPAD camera’s high temporal resolution can extract them from other noise such as from fluorescence photons or dark counts. The ability to identify Cherenkov photons focused through a lens onto a high resolution array could potentially allow to reconstruct the Cherenkov cone. If the Cherenkov cone can be reconstructed, electron-tracking information can be gained for an electron-tracking Compton camera. In addition, the interaction could be timed down to the SPAD array’s timetagging resolution (<100 ps). Even if the Cherenkov cone cannot be reconstructed, identifying Cherenkov photons could be used to time the interaction, and the fluorescence photons would be used to measure the interaction’s location and energy.