GPU Supported Simulation of Transition-Edge Sensor Arrays

We present numerical simulations of full transition-edge sensor (TES) arrays utilizing graphical processing units (GPUs). With the support of GPUs, it is possible to perform simulations of large pixel arrays to assist detector development. Comparisons with TES small-signal and noise theory confirm the representativity of the simulated data. In order to demonstrate the capabilities of this approach, we present its implementation in xifusim, a simulator for the X-ray Integral Field Unit, a cryogenic X-ray spectrometer on board the future Athena X-ray observatory.


Introduction
Superconducting transition-edge sensors (TES) are cryogenic energy sensors with applications as single-photon detectors from the near infrared through gamma rays [1,2]. We present simulation software for detectors based on arrays of TESs where we implement a generic mathematical model of the TES electrothermal system. The software is also part of xifusim, a simulator we are developing for the X-ray Integral Field Unit (X-IFU) instrument [3] on board the future Athena X-ray observatory [4] to be launched in the early 2030s. The X-IFU is a cryogenic X-ray spectrometer 1 3 that operates a large array of TESs. The current baseline configuration consists of a hexagonal array of more than 3000 TES pixels that will provide spatially resolved high-resolution spectroscopy from 0.2 to 12 keV with an energy resolution of 2.5 eV FWHM up to 7 keV. Numerical simulations of the detector allow us to study its performance under various operating conditions and to provide feedback to the detector development during design before entering the construction phase. The aim of xifusim is to provide a representative simulation of the full detection pipeline of the X-IFU including all relevant detector physics and the behavior of the readout chain. Our programming philosophy is to keep the software flexible to allow a quick implementation of new features and customizations. As such, we implemented a modular code design that separates the functionality of the software into independent and interchangeable blocks as shown in Fig. 1.
The input of the simulator is a list of X-ray photon impacts containing energies, arrival times and impact positions on the pixel array. If one wants to simulate an observation of an astrophysical source, such a list can also be generated with the Simulation of X-ray Telescopes (SIXTE) software package, 1 a generic Monte Carlo-based simulation toolkit for observations with astronomical instruments [6,7]. The output of xifusim is a list of records that contains the digitized signal of all detected current pulses during the simulation. Event reconstruction algorithms can further analyze these records to retrieve the photon energies from the pulses [8]. xifusim has been derived from the tessim tool [9], a TES simulator developed as part of SIXTE. Our software is written in C++ and runs on a standard computer under Linux and macOS.
In this contribution, we focus on the first part of the pipeline, the TES array simulation. Section 2 describes the TES model that we implement in our software. In Sect. 3, we provide details about the implementation and program structure. To enable long simulations of large pixel arrays, we have also implemented a version of the code that  Fig. 1 The data flow in xifusim. A list of photon impacts is propagated to the TES array where the responses of the individual pixels are calculated. Their signal is amplified in a set of SQUIDs, either using a simple, fast SQUID model or a model implementing the nonlinear SQUID response and baseband feedback [5] ensured by the digital readout electronics (DRE). An analog-digital converter (ADC) maps the measured current into a digital signal which is passed to a trigger that detects the individual pulses in the datastream and writes them to the output file (Color figure online) utilizes a GPU for acceleration if available. Finally, we show first verification efforts for our simulation output.

Model Description
The TES model we implement is based on the TES theory detailed in [1]. Figure 2 shows a diagram of the electrothermal system consisting of the TES itself and a Thevenin-equivalent representation of the bias circuit with a bias voltage V and load resistor R L in series with an inductance L and the TES. Here, as well as in xifusim, we simulate a DC equivalent of the AC bias circuit foreseen for the X-IFU [10] where each TES is coupled to an LC filter via a transformer coupling with turns ratio a. In this circuit, the effective inductance and resistance seen by the TES are L = 2L filter ∕a 2 and R L = R para ∕a 2 , where L filter is the LC filter inductance and R para combines additional parasitic resistances in the circuit. The factor 2 in the effective inductance accounts for the fact that under AC bias, the TES sees two sidebands of an RLC filter when our DC equivalent model simulates a single LR filter band. In this simplification, we are simulating the AC TES as if it reacted only to the RMS level of the current flowing through it. Models that fully take into account AC effects in TESs are in development [11,12], but they are outside the scope of this contribution. The thermal behavior is driven by the Joule heating in the TES, the signal power P in and the power P b lost due to the heat flow from the TES to the bath through the thermal link. In this electrothermal system, the evolution of the time-dependent temperature, T(t), and current, I(t), in the TES is described by two coupled differential equations [1], Fig. 2 Diagram of the electrothermal system that we implement in our software, consisting of the Thevenin-equivalent representation of the bias circuit coupled to the TES (after [1]). An equivalent bias voltage V is applied to a load resistor R L , an inductance L, and the TES. We assume the TES to have a heat capacity C, which is connected to a heat bath at temperature T b through a weak thermal link with thermal conductance G (Color figure online) where R TES (T, I) is a function that describes the shape of the TES resistance surface.
In this contribution, we assume a linear resistance model around the operating point resistance R 0 , temperature T 0 and current I 0 given by where the steepness of the transition is described by the logarithmic temperature sensitivity 0 and current sensitivity 0 at the bias point. The power flow to the heat bath is modeled using a power-law dependence [1], with a temperature exponent n (in general ~ 3) and a material-and geometrydependent parameter K.

Simulation of TES Arrays
For a two-dimensional detector array of TES-based pixels, our simulator predicts the current signal I(t) in each pixel of the array during a given time interval [t start , t stop ] , based on a list of photon impacts with arrival times, energies and positions on the array. Since the performance of a real detector is greatly affected and limited by various noise processes, we also include several noise sources in our simulation. We can output any other system state variables required, such as the evolution of the pixel resistances or temperatures. Furthermore, individual parts of the TES model can be exchanged for another representation as needed. For example, one could read the resistance from a different RTI-surface model or extend the differential equation system to describe the TES as a resistively shunted junction [12]. We can also include temperature fluctuations of the heat bath due to cosmic ray hits on the detector wafer [13].

Implementation Details
First, the physical parameters of all pixels and their position within the array are read from an external file. The program then performs a numerical integration of Eqs. (1) and (2) at times t j = t start + j t , j = 1, 2, … , until t stop is reached, where t is the step size of the integrator. Currently, we use a standard fourth-order Runge-Kutta integrator taken from the boost C++ library odeint. 2 Before each integration step, the list of photon impacts is checked for photons that would impact on the array during the next time step. The absorption of a photon is assumed to happen with instantaneous thermalization and modeled as a delta function impulse in the affected pixel. If two or more photons impact the same pixel during one time step, their energies are summed (pile-up). Figure 3 shows an example output stream for a four-pixel array during a 30 s simulation where we use the current best estimate X-IFU pixel parameters. Noise is modeled as Gaussian noise by adding normally distributed random numbers to the electrical and thermal differential equation at every time step with an appropriate variance and normalization that takes the step size of the integrator into account. In our simulation, we assume the following spectral densities for the white noise sources injected in the differential equation system [1,14,15]: We include Johnson noise of the TES [Eq. (6)] and load resistor [Eq. (7)] as well as thermal fluctuation noise between the TES and heat bath [Eq. (5)], where M P is a factor we introduced to scale the phonon noise level to match measurements. We also include noise from the bias line [Eq. (8)] and an excess Johnson noise parameter [Eq. (9)] which is based on empirical characterization to represent noise internal to the TES that is not fully understood as of yet. These white noise levels are updated at each time step to simulate the non-stationarity of the system. We note that standard Runge-Kutta methods are formally not suited for the integration of differential equations containing stochastic terms. As an alternative, we are investigating numerical algorithms specifically developed for stochastic differential equations [16]. First, results show a similar behavior between the two methods.
The run time of our software on a single-core processor depends mainly on the step size chosen for the integrator and the number of pixels in the array. To simulate one pixel for 1 s with a step size of t = 6.4 × 10 −6 s (foreseen final sampling rate for the X-IFU) takes about a quarter of a second. The run time scales linearly with the step size and number of pixels. Such run times are sufficient to study the behavior of a configuration on short timescales. Longer simulations for large detectors consisting of thousands of pixels such as the X-IFU, however, would not be feasible in this framework. For such applications, we have also implemented a version of our code that utilizes a GPU for acceleration.

GPU Implementation
While CPUs typically consist of only few cores optimized for execution of sequential programs, GPUs are highly parallel manycore processors designed to execute thousands of tasks concurrently [17]. Since in our pixel-based simulation, the individual signals can be computed mainly independently, a GPU is perfectly suited to simulate a 3000 pixels array. We have implemented the GPU version using the CUDA parallel computing platform and programming model by NVIDIA. 3 The left panel of Fig. 4 shows a run time comparison between the two versions where we simulate different pixel array sizes for 1 s each time. On a single-core processor, the run time increases linearly with the number of pixels simulated, whereas on the GPU, the run time stays almost constant, leading to a speed increase by a factor of 3000 for the largest configuration, which is exactly the anticipated behavior. For the Athena X-IFU with more than 3000 pixels, this reduces the computation time of a full observation with several kiloseconds exposure from a couple of months to a few hours.

Verification of the Simulation Output
We investigated different means to verify our simulation output. As a first confirmation of our implementation, we have compared our simulated pulses with the TES small-signal model [1]. By linearizing Eqs. (1) and (2) around the equilibrium, one can approximate their solution for small signals. As illustrated in Fig. 5, the simulation matches with this model for low photon energies. For higher photon energies, the pulses start to deviate since the small-signal approximation just scales linearly with energy, while the simulation takes the nonlinearities of the system into account.
The right panel of Fig. 4 shows a comparison between the power spectral density (PSD) of the current noise as derived in [1] with the linear equilibrium ansatz and the PSD of a simulated current stream. The noise sources included in this comparison are thermal fluctuation noise, amplifier noise and electrical Johnson noise of the TES and load resistor. We find that the PSD of our simulated stream matches the theoretically predicted values very well. The next step in our verification efforts will be comparisons with laboratory measurements.

Conclusions
We have presented a simulation software for detectors that are made up of TESbased pixel arrays. In our program, we implement a generic model of a TES described by the coupled electrical and thermal circuits and numerically calculate the pixel signals during incident photon impacts. The software is currently used in the xifusim program, a simulator for the Athena X-IFU instrument. In order to study the behavior of large pixel arrays like the X-IFU, we have also implemented a version of our code that uses a GPU for acceleration. We find that our simulation output compares as expected with the TES small-signal model and predicted noise levels. Comparisons with measured data will be performed next.