1 Introduction

In many particle physics experiments, the data acquisition system (DAQ) monitors the signals from the detector and initiates recording of data when a pre-defined trigger condition is met. The DAQ has only a short time window in which to determine, based on the detector data, the variables on which the trigger condition is based. Hence, the decision about when to trigger must be based on an imperfect estimate of the quantity of interest (such as the event energy), which can be calculated with the required speed [1] and without the use of calibration inputs, which are often only determined after a dataset is recorded.

Consider a situation where the DAQ bases the trigger decision on an online variable x. Data read-out is triggered whenever \(x \geqslant \theta _x\), so the trigger efficiency in x is a step function \(\epsilon (x) = \Theta (x-\theta _x)\). In offline data analysis, the variable y is calculated based on the same event information, but using more elaborate algorithms and calibration inputs. The y variable is therefore a more precise indicator of the trigger quantity. Of interest for analysis is the trigger efficiency as a function of y, \(\epsilon (y)\).

As an example, the DEAP-3600 dark matter detector uses a constant threshold trigger [2]. Data read-out is triggered whenever the signal intensity from the light detectors, which is related to the event energy, passes a fixed threshold. In offline analysis, the signal intensity is converted into the total number of photo electrons, using calibration constants that account for differences in light detector gain between different light detectors, and for changes in gain over time. This offline variable measures the event energy more precisely, but near the trigger threshold, the efficiency is no longer a simple step function.

A number of methods exist to determine \(\epsilon (y)\) by way of a calibration dataset. These rely on measuring directly or indirectly the true rate of events at each value of y, so that by comparison with the rate obtained after the trigger, the trigger efficiency can be calculated.

Obtaining a trigger efficiency calibration is not always possible. The calibration data could be corrupt, it could be impossible to record calibration data due to electronics or physics constraints, or the calibration could drift with time faster than calibration datasets can be taken. In such cases, the efficiency curve might still be recovered or verified provided the value of x for each event was recorded or can be obtained offline.

In the DEAP-3600 detector for example, recording a calibration dataset takes approximately 48hours, so performing regular trigger efficiency calibrations reduces the lifetime of the detector for dark-matter search data. The trigger efficiency changes the shape of the spectra used to obtain the energy calibration. It also changes the shape of the distributions of certain backgrounds in a pulseshape-discrimination parameter, which cannot be modelled correctly unless a trigger efficiency correction is applied [3]. Therefore, monitoring the trigger efficiency on an ongoing basis is crucial to some analysis efforts.

2 General principle and illustration of the method

A dataset contains the value of x and y for each event. For concreteness, we say that these are both variables for the event energy. We assume that the events recorded have a continuous spectrum in both x and y in the region relevant to the trigger.

Consider the histogram of x versus y for many events, I(xy). Because x and y are variables describing the same quantity, they are correlated and the data will form a ‘band’ in this 2-dimensional space. The data has a spectrum in the x parameter, \(I(x) = \int _{-\infty }^\infty I(x,y) dy\), and a spectrum in the y parameter \(I(y) = \int _{-\infty }^\infty I(x,y) dx\).

To illustrate how to obtain \(\epsilon (y)\) from a dataset, we create data in a toy Monte Carlo (MC) simulation with a spectrum \(I(x) = 10x\), a trigger threshold \(\theta _x = 100\), and a resolution such that the shape of the y distribution for events of the same x is a skewed Gaussian. The I(xy) histogram for the simulated events is shown in Fig. 1a. The functional form of the relation between x and y is not typically known a-priori in real data, and the method developed here does not rely on such knowledge, so we limit ourselves in this section to information that can be obtained from the data. This situation will be analysed mathematically in Sect. 3.2.

Fig. 1
figure 1

Simulated data using a skewed Gaussian resolution function with \(\mu = 0.2x\), \(\sigma = 0.2\), and \(\lambda = 0.7\) (see Eq. (10)). The spectrum is \(I(x) = 10x\). a The I(xy) distribution as it might be measured in an experiment. b The \(I_{x}(x,y)\) distribution, where each bin in I(xy) is normalized such that \(I(x) = 1\) for \(x \geqslant \theta _x\)

Fig. 2
figure 2

For the same MC data as Fig. 1 a The \(I_x(y)\) distribution is fit with a straight line in a region far away from the trigger threshold, and with the analytic description of the turn-on shape, which is known in this case. b \(I_x(y)\) scaled by the plateau height results in the trigger efficiency curve (zoomed in compared to panel a) to better show the turn-on region)

Figure 1a shows what the real detector data might look like. To obtain the trigger turn-on curve in y, the following steps are taken

  1. 1.

    Normalize the I(xy) histogram such that \(I(x) = 1\) for \(x \geqslant \theta _x\). This can be achieved by dividing each bin (x,y) in the histogram by I(x). We denote the histogram normalized in x as \(I_x(x,y)\). It is shown in Fig. 1b. After this normalization, \(I_x(x)\) is equal to the efficiency curve \(\epsilon (x) = \Theta (x-\theta )\).

  2. 2.

    Now consider the spectrum in y, that is \(I_x(y) = \int _{-\infty }^\infty I_x(x,y) dx\), illustrated in Fig. 2a. For values of y where all possible values of x are above the trigger threshold (approximately at \(y\geqslant 30\) in this example), a constant plateau arises. For values of y where some of the possible x values are below the trigger threshold, the spectrum is diminished by that fraction of x values which lies below the threshold.

  3. 3.

    The height of the plateau is determined by a fit with a straight line (red dashed line in Fig. 2a).

  4. 4.

    Finally, \(I_x(y)\) is divided by the plateau height. The resulting curve, shown in Fig. 2b, is an estimate of the trigger efficiency turn-on function.

Because in the simulation the functional dependence is known, the functional shape of the turn-on curve can be fit to the data. This is the blue solid line in Fig. 2a, b.

The crucial feature necessary for this method to work is the constant plateau in \(I_x(y)\). In other words, that step 1 forces the spectrum \(I_x(y)\) to be uniform. A constant plateau arises when \(I_x(y)\) does not depend on x or y until the trigger threshold is introduced. That is, \(\hat{I}_x(y) = c\) where the hat indicates the absence of the trigger and c is a constant plateau height. Any value in the actual \(I_x(y)\) histogram not equal to c then indicates the influence of the trigger and the difference \(c - I_x(y)\) is proportional to the number of events missing at y due to the trigger efficiency.Footnote 1 The conditions necessary to obtain a constant plateau will be discussed in Sect. 5.

3 Validation

In the following two sections, we analytically demonstrate the validity of this method for two common resolution functions and discuss the conditions that must be met to obtain a constant plateau region.

3.1 Gaussian example

In this section, the method of determining the trigger efficiency is discussed mathematically for Gaussian resolution functions. The offline variable y follows a Gaussian distribution for any given value of x. The shape parameters of the Gaussian distribution are functions of x, and the data has some spectrum N(x)Footnote 2:

$$\begin{aligned} I(x,y) = {\left\{ \begin{array}{ll} N(x) \frac{1}{\sqrt{2\pi }\sigma (x)} e^{-(y-\mu (x))^2/(2\sigma (x)^2)} &{}\quad (x \geqslant \theta _x)\\ 0 &{}\quad (x < \theta _x)\\ \end{array}\right. } \end{aligned}$$
(1)

Division by N(x) gives \(I_x(x,y)\) which is by construction already normalized such that \(I_x(x) = 1\) above the trigger threshold. To obtain \(I_x(y)\), an assumption must be made about the shape parameters. In the simplest case, \(\sigma (x) = \sigma \) and \(\mu (x) = a\cdot x\). Thus:

$$\begin{aligned} I_x(x,y) = {\left\{ \begin{array}{ll} \frac{1}{\sqrt{2\pi }\sigma } e^{-(y-ax)^2/(2\sigma ^2)} &{}\quad (x \geqslant \theta _x) \\ 0 &{}\quad (x < \theta _x) \end{array}\right. } \end{aligned}$$
(2)

We temporarily ignore the trigger condition to study the spectrum in y (indicated by the hat).

$$\begin{aligned} \hat{I}_x(y)&= \int _{-\infty }^\infty \frac{1}{\sqrt{2\pi }\sigma } e^{-(y-a\cdot x)^2/(2\sigma ^2)} dx \end{aligned}$$
(3)
$$\begin{aligned}&= \frac{1}{a} \end{aligned}$$
(4)

We find that the spectrum in y is a constant if no trigger condition is applied. Thus, we proved here that the critical condition for the method to work is met, i.e. that for values of y well above the trigger region, \(I_x(y)\) forms a constant plateau.

The analytic shape of the turn-on curve can be obtained by including the trigger condition in the integral

$$\begin{aligned} I_x(y)&= \int _{-\infty }^\infty \frac{1}{\sqrt{2\pi }\sigma } e^{-(y-a\cdot x)^2/(2\sigma ^2)} H(x-\theta _x)dx \end{aligned}$$
(5)
$$\begin{aligned}&= \frac{1}{2a}\left[ {{\,\mathrm{erf}\,}}{\left( \frac{y-a\theta _x}{\sqrt{2}\sigma }\right) } + 1 \right] \end{aligned}$$
(6)

The integral in Eq. (5) is formally equal to a convolution of a Gaussian with a step function. This curve describes the fraction of events that pass the trigger for each value of y, relative to some plateau height \(\frac{1}{a}\) that is reached for \(y \gg a\theta _x\). It can be turned into the trigger turn-on curve by scaling such that the plateau is at 1

$$\begin{aligned} \epsilon (y)&= I_x(y)\cdot a \end{aligned}$$
(7)
$$\begin{aligned}&= \frac{1}{2}\left[ {{\,\mathrm{erf}\,}}{\left( \frac{y-a\theta _x}{\sqrt{2}\sigma }\right) } + 1 \right] \end{aligned}$$
(8)

Figure 3 illustrates Eq. (2) (panel a), Eq. (6) (panel b), and Eq. (8) (panel c). Parameters used are \(a=0.2\), \(\sigma = 3\) and \(\theta _x = 100\). This is not a Monte Carlo simulation; the respective equations are evaluated numerically here.

Fig. 3
figure 3

The Gaussian example (Eq. (2)) for \(a=0.2\), \(\sigma = 3\). a The distribution of x versus y. b The y-profile (black line), together with the calculated level of the plateau (pink dashed). c The profile scaled by the plateau height (black), which represents the trigger efficiency. The blue dashed line is a step function convoluted by a Gaussian, with function parameters taken at the trigger threshold

In a more realistic case, both the mean and the width of the y distribution at a given x vary with x: \(\sigma (x) = b\cdot x\) and \(\mu (x) = a\cdot x\) so that

$$\begin{aligned} I_x(y) = \int _{-\infty }^\infty \frac{1}{\sqrt{2\pi }bx} e^{-(y-a\cdot x)^2/(2(bx)^2)} dx \end{aligned}$$
(9)

This integral cannot be solved analytically. The numeric solution for \(a = 0.2y\) and \(b = 0.015y\) is shown in Fig. 4. Panel (b) shows that a constant plateau exists and the plateau height is determined by a fit to the histogram between 30 and 50 y. The trigger turn-on curve in panel (b) is overlaid with the model from Eq. (8) with \(\sigma = \sigma (\theta _x)\).

Fig. 4
figure 4

The Gaussian example with \(\mu = 0.2y\) and \(\sigma = 0.015y\). a x vs y. b The efficiency curve in y

This method does not produce proper efficiency curves in all situations. Figures 5 and 6 show situations when it does not work, namely when at least one of the mean or the sigma functions are polynomials of level bigger than 1. In both figures, the trigger efficiency model from Eq. (8) is drawn to illustrate the differences.

Fig. 5
figure 5

The Gaussian example with \(\mu = 0.2y\) and \(\sigma = 0.07y + 2\cdot 10^{-5} y^2\). No constant plateau arises, hence the method is not applicable and b does not represent the trigger efficiency

Fig. 6
figure 6

The Gaussian example with \(\mu = 0.2y + 5\cdot 10^{-5}y^2\) and \(\sigma = 0.015y\). No plateau arises, hence the method is not applicable and b does not represent the trigger efficiency

3.2 Skewed Gaussian example

To show that this method does not work only for Gaussian distributions, we repeat the calculation for skewed Gaussian (also called exponentially modified Gaussian (EMG)) resolution functions:

$$\begin{aligned}&I_x(x, y) = \frac{\lambda \small (x\small )}{2} e^{\frac{\lambda \small (x\small )}{2}(2\mu (x)+\lambda (x)\sigma (x)^2-2y)} \nonumber \\&\quad \cdot {{\,\mathrm{erfc}\,}}\left( \frac{\mu (x) + \lambda (x)\sigma (x)^2 - y}{\sqrt{2}\sigma (x)} \right) H(x-\theta _x) \end{aligned}$$
(10)

In the simplest case, \(\sigma (x) = \sigma \), \(\lambda (x) = \lambda \), and \(\mu (x) = a\cdot x\), shown in Fig. 7. The y-axis projection without the trigger condition is thenFootnote 3 (see Appendix 1).

$$\begin{aligned}&\hat{I}_x(y; a, \sigma , \lambda ) = \int _{0}^\infty \frac{\lambda }{2} e^{\frac{\lambda }{2}(2ax+\lambda \sigma ^2-2y)}\nonumber \\&\qquad \qquad \qquad \qquad \cdot {{\,\mathrm{erfc}\,}}\left( \frac{ax + \lambda \sigma ^2 - y}{\sqrt{2}\sigma } \right) dx \end{aligned}$$
(11)
$$\begin{aligned}&\quad = \frac{1}{a} \end{aligned}$$
(12)

Again, a constant plateau height is expected in the absence of the trigger.

The analytic shape of the trigger turn-on curve is again obtained by including the trigger condition:

$$\begin{aligned}&I_x(y; a, \sigma , \lambda , \theta _x) = \int _{0}^{\infty } I_x(x, y) H(x-\theta _x) dx \end{aligned}$$
(13)
$$\begin{aligned}&= \frac{1}{2a} \left[ 1 - e^{\frac{\lambda }{2}\left( 2a\theta _x+\lambda \sigma ^2-2y\right) } \right. \nonumber \\&\cdot {{\,\mathrm{erfc}\,}}\left( \frac{\sigma }{\sqrt{2}}\left( \lambda + \frac{a\theta _x-y}{\sigma ^2}\right) \right) \nonumber \\&\quad \left. + {{\,\mathrm{erf}\,}}\left( \frac{1}{\sqrt{2}\sigma }\left( y-a\theta _x\right) \right) \right] \end{aligned}$$
(14)

and dividing by the plateau height

$$\begin{aligned} \epsilon (y) = a \cdot I_x(y; a, \sigma , \lambda , \theta _x) \end{aligned}$$
(15)

Figure 7 shows the EMG model for values of \(\mu = 2y\), \(\sigma = 3\), and \(\theta _x = 100\). The turn on curve, Eq. (15), is drawn as well. Function parameters are taken at the trigger threshold (i.e. this is not a fit).

Fig. 7
figure 7

Skewed Gaussian example with \(\mu = 0.2y\), \(\sigma = 3\), and \(\lambda = 0.7\). a x vs y. b The normalized y profile. Also shown (blue dashed) is a skewed Gaussian convoluted with a step function, and function parameters are taken at the trigger threshold

Figure 8 shows the skewed Gaussian example for \(\mu = 0.2y\), \(\sigma = 0.015y\), and \(\lambda = 0.7\). The efficiency curve again shows a plateau and the shape is described by Eq. (15), but here the parameters of the turn-on curve were fit out to \(\mu = 19.8\), \(\sigma = 2.0\), and \(\lambda = 0.705\), so the parameters that determine the shape differ slightly from the parameters of the skewed Gaussian at the trigger threshold.

Fig. 8
figure 8

Skewed Gaussian example with \(\mu = 0.2y\), \(\sigma = 0.015y\), and \(\lambda = 0.7\). a x vs y. b The normalized y profile. Also shown (blue dashed) is a fit to the curve with Eq. (15)

As previously seen in the Gaussian example, the method fails if \(\mu \) or \(\sigma \) are polynomials of order bigger than 1 in y. It also fails if \(\lambda \) is not constant, though if the dependence of \(\lambda \) on x is not strong, an approximately flat plateau region is obtained.

4 Uncertainties

The normalization of the spectrum in x adds correlated uncertainties to the statistical uncertainties of each bin in the \(I_x(y)\) histogram. Then, the plateau level must be fit out, introducing an uncertainty in the ‘true’ number of events. The final efficiency curve or histogram comes with a complicated mixture of correlated and uncorrelated, statistical and systematic uncertainties. Furthermore, because the ‘true’ number of events is only an estimate, the efficiency histogram can have values bigger than 1.

Some assumptions can be made to simplify the uncertainties. The uncertainty on the total number of events in each x bin will always be smaller than the statistical uncertainty on the events in any (x, y) bin. Thus, the correlated uncertainties can be neglected in the \(I_x(y)\) histogram.

The efficiency histogram can be fit with an analytic function where the maximum value is constrained to 1. Then, confidence regions can be obtained in the usual way by varying the fit parameters within their uncertainties.

The uncertainty on the plateau height must be minded as a systematic uncertainty.

5 Discussion

The method introduced here allows an estimation of the trigger efficiency turn-on curve without the use of calibration data. This method will have a larger uncertainty than a typical efficiency calibration for the same amount of data used. However, since ‘physics’ datasets are often much larger than calibration datasets, this method can result in a more precise estimate.

The method works if a number of assumptions are true: (1) The trigger bases the trigger decision solely on an online parameter x, and the value of x is known for each event. (2) The trigger curve in x is a step function; that is the efficiency is known to be 0 for \(x < \theta _x\) and 1 for \(x \geqslant \theta _x\). (3) The existing data covers the full available parameter space in the trigger turn-on region, and far enough into the plateau region to estimate the plateau height. (4) The distribution of the offline parameter y for events with the same x, \(I(y;x=const)\), has the same functional form at all values of x (or at least for all values of x in the critical turn-on region and far enough into the plateau region that the plateau height can be obtained). (5) The I(xy) histogram shows a linear dependence of y on x, and the width of the \(I(y;x=const)\) distribution is a polynomial of order \(\leqslant 1\) in x. Non-Gaussian distributions will have additional requirements on the distribution shape parameters. These do not have to be explicitly determined – if this method produces a flat plateau region, the conditions are met.

Condition (1) is typically met in particle physics experiments. We note that if the trigger variable x is not recorded for each event, it can often by re-constructed by programming an offline analysis algorithm that reproduces the trigger module algorithm.

Condition (2) must be met such that the result of this method is in fact a trigger efficiency. This method determines the efficiency curve of y with regard to x, not the efficiency of y with regard to the actual trigger. But if the trigger efficiency in x is a step function with values of either 0 or 1, the curve obtained is the trigger efficiency in y. If the trigger efficiency in x is not a step function, then it must additionally be obtained another way before the trigger efficiency in y can be determined. A typical situation where this is the case would be one where the trigger is pre-scaled or otherwise known to approach a value different from one.

Condition (3) means that the physics data recorded must contain a sufficient number of events with values of x near the trigger threshold. If it does not, presumably, the trigger turn-on curve is not of interest to begin with.

Condition (4) is the only one that is not trivial to verify based just on the physics data. An unchanging functional form of the \(I(y;x=const)\) distribution is a reasonable assumption in most cases, but should if possible be checked using a traditional efficiency calibration approach. If \(I(y;x=const)\) is known analytically, such that the shape of the turn-on curve can be obtained by convolution with a step function, then the shape of the data should be well described by this analytic turn-on curve. If the shapes do not match, it would be an indication that condition iv) is not met.

We showed analytically that this method works for certain forms of Gaussian and skewed Gaussian resolution functions. We expect that this method will work for many realistic distributions I(xy), as long as \(I(y;x=const)\) tends to 0 at both tails. This can be intuitively understood. At the integration borders of \(y = -\infty \) and \(y = \infty \), the distribution is 0. x determines where inside the integration region the distribution peaks (through \(\mu = \mu (x)\)), but since the integration goes from minus to plus infinity, the location of the distribution on the y-axis is not relevant.

The events used to obtain the trigger efficiency calibration do not need to be signal events. Taking the DEAP-3600 detector as an example again, events from a high-rate background, the beta decay of \(^{39}\mathrm Ar\), are used to obtain the trigger efficiency calibration.

6 Summary

We have presented a method to obtain the trigger efficiency turn-on curve for a physics dataset. This method uses only the physics data itself, that is it does not require calibration data. It is based on several assumptions that are fulfilled for many types of experiments but at least one of which is difficult to verify without a calibration. Therefore, this method is particularly well suited to tracking the efficiency turn-on curve over time. It can be verified against calibration at any point of data recording and, once verified, be used to obtain the trigger efficiency curve over time, for example if calibration parameters drift faster than it is reasonable to record calibrations.

It can also be used to find out where full efficiency is reached, even if the precise shape of the turn-on is not reliable because the method was not verified. This can be useful when a dataset must be analyzed for which no other calibration is available, to at least find out in which region the recorded data is reliable.