In this section, we study the detection of a deterministic gravitational-wave signal h(t; θ) of the general form given by Eq. (32) and the estimation of its parameters θ using the maximum-likelihood (ML) principle. We assume that the noise n(t) in the detector is a zero-mean, Gaussian, and stationary random process. The data x in the detector, in the case when the gravitational-wave signal h(t; θ) is present, is x(t; θ) = n(t) + h(t; θ). The parameters θ = (a, ξ) of the signal (32) split into extrinsic (or amplitude) parameters a and intrinsic ones ξ.
The \({\mathcal F}\)-statistic
For the gravitational-wave signal h(t; a, ξ) of the form given in Eq. (32) the log likelihood function (46) can be written as
$$\log \Lambda [x;{\rm{a}},\xi ] = {{\rm{a}}^{\rm{T}}} \cdot {\rm{N}}[x;\xi ] - {1 \over 2}{{\rm{a}}^{\rm{T}}} \cdot {\rm{M(}}\xi) \cdot {\rm{a,}}$$
(59)
where the components of the column n × 1 matrix N and the square n × n matrix M are given by
$${N_k}[x;\xi ]: = (x\vert {h_k}(t;\xi)),\,{M_{kl}}(\xi): = ({h_k}(t;\xi)\vert {h_l}(t;\xi)),\,k,l = 1, \ldots, n.$$
(60)
The ML equations for the extrinsic parameters a, ∂ logΛ[x; a, ξ]/∂a = 0, can be solved explicitly to show that the ML estimators â of the parameters a are given by
$${\rm{\hat a}}[x;\xi ] = {\rm{M}}{(\xi)^{- 1}} \cdot {\rm{N}}[x;\xi ].$$
(61)
Replacing the extrinsic parameters a in Eq. (59) by their ML estimators â, we obtain the reduced log likelihood function,
$${\mathcal F}[x;\xi ]: = \log \Lambda [x;{\rm{\hat a}}[x;\xi ],\xi ] = {1 \over 2}{\rm N}{[x;\xi ]^{\rm{T}}} \cdot {\rm{M}}{(\xi)^{- 1}} \cdot {\rm{N}}[x;\xi ],$$
(62)
that we call the \({\mathcal F}\)-statistic. The \({\mathcal F}\)-statistic depends nonlinearly on the intrinsic parameters ξ of the signal, it does not depend on the extrinsic parameters a.
The procedure to detect the gravitational-wave signal of the form (32) and estimate its parameters consists of two parts. The first part is to find the (local) maxima of the \({\mathcal F}\)-statistic (62) in the intrinsic parameters space. The ML estimators \({\hat \xi}\) of the intrinsic parameters ξ are those values of ξ for which the \({\mathcal F}\)-statistic attains a maximum. The second part is to calculate the estimators â of the extrinsic parameters a from the analytic formula (61), where the matrix M and the correlations N are calculated for the parameters ξ replaced by their ML estimators \({\hat \xi}\) obtained from the first part of the analysis. We call this procedure the maximum likelihood detection. See Section 4.6 for a discussion of the algorithms to find the (local) maxima of the \({\mathcal F}\)-statistic.
Targeted searches
The \({\mathcal F}\)-statistic can also be used in the case when the intrinsic parameters are known. An example of such an analysis called a targeted search is the search for a gravitational-wave signal from a known pulsar. In this case assuming that gravitational-wave emission follows the radio timing, the phase of the signal is known from pulsar observations and the only unknown parameters of the signal are the amplitude (or extrinsic) parameters a [see Eq. (30)]. To detect the signal one calculates the \({\mathcal F}\)-statistic for the known values of the intrinsic parameters and compares it to a threshold [67]. When a statistically-significant signal is detected, one then estimates the amplitude parameters from the analytic formulae (61).
In [109] it was shown that the maximum-likelihood \({\mathcal F}\)-statistic can be interpreted as a Bayes factor with a simple, but unphysical, amplitude prior (and an additional unphysical sky-position weighting). Using a more physical prior based on an isotropic probability distribution for the unknown spin-axis orientation of emitting systems, a new detection statistic (called the \({\mathcal B}\)-statistic) was obtained. Monte Carlo simulations for signals with random (isotropic) spin-axis orientations show that the \({\mathcal B}\)-statistic is more powerful (in terms of its expected detection probability) than the \({\mathcal F}\)-statistic. A modified version of the \({\mathcal F}\)-statistic that can be more powerful than the original one has been studied in [20].
Signal-to-noise ratio and the Fisher matrix
The detectability of the signal h(t; θ) is determined by the signal-to-noise ratio ρ. In general it depends on all the signal’s parameters θ and can be computed from [see Eq. (47)]
$$\rho (\theta) = \sqrt {(h(t;\theta)\vert h(t;\theta)}).$$
(63)
The signal-to-noise ratio for the signal (32) can be written as
$$\rho ({\rm{a}},\xi) = \sqrt {{{\rm{a}}^{\rm{T}}} \cdot {\rm{M}}(\xi) \cdot {\rm{a}}},$$
(64)
where the components of the matrix M(ξ) are defined in Eq. (60).
The accuracy of estimation of the signal’s parameters is determined by Fisher information matrix Γ. The components of Γ in the case of the Gaussian noise can be computed from Eq. (58). For the signal given in Eq. (32) the signal’s parameters (collected into the vector θ) split into extrinsic and intrinsic parameters: θ = (a, ξ), where a = (a1, …, an) and ξ = (ξ1,…, ξm). It is convenient to distinguish between extrinsic and intrinsic parameter indices. Therefore, we use calligraphic lettering to denote the intrinsic parameter indices: \({\xi _{\mathcal A}},{\mathcal A} = 1, \ldots,m\). The matrix Γ has dimension (n + m) × (n + m) and it can be written in terms of four block matrices for the two sets of the parameters a and ξ,
$$\Gamma ({\rm a},\xi) = \left({\begin{array}{*{20}c} {{\Gamma _{{\rm{aa}}}}(\xi)} & {{\Gamma _{{\rm a}\xi}}({\rm a},\xi)} \\ {{\Gamma _{{\rm a}\xi}}{{({\rm a},\xi)}^{\rm{T}}}} & {{\Gamma _{\xi \xi}}({\rm a},\xi)} \\ \end{array}} \right),$$
(65)
where Γaa is an n × n matrix with components (∂h/∂ai∣∂h/∂aj) (i, j = 1, …, n), Γaξ is an n × m matrix with components \((\partial h/\partial {a_i}\backslash \partial h/\partial {\xi _\mathcal A})(i = 1, \ldots, n,\mathcal A = 1, \ldots, m)\), and finally Γξξ is m × m matrix with components \((\partial h/\partial {\xi _\mathcal A}\backslash \partial h/\partial {\xi _\mathcal B})(\mathcal A,\mathcal B = 1, \ldots, m)\).
We introduce two families of the auxiliary n × n square matrices \({{\rm{F}}_{(\mathcal A)}}\) and \({{\rm{S}}_{({\mathcal A}{\mathcal B})}}({\mathcal A},{\mathcal B} = 1, \ldots, m)\), which depend on the intrinsic parameters ξ only (the indexes \({\mathcal A},{\mathcal B}\) within parentheses mean that they serve here as the matrix labels). The components of the matrices \({{\rm{F}}_{(\mathcal A)}}\) and \({{\rm{S}}_{({\mathcal A}{\mathcal B})}}\) are defined as follows:
$${F_{{{(\mathcal{A})}^{ij}}}}(\xi): = \left({{h_i}(t;\xi)\left\vert {{{\partial {h_j}(t;\xi)} \over {\partial {\xi _\mathcal{A}}}}} \right.} \right),\,i,j = 1, \ldots, n, \,\mathcal{A} = 1, \ldots, m.$$
(66)
$${S_{{{({\mathcal A}{\mathcal B})}^{ij}}}}(\xi): = \left({{{\partial {h_i}(t;\xi)} \over {\partial {\xi _{\mathcal A}}}}\left\vert {{{\partial {h_j}(t;\xi)} \over {\partial {\xi _{\mathcal B}}}}} \right.} \right),\,i,j = 1, \ldots, n,\,{\mathcal A},{\mathcal B} = 1, \ldots, m.$$
(67)
Making use of the definitions (60) and (66)–(67) one can write the more explicit form of the matrices Γaa, Γaξ, and Γξξ,
$${\Gamma _{aa}}(\xi) = {\rm{M}}(\xi),$$
(68)
$${\Gamma _{a\xi}}({\rm{a}},\xi) = ({{\rm{F}}_{(1)}}(\xi) \cdot {\rm{a}} \cdots {{\rm{F}}_{(m)}}(\xi) \cdot {\rm{a}}),$$
(69)
$${\Gamma _{\xi \xi}}({\rm{a}},\xi) = \left({\begin{array}{*{20}c} {{{\rm{a}}^{\rm{T}}} \cdot {{\rm{S}}_{(11)}}(\xi) \cdot {\rm{a}} \cdots {{\rm{a}}^{\rm{T}}} \cdot {{\rm{S}}_{(1m)}}(\xi) \cdot {\rm{a}}} \\ {\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots} \\ {{{\rm{a}}^{\rm{T}}} \cdot {{\rm{S}}_{(m1)}}(\xi) \cdot {\rm{a}} \cdots {{\rm{a}}^{\rm{T}}} \cdot {{\rm{S}}_{(mm)}}(\xi) \cdot {\rm{a}}} \\ \end{array}} \right).$$
(70)
The notation introduced above means that the matrix Γaξ can be thought of as a 1 × m row matrix made of n × 1 column matrices \({{\rm{F}}_{(\mathcal A)}}\). Thus, the general formula for the component of the matrix Γaξ is
$${({\Gamma _{{\rm{a}}\xi}})_{i\mathcal{A}}} = ({{\rm{F}}_{(\mathcal{A)}} \cdot {\rm{a}})_i} = \sum\limits_{j = 1}^n {{F_{(\mathcal{A})ij}}{a_j}, \quad\quad \mathcal{A} = 1, \ldots, m, \quad i = 1, \ldots, n.}$$
(71)
The general component of the matrix Γξξ is given by
$${({\Gamma _{\xi \xi}})_{{\mathcal A}{\mathcal B}}} = {{\rm a}^{\rm T}} \cdot {S_{(\mathcal{A}\mathcal{B})}} \cdot {\rm a} = \sum\limits_{i = 1}^n {\sum\limits_{j = 1}^n {{S_{({\mathcal A}{\mathcal B})ij}}{a_i}{a_j},\,{\mathcal A},\mathcal{B} = 1, \ldots, m.}}$$
(72)
The covariance matrix C, which approximates the expected covariances of the ML estimators of the parameters θ, is defined as Γ−1 Applying the standard formula for the inverse of a block matrix [90] to Eq. (65), one gets
$${\rm C}({\rm{a}},\xi) = \left({\begin{array}{*{20}c} {{{\rm C}_{{\rm{aa}}}}({\rm{a}},\xi)} & {{{\rm C}_{{\rm{a}}\xi}}({\rm{a}},\xi)} \\ {{{\rm C}_{{\rm{a}}\xi}}{{({\rm{a}},\xi)}^T}} & {{{\rm C}_{\xi \xi}}({\rm{a}},\xi)} \\ \end{array}} \right),$$
(73)
where the matrices Caa, Caξ, and Cξξ can be expressed in terms of the matrices Γaa = M, Γaξ, and Γξξ as follows:
$${{\rm{C}}_{{\rm{aa}}}}({\rm{a}},\xi) = {\rm{M(}}\xi)^{- 1} + {\rm{M(}}\xi)^{- 1} \cdot {\Gamma _{{\rm{a}}\xi}}({\rm{a,}}\xi) \cdot \bar \Gamma {({\rm{a,}}\xi)^{- 1}} \cdot {\Gamma _{{\rm{a}}\xi}}{({\rm{a,}}\xi)^{\rm{T}}} \cdot {\rm{M(}}\xi)^{- 1},$$
(74)
$${{\rm{C}}_{{\rm{a}}\xi}}({\rm{a}},\xi) = - {\rm{M}}{(\xi)^{- 1}}\cdot{\Gamma _{{\rm{a}}\xi}}({\rm{a}},\xi)\cdot\bar \Gamma {({\rm{a}},\xi)^{- 1}},$$
(75)
$${{\rm{C}}_{\xi \xi}}({\rm{a}},\xi) = \bar \Gamma {({\rm{a}},\xi)^{- 1}}.$$
(76)
In Eqs. (74)–(76) we have introduced the m × m matrix:
$$\bar \Gamma ({\rm{a,}}\xi): = {\Gamma _{\xi \xi}}({\rm{a,}}\xi) - {\Gamma _{{\rm{a}}\xi}}{({\rm{a}},\xi)^{\rm{T}}} \cdot {\rm{M(}}\xi)^{- 1} \cdot {\Gamma _{{\rm{a}}\xi}}({\rm{a,}}\xi).$$
(77)
We call the matrix \({\bar \Gamma}\) (which is the Schur complement of the matrix M) the projected Fisher matrix (onto the space of intrinsic parameters). Because the matrix \({\bar \Gamma}\) is the inverse of the intrinsic-parameter submatrix Cξξ of the covariance matrix C, it expresses the information available about the intrinsic parameters that takes into account the correlations with the extrinsic parameters. The matrix \({\bar \Gamma}\) is still a function of the putative extrinsic parameters.
We next define the normalized projected Fisher matrix (which is the m × m square matrix)
$${\bar \Gamma _n}({\rm{a}},\xi): = {{\bar \Gamma ({\rm{a,}}\xi)} \over {\rho {{({\rm{a,}}\xi)}^2}}},$$
(78)
where ρ is the signal-to-noise ratio. Making use of the definition (77) and Eqs. (71)–(72) we can show that the components of this matrix can be written in the form
$$({\bar \Gamma _n}{({\rm{a}},\xi)_{\mathcal{A}\mathcal{B}}}) = {{{{\rm{a}}^{\rm{T}}} \cdot {{\rm{A}}_{(\mathcal{A}\mathcal{B})}}(\xi) \cdot {\rm{a}}} \over {{{\rm{a}}^{\rm{T}}} \cdot {\rm{M(}}\xi) \cdot {\rm{a}}}},\mathcal{A},\mathcal{B} = 1, \ldots, m,$$
(79)
where \({{\rm{A}}_{({\mathcal A}{\mathcal B})}}\) is the n × n matrix defined as
$${{\rm{A}}_{(\mathcal{A}\mathcal{B})}}(\xi): = {S_{(\mathcal{A}\mathcal{B})}}(\xi) - {{\rm{F}}_{(\mathcal{A})}}{(\xi)^{\rm{T}}} \cdot {\rm{M(}}\xi)^{- 1} \cdot {{\rm{F}}_{(B)}}(\xi),\,\mathcal{A},\mathcal{B} = 1, \ldots, m.$$
(80)
From the Rayleigh principle [90] it follows that the minimum value of the component \({({{\bar \Gamma}_n}({\rm{a}},\xi))_{{\mathcal A}{\mathcal B}}}\) is given by the smallest eigenvalue of the matrix \({{\rm{M}}^{- 1}} \cdot {{\rm{A}}_{({\mathcal A}{\mathcal B})}}\). Similarly, the maximum value of the component \({({{\bar \Gamma}_n}({\rm{a}},\xi))_{{\mathcal A}{\mathcal B}}}\) is given by the largest eigenvalue of that matrix.
Because the trace of a matrix is equal to the sum of its eigenvalues, the m × m square matrix \({\tilde \Gamma}\) with components
$${(\tilde \Gamma (\xi))_{\mathcal{A}\mathcal{B}}}: = {1 \over n}{\rm{Tr(M(}}\xi)^{- 1} \cdot {{\rm{A}}_{(\mathcal{A}\mathcal{B})}}(\xi)),\,\mathcal{A},\mathcal{B} = 1, \ldots, m.$$
(81)
expresses the information available about the intrinsic parameters, averaged over the possible values of the extrinsic parameters. Note that the factor 1/n is specific to the case of n extrinsic parameters. We shall call \({\tilde \Gamma}\) the reduced Fisher matrix. This matrix is a function of the intrinsic parameters alone. We shall see that the reduced Fisher matrix plays a key role in the signal processing theory that we present here. It is used in the calculation of the threshold for statistically significant detection and in the formula for the number of templates needed to do a given search.
For the case of the signal
$$h(t;{A_0},{\phi _0},\xi) = {A_0}g(t;\xi)\cos (\phi (t;\xi) - {\phi _0}),$$
(82)
the normalized projected Fisher matrix \({{\bar \Gamma}_n}\) is independent of the extrinsic parameters A0 and ϕ0, and it is equal to the reduced matrix \({\tilde \Gamma}\) [102]. The components of \({\tilde \Gamma}\) are given by
$${\tilde \Gamma _{\mathcal{A}\mathcal{B}}} = {({\Gamma _0})_{\mathcal{A}\mathcal{B}}} - {{{{({\Gamma _0})}_{{\phi _0}\mathcal{A}}}{{({\Gamma _0})}_{{\phi _0}\mathcal{B}}}} \over {{{({\Gamma _0})}_{{\phi _0}{\phi _0}}}}},$$
(83)
where Γ0 is the Fisher matrix for the signal g(t; ξ) cos (ϕ(t; ξ) − ϕ0).
False alarm and detection probabilities
False alarm and detection probabilities for known intrinsic parameters
We first present the false alarm and detection probabilities when the intrinsic parameters ξ of the signal are known. In this case the \({\mathcal F}\)-statistic is a quadratic form of the random variables that are correlations of the data. As we assume that the noise in the data is Gaussian and the correlations are linear functions of the data, \({\mathcal F}\) is a quadratic form of the Gaussian random variables. Consequently the \({\mathcal F}\)-statistic has a distribution related to the χ2 distribution. One can show (see Section III B in [65]) that for the signal given by Eq. (30), \(2{\mathcal F}\) has a χ2 distribution with n degrees of freedom when the signal is absent and noncentral χ2 distribution with n degrees of freedom and non-centrality parameter equal to the square of the signal-to-noise ratio when the signal is present.
As a result the pdfs p0 and p1 of the \({\mathcal F}\)-statistic, when the intrinsic parameters are known and when respectively the signal is absent or present in the data, are given by
$${p_0}(\mathcal{F}) = {{{\mathcal{F}^{n/2 - 1}}} \over {(n/2 - 1)!}}\exp (- \mathcal{F}),$$
(84)
$${p_1}(\rho,\mathcal{F}) = {{{{(2\mathcal{F})}^{(n/2 - 1)/2}}} \over {{\rho ^{n/2 - 1}}}}{I_{n/2 - 1}}\left({\rho \sqrt {2\mathcal{F}}} \right)\exp \left({- \mathcal{F} - {1 \over 2}{\rho ^2}} \right),$$
(85)
where n is the number of degrees of freedom of χ2 distribution and In/2−1 is the modified Bessel function of the first kind and order n/2 − 1. The false alarm probability PF is the probability that \({\mathcal F}\) exceeds a certain threshold \({{\mathcal F}_0}\) when there is no signal. In our case we have
$${P_{\rm{F}}}({\mathcal{F}_0}): = \int\nolimits_{{\mathcal{F}_0}}^\infty {{p_0}(\mathcal{F})d\mathcal{F} = \exp (- {\mathcal{F}_0})\sum\limits_{k = 0}^{n/2 - 1} {{{{\mathcal{F}_0}^k} \over {k!}}.}}$$
(86)
The probability of detection PD is the probability that \({\mathcal F}\) exceeds the threshold \({{\mathcal F}_0}\) when a signal is present and the signal-to-noise ratio is equal to ρ:
$${P_{\rm{D}}}(\rho, {\mathcal{F}_0}): = \int\nolimits_{{\mathcal{F}_0}}^\infty {p_1}(\rho, \mathcal{F}){\rm{d\mathcal{F}}}.$$
(87)
The integral in the above formula can be expressed in terms of the generalized Marcum Q-function [132, 58], \({P_{\rm{D}}}(\rho, {{\mathcal F}_0}) = Q(\rho, \sqrt {2{{\mathcal F}_0}})\). We see that when the noise in the detector is Gaussian and the intrinsic parameters are known, the probability of detection of the signal depends on a single quantity: the optimal signal-to-noise ratio ρ.
False alarm probability for unknown intrinsic parameters
Next we return to the case in which the intrinsic parameters ξ are not known. Then the statistic \({\mathcal F}[x;\xi ]\) given by Eq. (62) is a certain multiparameter random process called the random field (see monographs [5, 6] for a comprehensive discussion of random fields). If the vector ξ has one component the random field is simply a random process. For random fields we define the autocovariance function \({\mathcal C}\) just in the same way as we define such a function for a random process:
$$\mathcal{C}(\xi, {\xi {\prime}}): = {{\rm{E}}_0}[\mathcal{F}[x;\xi ]\mathcal{F}[x;{\xi {\prime}}]] - {{\rm{E}}_0}[\mathcal{F}[x;\xi ]]{{\rm{E}}_0}[\mathcal{F}[x;{\xi {\prime}}]],$$
(88)
where ξ and ξ′ are two values of the intrinsic parameter set, and E0 is the expectation value when the signal is absent. One can show that for the signal (30) the autocovariance function \({\mathcal C}\) is given by
$${\mathcal C}(\xi, {\xi \prime}) = {1 \over 2}{\rm{Tr}}\left({{\rm{Q(}}\xi, {\xi {\prime}}) \cdot {\rm{M(}}{\xi {\prime}})^{- 1} \cdot {\rm{Q(}}\xi,{\xi {\prime}})^{\rm{T}} \cdot {\rm{M(}}\xi)^{- 1}} \right),$$
(89)
where Q is an n × n matrix with components
$${\rm{Q(}}\xi, {\xi {\prime}})_{ij}: = ({h_i}(t;\xi)\vert {h_j}(t;{\xi {\prime}})),\,i,j = 1, \ldots n.$$
(90)
Obviously Q(ξ, ξ) = M(ξ), therefore \({\mathcal C}(\xi, \xi) = n/2\).
One can estimate the false alarm probability in the following way [68]. The autocovariance function \({\mathcal C}\) tends to zero as the displacement Δξ = ξ′ − ξ increases (it is maximal for Δξ = 0). Thus we can divide the parameter space into elementary cells such that in each cell the autocovariance function \({\mathcal C}\) is appreciably different from zero. The realizations of the random field within a cell will be correlated (dependent), whereas realizations of the random field within each cell and outside of the cell are almost uncorrelated (independent). Thus, the number of cells covering the parameter space gives an estimate of the number of independent realizations of the random field.
We choose the elementary cell with its origin at the point ξ to be a compact region with boundary defined by the requirement that the autocovariance \({\mathcal C}(\xi, {\xi \prime})\) between the origin ξ and any point ξ′ at the cell’s boundary equals half of its maximum value, i.e., \({\mathcal C}(\xi, \xi)/2\). Thus, the elementary cell is defined by the inequality
$$\mathcal{C}(\xi, {\xi {\prime}}) \leq {1 \over 2}\mathcal{C}(\xi, \xi) = {n \over 4},$$
(91)
with ξ at the cell’s center and ξ′ on the cell’s boundary.
To estimate the number of cells we perform the Taylor expansion of the autocovariance function up to the second-order terms:
$$\mathcal{C}(\xi ,\xi \prime ) \cong \frac{n}{2} + \sum\limits_{\mathcal{A} = 1}^m {{{\left. {\frac{{\partial \mathcal{C}(\xi ,\xi \prime )}}{{\partial {\xi _\mathcal{A}}\prime }}} \right|}_{\xi \prime = \xi }}} \Delta {\xi _\mathcal{A}} + \frac{1}{2}\sum\limits_{\mathcal{A},\mathcal{B} = 1}^m {{{\left. {\frac{{{\partial ^2}\mathcal{C}(\xi ,\xi \prime )}}{{\partial {{\xi '}_\mathcal{A}}\partial {{\xi '}_\mathcal{B}}}}} \right|}_{\xi \prime = \xi }}} \Delta {\xi _\mathcal{A}}\Delta {\xi _\mathcal{B}}.$$
(92)
As \({\mathcal C}\) attains its maximum value when ξ − ξ′ = 0, we have
$${\left. {{{\partial {\mathcal C}(\xi ,{\xi \prime})} \over {\partial \xi _A\prime}}} \right|_{{\xi \prime} = \xi}} = 0,\quad{\mathcal A} = 1, \ldots ,m.$$
(93)
Let us introduce the symmetric matrix G with components
$${G_{\mathcal{A}\mathcal{B}}}(\xi ): = - {\left. {\frac{1}{{2\mathcal{C}(\xi ,\xi )}}\frac{{{\partial ^2}\mathcal{C}(\xi ,\xi \prime )}}{{\partial {{\xi '}_\mathcal{A}}\partial {{\xi '}_\mathcal{B}}}}} \right|_{\xi \prime = \xi }},\quad \mathcal{A},\mathcal{B} = 1, \ldots ,m.$$
(94)
then, the inequality (91) for the elementary cell can approximately be written as
$$\sum\limits_{\mathcal{A},\mathcal{B} = 1}^m {{G_{\mathcal{A}\mathcal{B}}}(\xi)\Delta {\xi _\mathcal{A}}\Delta {\xi _\mathcal{B}} \leq{1 \over 2}.}$$
(95)
It is interesting to find a relation between the matrix G and the Fisher matrix. One can show (see [78], Appendix B) that the matrix G is precisely equal to the reduced Fisher matrix \({\tilde \Gamma}\) given by Eq. (81).
If the components of the matrix G are constant (i.e., they are independent of the values of the intrinsic parameters ξ of the signal), the above equation defines a hyperellipsoid in m-dimensional (m is the number of the intrinsic parameters) Euclidean space ℝm. The m-dimensional Euclidean volume Vcell of the elementary cell given by Eq. (95) equals
$${V_{{\rm{cell}}}} = {{{{(\pi/2)}^{m/2}}} \over {\Gamma (m/2 + 1)\sqrt {\det \,{\rm{G}}}}},$$
(96)
where Γ denotes the Gamma function. We estimate the number Ncells of elementary cells by dividing the total Euclidean volume V of the m-dimensional intrinsic parameter space by the volume Vcell of one elementary cell, i.e., we have
$${N_{{\rm{cells}}}} = {V \over {{V_{{\rm{cell}}}}}}.$$
(97)
The components of the matrix G are constant for the signal h(t; A0, ϕ0, ξ) = A0 cos (ϕ(t; ξ) − ϕ0), provided the phase ϕ(t; ξ) is a linear function of the intrinsic parameters ξ.
To estimate the number of cells in the case when the components of the matrix G are not constant, i.e., when they depend on the values of the intrinsic parameters ξ, one replaces Eq. (97) by
$${N_{{\rm{cells}}}} = {{\Gamma (m/2 + 1)} \over {{{(\pi/2)}^{m/2}}}}\int\nolimits_V {\sqrt {\det {\rm{G(}}\xi)} {\rm{d}}V.}$$
(98)
This formula can be thought of as interpreting the matrix G as the metric on the parameter space. This interpretation appeared for the first time in the context of gravitational-wave data analysis in the work by Owen [102], where an analogous integral formula was proposed for the number of templates needed to perform a search for gravitational-wave signals from coalescing binaries.
The concept of number of cells was introduced in [68] and it is a generalization of the idea of an effective number of samples introduced in [46] for the case of a coalescing binary signal.
We approximate the pdf of the \({\mathcal F}\)-statistic in each cell by the pdf \({p_0}({\mathcal F})\) of the \({\mathcal F}\)-statistic when the parameters are known [it is given by Eq. (84)]. The values of the \({\mathcal F}\)-statistic in each cell can be considered as independent random variables. The probability that \({\mathcal F}\) does not exceed the threshold \({{\mathcal F}_0}\) in a given cell is \(1 - {P_F}({{\mathcal F}_0})\), where \({P_F}({{\mathcal F}_0})\) is given by Eq. (86). Consequently the probability that \({\mathcal F}\) does not exceed the threshold \({{\mathcal F}_0}\) in all the Ncells cells is \({[1 - {P_F}({{\mathcal F}_0})]^{{N_{{\rm{cells}}}}}}\). Thus, the probability \(P_F^T\) that \({\mathcal F}\) exceeds \({{\mathcal F}_0}\) in one or more cells is given by
$$P_{\rm{F}}^T({\mathcal{F}_0}) = 1 - {[1 - {P_{\rm{F}}}({\mathcal{F}_0})]^{{N_{{\rm{cells}}}}}}.$$
(99)
By definition, this is the false alarm probability when the phase parameters are unknown. The number of false alarms NF is given by
$${N_{\rm{F}}} = {N_{{\rm{cells}}}}P_{\rm{F}}^T({\mathcal{F}_0}).$$
(100)
A different approach to the calculation of the number of false alarms using the Euler characteristic of level crossings of a random field is described in [65].
It was shown (see [39]) that for any finite \({{\mathcal F}_0}\) and Ncells, Eq. (99) provides an upper bound for the false alarm probability. Also in [39] a tighter upper bound for the false alarm probability was derived by modifying a formula obtained by Mohanty [92]. The formula amounts essentially to introducing a suitable coefficient multiplying the number Ncells of cells.
Detection probability for unknown intrinsic parameters
When the signal is present in the data a precise calculation of the pdf of the \({\mathcal F}\)-statistic is very difficult because the presence of the signal makes the data’s random process non-stationary. As a first approximation we can estimate the probability of detection of the signal when the intrinsic parameters are unknown by the probability of detection when these parameters are known [it is given by Eq. (87)]. This approximation assumes that when the signal is present the true values of the intrinsic parameters fall within the cell where the \({\mathcal F}\)-statistic has a maximum. This approximation will be the better the higher the signal-to-noise ratio ρ is.
Number of templates
To search for gravitational-wave signals we evaluate the \({\mathcal F}\)-statistic on a grid in parameter space. The grid has to be sufficiently fine such that the loss of signals is minimized. In order to estimate the number of points of the grid, or in other words the number of templates that we need to search for a signal, the natural quantity to study is the expectation value of the \({\mathcal F}\)-statistic when the signal is present.
Thus, we assume that the data x contains the gravitational-wave signal h(t; θ) defined in Eq. (32), so x(t; θ) = h(t; θ) + n(t). The parameters θ = (a, ξ) of the signal consist of extrinsic parameters a and intrinsic parameters ξ. The data x will be correlated with the filters hi(t; ξ′) (i = 1,…, n) parameterized by the values ξ′ of the intrinsic parameters. The \({\mathcal F}\)-statistic can thus be written in the form [see Eq. (62)]
$$\mathcal{F}[x(t;{\rm{a,}}\xi {\rm{);}}{\xi {\prime}}] = {1 \over 2}{\rm{N}}{[x(t;{\rm{a}},\xi);{\xi {\prime}}]^{\rm{T}}} \cdot {\rm{M(}}{\xi {\prime}})^{- 1}\cdot{\rm{N}}[x(t;{\rm{a}},\xi);{\xi {\prime}}],$$
(101)
where the matrices M and N are defined in Eqs. (60). The expectation value of the \({\mathcal F}\)-statistic (101) is
$${\rm{E}}[\mathcal{F}[x(t;{\rm{a}},\xi);{\xi \prime}]] = {1 \over 2}(n + {{\rm{a}}^{\rm{T}}} \cdot {\rm{Q}}(\xi, {\xi {\prime}}) \cdot {\rm{M}}{({\xi {\prime}})^{- 1}} \cdot {\rm{Q}}{(\xi, {\xi {\prime}})^{\rm{T}}} \cdot {\rm{a}}),$$
(102)
where the matrix Q is defined in Eq. (90). Let us rewrite the expectation value (102) in the following form,
$${\rm{E}}[{\mathcal F}[x(t;{\rm{a}},\xi);{\xi \prime}]] = {1 \over 2}(n + \rho {({\rm{a}},\xi)^2}{\mathcal{C}_{\rm{n}}}({\rm{a}},\xi, {\xi {\prime}})),$$
(103)
where ρ is the signal-to-noise ratio and where we have introduced the normalized correlation function \({{\mathcal C}_{\rm{n}}}\),
$${{\mathcal C}_{\rm{n}}}({\rm{a}},\xi, {\xi \prime}): = {{{{\rm{a}}^{\rm{T}}} \cdot {\rm{Q}}(\xi, {\xi \prime})\cdot{\rm{M}}{{({\xi \prime})}^{- 1}}\cdot{\rm{Q}}{{(\xi, {\xi \prime})}^{\rm{T}}}\cdot{\rm{a}}} \over {{{\rm{a}}^{\rm{T}}}\cdot{\rm{M}}(\xi)\cdot{\rm{a}}}}.$$
(104)
From the Rayleigh principle [90] it follows that the minimum value of the normalized correlation function is equal to the smallest eigenvalue of the matrix M(ξ)−1 · Q(ξ, ξ′) · M(ξ′)−1 · Q(ξ, ξ′)T, whereas the maximum value is given by its largest eigenvalue. We define the reduced correlation function \({\mathcal C}\) as
$$\mathrm{C}(\xi, {\xi {\prime}}): = {1 \over 2}{\rm{Tr}}({\rm{M}}{(\xi)^{- 1}} \cdot {\rm{Q}}(\xi, {\xi {\prime}}) \cdot {\rm{M}}{({\xi {\prime}})^{- 1}} \cdot {\rm{Q}}{(\xi, {\xi {\prime}})^{\rm{T}}}).$$
(105)
As the trace of a matrix equals the sum of its eigenvalues, the reduced correlation function \({\mathcal C}\) is equal to the average of the eigenvalues of the normalized correlation function \({{\mathcal C}_{\rm{n}}}\). In this sense we can think of the reduced correlation function as an “average” of the normalized correlation function. The advantage of the reduced correlation function is that it depends only on the intrinsic parameters ξ, and thus is suitable for studying the number of grid points on which the \({\mathcal F}\)-statistic needs to be evaluated. We also note that the normalized correlation function \({\mathcal C}\) precisely coincides with the autocovariance function \({\mathcal C}\) of the \({\mathcal F}\)-statistic given by Eq. (89).
As in the calculation of the number of cells in order to estimate the number of templates we perform a Taylor expansion of \({\mathcal C}\) up to second order terms around the true values of the parameters, and we obtain an equation analogous to Eq. (95),
$$\sum\limits_{\mathcal{A},\mathcal{B} = 1}^m {{G_{\mathcal{A}\mathcal{B}}}\Delta {\xi _\mathcal{A}}\Delta {\xi _\mathcal{B}} = 1 - {C_0},}$$
(106)
where G is given by Eq. (94). By arguments identical to those in deriving the formula for the number of cells we arrive at the following formula for the number of templates:
$${N_t} = {1 \over {{{(1 - {C_0})}^{m/2}}}}{{\Gamma (m/2 + 1)} \over {{\pi ^{m/2}}}}\int\nolimits_V {\sqrt {\det G(\xi)} {\rm d}V.}$$
(107)
When C0 = 1/2 the above formula coincides with the formula for the number Ncells of cells, Eq. (98). Here we would like to place the templates sufficiently closely so that the loss of signals is minimized. Thus 1 − C0 needs to be chosen sufficiently small. The formula (107) for the number of templates assumes that the templates are placed in the centers of hyperspheres and that the hyperspheres fill the parameter space without holes. In order to have a tiling of the parameter space without holes we can place the templates in the centers of hypercubes, which are inscribed in the hyperspheres. Then the formula for the number of templates reads
$${N_t} = {1 \over {{{(1 - {C_0})}^{m/2}}}}{{{m^{m/2}}} \over {{2^m}}}\int\nolimits_V {\sqrt {\det {\rm{G}}(\xi)} {\rm{d}}V.}$$
(108)
For the case of the signal given by Eq. (34) our formula for the number of templates is equivalent to the original formula derived by Owen [102]. Owen [102] has also introduced a geometric approach to the problem of template placement involving the identification of the Fisher matrix with a metric on the parameter space. An early study of the template placement for the case of coalescing binaries can be found in [121, 45, 26]. Applications of the geometric approach of Owen to the case of spinning neutron stars and supernova bursts are given in [33, 16].
Covering problem
The problem of how to cover the parameter space with the smallest possible number of templates, such that no point in the parameter space lies further away from a grid point than a certain distance, is known in mathematical literature as the covering problem [38]. This was first studied in the context of gravitational-wave data analysis by Prix [ ]. The maximum distance of any point to the next grid point is called the covering radius R. An important class of coverings are lattice coverings. We define a lattice in m-dimensional Euclidean space ℝm to be the set of points including 0 such that if u and v are lattice points, then also u + v and u − v are lattice points. The basic building block of a lattice is called the fundamental region. A lattice covering is a covering of ℝm by spheres of covering radius R, where the centers of the spheres form a lattice. The most important quantity of a covering is its thickness Θ defined as
$$\Theta: = {{{\rm{volume}}\,{\rm{of}}\,{\rm{one}}\,m-{\rm{dimensional}}\,{\rm{sphere}}} \over {{\rm{volume}}\,{\rm{of}}\,{\rm{the}}\,{\rm{fundamental}}\,{\rm{region}}}}.$$
(109)
In the case of a two-dimensional Euclidean space the best covering is the hexagonal covering and its thickness ≃ 1.21. For dimensions higher than 2 the best covering is not known. However, we know the best lattice covering for dimensions m ≤ 23. These are A*m lattices, which have thicknesses \({\Theta _{A_m^{\ast}}}\) equal to
$${\Theta _{A_m^{\ast}}} = {V_m}\sqrt {m + 1} {\left({{{m(m + 2)} \over {12(m + 1)}}} \right)^{m/2}},$$
(110)
where Vm is the volume of the m-dimensional sphere of unit radius. The advantage of an A*m lattice over the hypercubic lattice grows exponentially with the number of dimensions.
For the case of gravitational-wave signals from spinning neutron stars a 3-dimensional grid was constructed [18]. It consists of prisms with hexagonal bases. Its thickness is around 1.84, which is much better than the cubic grid with a thickness of approximately 2.72. It is worse than the best 3-dimensional lattice covering, which has a thickness of around 1.46.
In [19] a grid was constructed in the 4-dimensional parameter space spanned by frequency, frequency derivative, and sky position of the source, for the case of an almost monochromatic gravitational-wave signal originating from a spinning neutron star. The starting point of the construction was an A*4 lattice of thickness ≃ 1.77. The grid was then constrained so that the nodes of the grid coincide with Fourier frequencies. This allowed the use of a fast Fourier transform (FFT) to evaluate the maximum-likelihood \({\mathcal F}\)-statistic efficiently (see Section 4.6.2). The resulting lattice is only 20% thicker than the optimal A*4 lattice.
Efficient 2-dimensional banks of templates suitable for directed searches (in which one assumes that the position of the gravitational-wave source in the sky is known, but one does not assume that the wave’s frequency and its derivative are a priori known) were constructed in [104]. All grids found in [104] enable usage of the FFT algorithm in the computation of the \({\mathcal F}\)-statistic; they have thicknesses 0.1–16% larger than the thickness of the optimal 2-dimensional hexagonal covering. In the construction of grids the dependence on the choice of the position of the observational interval with respect to the origin of time axis was employed. Also the usage of the FFT algorithms with nonstandard frequency resolutions achieved by zero padding or folding the data was discussed.
The above template placement constructions are based on a Fisher matrix with constant coefficients, i.e., they assume that the parameter manifold is flat. The generalization to curved Riemannian parameter manifolds is difficult. An interesting idea to overcome this problem is to use stochastic template banks where a grid in the parameter space is randomly generated by some algorithm [89, 57, 86, 119].
Suboptimal filtering
To extract gravitational-wave signals from the detector’s noise one very often uses filters that are not optimal. We may have to choose an approximate, suboptimal filter because we do not know the exact form of the signal (this is almost always the case in practice) or in order to reduce the computational cost and to simplify the analysis. In the case of the signal of the form given in Eq. (32) the most natural and simplest way to proceed is to use as detection statistic the \({\mathcal F}\) statistic where the filters h′k(t; ζ) (k = 1, …, n) are the approximate ones instead of the optimal ones hk(t; ξ) (k = 1,…, n) matched to the signal. In general the functions h′k(t; ζ) will be different from the functions hk(t; ξ) used in optimal filtering, and also the set of parameters ζ will be different from the set of parameters ξ in optimal filters. We call this procedure the suboptimal filtering and we denote the suboptimal statistic by \({{\mathcal F}_{\rm{s}}}\). It is defined as [see Eq. (62)]
$${\mathcal{F}_{\rm{s}}}[x;\zeta ]: = {1 \over 2}{{\rm{N}}_{\rm{s}}}{[x;\zeta ]^{\rm T}} \cdot {{\rm{M}}_{\rm{s}}}{(\zeta)^{- 1}} \cdot {{\rm{N}}_{\rm{s}}}[x;\zeta ],$$
(111)
where the data-dependent n × 1 column matrix Ns and the square n × n matrix Ms have components [see Eq. (60)]
$${N_{{\text{s}}\;i}}[x;\zeta ]: = (x(t)|{h'_i}(t;\zeta )),\quad {M_{{\text{s}}ij}}(\zeta ): = ({h'_i}(t;\zeta )|{h'_j}(t;\zeta )),\quad i,j = 1, \ldots ,n.$$
(112)
We need a measure of how well a given suboptimal filter performs. To find such a measure we calculate the expectation value of the suboptimal statistic \({{\mathcal F}_{\rm{s}}}\) in the case where the data contains the gravitational-wave signal, i.e., when x(t; a, ξ) = n(t) + h(t; a, ξ). We get
$${\rm{E}}[{{\mathcal{F}}_s}[x(t;{\rm{a}},\xi);\zeta ]] = {1 \over 2}(n + {{\rm{a}}^{\rm T}} \cdot {\rm{Q}} _{\rm{s}}(\xi, \zeta) \cdot {{\rm{M}}_{\rm{s}}}{(\zeta)^{- 1}} \cdot {\rm Q}_{\rm{s}}{(\xi, \zeta)^{\rm{T}}} \cdot \rm{a}),$$
(113)
where we have introduced the matrix Qs with components
$${Q_{{\text{s}}ij}}(\xi ,\zeta ): = ({h_i}(t;\xi )|{h'_j}(t;\zeta )),\quad i,j = 1, \ldots ,n.$$
(114)
Let us rewrite the expectation value (113) in the following form,
$${\rm{E}}[{{\mathcal F}_{\rm{s}}}[x(t;{\rm{a}},\xi);\zeta ]] = {1 \over 2}\left({n + \rho {{({\rm{a}},\xi)}^2}{{{{\rm{a}}^T} \cdot {{\rm{Q}}_{\rm{s}}}(\xi, \zeta) \cdot {{\rm{M}}_{\rm{s}}}{{(\zeta)}^{- 1}} \cdot {{\rm{Q}}_{\rm{s}}}{{(\xi, \zeta)}^T} \cdot {\rm{a}}} \over {{{\rm{a}}^{\rm{T}}} \cdot M(\xi) \cdot {\rm{a}}}}} \right),$$
(115)
where ρ is the optimal signal-to-noise ratio [given in Eq. (64)]. This expectation value reaches its maximum equal to (n + ρ2)/2 when the filter is perfectly matched to the signal. Therefore, a natural measure of the performance of a suboptimal filter is the quantity FF defined by
$$\text{FF}(\xi ): = \mathop {\max }\limits_{(a,\varsigma )} \sqrt {\frac{{{\text{a}^\text{T}} \cdot {\text{Q}_\text{s}}(\xi ,\varsigma ) \cdot {\text{M}_\text{s}}{{(\varsigma )}^{ - 1}} \cdot {\text{Q}_\text{s}}{{(\xi ,\varsigma )}^\text{T}} \cdot \text{a}}}{{{\text{a}^\text{T}} \cdot \text{M}(\xi ) \cdot \text{a}}}} .$$
(116)
We call the quantity FF the generalized fitting factor. From the Rayleigh principle, it follows that the generalized fitting factor is the maximum of the largest eigenvalue of the matrix M(ξ)−1 · Qs(ξ, ζ) · Ms(ζ)−1 · Qs(ξ, ζ)T over the intrinsic parameters of the signal.
In the case of a gravitational-wave signal given by
$$s(t;{A_0},\xi) = {A_0}h(t;\xi),$$
(117)
the generalized fitting factor defined above reduces to the fitting factor introduced by Apostolatos [13]:
$$\mathrm{FF}(\xi) = \max\limits_\zeta {{(h(t;\xi)\vert {h{\prime}}(t;\zeta))} \over {\sqrt {(h(t;\xi)\vert h(t;\xi))} \sqrt {({h{\prime}}(t;\zeta)\vert {h{\prime}}(t;\zeta))}}}.$$
(118)
The fitting factor is the ratio of the maximal signal-to-noise ratio that can be achieved with suboptimal filtering to the signal-to-noise ratio obtained when we use a perfectly matched, optimal filter. We note that for the signal given by Eq. (117), FF is independent of the value of the amplitude A0.
For the case of a signal of the form
$$s(t;{A_0},{\phi _0},\xi) = {A_0}\cos (\phi (t;\xi) + {\phi _0}),$$
(119)
where ϕ0 is a constant phase, the maximum over ϕ0 in Eq. (118) can be obtained analytically. Moreover, assuming that over the bandwidth of the signal the spectral density of the noise is constant and that over the observation time cosϕ(t; ξ) oscillates rapidly, the fitting factor is approximately given by
$${\rm{FF}}(\xi) \cong \max\limits_\zeta {\left[ {{{\left({\int\nolimits_0^{{T_0}} {\cos (\phi (t;\xi) - {\phi {\prime}}(t;\zeta))dt}} \right)}^2} + {{\left({\int\nolimits_0^{{T_0}} {\sin (\phi (t;\xi) - {\phi {\prime}}(t;\zeta))dt}} \right)}^2}} \right]^{1/2}}.$$
(120)
In designing suboptimal filters one faces the issue of how small a fitting factor one can accept. A popular rule of thumb is accepting FF = 0. 97. Assuming that the amplitude of the signal and consequently the signal-to-noise ratio decreases inversely proportionally to the distance from the source this corresponds to 10% loss of the signals that would be detected by a matched filter.
Proposals for good suboptimal (search) templates for the case of coalescing binaries are given in [35, 134] and for the case-spinning neutron stars in [65, 18].
Algorithms to calculate the \({\mathcal F}\)-statistic
The two-step procedure
In order to detect signals we search for threshold crossings of the \({\mathcal F}\)-statistic over the intrinsic parameter space. Once we have a threshold crossing we need to find the precise location of the maximum of \({\mathcal F}\) in order to estimate accurately the parameters of the signal. A satisfactory procedure is the two-step procedure. The first step is a coarse search where we evaluate \({\mathcal F}\) on a coarse grid in parameter space and locate threshold crossings. The second step, called a fine search, is a refinement around the region of parameter space where the maximum identified by the coarse search is located.
There are two methods to perform the fine search. One is to refine the grid around the threshold crossing found by the coarse search [94, 92, 134, 127], and the other is to use an optimization routine to find the maximum of \({\mathcal F}\) [65, 78]. As initial values to the optimization routine we input the values of the parameters found by the coarse search. There are many maximization algorithms available. One useful method is the Nelder-Mead algorithm [79], which does not require computation of the derivatives of the function being maximized.
Evaluation of the \({\mathcal F}\)-statistic
Usually the grid in parameter space is very large and it is important to calculate the optimum statistic as efficiently as possible. In special cases the \({\mathcal F}\)-statistic given by Eq. (62) can be further simplified. For example, in the case of coalescing binaries \({\mathcal F}\) can be expressed in terms of convolutions that depend on the difference between the time-of-arrival (TOA) of the signal and the TOA parameter of the filter. Such convolutions can be efficiently computed using FFTs. For continuous sources, like gravitational waves from rotating neutron stars observed by ground-based detectors [65] or gravitational waves form stellar mass binaries observed by space-borne detectors [78], the detection statistic \({\mathcal F}\) involves integrals of the general form
$$\int\nolimits_0^{{T_0}} {x(t)m(t;\omega, \tilde \xi)} \exp (i\omega {\phi _{\bmod}}(t;\tilde \xi))\exp (i\omega t){\rm{d}}t,$$
(121)
where \({\tilde \xi}\) are the intrinsic parameters excluding the frequency parameter ω, m is the amplitude modulation function, and ωϕmod the phase modulation function. The amplitude modulation function is slowly varying compared to the exponential terms in the integral (121). We see that the integral (121) can be interpreted as a Fourier transform (and computed efficiently with an FFT), if ϕmod = 0 and if m does not depend on the frequency ω. In the long-wavelength approximation the amplitude function m does not depend on the frequency. In this case, Eq. (121) can be converted to a Fourier transform by introducing a new time variable tb [124],
$${t_{\rm{b}}}(t;\tilde \xi): = t + {\phi _{\bmod}}(t;\tilde \xi).$$
(122)
Thus, in order to compute the integral (121), for each set of the intrinsic parameters \({\tilde \xi}\) we multiply the data by the amplitude modulation function m, resample according to Eq. (122), and perform the FFT. In the case of LISA detector data when the amplitude modulation m depends on frequency we can divide the data into several band-passed data sets, choosing the bandwidth for each set to be sufficiently small so that the change of m exp(iωϕmod) is small over the band. In the integral (121) we can then use as the value of the frequency in the amplitude and phase modulation function the maximum frequency of the band of the signal (see [78] for details).
Accuracy of parameter estimation
Fisher-matrix-based assessments
Fisher matrix has been extensively used to assess the accuracy of estimation of astrophysically-interesting parameters of different gravitational-wave signals. For ground-based interferometric detectors, the first calculations of the Fisher matrix concerned gravitational-wave signals from inspiralling compact binaries (made of neutron stars or black holes) in the leading-order quadrupole approximation [50, 76, 63] and from quasi-normal modes of Kerr black hole [48].
Cutler and Flanagan [41] initiated the study of the implications of the higher-order post-Newtonian (PN) phasing formula as applied to the parameter estimation of inspiralling binary signals. They used the 1.5PN phasing formula to investigate the problem of parameter estimation, both for spinning and non-spinning binaries, and examined the effect of the spin-orbit coupling on the estimation of parameters. The effect of the 2PN phasing formula was analyzed independently by Poisson and Will [106] and Królak, Kokkotas and Schäfer [75]. In both cases the focus was to understand the leading-order spin-spin coupling term appearing at the 2PN level when the spins were aligned perpendicularly to the orbital plane. Compared to [75], [106] also included a priori information about the magnitude of the spin parameters, which then leads to a reduction in the rms errors in the estimation of mass parameters. The case of a 3.5PN phasing formula was studied in detail by Arun et al. [17]. Inclusion of 3.5PN effects leads to an improved estimate of the binary parameters. Improvements are relatively smaller for lighter binaries. More recently the Fisher matrix was employed to assess the errors in estimating the parameters of nonspinning black-hole binaries using the complete inspiral-merger-ring-down waveforms [7].
Various authors have investigated the accuracy with which the LISA detector can determine binary parameters including spin effects. Cutler [40] determined LISA’s angular resolution and evaluated the errors of the binary masses and distance considering spins aligned or anti-aligned with the orbital angular momentum. Hughes [60] investigated the accuracy with which the redshift can be estimated (if the cosmological parameters are derived independently), and considered the black-hole ring-down phase in addition to the inspiralling signal. Seto [128] included the effect of finite armlength (going beyond the long wavelength approximation) and found that the accuracy of the distance determination and angular resolution improve. This happens because the response of the instrument when the armlength is finite depends strongly on the location of the source, which is tightly correlated with the distance and the direction of the orbital angular momentum. Vec-chio [140] provided the first estimate of parameters for precessing binaries when only one of the two supermassive black holes carries spin. He showed that modulational effects decorrelate the binary parameters to some extent, resulting in a better estimation of the parameters compared to the case when spins are aligned or antialigned with orbital angular momentum. Hughes and Menou [61] studied a class of binaries, which they called “golden binaries,” for which the inspiral and ring-down phases could be observed with good enough precision to carry out valuable tests of strong-field gravity. Berti, Buonanno and Will [29] have shown that inclusion of non-precessing spin-orbit and spin-spin terms in the gravitational-wave phasing generally reduces the accuracy with which the parameters of the binary can be estimated. This is not surprising, since the parameters are highly correlated, and adding parameters effectively dilutes the available information.
Extensive study of accuracy of parameter estimation for continuous gravitational-wave signals from spinning neutron stars was performed in [64]. In [129] Seto used the Fisher matrix to study the possibility of determining distances to rapidly rotating isolated neutron stars by measuring the curvature of the wave fronts.
Comparison with the Cramèr-Rao bound
In order to test the performance of the maximization method of the \({\mathcal F}\)-statistic it is useful to perform Monte Carlo simulations of the parameter estimation and compare the simulated variances of the estimators with the variances calculated from the Fisher matrix. Such simulations were performed for various gravitational-wave signals [73, 26, 65, 36]. In these simulations one observes that, above a certain signal-to-noise ratio, called the threshold signal-to-noise ratio, the results of the Monte Carlo simulations agree very well with the calculations of the rms errors from the inverse of the Fisher matrix. However, below the threshold signal-to-noise ratio they differ by a large factor. This threshold effect is well known in signal processing [139]. There exist more refined theoretical bounds on the rms errors that explain this effect, and they were studied in the context of the gravitational-wave signals from coalescing binaries [98].
Use of the Fisher matrix in the assessment of accuracy of the parameter estimation has been critically examined in [138], where a criterion has been established for the signal-to-noise ratio above which the inverse of the Fisher matrix approximates well covariances of the parameter estimators. In [148, 142] the errors of ML estimators of parameters of gravitational-wave signals from nonspinning black-hole binaries were calculated analytically using a power expansion of the bias and the covariance matrix in inverse powers of the signal-to-noise ratio. The first-order term in this covariance matrix expansion is the inverse of the Fisher information matrix. The use of higher-order derivatives of the likelihood function in these expansions makes the errors prediction sensitive to the secondary lobes of the pdf of the ML estimators. Conditions for the validity of the Cramèr-Rao lower bound are discussed in [142] as well, and some new features in regions of the parameter space so far not explored are predicted (e.g., that the bias can become the most important contributor to the parameters errors for high-mass systems with masses 200 M⊙ and above).
There exists a simple model that explains the deviations from the covariance matrix and reproduces well the results of the Monte Carlo simulations (see also [25]). The model makes use of the concept of the elementary cell of the parameter space that we introduced in Section 4.3.2. The calculation given below is a generalization of the calculation of the rms error for the case of a monochromatic signal given by Rife and Boorstyn [116].
When the values of parameters of the template that correspond to the maximum of the functional \({\mathcal F}\) fall within the cell in the parameter space where the signal is present, the rms error is satisfactorily approximated by the inverse of the Fisher matrix. However, sometimes, as a result of noise, the global maximum is in the cell where there is no signal. We then say that an outlier has occurred. In the simplest case we can assume that the probability density of the values of the outliers is uniform over the search interval of a parameter, and then the rms error is given by
$$\sigma _{{\rm{out}}}^2 = {{{\Delta ^2}} \over {12}},$$
(123)
where Δ is the length of the search interval for a given parameter. The probability that an outlier occurs will be higher the lower the signal-to-noise ratio is. Let q be the probability that an outlier occurs. Then the total variance σ2 f the estimator of a parameter is the weighted sum of the two errors
$${\sigma ^2} = \sigma _{{\rm{out}}}^2q + \sigma _{{\rm{CR}}}^2(1 - q),$$
(124)
where σCR is the rms errors calculated from the covariance matrix for a given parameter. One can show [65] that the probability q can be approximated by the following formula:
$$q = 1 - \int\nolimits_0^\infty {{p_1}(\rho, \mathcal{F}){{\left({\int\nolimits_0^\mathcal{F} {{p_0}(y){\rm{d}}y}} \right)}^{{N_{{\rm{cells}}}} - 1}}{\rm{d}}\mathcal{F},}$$
(125)
where p0 and p1 are the pdfs of the \({\mathcal F}\)-statistic (for known intrinsic parameters) when the signal is absent or present in data, respectively [they are given by Eqs. (84) and (85)], and where Ncells is the number of cells in the intrinsic parameter space. Eq. (125) is in good but not perfect agreement with the rms errors obtained from the Monte Carlo simulations (see [65]). There are clearly other reasons for deviations from the Cramèr-Rao bound as well. One important effect (see [98]) is that the functional \({\mathcal F}\) has many local subsidiary maxima close to the global one. Thus, for a low signal-to-noise ratio the noise may promote the subsidiary maximum to a global one.
Upper limits
Detection of a signal is signified by a large value of the \({\mathcal F}\)-statistic that is unlikely to arise from the noise-only distribution. If instead the value of \({\mathcal F}\) is consistent with pure noise with high probability we can place an upper limit on the strength of the signal. One way of doing this is to take the loudest event obtained in the search and solve the equation
$${P_{\rm{D}}}({\rho _{{\rm{UL}}}},{{\mathcal F}_{\rm{L}}}) = \beta$$
(126)
for signal-to-noise ratio ρUL, where PD is the detection probability given by Eq. (87), \({{\mathcal F}_{\rm{L}}}\) is the value of the \({\mathcal F}\)-statistic corresponding to the loudest event, and β is a chosen confidence [23, 2]. Then ρUL is the desired upper limit with confidence β.
When gravitational-wave data do not conform to a Gaussian probability density assumed in Eq. (87), a more accurate upper limit can be obtained by injecting the signals into the detector’s data and thereby estimating the probability of detection PD [4].