Latent likelihood ratio tests for assessing spatial kernels in epidemic models

One of the most important issues in the critical assessment of spatio-temporal stochastic models for epidemics is the selection of the transmission kernel used to represent the relationship between infectious challenge and spatial separation of infected and susceptible hosts. As the design of control strategies is often based on an assessment of the distance over which transmission can realistically occur and estimation of this distance is very sensitive to the choice of kernel function, it is important that models used to inform control strategies can be scrutinised in the light of observation in order to elicit possible evidence against the selected kernel function. While a range of approaches to model criticism is in existence, the field remains one in which the need for further research is recognised. In this paper, building on earlier contributions by the authors, we introduce a new approach to assessing the validity of spatial kernels—the latent likelihood ratio tests—which use likelihood-based discrepancy variables that can be used to compare the fit of competing models, and compare the capacity of this approach to detect model mis-specification with that of tests based on the use of infection-link residuals. We demonstrate that the new approach can be used to formulate tests with greater power than infection-link residuals to detect kernel mis-specification particularly when the degree of mis-specification is modest. This new tests avoid the use of a fully Bayesian approach which may introduce undesirable complications related to computational complexity and prior sensitivity. Electronic supplementary material The online version of this article (10.1007/s00285-020-01529-3) contains supplementary material, which is available to authorized users.


Introduction
Selection of spatial kernel functions in spatio-temporal epidemic models is a question of paramount practical importance.It is recognised [26,11] that predictions regarding the speed of epidemic spread or propensity for transmission over long distances are very sensitive to the choice of spatial kernel function.The control of epidemics such as foot and mouth disease (FMD) in the UK [18,2,5,6,7,16,22,3,28,29,32] or citrus canker in the USA [23,14,15] has proved controversial on account of the removal of healthy hosts as part of the strategy.Such strategies have been informed by mathematical models in which the choice of spatial kernel has been a factor in determining a 'culling radius' (for example [18,6]).Methods for model criticism and comparison are therefore much-needed to ensure that, as far as possible, such decisions can be supported and defended in the light of available evidence.
Although several approaches to model criticism for epidemic models exist, in the epidemic context many of these suffer from certain difficulties which motivate the development of further approaches.In [13] the approaches commonly used are reviewed.These range from Bayes factors and Bayesian model selection, posterior predictive p-values, latent classical tests and the use of the DIC including missing data variants.One recommendation from [13] is that it is prudent to follow the advice of Box [4] that one should test selectively for those forms of misspecification which are most strongly suspected and to design specific tests for this purpose.This is the approach that is taken throughout this paper where we will formulate latent likelihood ratio-tests [29,31] for kernel misspecification and compare their sensitivity with that of the infection-link residuals test introduced in [20].Both of these methods are examples of latent classical testing, an approach which fuses Bayesian and classical thinking by having a Bayesian observer impute the result of a classical goodness-of-fit test applied to a latent process, where the process and the test can be specified flexibly to maximise the chance of detecting the suspected misspecification, should it be present.The approach differs from a purely Bayesian one, in which modes of misspecification are accommodated through the process of Bayesian model expansion.One reason for not adopting this latter approach is that inference for relatively simple epidemic models using partial observation is already a complex process.We therefore seek model comparison methods that can be utilised without increasing the dimension of the models to which Bayesian methods are applied.Accordingly, the methods we present can be integrated into analyses without increasing the complexity of the fundamental Bayesian computations.
We will consider stochastic models for an infectious disease spreading through a closed population of spatially-distributed hosts exemplified by the spatio-temporal Susceptible-Exposed-Infectious-Removed (SEIR) model.It will be assumed that the locations of hosts are known and fixed.Under this model, the host population at time t is partitioned into subsets S(t), E(t), I(t) and R(t).Hosts in S(.) are susceptible to infection, hosts in E(.) have been infected but are not yet able to transmit, hosts in I(.) can pass on infection, while hosts in R(t) have been removed (e.g. by death, hospitalisation, or the acquisition of immunity) and play no further part in the epidemic.A susceptible individual at coordinates x at time t becomes exposed at a rate where I(t) comprises sites infectious at time t, α and β are primary and secondary infection rates, and κ parametrises the spatial kernel function K().For convenience, we identify hosts with their location.The choice of K greatly influences the design of control strategies, for example based on ring-culling.A longer-tailed kernel may suggest the use of a larger culling radius and vice versa.Sojourn times in the E and I class are modelled using appropriate distributions such as Gamma or Weibull distributions.We will denote by θ the vector of model parameters formed from α, β , κ supplemented by parameters specifying the distributions of sojourn times in E and I.This flexible framework can accommodate complexity arising e.g. from host heterogeneity as appropriate [25,17].
When data y contain partial information (e.g.removals or 'snapshots' of I(t) using imperfect diagnostic tests) data-augmented Bayesian analysis is now a standard tool for investigating π(θ |y) via π(θ , z|y), where z incorporates unobserved transitions and, possibly, graphs of infectious contacts.Computations are often effected using reversiblejump MCMC or particle filtering [19].In this paper, we will assume that observations include times and locations of all transitions from E to I and from I to R, so that the subsets I(t) and R(t) are observed but individuals in S(t) cannot be distinguished from those in E(t).We therefore specify z to incorporate the times and location of the unobserved transitions from S to E (termed exposure events) and use MCMC to sample from π(θ , z|y).As the number of exposure events is not uniquely determined by the data, the state-space for (θ , z) comprises components of varying dimension requiring the use of reversible-jump methods.It is straightforward to apply the methods used on this class of models, to snapshot data.
The rest of the paper is organised as follows.In Section 2, we discuss the general features of the latent classical testing framework before describing how functional-model representations of epidemic models have been used in the specification of infectionlink residuals [20].In Section 3, we explicitly formulate new latent classical tests for detecting kernel misspecification using likelihood ratios, where the ratio is based on a complete or partial parameter likelihood.In Section 4, we apply the tests to simulated data comparing the ability or 'power' of the likelihood-based and infection-link residual tests to detect kernel misspecification in several scenarios.Conclusions are summarised in Section 5.

Latent classical testing and residual construction
Throughout we consider the situation where a Bayesian observer B observes the outcome y of an experiment for which they have proposed a statistical model π 0 (y|θ ) where beliefs regarding the parameter vector θ are represented by the prior distribution π 0 (θ ).We suppose that the likelihood π 0 (y|θ ) may not necessarily be tractable -a situation which typically applies in the case of a partially observed epidemic.Now let r be some process varying jointly with y and suppose that we have a model π(y, r|θ ) for which the marginal model π(y|θ ) coincides with π 0 (y|θ ).Suppose that the model π(r|θ ) is tractable.Then in the latent classical framework, the model π 0 (y|θ ) is assessed by having B impute the result of a classical test of the model π(r|θ ) which is carried out by a classical observer C of r.The evidence found by C against π(r|θ ), for example as summarised by a P-value p(r; θ ), can be considered as evidence against the joint model π(y, r|θ ).
In [13] it is discussed how the roles of B and C above are analogous to those of the Freudian ego and superego, with model formulation and fitting being done by the former in the Bayesian framework and model criticism by the latter in the classical framework.The conclusions of the analysis carried out in this 'dual-observer' framework are necessarily presented via B s posterior distribution of C s P-value, π(p(r, θ |y)) from which B can extract natural measures of lack of fit such as Pr(p(r, θ ) < α|y) for some suitably small α.Note that this approach can be viewed as an extension of the framework of posterior predictive checking.The main differences lie in the use of latent processes to specify the p-value and in the consideration of the entire distribution of p-values as opposed to its mean as captured by a posterior predictive p-value [21].As noted by Meng, '... every problem is a missing data problem...' and the approach we take exploits this.Suppose that π j (y, r j |θ ), j = 1, ..., k represent models for the joint distribution of (y, r j ) all of which specify the same marginal model π 0 (y|θ ) and share a common parameter prior distribution π(θ ).Then observation of y alone carries no information on the relative validity of these models.That is, y carries exactly the same evidence against every model with marginal π 0 (y|θ ).Therefore, the latent process r can be designed to yield a test tailored to detecting the suspected form of misspecification.This facility is exemplified by the construction of infection-link residuals [20].

Infection-link residuals
The starting point is to construct a functional-model representation of the epidemic process.In this formalism the observations y are represented as a deterministic function x = h(r, θ ) of θ and some unobserved process r with fixed distribution independent of θ , where x = (y, z).This means that r can be treated as a residual process and tests for compliance with the specified distribution can be applied to the imputed realisations of r.Such an approach fits well for epidemic models where sampling from π(θ , r|y) is often possible using Markov chain Monte Carlo methods.
In [20] a functional-model for a spatio-temporal SEIR model is presented where the process r is composed of four independent i.i.d.U(0, 1) sequences, r 1 , r 2 , r 3 , r 4 .Consider the mapping x = h θ (r 1 , r 2 , r 3 , r 4 ), where x records the time and nature of every event occurring during the epidemic.Details can be found in [20].The time of each subsequent infection event is determined from the process r 1 = {r 1 j , j ≥ 1} while processes r 3 and r 4 specify the quantiles of the sojourn periods in the E and I class respectively for each infection.The infection-link residual sequence (to which tests are applied) r 2 = {r 2 j , j ≥ 1} determine the particular I-S pair responsible for each infection event.Given the time of the j th infection, t j , we identify the set of I-S links and order these according to ascending order of magnitude.The particular link causing the j th infection is selected by considering the cumulative sum of the ordered links and identifying the first link where this cumulative sum exceeds the value r 2 j W where W denotes the sum of the weights in S. It is straightforward to explore the joint posterior π(θ , r 1 , r 2 , r 3 , r 4 |y).If the kernel function K has been misspecified (for example by underestimating the propensity for long-range transmission by assuming an exponentially bounded form when a power-law relation is more appropriate, see Fig. 1), then when the process r 2 is imputed, some systematic deviation from a U(0, 1) should be anticipated.In [20] p-values were imputed from an Anderson-Darling test [1] applied to r 2 and it was demonstrated that the approach can detect kernel misspecification in simulated data sets.In this paper we investigate whether it is possible to improve on the sensitivity of the ILR tests using likelihood-based methods.

Latent likelihood ratio tests for model comparison
This general approach has been followed previously in [31] where results of an ANOVA test applied to viraemic measurements taken on a host population, partitioned by depth in an unobserved infection graph, were imputed.While the ILR test is targeted at generic forms of model inadequacy (misspecification of the tail properties of a spatial kernel), latent likelihood ratio tests demand that a more specific alternative model is identified.
To test the validity of model M 0 with likelihood π 0 (y|θ ) against a simple alternative model M 1 (i.e. with no free parameters) π 1 (y), it is natural for B to impute C s conclusion from a classical test based on a likelihood ratio T (y, θ ) = π 0 (y|θ ) π 1 (y) .For epidemic models and data, π(y|θ ) and π 1 (y) would typically be intractable.Nevertheless, B can impute (θ , x), where x represents an appropriate latent process, and the conclusion of C s test based on a likelihood ratio T (x, θ ) = π 0 (x|θ ) π 1 (x) , so long as π 0 (x|θ ) and π 1 (x) are tractable.It is straightforward to extend the idea to a generalised likelihood ratio test (GLRT) when the alternative model is composite by replacing π 1 (x) with π 1 (x| θ1 ) where θ1 is the maximum likelihood estimate (MLE) of the parameter θ 1 in the alternative model.
Suppose that, given partial information y, we use data-augmented MCMC to explore π 0 (θ , x|y) where x comprises the times and nature of all transitions of all transitions within an observation window (0, T max ).The latent likelihood ratio test may be implemented as an addendum to this analysis as follows.
By repeating these steps within a standard MCMC analysis, a sample from π(p(θ , x)|y) can be obtained.
Note that we do not assume nesting of models that might allow asymptotic results on sampling distributions of likelihood ratios to be applied.For each sampled pair (θ (i) , x (i) ) we may estimate the p-value by simulation.The simplest approach is to estimate the posterior expectation of the p-values as follows: . Simulate a random draw, x from π 0 (x|θ (i) ), obtain the MLE, θ 1 , by maximising π 1 (x |θ 1 ), and compute T = ) .An estimate of the posterior mean of π(p(θ , x)|y), is obtained from the frequency with which T < T (x (i) , θ (i) ).This quantity provides some information on the strength of the evidence against the modelling assumptions.

Imputation and reinforcement
The above test may appear to have the potential to be targeted at specific forms of misspecification but some caveats should be noted.When B imputes the latent process x in order to specify a tractable classical test, they appeal to the modelling assumptions underlying M 0 .It follows that imputation will reinforce these assumptions to an extent dependent on the amount of imputed information.For example, if the imputed x included not only unobserved quantities x 1 from the present experiment but also a further m − 1 independent replicates x 2 , ..., x m then, for large m, π(p(θ , x)|y) ≈ U(0, 1) for a large range of tests, since the test result would be increasingly dominated by the imputed replicates.
To understand the impact of imputation more formally, we consider the simple situation where B s prior distribution π 0 (θ ) places all belief on a single value θ 0 , giving a density π 0 (x) for the latent process x.We assume that the alternative model, M 1 , is simple (i.e. has no free parameters) with sampling distribution π 1 (x).Suppose now that B observes y = f (x), so that x is an augmented version of y and imputes x via π 0 (x|y).They then impute the p-value, p x computed by C from an LRT applied to x. Suppose that B summarises their posterior belief regarding C s evidence against π 0 by the quantity γ x,α (y) = π 0 (p x < α|y) for some suitably small α.A natural analogue of power for B would be the expectation of γ x,α (y) under the alternative hypothesis, that is Note that when x ≡ y the quantity γ y,α (y) is an indicator function and β y is the power of the uniformly most powerful test obtained using the Neyman-Pearson Lemma.Then we have following result.Proposition 3.1.For x, y, M 0 , M 1 as described above, β x ≤ β y .
Proof.The most powerful classical test of level α of M 0 v M 1 that can be applied to the imputed x is based on the ratio π 0 (x) π 1 (x) where π 0 and π 1 represent the sampling densities of the imputed x respectively under M 0 and M 1 .Now π 0 (x) = π 0 (y)π 0 (x|y) = π 0 (x) while π 1 (x) = π 1 (y)π 0 (x|y), so that π 0 (x) Therefore, a LRT applied directly to y is equivalent to a LRT applied to x when x ∼ π 0 (x) and x ∼ π 1 (x) are used as the sampling densities of x under the competing hypotheses.We denote by p y the resulting p-value.Now, for the latent likelihood ratio test, B imputes the result of C s likelihood ratio test applied to the imputed x, where C s test is based on the test statistic with the associated p-value, p x .By the Neyman-Pearson Lemma, the power of this test cannot exceed that of the optimal test.We therefore have that for any given value α, where x ∼ π 1 (x) and y ∼ π 1 (y) on the left and right-hand sides respectively.Note that This completes the proof.Now suppose more generally that M 0 uses an arbitrary prior π 0 (θ ), while M 1 remains simple, and define β x (θ ) and β y (θ ) in the obvious way.Then under the prior distribution, β y (θ ) is absolutely dominant over β x (θ ) so that B views with certainty the LRT applied directly to y as giving the more powerful test of M 0 against M 1 .
In the above proof the inequality β x ≤ β y arises from the disparity between π 1 (x) and π 1 (x).We show that this disparity, as characterised using Kullback-Leibler (KL) divergence, increases as the amount of imputation grows.Suppose that y = f (x) and x = g(z), so that z represents the outcome of an experiment that is even more informative than x.Consider again the case of simple hypotheses for M 0 and M 1 , with π 0 and π 1 denoting the respective sampling densities of quantities.
If the test is based on the imputed z then the optimal test statistic uses the distribution of this imputed z and is therefore the ratio π 0 (z) π 1 (y)π 0 (z|y) .We use π i 0 and π i 1 to denote the sampling densities of imputed quantities under the respective hypotheses.Note that π i 0 = π 0 .We now consider the Kullback-Leibler divergence between π i 1 (z) and π 1 (z).
This can be calculated as The first integral above is the KL divergence between the density π i 1 (z) = π 1 (y)π 0 (x|y)π 0 (z|x) and the density π 1 (y)π 1 (x|y)π 0 (z|x).Suppose that the latter is used on the denominator in a ratio test statistic applied to the imputed z.Then this ratio is clearly π 0 (x) π 1 (x) where x is the imputed value and the power of the test corresponds to that of a latent likelihood ratio test applied to x.The second integral above is itself a KL divergence greater than zero.It follows that KL(π i 1 (z), π 1 (z)) > KL(π i 1 (z), π 1 (y)π 1 (x|y)π 0 (z|x)).
In the light of this increasing divergence, we may suspect that the power of a LRT that uses π 1 (z) on the denominator may be less than that of a test using π 1 (y)π 1 (x|y)π 0 (z|x) or, equivalently, a latent likelihood ratio test applied directly to the imputed x.When seeking a suitable latent process x, it may be prudent to minimise the extent of imputation and, consequently, the degree of reinforcement of the model under test.That is, if y is specified by x which, in turn, is specified by z, then, assuming the likelihoods π 0 (x|θ ) and π 1 (x|θ ) are tractable, x should be preferred to z as the choice for the latent process.

Latent likelihood tests for kernel assessment
We now return to the situation where we wish to assess the validity of the choice of transmission kernel for a spatio-temporal SEIR model for an emerging epidemic based on partial data y.We construct latent likelihood ratio tests and compare their ability with that of the ILR test.
We make the following assumptions.Bayesian observer B proposes an SEIR model for an emerging epidemic of the form described in Section 1.The model, M 0 , incorporates a transmission kernel K 0 (d, κ 0 ) and a prior π 0 (θ ) is assigned to the parameter vector θ 0 = (α, β , κ 0 , θ E , θ I ).Observer C criticises this model, suspecting an alternative transmission kernel K 1 (d, κ 1 ) may be more appropriate.All other aspects of the alternative model M 1 coincide with M 0 .We denote the parameter in M 1 by θ 1 = (α, β , κ 1 , θ E , θ I ).Note that since M 1 will be treated by Observer C in the framework using frequentist methods, then no prior for θ 1 need be specified.
We consider two forms of latent likelihood test, based on full and partial likelihood respectively, which differ in terms of the amount of information imputed for the test.

Full-trajectory LLRT
This analysis is achieved through B investigating π 0 (θ 0 , x|y), where x is the complete trajectory of the epidemic (the waiting times and locations of the exposure, infection and removal events not considering the infection tree).The MCMC algorithm used to do this is standard (for example, [10,24,30,29,8,12,5,27,23]) and is summarised in Electronic Appendix 1.For each sample (θ 0 , x), the MLE θ1 is computed using the optimisation routine described in Electronic Appendix 2, and the algorithm is implemented as in Section 3. The test statistic used is the full likelihood ratio, as detailed in Step 2 in Section 3.

Partial LLRT
In this setting, Observer B investigates π 0 (θ 0 , x|y) but Observer C is provided only with θ 0 and z, where z incorporates for each exposure event, j: • the sets of locations of susceptible and infectious individuals, S(t j −), I(t j −) immediately prior to the time of the event, t j ; • the location of the exposed individual, x j ∈ S(t j −).
The times or even the order of the exposure events are not included in z though some restrictions on the latter will follow from z.Let G 0 (θ 0 , z) be defined by where |S(t j −)| denotes the cardinality of S(t j −).An analogous partial likelihood for M 1 with kernel function K 1 and parameter θ 1 is given by Then, if θ1 maximises G 1 (θ 1 , z) we can define a partial likelihood ratio statistic This statistic is used in place of the full likelihood ratio in Step 2 in Section 3. We may motivate the partial LLRT from the perspective of reinforcement.The partial LLRT requires that only θ 0 and z are imputed by B for its calculation.Thus, the impact of reinforcement of M 0 may be lessened.Moreover, if detection of a possibly misspecified kernel is the goal, then T partial (θ 0 , z) is a statistic which 'focuses' on this aspect of the model.It is therefore possible that the partial LLRT, at least in some circumstances, may be more effective in eliciting evidence of a misspecified kernel than the full-likelihood LLRT.Moreover, the partial LLRT is a natural comparator for the ILR test used in [20], as both tests utilise the same information.For both of these tests, (θ 0 , z) is necessary and sufficient for computation of the test result.
In the next section we consider the ability of the ILR, and the two LLRTs to detect misspecification of the transmission kernel in a spatio-temporal epidemic model in a simulation study.

Simulation study
In keeping with the assumptions of [20], we assume that the observations y record the transitions from E to I and from I to R, but that exposure events are not recorded.Epidemics are simulated in an initially totally susceptible population uniformly distributed over a square region of size 2000 × 2000 units.Both primary and secondary infection are present and an exponentially decaying spatial kernel function of the form is assumed, where x and y denote the positions of two hosts.We assume that the sojourn times in the E and I classes follow Gamma distributions with means and variances µ E , µ I and σ 2 E , σ 2 I respectively.Table 1 lists the parameter values used to simulate the data.These parameters are based on those used in the simulation study of the ILR test in [20], to allow comparison with the simulation study therein.Starting from an entirely susceptible population, the epidemic is simulated until complete infection of the population.Four different parameter sets are used -a baseline scenario, and the same parameter set with α, β and κ respectively increased to twice the baseline value.The baseline set of parameter values, and the modified values, are given in Table 1.
For each simulated epidemic, each test was applied using 3 different observation windows corresponding to the intervals up to which 100%, 70% or 40% of the population was observed to be infected.The likelihood-based tests only allow for estimation of the posterior expectation of an imputed p-value, and we therefore use the posterior expectation as the summary measure of evidence for all the tests (even though the full posterior can be explored for the ILR test).
To each simulated data set y we fit two separate misspecified models with isotropic kernel functions: where d denotes the Euclidean distance between x and y.In the former case, infective challenge decreases according to a power-law, while in the latter case the Gaussian kernel is exponentially bounded.Informally, we may consider the first kernel to represent a more severe degree of misspecification, in comparison to the real exponential kernel, than the second one.We may anticipate that tests should find more evidence against the assumptions when the power-law kernel is fitted.The fitted model, whose adequacy is to be tested, will be referred to as M 0 .
The simulated data are generated in all data-sets from an exponential kernel.This model is referred to as M 1 , and will be the model that M 0 is compared against in the LLR tests.This exponential kernel is given by: In all cases we use non-informative prior distributions for the parameters in the fitted model as follows: An Unif(0, M) uniform prior was used for α, µ E , σ 2 E , µ E , σ 2 E , where M ≈ 1.7 × 10 308 is the computer limit for double precision floating-point numbers in C++.
The prior distributions used for the other parameters were: The results of the simulation study are presented in Table 2 and in Figures 2, 3,4,5 and 6.Some obvious trends can be seen.
• The ILR based test consistently finds very strong evidence against the model when the power-law kernel is wrongly fitted.However, no evidence emerges when the exponentially bounded, Gaussian kernel is fitted to the observations, with the mean p-value being close to 0.5.This suggests that the ILR test may be insensitive to misspecification if the degree of discrepancy is modest.When the data are simulated using the larger value of κ (so that secondary infection tends to occur over short range) and the power-law kernel is fitted, the evidence against the assumptions is strongest when only 40% of the hosts are infected.This may be due to the short-range secondary infection being most apparent during the early stages of the epidemic where the pattern of infection is clearly formed from isolated foci (caused by primary infection) surrounded by clustered secondary infections.As a result, residuals from the early stage of the epidemic (when the potential choice of exposure locations is widest) may display the greatest evidence against the assumed model and inclusion of residuals from later in the epidemic may serve to dilute this evidence.Simulated epidemics are presented in electronic Appendix 3 to illustrate this point.Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.
as posterior predictive checking.In particular, we have compared the ability of the infection-link residuals introduced by Lau at al [20] to detect kernel misspecification with that of tests based on latent likelihood ratio tests.The simulation study uses data in which the transition into the I and R states are observed, but can be easily adapted to snapshot data, data with under-reporting and other forms of data censoring [9], where epidemic model selection is often hindered by computational complexity.
The results demonstrate that the former approach performs well when the degree of model misspecification in high -that is when a power-law kernel is assumed when the true kernel is exponential -but is unable to detect evidence when the true and assumed kernels are qualitatively more similar.On the other hand, a test based on a full latent likelihood is able to elicit evidence of the more subtle misspecification.
The results point to an interesting phenomenon regarding the use of classical tests applied to imputed processes.Since the additional data are imputed using the misspecified model, it need not follow that basing the testing on more data leads to more power.In certain cases, the ILR methodology applied to the emergent phase of the epidemic only provides more evidence of discrepancy than when the full trajectory is used.This in turns leads to the notion of how best to design a latent experiment.How can one use prior belief on model parameters to predict (before data are considered) which form of latent test will be best able to detect a suspected mode of misspecification?Answering this question is a challenge which we seek to address in ongoing work.Nevertheless, we suggest that the techniques presented in this paper can offer readily implementable ways of checking model assumptions while avoiding the complexities and instabilities associated with a purely Bayesian approach.

Figure 1 :
Figure 1: Diagram of motivation for the infection link residual (ILR) r2k

Figure 5 :
Figure 5: Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: Bar chart of the expected posterior p-values obtain for the data set β × 2, where "Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.

Figure 6 :
Figure6: Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: Bar chart of the expected posterior p-values obtain for the data set κ × 2, where "Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.

Table 1 :
Table of the parameters used in the generation of the simulated data-sets used in Section 5.

Table 2 :
Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: data-set, M 0 tested and estimated expected p-values from the infection link residuals test, LLR (full likelihood) and LLR (partial likelihood) Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: Bar chart of the expected posterior p-values obtain for the data set generated with the original parameters, but with a new random seed for the coordinates of the hosts, where "Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: Bar chart of the expected posterior p-values obtain for the data set generated with the original parameters, where "Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.Comparison of Latent Likelihood Ratio (LLR) test to Infection Link Residuals test: Bar chart of the expected posterior p-values obtain for the data set α × 2, where "Pow" denotes that a power law kernel was fitted and "Gauss" denotes that a Gaussian kernel was fitted.The simulated data were observed up to the time such that a set percentage of the population became infectious.This percentage is in brackets.