7.1 Introduction

In studies such as biological sciences, animal science, and agronomy, a common outcome of interest is the time at which an event of interest occurs. The main characteristic of these data is that the subjects/experimental units are usually observed for different periods of time until the event of interest occurs. These events of interest may be adverse events such as the death of an experimental unit and the cessation of lactation, or positive events such as the conception of a female’s offspring from a particular treatment and the onset of estrus in a female undergoing hormone treatment, among others. Because of the characteristics of these response variables, a “normal” distribution is often a poor choice for modeling the time at which the event of interest occurs. Exponential, log-normal, gamma, Weibull, and other more complex distributions that tend to be more common and are better choices for modeling these phenomena.

Fitting a generalized linear mixed model (GLMM) is a good option for analyzing these phenomena because the conditional response distribution of the random effects of this model has desirable properties. In this vein, it is conventional to speak of survival data and survival analysis, regardless of the nature of the event. Similar data also arise when measuring the time to complete a task, such as walking 50 meters, passing an agronomy exam, performing a sensory evaluation of coffee, and so on. The purpose of this chapter is to provide the reader with the essential language of linear models and the connection between GLMMs and survival analysis.

7.2 Generalized Linear Mixed Models with a Gamma Response

The gamma family of distributions encompasses continuous, nonnegative, right-skewed values. A gamma distribution has two nonnegative parameters –α and β –the probability density function of which is given by:

$$ f\left(y;\alpha, \beta \right)=\frac{1}{\varGamma \left(\alpha \right){\beta}^{\alpha }}{y}^{\alpha -1}{e}^{\left\{\raisebox{1ex}{$-y$}\!\left/ \!\raisebox{-1ex}{$\beta $}\right.\right\}},y\ge 0 $$

where \( \varGamma \left(\alpha \right)={\int}_0^{\infty }{t}^{\alpha -1}{e}^{-t} dt \) is the gamma function (Casella and Berger 2002). The mean and variance of a random gamma variable are E[Y] = αβ = μ and \( \mathrm{Var}\left[Y\right]=\alpha {\beta}^2=\raisebox{1ex}{${\mu}^2$}\!\left/ \!\raisebox{-1ex}{$\alpha $}\right. \), respectively. This density function can be rewritten in terms of the mean μ and the scale parameter ϕ = 1/α.

$$ f\left(y;\alpha, \beta \right)=\frac{1}{\varGamma \left(\frac{1}{\phi}\right){\left(\mu \phi \right)}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$\phi $}\right.}}{y}^{\frac{1}{\phi }-1}{e}^{\left\{\raisebox{1ex}{$-y$}\!\left/ \!\raisebox{-1ex}{$\mu \phi $}\right.\right\}},y\ge 0. $$

7.2.1 CRD: Estrus Induction in Pelibuey Ewes

Estrus induction in ewes is a very common practice carried out in livestock farms or at research centers. For this, an animal researcher uses gonadotropin-releasing hormone (GnRH), equine chorionic gonadotropin (eCG), and P4 in a controlled internal drug-releasing (CIDR) intravaginal device in female Pelibuey ewes (n = 78) with single, double, and triple lambing as treatments. In order to ensure that all animals were in good condition during the experiment, ewes received the same zootechnical management and feeding. For this experiment, the ewes were synchronized on the same day under a synchronization protocol. Table 7.1 presents the analysis of variance (ANOVA).

Table 7.1 Sources of variation and degrees of freedom

The variables evaluated in this experiment were the time of onset and duration of estrus (yij) in hours according to the type of calving. The variability among female sheep on weight, age, and body condition must be taken into account in the analysis. The data from this experiment can be found in the Appendix 1 of this book (Data: Pelibuey Sheep). Thus, the components of a gamma GLMM are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ij}\mid \tau {(r)}_{ij}\sim \mathrm{Gamma}\left({\mu}_{ij},\phi \right);i=1,2,3;j=1,\cdots, {r}_i.\\ {}r{\left(\tau \right)}_{ij}\sim N\left(0,{\sigma}_{\tau \left(\mathrm{animal}\right)}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij}=\mu +{\tau}_i+\tau {(r)}_{ij} $$
$$ \mathrm{Link}\ \mathrm{function}:\log \left({\mu}_{ij}\right)={\eta}_{ij} $$

where ηij is the ith link function for treatment i (type of birth angle, double or triple) in ewes j, μ is the overall mean, τi is the fixed effect due to type of birth (treatment), r(τ)ij is the random effect due to type of birth (treatment) in ewes j with \( \tau {(r)}_j\sim N\left(0,{\sigma}_{\tau \left(\mathrm{animal}\right)}^2\right) \).

The following GLIMMIX program fits the model

proc glimmix nobound method=laplace; class animal birthtype; model Inestro = birthtype/dist=gamma; random birthtype (animal); lsmeans birthtype/lines ilink; run;

Part of the results is reported in Table 7.2.

Table 7.2 Results of the analysis of variance

Subsection (a) shows the estimated variance components due to the type of parturition used in females \( \left({\hat{\sigma}}_{\mathrm{birthtype}\left(\mathrm{animal}\right)}^2=-0.0157\left(\pm 0.0837\right)\right) \) as well as the scale parameter \( \left(\hat{\phi}=0.06668\right) \).

Table 7.2 (b) shows the results of the hypothesis tests for type III fixed effects, which indicate that there is a statistically significant effect of treatment (type of birth) on the time of onset and duration of ewe estrus.

The last two columns of Table 7.3, labeled “Mean” and “Standard error,” correspond to the means (μij) on the data scale for the ewes’ mean onset and duration of estrus with their respective standard errors. For example, the mean time to onset of estrus in single-birth ewes was 26.87 ± 1.78 hours, whereas for double- and triple-birth ewes, it was 21.37 ± 0.98 and 21.1 ± 0.95, respectively. On the other hand, the average time (in hours) of estrus duration was longer in double- and triple-birth ewes (14.46 ± 1.69 and 16.56 ± 1.63, respectively) compared to single-birth ewes (5.38 ± 0.81).

Table 7.3 Means and standard errors on the model scale (“Estimate” column) and the data scale (“Mean” column) for the onset and duration of estrus in Pelibuey ewe lambs

7.2.2 Randomized Complete Block Design (RCBD): Itch Relief Drugs

A total of 10 male volunteer patients between 20 and 30 years of age participated as a study group to compare 7 treatments (Trts) (5 drugs, 1 placebo, and 1 no drug) to relieve their itching. Since each subject responded differently to each drug, and, in addition, each subject received a different treatment in the 7 days of study, each of the subjects can be considered a block. Treatment assignment was randomized across days. Except for the drug-free day, subjects were administered the treatment intravenously, and, then, their forearms were induced to itch using an effective itch stimulus called cowage. The duration of itching, in seconds, was recorded. The data are shown in Table 7.4.

Table 7.4 Time taken to get rid of the itch

From left to right, the drugs used were papaverine = Papv, morphine = Morp, aminophylline = Amino, pentobarbital = Pent, and tripelennamine pentobarbital = Tripel.

The analysis of variance table (Table 7.5) shows the sources of variation and degrees of freedom for this experiment.

Table 7.5 Sources of variation and degrees of freedom

The components of the GLMM with a gamma response are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ij}\mid r{\left(\alpha \beta \right)}_{ij k}\sim \mathrm{Gamma}\left({\mu}_{ij},\phi \right);i=1,\cdots, 7;j=1,\cdots, 10.\\ {}{r}_j\sim N\left(0,{\upsigma}_{\mathrm{patient}}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij}=\mu +{r}_j+{\tau}_i $$
$$ \mathrm{Link}\ \mathrm{function}:\log \left({\mu}_{ijk}\right)={\eta}_{ijk} $$

where ηij is the predictor with treatment i and block j, μ is the overall mean, rj is the random effect of the patient with \( {r}_j\sim N\left(0,{\sigma}_{\mathrm{patient}}^2\right) \), and τi is the fixed effect due to treatment.

Note, although the exponential and gamma distributions have a canonical link equal to the inverse of the mean, the gamma and exponential GLMMs most often use a computationally more stable link (link = log), which was used in this and in the previous analysis.

The following GLIMMIX syntax adjusts a GLMM into complete blocks.

proc glimmix nobound method=laplace; class Patient Trt; model y = Trt/dist=gamma; random Patient; lsmeans Trt/lines ilink; run;

The statistics of the conditional model (Pearsons chi − squre/DF = 0.08) as well as the variance components (Patient) and the scale parameter \( \left(\hat{\phi}\right) \) of the model indicate that the gamma model adequately describes the dataset (Table 7.6 parts (a) and (b)). The analysis of variance (Table 7.6 part (c)) indicates that there is a highly significant difference of treatments in the mean time of itch duration (P = 0.0030).

Table 7.6 Results of the analysis of variance

The dispersion observed in the following plot (top left) of the residuals versus the linear predictor value suggests that the variance is constant and homogeneous (Fig. 7.1). The histogram (upper right) shows a nearly symmetrical pattern with little bias. Furthermore, the residuals versus quantile plot (bottom left) shows no marked deviations, indicating that the fit is adequate. Finally, the bottom right plot shows that the average residuals are zero and vary between −0.5 and 0.75.

Fig. 7.1
4 graphs titled conditional residuals. 1. A scatterplot plots residual versus linear predictor with scattered dots. 2. A histogram plots percent versus residual with a bell-shaped curve. 3. A scatterplot plots residual versus quantile with an increasing trend. 4. A box plot for the residual value.

Conditional residuals

The “lsmeans” on the data scale, for each of the five treatments, placebo, and the control treatment, are shown under the “Mean” column with their respective “Standard error” in Table 7.7. Each of the five drugs appear to have a significant effect compared to the placebo and control. Papaverine (Papv) is the most effective drug. Both the placebo and control treatment have statistically similar means. The relatively large difference in the placebo group suggests that some patients responded negatively to the placebo compared to the control, whereas others responded positively.

Table 7.7 Means and standard errors on the model scale (“Estimate” column) and the data scale (“Mean” column) for the average duration time of the itch

Figure 7.2 shows that the drug papaverine significantly reduced the itching time, followed by the drugs aminophylline and morphine, whereas the efficacies of the drugs pentobarbital and tripelennamine were highly similar to each other in eliminating itching.

Fig. 7.2
A bar graph with error plots average time versus treatment. The bars are titled amino, morp, papv, pento, placebo, sindroga, and tripel. The bar titled placebo has the highest peak, while the bar titled papv has the lowest peak.

Average time taken to eliminate itching

7.2.3 Factorial Design: Insect Survival Time

This experiment consisted of studying the effectiveness of four different types of insecticides (Insec1, Insec2, Insec3, and Insec4) at three different concentration levels (low, medium, and high) in the survival time (in hours) of a particular species of beetles (Appendix 1: Data: Beetles). The interaction between both factors (insecticide * dose) yielded a total of 12 combinations (treatments). The objective of this study was to compare the insecticides, dose, and interaction with beetle survival time. Due to the intrinsic characteristics of each of the insects, these must be considered as a source of variation in the experiment, since they respond differently to certain stimuli. Assuming that 48 beetles are available, they were randomly assigned equally to 4 groups (blocks) with 12 treatment combinations. That is, four beetles were randomly assigned to each treatment.

The sources of variation and degrees of freedom for this experiment are shown in the following analysis of variance table (Table 7.8).

Table 7.8 Sources of variation and degrees of freedom

The components of the gamma-response GLMM are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ijk}\mid {r}_k\sim \mathrm{Gamma}\left({\mu}_{ijk},\phi \right);i=1,\cdots, 4;j=1,2,3;k=1,\cdots, 4.\\ {}{r}_k\sim N\left(0,{\upsigma}_{\mathrm{block}}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij k}=\mu +{r}_k+{\alpha}_i+{\beta}_j+{\left(\alpha \beta \right)}_{ij} $$
$$ \mathrm{Link}\ \mathrm{function}:\log \left({\mu}_{ijk}\right)={\eta}_{ijk} $$

The following GLIMMIX command adjusts a GLMM with a gamma response.

proc glimmix nobound method=laplace; class dose insecticide insect; model time = dose|insecticide/dist=gamma; random insect; lsmeans dose|insecticide/lines ilink; run;

Part of the Statistical Analysis Software (SAS) output is shown in Table 7.9. The value of the conditional model’s Pearsons chi − square/DF = 0.04 indicates that the gamma distribution adequately models the data. The estimated variance component for blocks and the scaling parameter given by the “residual” value are shown below (in part (b)) \( \left({\hat{\upsigma}}_{\mathrm{block}}^2=-0.00173,\mathrm{and}\ {\hat{\upsigma}}^2=0.04155,\mathrm{respectively}\right) \).

Table 7.9 Results of the analysis of variance

The analysis of variance in (c) of Table 7.9 indicates that the insecticides and dose (P = 0.0001) have different significant effectiveness (toxicity) on beetle survival time. However, the interaction between both factors is close to significance (P = 0.0868). The “lsmeans” values on the data scale for dose \( {\hat{\mu}}_{i..} \) (part (a)) and insecticide \( {\hat{\mu}}_{.j.} \) (part (b)) with their respective standard errors for both factors are listed under the columns titled “Mean” and “Standard error mean” of Table 7.10, respectively.

Table 7.10 Means and standard errors on the model scale (“Estimate”) and the data scale (“Mean”) for the factor dose and type of insecticide

The combination of levels of both factors affected the average survival time of the beetles (Table 7.11). For insecticides 1 and 3 at a high dose, the survival time was lower with average times of 2.1 ± 0.209 and 2.35 ± 0.334 hours, respectively. In general, low values of survival times were observed for insecticides 1 and 3 compared to insecticides 2 and 4.

Table 7.11 Means and standard errors on the model scale and the data scale for the interaction between dose and type of insecticide

7.2.4 A Split Plot with a Factorial Structure on a Large Plot in a Completely Randomized Design (CRD)

Four samples were obtained from each of two batches (Reps) of unprocessed gum from Acacia sp. Trees, with eight samples in total. Within each batch, the four samples were randomly assigned to combinations of two factors with two levels each. The first factor refers to whether the gum was demineralized or not, and the second factor refers to whether the gum was pasteurized or not. An emulsion made from each gum sample was divided into three smaller parts, which were randomly assigned to the levels of a third factor, the PH, and pH was adjusted to 2.5, 4.5, or 5.5 using citric acid (Appendix 1: Data: Gum Breakdown Times).

This is a split-plot design, with whole plots and rubber samples in a block arrangement. The combined levels of demineralization and pasteurization of the paste are large (whole) plot factors. The split plots are the smaller parts, with a specific pH, which is the only split-plot factor. The response measured (y) was the time to break, i.e., the time (in hours) until the emulsion failed. The sources of variation and degrees of freedom for this experiment are shown in Table 7.12.

Table 7.12 Sources of variation and degrees of freedom

The components of the GLMM with a Gamma response are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ijkl}\mid {r}_l,\alpha \beta {(r)}_{ijl}\sim \mathrm{Gamma}\left({\mu}_{ijkl},\phi \right);i=1,2;j=1,2;k=1,2,3;l=1,2.\\ {}{r}_l\sim N\left(0,{\sigma}_r^2\right),\alpha \beta {(r)}_{ijl}\sim N\left(0,{\sigma}_{r\alpha \beta}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij k l}=\mu +{\alpha}_i+{\beta}_j+{\left(\alpha \beta \right)}_{ij}+r{\left(\alpha \beta \right)}_{ij l}+{\gamma}_k+{\left(\alpha \gamma \right)}_{ik}+{\left(\beta \gamma \right)}_{jk}+{\left(\alpha \beta \gamma \right)}_{ij k}; $$

where αi, βj, and γk are the fixed effects due to the factors demineralization, pasteurization, and pH, respectively; the effects (αβ)ij, (αγ)ik, (βγ)jk, and (αβγ)ijk are the two- and three-way interactions of the factors under study; and αβ(r)ijl are random effects due to the demineralization x pasteurization x rep interaction, assuming that \( \alpha \beta {(r)}_{ijl}\sim N\left(0,{\sigma}_{r\alpha \beta}^2\right) \).

$$ \mathrm{Link}\ \mathrm{function}:\log \left({\mu}_{ijk}\right)={\eta}_{ijk} $$

The GLIMMIX commands for setting this GLMM are as follows:

proc glimmix nobound method=laplace; class Batch Demineralization Pasteurization pH; model y = Demineralization|Pasteurization|pH/dist=gamma; random batch(Demineralization*Pasteurization); lsmeans Demineralization|Pasteurization|pH/lines ilink; run;

The relevant results from the SAS output are shown in Table 7.13. The value of the conditional model \( \frac{\chi^2}{DF}=0.01 \) indicates that the gamma distribution does not cause overdispersion. The variance component due to blocks × demineralization × pasteurization \( {\hat{\sigma}}_{r\left(\alpha \beta \right)}^2 \) and the scale parameter \( \hat{\phi} \) are shown in (b).

Table 7.13 Results of the analysis of variance

The hypothesis tests for type III fixed effects are presented in part (c) of Table 7.13, where a significant effect of the factors demineralization, pasteurization, and pH as well as the interaction between demineralization with pasteurization are observed on the gum. However, the interactions demineralization*pH (P = 0.0676) and demineralization*pasteurization*pH are close to significance (P = 0.0535). The emulsion breaking time is strongly affected by no demineralization (demineralization = 1) and no pasteurization (pasteurization = 1) of the gum and, to a lesser extent, by the pH adjusted to the gum (Table 7.14).

Table 7.14 Means and standard errors of the main effects on the model scale (Estimate) and the data scale (Mean)

Analyzing the simple effects of the factors, we can observe that when the gum has not been pasteurized (B = 1), the average emulsion break time is very similar in the demineralized paste than in the non-demineralized paste at the three pH levels. However, when the gum has been pasteurized, demineralization has a significant impact on the emulsion breakup time; for example, for a paste that is not demineralized and pasteurized (A1B2), the emulsion breakup time is much lower than when the gum has been demineralized and pasteurized (A2B2) at all three pH levels. Finally, with a demineralized, pasteurized gum at pH = 4.5, a gum with higher breaking stability is obtained (Table 7.15).

Table 7.15 Means and standard errors of the simple effects on the model scale (Estimate) and the data scale (Mean)

7.3 Survival Analysis

When a research focuses on the time of occurrence of a specific event, we usually refer to survival times, and, hence, the statistical analysis of these times, as mentioned above, is known as survival analysis. A very characteristic feature of survival times is the presence of censored times, that is, when there are individuals whose actual survival time is not known.

For a set of survival times (including censored ones) of a sample of individuals, it is possible to estimate the proportion of the population that will survive a time interval under the same circumstances. The methods used to make this estimate are based on the proposal of Kaplan and Meier (1958). This method allows – through different statistical tests (log rank, Breslow, Tarone–Ware, etc.) – the comparison of the survival of two or more groups of individuals who differ with respect to certain factors.

Survival analysis focuses its interest on a group or several groups of individuals for whom an event is defined, which occurs after a time interval. To determine the time of interest, there are three requirements: an initial time, a scale to measure the passage of time (minutes, hours, days, etc.), and clarity about what is meant by the event of interest.

Survival of an individual is conceptually the probability of being alive in a given time "t" from diagnosis, i.e., initiation of treatment or complete remission for a group of individuals. In clinical studies, survival times often refer to time till death, development of a particular symptom, or relapse after complete remission of a disease. Failure is defined as death, relapse, or the occurrence of a new disease. In many survival analyses, when the end of the observation period previously set by the investigator is reached, there are individuals to whom the event has not occurred and we do not know when it will occur. Therefore, the actual survival time for them is unknown, and only the survival time to the end of the study is known. Such survival times are called censored times. It also happens, in some cases, that some individuals do not continue the study until the end of the analysis period for reasons unrelated to the research, e.g., death from other causes; these times are also censored. These censored data contribute valuable information and, therefore, should not be omitted from the analysis.

The pharmaceutical and food industries are legally required to label the shelf life of their product on the packaging. For pharmaceuticals, the requirements for how to determine shelf life are highly regulated. However, the regulatory standards do not specifically define shelf life. Instead, the definition is implicit through the estimation procedure. The interest is in the situation where multiple batches are used to determine a shelf life of a product that applies to all future batches. Consequently, both shelf life and label life are of great importance because of the variability within and between batches. Product development must be very well thought out before a company can have confidence in shelf life estimates. The company must be able to reliably produce a homogeneous product from batch to batch of ingredients, as physical and chemical factors impact the ability of bacteria to grow, such as pH, water activity, and uniformity of the mix (moisture distribution, salt, preservative or food acid) and, consequently, the shelf life of the product. Therefore, products should be inspected at appropriate times and samples should be tested for critical stability of physical and chemical characteristics. These tests also provide an opportunity to begin microbiological testing for spoilage organisms. Testing should continue beyond the intended shelf life unless the product fails earlier. Testing should lead to an understanding of target levels and ranges of ingredients for evaluation of the critical physical and chemical characteristics of the product over the intended shelf life.

Survival analysis is the name for a collection of statistical techniques used to describe and quantify the time in which the event of interest occurs. The term “survival time” specifies the amount of time taken to occur. Situations in which survival analyses have been used in epidemiology include:

  1. (a)

    Survival of insects after having received an insecticide.

  2. (b)

    The time taken by cows or ewes to conceive after calving.

  3. (c)

    The time taken for a farm to experience its first case of an exotic disease.

7.3.1 Concepts and Definitions

To clearly understand and interpret a rate of change calculated from the event data of interest, a more extensive approach is needed. The definition of a rate of change begins with the mathematical description of a changing pattern over time, represented by the symbol S(t). A version of a ratio is created by dividing the change in function S(t)[S(t) to S(t + Δt)] by the corresponding change over time t(t to t + Δt) producing the rate of change

$$ \mathrm{rate}\ \mathrm{of}\ \mathrm{change}=\frac{\mathrm{change}\ \mathrm{on}\ S(t)}{\mathrm{change}\ \mathrm{on}\ \mathrm{time}}=\frac{S(t)-S\left(t+\Delta t\right)}{\left(t+\Delta t\right)-t}=\frac{S(t)-S\left(t+\Delta t\right)}{\Delta t} $$

Rates of change, with respect to time, apply to a variety of situations, but one specific function, traditionally denoted by S(t), is fundamental to the analysis of survival data. This is called the survival function and is defined as the probability of surviving (probability of survival) beyond a specific point in time (denoted by t). That is;

$$ {\displaystyle \begin{array}{c}S(t)=P\left(\mathrm{survival}\ \mathrm{time}=0\ \mathrm{at}\ \mathrm{time}=t\right)\\ {}=P\left(\mathrm{survival}\ \mathrm{in}\ \mathrm{the}\ \mathrm{in}\mathrm{terval}\ \left[0,t\right]\right)\end{array}} $$

Equivalent to

$$ S(t)=P\left(\mathrm{surviving}\ \mathrm{beyond}\ \mathrm{time}\ t\right)=P\left(T\ge t\right)=1-F(t) $$

where F(t) is the cumulative distribution function with F(t) = P(T ≤ t). Another important concept in survival analysis is the hazard function h(t). The hazard function that depends on T is defined as

$$ h(t)=\underset{\Delta t\to 0}{\lim}\left\{\frac{P\left(t\le T<t+\Delta t|T\ge t\right)}{\Delta t}\right\} $$

such that the following expression can be expressed as

$$ h(t)=\underset{\Delta t\to 0}{\lim}\left\{\frac{F\left(t+\Delta t\right)-F(t)}{\Delta t}\right\}\times \frac{1}{P\left(T\ge t\right)} $$
$$ h(t)=\frac{f(t)}{S(t)} $$

where f(t) is the probability density function. Any distribution defined by t ∈ [0, t) can serve as a survival distribution. Consequently,

$$ h(t)=-\frac{\partial }{t}\left\{\log S(t)\right\}. $$

It then follows that

$$ S(t)=\exp \left\{-H(t)\right\} $$

where H(t) the cumulative hazard function

$$ H(t)=\int_0^th(u) du $$

Another useful relationship is

$$ H(t)=-\log S(t). $$

For the simplest model, the exponential model with h(t) = λ (λ is a constant), the survival function is given by

$$ S(t)=\exp \left\{-\int_0^th(u) du\right\}=\exp -\int_0^t\lambda du={e}^{-\lambda t} $$

with the probability density function given by

$$ f(t)=\frac{\partial }{t}S(t)=\lambda {e}^{-\lambda t}. $$

Thus, the survival function, hazard function, and cumulative risk for the exponential model is given by:

$$ \mathrm{Survival}\ \mathrm{function}:S(t)={e}^{-\lambda t} $$
$$ \mathrm{Risk}\ \mathrm{function}:h(t)=\frac{f(t)}{S(t)}=\frac{\lambda {e}^{-\lambda t}}{e^{-\lambda t}}=\lambda $$
$$ \mathrm{Cumulative}\ \mathrm{risk}\ \mathrm{function}:H(t)=\int_t^th(u) du=\int_0^t\lambda du=\lambda t. $$

7.3.2 CRD: Aedes aegypti

The objective of this experiment was to test the vulnerability of Aedes aegypti mosquitoes to different fungal treatments (four treatments). A bioassay was conducted to determine the survival time of each of the mosquitoes. Three-day-old mosquitoes were maintained after hatching in 45-cm rearing cages with access to water but not food. The mosquitoes were kept in rearing cages with water and fed warm pig blood (37 °C) through a natural membrane (sausage casing) approximately every 3 days and allowed to oviposit freely during the waiting period. A total of 10 mosquitoes were placed in a chamber to which one of the treatments (four) plus a control was applied. Here, we present part of the data from a bioassay with four replicates. The complete data from this trial can be found in the Appendix 1 (Data: Aedes aegypti).

Treatment

Rep

Y

C

1

8

C

1

11

C

4

20

Mam

1

2

Mam

1

2

MaS

1

3

MaS

1

3

MaC

1

2

MaC

1

2

MaC

1

2

Ma1

1

2

Ma1

1

2

Ma1

4

11

The components of this GLMM are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ij}\mid {\mathrm{rep}}_j\sim \mathrm{Gamma}\left({\mu}_{ij},\phi \right)\\ {}{\mathrm{rep}}_j\sim N\left(0,{\sigma}_{\mathrm{rep}}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij}=\eta +{\tau}_i+{\mathrm{rep}}_j $$
$$ \mathrm{Link}\ \mathrm{function}:{\eta}_{ij}=\log \left({\mu}_{ij}\right) $$

where η is the intercept, τi is the treatment effect, and repj is the random effect due to the mosquito chamber assuming \( {\mathrm{rep}}_j\sim N\left(0,{\sigma}_{\mathrm{rep}}^2\right). \)

The following GLIMMIX commands adjust a GLMM with a gamma response:

proc glimmix data=mosquitos method=laplace; class bio trt rep; model y = trt/dist=gamma; random rep; lsmeans trt/lines ilink; run;

Part of the output is shown in Table 7.16. The statistic in (a) above indicates that there is no over-dispersion in the fit of the data, as indicated by Pearsons chi − square/DF = 0.18. The analysis of variance (type III tests of fixed effects) indicates that there is a highly significant effect (P = 0.0001) of the fungal treatments on the mean mosquito survival time.

Table 7.16 Results of the analysis of variance

The relevant information in Table 7.17 “lsmeans” comes from the columns labeled “Estimate” and “Mean”: these are the estimates on the model scale and the data scale, and the average survival time in each of the treatments is represented by \( {\hat{\mu}}_i\ \left(\pm \mathrm{standard}\ \mathrm{error}\right) \).

Table 7.17 Means and standard errors of the main effects on the model scale (Estimate) and the data scale (Mean)

The estimated risk function for each treatment combination is \( {\hat{\lambda}}_i=\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{${\hat{\mu}}_i$}\right.. \) For example, for treatment Ma1, the estimated hazard function is \( {\hat{\lambda}}_{Ma1}=\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$3.4223$}\right.=0.2922. \) We can manually calculate these values from the Mean column or we can automate the process by adding the command “ods output lsmeans = mu” in the GLIMMIX program above. Once we have saved the treatment means, we can ask SAS to estimate the estimated hazard function for the treatments. The commands are as follows:

data hazard; set mu; hazard=1/mu; proc print data=hazard; run;

The results are listed below in Table 7.18. The hazard column contains the estimated hazard functions for each treatment \( {\hat{h}}_i(t)={\hat{\lambda}}_i \).

Table 7.18 Means and standard errors of the main effects on the model scale (Estimate) and the data scale (Mean) and the hazard function \( {\hat{\lambda}}_i \)

From the values \( {\hat{\lambda}}_i \), we can calculate the estimated survival function \( {S}_i(t)={e}^{-{\hat{\lambda}}_it} \) for each of the treatments. Figure 7.3 shows the probability of survival over time obtained with \( {S}_i(t)={e}^{-{\hat{\lambda}}_it} \) of each of the proposed treatments and the control. Clearly, the treatments MaS, Ma1, MaC, and Mam showed a greater efficacy in the biological control of these mosquitoes.

Fig. 7.3
A line graph plots survival probability versus time. The lines are titled M a 1, M a C, M a S, M a m, and Control. All 5 lines illustrate decreasing trends. The line titled control illustrates the highest peak. The line titled M a 1 illustrates the lowest peak.

Estimated survival probability for each treatment

7.3.3 RCBD: Aedes aegypti

Similar to the previous example, this experiment consisted of testing the vulnerability of Aedes aegypti mosquitoes to different fungal treatments (four treatments). For this, two bioassays were conducted to determine the survival time of each of the mosquitoes. Three-day-old mosquitoes were maintained after hatching in 45-cm rearing cages with access to water but not food. Mosquitoes were maintained in rearing cages with water and were fed warm pig blood (37 °C) through a natural membrane (sausage casing) approximately every 3 days. They were allowed to freely oviposit during the waiting period. A total of 10 mosquitoes were placed in a chamber to which one of the treatments (four) plus a control was applied. The data can be found in the Appendix 1 (Data: Aedes aegypti).

The components of this GLMM are as follows:

$$ {\displaystyle \begin{array}{c}\mathrm{Distributions}:{y}_{ijk}\mid {\mathrm{bio}}_j,\mathrm{rep}{\left(\mathrm{bio}\right)}_{k(j)}\sim \mathrm{Gamma}\left({\mu}_{ijk},\phi \right)\\ {}{\mathrm{bio}}_j\sim N\left(0,{\sigma}_{\mathrm{bio}}^2\right),\mathrm{rep}{\left(\mathrm{bio}\right)}_{k(j)}\sim N\left(0,{\sigma}_{\mathrm{rep}\left(\mathrm{bio}\right)}^2\right)\end{array}} $$
$$ \mathrm{Linear}\ \mathrm{predictor}:{\eta}_{ij}=\eta +{\tau}_i+{\mathrm{bio}}_j+\mathrm{rep}{\left(\mathrm{bio}\right)}_{k(j)} $$

where η is the intercept, τi is the treatment effect, bioj and rep(bio)k(j) are the random effects of the bioassay and the mosquito chamber within the bioassay, respectively, assuming \( {\mathrm{bio}}_j\sim N\left(0,{\sigma}_{\mathrm{bio}}^2\right) \) and \( \mathrm{rep}{\left(\mathrm{bio}\right)}_{k(j)}\sim N\left(0,{\sigma}_{\mathrm{rep}\left(\mathrm{bio}\right)}^2\right). \)

$$ \mathrm{Link}\ \mathrm{function}:{\eta}_{ij}=\log \left({\mu}_{ij}\right) $$

The following GLIMMIX program fits a block GLMM with a gamma response.

proc glimmix method=laplace nobound; class bio trt ind rep; model y = trt/dist=gamma; random bio rep(bio); ods output lsmeans=mu; lsmeans trt/lines ilink; run;quit;

The results obtained are shown below. Part of the statistics and variance components are listed in Table 7.19. In part (a), the value of the statistic of Pearson′s chi − square/DF = 0.34 and in part (b), the estimated variance components due to blocks, within-block replicates, and experimental error are \( {\hat{\sigma}}_{\mathrm{bio}}^2=0.1859,{\hat{\sigma}}_{\mathrm{rep}\left(\mathrm{bio}\right)}^2=0.02562,\mathrm{and}\ {\hat{\sigma}}^2=0.2822 \), respectively. The type III effect hypothesis tests (part (c)) indicate that there is a highly significant difference between treatments on the mean survival time, as indicated by P = 0.0001.

Table 7.19 Results of the analysis of variance

Tables 7.20 and 7.21 show the estimates on the model scale and the data scale, linear predictors \( \left({\hat{\eta}}_i\right) \), means \( \left({\hat{\mu}}_i\right) \) with their respective standard errors, and the estimated hazard function. The results indicate that the MaC treatment has a greater lethal effect than A. aegypti mosquito control.

Table 7.20 Means and standard errors of the main effects on the model scale (Estimate) and the data scale (Mean)
Table 7.21 Means and standard errors of the main effects on the model scale (Estimate), the data scale (Mean), and the hazard function \( {\hat{\lambda}}_i \)

Figure 7.4 shows the survival times for the different treatments tested. These curves were obtained with \( {S}_i(t)={e}^{\left(-{\hat{\lambda}}_i\ast t\right)} \).

Fig. 7.4
A line graph plots survival probability versus time. The lines are titled M a 1, M a C, M a S, M a m, and Control. All 5 lines illustrate decreasing trends. The line titled control illustrates the highest peak. The line titled M a C illustrates the lowest peak.

Estimated survival probability for each treatment

7.4 Exercises

Exercise 7.4.1

The investigation of this experiment focused on studying the times of animal incapacitation experienced after being exposed to the burning of eight types of aircraft interior materials (M1M9) and performances in milligram/gram combustion of seven gases (CO, HCN, H2S, HCl, HBr, NO2, SO2) (Spurgeon 1978). The recorded incapacitation time of the animal when exposed to different combustion materials (under the column “Material”) is found under the column “Time in minutes” and in the third column the value of (1000/Time); these data are shown below (Table 7.22):

Table 7.22 Time of incapacity of the animal when exposed to different combustion gases
  1. (a)

    Write down a statistical model of this experiment.

  2. (b)

    List all the components of the GLMM in (a).

  3. (c)

    Write down the null and alternative hypotheses associated with this experiment.

  4. (d)

    Construct an ANOVA table indicating the sources of variation and degrees of freedom.

  5. (e)

    Analyze the time of inability of the animal to be exposed to the gases of the different types of materials.

  6. (f)

    Comment on the results obtained.

Exercise 7.4.2

Cockroaches are responsible for 80% of infestations in spaces used by humans. They associate with humans and have the ability to contaminate food with their feces and secretions, having both medical and economic implications. Different insecticides have been formulated, mainly synthetic, and, in some cases, have led to the development of cockroaches’ resistance. This example deals with the study of survival in days (y) of this insect when exposed to two promising fungi in the biological control of this insect plus an already known control. The data for this example are shown below (Table 7.23):

Table 7.23 Results of the cockroach biological control experiment
  1. (a)

    Write down a statistical model of this experiment.

  2. (b)

    List all components of the GLMM from (a).

  3. (c)

    Write down the null and alternative hypotheses associated with this experiment.

  4. (d)

    Analyze the survival time of the insect when infected with the different types of fungi.

  5. (e)

    Comment on the results obtained.

Exercise 7.4.3

Consider a study on the effect of analgesic treatments (Trt) in elderly patients with neuralgia. Two test treatments (A and B) and a placebo (P) are compared. The response variable is whether the patient reported pain or not (yes = 1, n = 0). The investigators recorded the age (E) and sex (S) of 60 patients and the duration (time = T) in which the pain disappeared after starting the treatment. The data are presented in the Table 7.24 below.

Table 7.24 Results with neuralgia patients (Trt = Treatment, S = Sex, E = Age, T = Time, D = Pain with yes = 1 and no = 0)
  1. (a)

    List all components of the GLMM for this exercise.

  2. (b)

    Write down the null and alternative hypotheses associated with this experiment.

  3. (c)

    Construct an ANOVA table indicating the sources of variation and degrees of freedom.

  4. (d)

    Analyze the average time during which the patient experiences pain after starting the treatment. Are there any significant differences?

  5. (e)

    Comment on the results obtained.

Exercise 7.4.4

Refer to the previous exercise and perform an analysis of covariance.

  1. (a)

    List the linear predictor of this experiment.

  2. (b)

    Analyze the average time during which the patient experiences pain after starting the treatment using an analysis of covariance. Are there any significant differences?

  3. (c)

    Comment on the results obtained. Your results differ from those obtained in the previous year.