Generalized Linear Mixed Models for Repeated Measurements

Salinas Ruíz, Josafhat; Montesinos López, Osval Antonio; Hernández Ramírez, Gabriela; Crossa Hiriart, Jose

doi:10.1007/978-3-031-32800-8_9

5821 Accesses

Abstract

Repeated measures data, also known as longitudinal data, are those derived from experiments in which observations are made on the same experimental units at various planned times. These experiments can be of the regression or analysis of variance (ANOVA) type, can contain two or more treatments, and are set up using familiar designs, such as CRD (Completely Randomized design), randomized complete block design (RCBD), or randomized incomplete blocks, if blocking is appropriate, or using row and column designs such as Latin squares when appropriate. Repeated measures designs are widely used in the biological sciences and are fairly well understood for normally distributed data but less so with binary, ordinal, count data, and so on. Nevertheless, recent developments in statistical computing methodology and software have greatly increased the number of tools available for analyzing categorical data.

You have full access to this open access chapter, Download chapter PDF

9.1 Introduction

Repeated measures data, also known as longitudinal data, are those derived from experiments in which observations are made on the same experimental units at various planned times. These experiments can be of the regression or analysis of variance (ANOVA) type, can contain two or more treatments, and are set up using familiar designs, such as completely randomized design (CRD), randomized complete block design (RCBD), or randomized incomplete blocks, if blocking is appropriate, or using row and column designs such as Latin squares when appropriate. Repeated measures designs are widely used in the biological sciences and are fairly well understood for normally distributed data but less so with binary, ordinal, count data, and so on. Nevertheless, recent developments in statistical computing methodology and software have greatly increased the number of tools available for analyzing categorical data.

A generalized linear mixed model (GLMM) is one of the most useful and sophisticated structures in modern statistics, as it allows complex structures to be incorporated into the framework of a general linear model. Fitting such models has been the subject of much research over the last three decades. GLMMs, for repeated measures, combine both generalized linear model (GLM) theory (e.g., a binomial, multinomial, or Poisson response variable) and linear mixed effects models.

Experimentation is sometimes not well understood since researchers believe that it involves only the manipulation of the levels of independent variables and the observation of subsequent responses in dependent variables. Independent variables, whose levels are determined or set by the experimenter, are said to have fixed effects, although random effects are also very common, where the levels of the effects are assumed to be randomly selected from an infinite population of possible levels. Many variables of interest in research are not fully amenable to experimental manipulation but can nevertheless be studied by considering them to have random effects. For example, the genetic composition of individuals of a species cannot be manipulated experimentally, but it is of great interest to geneticists aiming to assess the genetic contribution to individual variation of some specific behaviors.

A GLMM with repeated measures is a generalization of the standard linear model, and this generalization is due to (1) the presence of more than one response variable that can be binary, ordinal, count, and so on and (2) the nonconstant correlation and/or variability exhibited by the data. The linear mixed model, therefore, gives you the flexibility to model not only the means of your data (as in the standard linear model) but also their variances and covariances. Usually, a normal distribution is assumed for random effects. Since normally distributed data can be modeled entirely in terms of their means and variances/covariances, the two sets of parameters in a linear mixed model actually specify the full probability distribution of the data. The parameters of the mean structure in the model are called (known as) fixed effects parameters, which can be qualitative (as in traditional analysis of variance) or quantitative (as in standard regression), and the parameters of the variance–covariance of the model are known as covariance parameters, which help distinguish a linear mixed model from the standard linear model. Covariance parameters come up quite frequently in the following applications, with two more typical scenarios:

(a)
Experimental units on which data are measured can be grouped into clusters, and data from a common cluster are correlated.
(b)
Repeated measurements of the same experimental unit are taken, and these repeated measurements correlate or show some variability.

The first scenario can be generalized to include a set of clusters nested within one another. For example, if students are the experimental unit, they can be grouped into classes (clusters), which, in turn, can be grouped into schools. Each level of this hierarchy may present an additional source of variability and correlation. The second scenario occurs in longitudinal studies, in which repeated measurements of the same experimental unit over time are taken. Alternatively, these repeated measures could be spatial or multivariate.

9.2 Example of Turf Quality

The proportional odds model, introduced by McCullagh (1980), was proposed as an extension of the generalized linear model used for ordinal responses. One can recall that the proportional odds model is a special case of a GLM with a cumulative link function in which the probability of an observation falling into a category or below is modeled. In the case of a logit link, with only two categories (a binary response), the proportional odds model reduces to a standard logistic regression or a classification model. As with any other type of response variable, repeated measurements are common in agronomic research. They result in clustered data structures with correlations between repeated observations in the same experimental unit that must be taken into account in the analysis.

The data were obtained from an experiment studying the turf quality of five grass varieties. The varieties were sown independently in 17 or 18 plots. The evaluations of the plots (experimental units) were carried out in the months of May, July, and September of the growing season, and turf quality was classified on an ordinal scale into three categories: low quality, medium quality, and excellent quality, as demonstrated in Table 9.1.

Table 9.1 Turf quality of five grass varieties (low, Med = medium, Excel = Excellent, Sept = September)

Bio	Day	Conc	Rep	Y	Bio	Day	Conc	Rep	Y
1	1	0	3	5.263	2	1	0	2	0.0016
1	1	0	4	5.263	2	1	0	3	14.285
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
1	2	0	2	1.935	2	2	500	2	31.506
1	2	0	3	4.516	2	2	500	3	42.465
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
1	3	0	2	1.234	2	3	500	3	35.042
1	3	0	3	3.703	2	3	500	4	24.786
1	4	0	3	4.672	2	4	500	2	23.123
1	4	500	1	19.626	2	4	500	3	27.927
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
1	5	0	3	4.065	2	5	500	1	13.253
1	5	0	4	4.065	2	5	500	2	21.285
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
1	6	1000	3	15.862	2	6	2000	1	31.197
1	6	1000	4	18.62	2	6	2000	2	29.173
1	6	2000	1	32.413	2	6	2000	3	29.848
1	6	2000	2	29.655	2	6	2000	4	30.522
1	6	2000	3	31.724
1	6	2000	4	35.172

Data: Feeding line experiment
Tray	Feeding	Run	Proportion	Tray	Feeding	Run	Proportion
1	H	1	0.18217	1	H	3	0.06818
2	H	1	0.15493	2	H	3	0.05874
3	H	1	0.15906	3	H	3	0.05757
4	H	1	0.15869	4	H	3	0.10349
5	H	1	0.14891	5	H	3	0.08564
6	H	1	0.17654	6	H	3	0.09359
7	H	1	0.12915	7	H	3	0.09706
8	H	1	0.12895	8	H	3	0.13188
9	H	1	0.16688	9	H	3	0.18477
10	H	1	0.11965	10	H	3	0.10966
11	H	1	0.21719	11	H	3	0.18069
12	H	1	0.20797	12	H	3	0.18182
1	L	1	0.70601	1	L	3	0.75524
2	L	1	0.68817	2	L	3	0.77249
3	L	1	0.68317	3	L	3	.
4	L	1	0.77805	4	L	3	0.84204
5	L	1	0.76692	5	L	3	0.81572
6	L	1	0.79127	6	L	3	0.79161
7	L	1	0.73653	7	L	3	0.81234
8	L	1	0.74939	8	L	3	0.81795
9	L	1	0.78773	9	L	3	0.8225
10	L	1	0.7381	10	L	3	0.79384
11	L	1	0.88486	11	L	3	0.8135
12	L	1	0.90401	12	L	3	0.83965
1	H	2	0.07547	1	H	4	0.07105
2	H	2	0.05801	2	H	4	0.05511
3	H	2	0.0565	3	H	4	0.05217
4	H	2	0.09579	4	H	4	0.10567
5	H	2	0.10954	5	H	4	0.0755
6	H	2	0.12154	6	H	4	0.0853
7	H	2	0.1144	7	H	4	0.09363
8	H	2	0.13728	8	H	4	0.11154
9	H	2	0.15012	9	H	4	0.16264
10	H	2	0.12113	10	H	4	0.09215
11	H	2	0.17633	11	H	4	0.1834
12	H	2	0.16408	12	H	4	0.21016
1	L	2	0.78318	1	L	4	0.76556
2	L	2	0.78418	2	L	4	0.78307
3	L	2	0.78589	3	L	4	0.76486

Data: Nitrous oxide emission
A	T	T	F	pH	HE	TE	tx	tm	Hx	Hm	A	T	t	F	pH	HE	TE	tx	tm	Hx	Hm
3	1	1	108	8	85	20	17	16	69	68	2	1	1	4.13	6.9	87	20	19	19	67	66
2	2	1	15	4.7	86	20	17	16	69	68	3	2	1	82.4	7	87	22	19	19	67	66
1	3	1	33	5.5	85	20	17	16	69	68	4	3	1	51.4	6.4	87	20	19	19	67	66
4	4	1	23	6.4	84	21	17	16	69	68	1	4	1	704	95	20	19	19	67	66	170
3	1	2	−58	8	85	23	34	34	54	51	3	2	2	14	7	87	22	27	27	68	67
2	2	2	−45	4.7	86	24	34	34	54	51	4	3	2	130	6.4	87	24	27	27	68	67
1	3	2	−97	5.5	85	29	34	34	54	51	1	4	2	537	95	24	27	27	68	67	170
4	4	2	185	6.4	84	28	34	34	54	51	3	2	3	1.02	7	87	25	28	27	63	57
3	1	3	47	8	85	32	35	35	38	30	4	3	3	41.9	6.4	87	24	28	27	63	57
2	2	3	26	4.7	86	32	35	35	38	30	1	4	3	824	95	24	28	27	63	57	170
1	3	3	41	5.5	85	34	35	35	38	30	3	2	4	53.9	7	87	20	22	22	61	60
4	4	3	19	6.4	84	34	35	35	38	30	4	3	4	9.88	6.4	87	20	22	22	61	60
3	1	4	311	8	85	28	30	30	40	38	1	4	4	745	95	20	22	22	61	60	170
2	2	4	−29	4.7	86	27	30	30	40	38	3	2	5	92.7	7	87	20	22	22	65	62
1	3	4	37	5.5	85	28	30	30	40	38	4	3	5	187	6.4	87	20	22	22	65	62
4	4	4	204	6.4	84	27	30	30	40	38	1	4	5	591	95	20	22	22	65	62	170
3	1	5	6.8	8	85	22	27	27	51	43	3	2	1	7	6.6	85	20	20	18	80	74
2	2	5	−18	4.7	86	22	27	27	51	43	4	3	1	8.39	6.8	87	20	20	18	80	74
1	3	5	68	5.5	85	22	27	27	51	43	1	4	1	1367	86	20	20	18	80	74	170
4	4	5	91	6.4	84	22	27	27	51	43	3	2	5	−49	6.6	85	19	18	18	89	89
3	1	1	135	4.6	84	20	18	18	60	59	4	3	5	−91	6.8	87	19	18	18	89	89
2	2	1	1.4	4.3	85	20	18	18	60	59	1	4	5	711	86	19	18	18	89	89	170
1	3	1	55	6.5	85	20	18	18	60	59	3	2	1	108	7.1	86	20	17	16	87	87
4	4	1	18	6.1	85	21	18	18	60	59	4	3	1	15	6.8	87	20	17	16	87	87
3	1	5	5	4.6	84	22	24	24	61	59	1	4	1	621	86	20	17	16	87	87	170
2	2	5	12	4.3	85	22	24	24	61	59	3	2	5	6.19	7.1	86	26	24	24	65	64
1	3	5	121	6.5	85	22	24	24	61	59	4	3	5	1.36	6.8	87	24	24	24	65	64
4	4	5	51	6.1	85	23	24	24	61	59	1	4	5	656	86	25	24	24	65	64	170
3	1	1	21	4.3	85	19	18	17	61	58	3	2	1	18.2	7.3	86	20	18	17	77	76
2	2	1	87	4.6	86	19	18	17	61	58	4	3	1	55.9	7	87	20	18	17	77	76
1	3	1	21	6.5	85	19	18	17	61	58	1	4	1	731	90	19	18	17	77	76	170
4	4	1	28	6.1	85	19	18	17	61	58	3	2	5	−77	7.3	86	26	25	25	65	63
3	1	5	31	4.3	85	23	25	25	57	55	4	3	5	33.6	7	87	25	25	25	65	63
2	2	5	101	4.6	86	23	25	25	57	55	1	4	5	880	90	26	25	25	65	63	163
1	3	5	−37	6.5	85	23	25	25	57	55	1	2	1	29.4	7.5	88	20	20	21	60	59
4	4	5	136	6.1	85	23	25	25	57	55	2	3	1	49.3	7.1	88	20	20	21	60	59
3	1	1	26	4.4	84	19	19	19	61	60	3	4	1	69.7	6.7	85	20	20	21	60	59
2	2	1	16	4.8	85	19	19	19	61	60	4	1	2	36.1	6.8	90	24	42	29	51	20
1	3	1	92	5.5	87	19	19	19	61	60	1	2	2	−100	7.5	88	24	42	29	51	20

Data: Percentage inhibition (Bio bioassay, Con concentration, Rep repetition, Por percentage inhibition)
Bio	Day	Con	Rep	Por	Bio	Day	Con	Rep	Por
1	1	0	3	5.2632	1	6	2000	4	35.1724
1	1	0	4	5.2632	2	1	0	2	0.0016
1	1	500	1	15.7895	2	1	0	3	14.2857
1	1	500	2	26.3158	2	1	500	1	42.8571
1	1	500	3	15.7895	2	1	500	2	42.8571
1	1	500	4	15.7895	2	1	500	3	42.8571
1	1	1000	1	36.8421	2	1	500	4	42.8571
1	1	1000	2	36.8421	2	1	1000	1	7.1429
1	1	1000	3	36.8421	2	1	1000	2	42.8571
1	1	1000	4	36.8421	2	1	1000	3	42.8571
1	1	2000	1	15.7895	2	1	1000	4	42.8571
1	1	2000	2	36.8421	2	2	0	1	1.3699
1	1	2000	3	36.8421	2	2	0	2	1.3699
1	1	2000	4	36.8421	2	2	0	4	1.3699
1	2	0	2	1.9355	2	2	500	1	34.2466
1	2	0	3	4.5161	2	2	500	2	31.5068
1	2	0	4	1.9355	2	2	500	3	42.4658
1	2	500	1	43.2258	2	2	500	4	36.9863
1	2	500	2	48.3871	2	2	1000	1	34.2466
1	2	500	3	40.6452	2	2	1000	2	47.9452
1	2	500	4	40.6452	2	2	1000	3	45.2055
1	2	1000	1	35.4839	2	2	1000	4	45.2055
1	2	1000	2	45.8065	2	2	2000	1	47.9452
1	2	1000	3	43.2258	2	2	2000	2	53.4247
1	2	1000	4	32.9032	2	2	2000	3	50.6849
1	2	2000	1	58.7097	2	2	2000	4	56.1644
1	2	2000	2	53.5484	2	3	0	1	4.2735
1	2	2000	3	53.5484	2	3	0	4	14.5299
1	2	2000	4	58.7097	2	3	500	1	28.2051
1	3	0	2	1.2346	2	3	500	2	28.2051
1	3	0	3	3.7037	2	3	500	3	35.0427
1	3	500	1	25.9259	2	3	500	4	24.7863
1	3	500	2	23.4568	2	3	1000	1	24.7863
1	3	500	3	23.4568	2	3	1000	2	35.0427
1	3	500	4	24.6914	2	3	1000	3	24.7863
1	3	1000	1	30.8642	2	3	1000	4	26.4957
1	3	1000	2	32.0988	2	3	2000	1	40.1709
1	3	1000	3	28.3951	2	3	2000	2	38.4615
1	3	1000	4	25.9259	2	3	2000	3	47.0085

Generalized Linear Mixed Models for Repeated Measurements

Abstract

9.1 Introduction

9.2 Example of Turf Quality

9.3 Effect of Insecticides on Aphid Growth

9.4 Manufacture of Livestock Feed

9.5 Characterization of Spatial and Temporal Variations in Fecal Coliform Density

9.6 Log-Normal Distribution

9.6.1 Emission of Nitrous Oxide (N2O) in Beef Cattle Manure with Different Percentages of Crude Protein in the Diet

9.7 Effect of a Chemical Salt on the Percentage Inhibition of the Fusarium sp.

9.8 Carbon Dioxide (CO2) Emission as a Function of Soil Moisture and Microbial Activity

9.9 Effect of Soil Compaction and Soil Moisture on Microbial Activity

9.10 Joint Model for Binary and Poisson Data

9.11 Exercises

Exercise 9.11.1

Exercise 9.11.2

Exercise 9.11.3

Exercise 9.11.4

References

Author information

Authors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation

9.6.1 Emission of Nitrous Oxide (N₂O) in Beef Cattle Manure with Different Percentages of Crude Protein in the Diet

9.8 Carbon Dioxide (CO₂) Emission as a Function of Soil Moisture and Microbial Activity