Orthogonality, Orthogonal Decomposition, and Their Role in Modern Experimental Design

Berger, Paul D.; Maurer, Robert E.; Celli, Giovana B.

doi:10.1007/978-3-319-64583-4_5

Paul D. Berger⁴,
Robert E. Maurer⁵ &
Giovana B. Celli⁶

5546 Accesses

Abstract

In Chap. 2, we saw how to investigate whether or not one factor influences some dependent variable. Our approach was to partition the total sum of squares (TSS), the variability in the original data, into two components – the sum of squares between columns (SSB_c), attributable to the factor under study, and the sum of squares within a column (SSW), the variability not explained by the factor under study, and instead explained by “everything else.” Finally, these quantities were combined with the appropriate degrees of freedom in order to assess statistical significance. We were able to accept or reject the null hypothesis that all column means are equal (or, correspondingly, reject or accept that the factor under study has an impact on the response). In Chap. 4, we discussed multiple-comparison techniques for asking more detailed questions about the factor under study; for example, if not all column means are equal, how do they differ? We now present a more sophisticated, flexible, and potent way to analyze (or “decompose”) the impact of a factor on the response, not limited to pairwise comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
For a situation with C levels of a factor , we have referred to (C – 1) orthogonal questions (components) into which the SSB_c can be partitioned; in this example, with eight levels, we refer to seven orthogonal questions. In fact, we do not need to specify a full seven questions – we can specify any number up to seven – the remaining variability is simply labeled as “other differences.” We note this again later in the chapter.
2.
A fictitious name. However, the example is real, with only a few changes in levels of factors , results, and locations, in order not to reveal the identity of the company.
3.
Where the context makes it clear, we replace $ {\overline{Y}}_{\cdot j} $ with the less cumbersome Y _j.
4.
Again, the same exception applies as in footnote 1: the matrix may be specifically chosen to have fewer than (C – 1) rows, along with a catchall row describing “other differences.”
5.
Those who studied (and remember) their high-school physics may recognize the similarity of orthogonal decomposition to the resolution of vectors, loosely defined as a quantity with a magnitude and a direction, into components in the “X, Y, and Z directions.” Typical examples include force and velocity. V (the vector, let’s say) is decomposed along three unit vectors, one each in the X, Y, and Z directions, respectively. Unit vectors have a magnitude of one and are perpendicular; in this instance, perpendicular and orthogonal are synonyms. The result of the vector decomposition is a magnitude in the X direction, a magnitude in the Y direction, and a magnitude in the Z direction, such that the sum of squares of these magnitudes is equal to the square of the magnitude of V. The orthonormal rows of the coefficient matrix are, in fact, orthogonal unit vectors; the calculation of the Z _i’s is the resolution of the effect under study into components along these unit vectors.
6.
If two events, A and B, are independent, knowledge of the occurrence of one of the events sheds no light on the occurrence of the other event. If A and B are independent, P(A and B) = P(A) ⋅ P(B).
7.
There would be some major advantages to designing this experiment so that each of the 4 (or 16) people evaluate each of the four portfolios. However, “people” would then be a second factor, and we have not yet covered designs with two factors . If each person evaluated each of the four portfolios, the design would be called a “repeated-measures ” or “within-subject” design.
8.
In essence, one can prove that each component sum of squares follows a chi-square distribution with one degree of freedom and is independent of the error sum of squares. After all, an F distribution is derived as the ratio of two independent chi-square random variables, each divided by its respective degrees of freedom.
9.
A placebo is an inactive substance used for control. Placebos are frequently used in studies of drugs to account for the tendency of patients to perceive an improvement in symptoms merely due to the ingestion of “medicine.” Double-blind studies, in which the identity of the real drug and the placebo are hidden from both the patient and the experimenter, mitigate against the unintentional giving of cues that otherwise might undermine the integrity of the experiment. Such studies have revealed, for example, that yohimbine hydrochloride, a drug long prescribed for temporary male impotence, performs no better than a placebo . (R. Berkow, Merck Manual of Medical Information, Rahway, N.J., Merck, 1997).

Author information

Authors and Affiliations

Bentley University, Waltham, MA, USA
Paul D. Berger
Questrom School of Business, Boston University, Boston, MA, USA
Robert E. Maurer
Cornell University, Ithaca, NY, USA
Giovana B. Celli

Authors

Paul D. Berger
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Maurer
View author publications
You can also search for this author in PubMed Google Scholar
Giovana B. Celli
View author publications
You can also search for this author in PubMed Google Scholar

Appendix

Example 5.7 Drug Comparison Using SPSS

We use the drug comparison example to illustrate the use of SPSS when using orthogonal contrasts . Data from Table 5.5 are repeated in Table 5.15 for convenience.

Table 5.15 Column means for drug-comparison study

Full size table

Running the ANOVA using SPSS yields the same basic result as we have seen previously, shown in Table 5.16. Once again, we conclude that our result is significant ( p < .01), so not all “treatments” are the same with respect to efficacy.

Table 5.16 One-way ANOVA table for drug-comparison study in SPSS

Full size table

We now set out to see how they differ. Similar to what we have done, we will break down the differences into three components: (a) “real” drugs versus placebo ; (b) aspirin 1 versus aspirin 2; and (c) aspirin drugs versus non-aspirin. To use SPSS to enter the contrast values, we click on Contrasts... under Analyze > Compare Means > One-Way ANOVA and enter the three sets (rows) of coefficients in the orthonormal matrix we have built previously (click Previous or Next to change contrasts , and Continue once the three contrasts have been included), as shown in Fig. 5.2; the output in Table 5.17 verifies that these were the coefficients used.

Table 5.17 Orthonormal matrix in SPSS

Full size table

Unfortunately, SPSS does not provide the augmented ANOVA table in one fell swoop. What it does provide, in the notation of this chapter, are the values of the Z’s and of the t _calc’s, which are the respective square roots of the F _calc’s in the augmented ANOVA table, as well as the p-values. It does this under the banner of an analysis that uses the mean-square error in the original ANOVA table as the basis of the standard error for all contrasts (as opposed to using a variance estimate based only on the particular columns, weighted according to the contrast’s coefficients). The SPSS results are in Table 5.18. Note that the “Value of Contrast” column in Table 5.18 consists of the same set of values as the Z’s in Table 5.8 and the numbers in the “t” column are each the square root of the corresponding F _calc value in Table 5.9, the augmented ANOVA table (some numbers are off a tiny bit due to rounding).

Table 5.18 Contrast tests in SPSS

Full size table

Example 5.8 Drug Comparison Using R

We will use the same drug comparison example to demonstrate how the analysis is done in R. After importing the data, we have to access the order in which the levels of the independent variable (V1) are organized.

> drug <- read.csv(file.path("/Users/documents", "ex5.8.csv"), +header=F) > levels(drug$V1) [1] "Aspirin1" "Aspirin2" "NonA" "Placebo"

We will notice that the levels have been organized alphabetically and this has to be taken into account when setting up the contrast matrix. In the command below, the data are entered per row (byrow=T) using the contrast values in the orthonormal matrix we discussed previously. Note that the order of the values has changed to reflect the order of the levels in R.

> matrix <- matrix(c(0.2887, 0.2887, 0.2887, -0.866, -0.7071, +0.7071, 0, 0, -0.4082, -0.4082, 0.8165, 0), nrow=3, ncol=4, +byrow=T) > matrix

	[,1]	[,2]	[,3]	[,4]
[1,]	0.2887	0.2887	0.2887	-0.866
[2,]	-0.7071	0.7071	0.0000	0.000
[3,]	-0.4082	-0.4082	0.8165	0.000

R gives an error message if we try to use this matrix in further analysis. This is because the software package deals with contrasts when the values are organized in columns, rather than rows. For this reason, we have to transpose the contrast matrix, as follows:

> matrix_t <- t(matrix) > matrix_t

	[,1]	[,2]	[,3]
[1,]	0.2887	-0.7071	-0.4082
[2,]	0.2887	0.7071	-0.4082
[3,]	0.2887	0.0000	0.8165
[4,]	-0.8660	0.0000	0.0000

Alternatively, it is possible to create the transposed matrix, which would eliminate one step in the programming – we just have to remember to set the correct number of columns and row and byrow=F. Next, we set the contrast :

> contrasts(drug$V1) <- matrix_t > drug.aov <- aov(V2~V1, data=drug)

# This command is used to verify if the contrast matrix has been correctly assigned to the levels.

> drug.aov$contrasts $V1

	[,1]	[,2]	[,3]
Aspirin1	0.2887	-0.7071	-0.4082
Aspirin2	0.2887	0.7071	-0.4082
NonA	0.2887	0.0000	0.8165
Placeo	-0.8660	0.0000	0.0000

This will also provide the augmented ANOVA table, which can be viewed using the summary.aov() function. Note that in this command we specify how the SSB_c will be split, addressing our three questions regarding the differences between the treatments.

> summary.aov(drug.aov, split=list(V1=list("P vs. P'"=1, "A1 vs. +A2"=2, "A vs. Non-A"=3)))

	Df	Sum Sq	Mean Sq	F value	Pr(>F)
V1	3	112.00	37.33	7.467	0.000802	*******
V1: P vs. P ′	1	42.67	42.67	8.533	0.006820	**
V1: A1 vs. A2	1	4.00	4.00	0.800	0.378718
V1: A vs. Non-A	1	65.33	65.33	13.067	0.001168	**
Residuals	28	140.00	5.00

--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berger, P.D., Maurer, R.E., Celli, G.B. (2018). Orthogonality, Orthogonal Decomposition, and Their Role in Modern Experimental Design. In: Experimental Design. Springer, Cham. https://doi.org/10.1007/978-3-319-64583-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-64583-4_5
Published: 30 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64582-7
Online ISBN: 978-3-319-64583-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Orthogonality, Orthogonal Decomposition, and Their Role in Modern Experimental Design

Abstract

Access this chapter

Notes

Author information

Authors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation