Skip to main content

Multiple Linear Regression

  • Chapter
  • First Online:
Experimental Design

Abstract

In the previous chapter, we discussed situations where we had only one independent variable (X ) and evaluated its relationship with a dependent variable (Y ). This chapter goes beyond that and deals with the analysis of situations where we have more than one X (predictor) variable, using a technique called multiple regression. Similarly to simple regression, the objective here is to specify mathematical models that can describe the relationship between Y and more than one X and that can be used to predict the outcome at given values of the predictors. As we did in Chap. 14, we focus on linear models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    In fact, we would have a plane or hyperplane, since we have multiple dimensions. We will use the term line in this text for simplicity.

  2. 2.

    For those of you who know what this means, you would need to invert a matrix by hand!

  3. 3.

    If we accept the null hypothesis , we would typically abandon formal statistical analysis, since we have accepted that “the X’s as a group (or, the X’s collectively) do not provide us with predictive value about Y”; in which case, what more can be said?

  4. 4.

    At this point we would drop X 2 from consideration and repeat the regression without the X 2 data. However, leaving it in the model/equation for the moment allows several salient points to be made about the methodology over the next several pages in a superior way. We explicitly discuss this issue later.

  5. 5.

    We are making an analogy to R. That is, imagine that a “unit of information” is equivalent to .01 of the R. There are 100 units of information about Y, labeled 1−100. Obviously, if an X, or group of X’s, provide all 100 units of information, it would be equivalent to having an R of 1.

  6. 6.

    In SPSS and JMP , we can enter a column of data as, for example, M and F, for the two sexes. However, we advise the reader not to do so, for the richness of the output is greater when we convert the letters to 0’s and 1’s.

  7. 7.

    We would, in general, not be pleased to have 12 X’s and n = (only) 15. This is true even though all 12 X’s are extremely unlikely to enter the stepwise regression . There is too much opportunity to “capitalize on chance,” and find variables showing up as significant, when they are really not. This possibility is a criticism of the stepwise regression technique and is discussed further in “Improving the User Experience through Practical Data Analytics,” by Fritz and Berger , Morgan Kaufmann, page 259.

  8. 8.

    JMP and SPSS include some options for “directions” or “methods” when performing stepwise regression . Forward is equivalent to stepwise, but once a variable is included, it cannot be removed. Remove is a stepwise in reverse; that is, your initial equation contains all the variables and the steps remove the least significant ones in each step (not available in JMP). Backward is similar to remove, although we cannot reintroduce a variable once it is removed from the equation. JMP also has mixed, which is a procedure that alternates between forward and backward. The authors recommend stepwise and, while preferring it, are not strongly against remove. We are not certain why anyone would prefer either forward or backward. These two processes remove the “guarantee” that all non-significant variables (using p = .10, usually) are deleted from the model/equation.

  9. 9.

    While standardized coefficients provide an indication of relative importance of the variables in a stepwise regression , this would not necessarily be the case in a “regular” multiple regression . This is because there can be large amounts of multicollinearity in a regular multiple regression , while this element is eliminated to a very large degree in the stepwise process.

Author information

Authors and Affiliations

Authors

1 Electronic Supplementary Material

Appendix

Appendix

Example 15.8 Faculty Ratings using R

To analyze the faculty ratings example , we can import the data as we have done previously or create our own in R.

> x1 <- c(1, 4, 4, 2, 4, 4, 4, 5, 4, 3, 4, 4, 3, 4, 3) > x2 <- c(4, 4, 3, 3, 4, 4, 4, 5, 3, 3, 3, 3, 3, 3, 4) > x3 <- c(4, 3, 4, 4, 4, 3, 5, 5, 4, 4, 4, 4, 3, 3, 4) > x4 <- c(4, 4, 4, 4, 4, 3, 4, 4, 3, 3, 3, 3, 3, 3, 2) > x12 <- c(3, 2, 1, 2, 3, 2, 3, 2, 2, 3, 1, 2, 2, 2, 1) > y <- c(4, 4, 3, 3, 4, 4, 4, 5, 4, 3, 4, 3, 3, 3, 4) > rating <- data.frame(x1, x2, x3, x4, …, x12, y)

First, let’s see how we perform a multiple-regression analysis . The functions used are the ones we already know:

> rating_model <- lm(y~x1+x2+x3+x4+…+x12, data=rating) > summary(rating_model) Call: lm(formula = y~x1+x2+x3+x4+…+x12, data=rating)

Residuals:

1

2

3

4

5

6

0.01552

-0.10636

-0.01592

-0.04003

0.14890

-0.02140

7

8

9

10

11

12

-0.04565

0.01751

-0.06493

0.05061

0.21131

-0.22315

13

14

15

   

0.02642

0.07319

-0.02603

   

Coefficients:

 

Estimate

Std. error

t value

Pr(>|t|)

(Intercept)

-0.40784

0.84199

-0.484

0.676

x1

0.26856

0.19360

1.387

0.300

x2

0.01166

0.31473

0.037

0.974

x3

0.31028

0.21674

1.432

0.289

x4

0.02993

0.43669

0.069

0.952

x5

-0.17622

0.16670

-1.057

0.401

x6

0.20136

0.42008

0.479

0.679

x7

0.05440

0.14016

0.388

0.735

x8

0.09736

0.24867

0.392

0.733

x9

0.17106

0.14630

1.169

0.363

x10

0.27376

0.19890

1.376

0.303

x11

0.10341

0.32860

0.315

0.783

x12

0.00783

0.38118

0.021

0.985

Residual standard error: 0.2705 on 2 degrees of freedom Multiple R-squared: 0.9726, Adjusted R-squared: 0.8079 F-statistic: 5.906 on 12 and 2 DF, p-value: 0.1538

Our model is obtained as follows:

> rating_model Call: lm(formula = y~x1+x2+x3+x4+…+x12, data=rating) Coefficients:

(Intercept)

x1

x2

x3

x4

x5

-0.40784

0.26856

0.01166

0.31028

0.02993

-0.17622

x6

x7

x8

x9

x10

x11

0.20136

0.05440

0.09736

0.17106

0.27376

0.10341

x12

     

0.00783

     

There are different ways a stepwise regression can be performed in R. Here, we demonstrate a semi-automated procedure using p-value as the selection criteria. Differently from other software, with R we have to select which variable will be included or excluded. First, we create a model that contains only the intercept (called “1” by R) and none of the independent variables:

> rating_none <- lm(y~1, data=rating)

Then, using add1() or drop1() functions we can include or remove single items from the model. This is done as follows:

> add1(rating_none, formula(rating_model), test="F") Single term additions Model: y ~ 1

 

Df

Sum of Sq

RSS

AIC

F value

Pr(>F)

 

<none>

  

5.3333

-13.511

   

x1

1

0.5178

4.8155

-13.043

1.3978

0.258258

 

x2

1

3.7984

1.5349

-30.194

32.1717

7.643e-05

***

x3

1

0.9496

4.3837

-14.452

2.8161

0.117186

 

x4

1

0.1786

5.1548

-12.022

0.4503

0.513918

 

x5

1

0.2976

5.0357

-12.372

0.7683

0.396645

 

x6

1

2.7083

2.6250

-22.145

13.4127

0.002869

**

x7

1

0.1190

5.2143

-11.850

0.2968

0.595116

 

x8

1

2.8161

2.5172

-22.773

14.5434

0.002151

**

x9

1

0.3592

4.9741

-12.557

0.9388

0.350278

 

x10

1

2.9207

2.4126

-23.410

15.7378

0.001609

**

x11

1

3.9062

1.4271

-31.286

35.5839

4.705e-05

***

x12

1

0.0160

5.3173

-11.556

0.0392

0.846154

 

--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1

Next, we select the variable with the smallest p-value – in this case, X 11 – and introduce it in our model without dependent variables:

> rating_best <- lm(y~1+x11, data=rating) > add1(rating_best, formula(rating_model), test="F") Single term additions Model: y ~ 1 + x11

 

Df

Sum of Sq

RSS

AIC

F value

Pr(>F)

 

<none>

  

1.42708

-31.286

   

x1

1

0.15052

1.27656

-30.958

1.4149

0.25724

 

x2

1

0.47429

0.95279

-35.346

5.9735

0.03093

*

x3

1

0.22005

1.20703

-31.798

2.1877

0.16488

 

x4

1

0.10665

1.32043

-30.451

0.9693

0.34430

 

x5

1

0.00125

1.42584

-29.299

0.0105

0.92013

 

x6

1

0.02708

1.40000

-29.574

0.2321

0.63861

 

x7

1

0.11905

1.30804

-30.593

1.0922

0.31659

 

x8

1

0.68192

0.74517

-39.033

10.9814

0.00618

**

x9

1

0.04419

1.38289

-29.758

0.3835

0.54732

 

x10

1

0.05887

1.36821

-29.918

0.5164

0.48616

 

x12

1

0.00453

1.42256

-29.334

0.0382

0.84834

 

--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1

We keep doing this until there are no significant variables left:

> rating_best <- lm(y~1+x11+x8, data=rating) > add1(mbest, formula(rating_model), test="F") Single term additions Model: y ~ 1 + x11 + x8

 

Df

Sum of Sq

RSS

AIC

F value

Pr(>F)

<none>

  

0.74517

-39.033

  

x1

1

0.011724

0.73344

-37.271

0.1758

0.6831

x2

1

0.156982

0.58818

-40.581

2.9358

0.1146

x3

1

0.072753

0.67241

-38.574

1.1902

0.2986

x4

1

0.024748

0.72042

-37.540

0.3779

0.5512

x5

1

0.012667

0.73250

-37.290

0.1902

0.6712

x6

1

0.020492

0.72468

-37.451

0.3110

0.5882

x7

1

0.001921

0.74325

-37.072

0.0284

0.8691

x9

1

0.007752

0.73742

-37.190

0.1156

0.7402

x10

1

0.049515

0.69565

-38.064

0.7830

0.3952

x12

1

0.009649

0.73552

-37.228

0.1443

0.7113

Since all the other variables are non-significant, we terminate the optimization process and, using X 8 and X 11, find our final model:

> rating_final <- lm(y~x8+x11, data=rating) > rating_final Call: lm(formula = y~x8+ x11, data=rating) Coefficients:

(Intercept)

x8

x11

0.9209

0.3392

0.6011

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Berger, P.D., Maurer, R.E., Celli, G.B. (2018). Multiple Linear Regression. In: Experimental Design. Springer, Cham. https://doi.org/10.1007/978-3-319-64583-4_15

Download citation

Publish with us

Policies and ethics