Skip to main content
Log in

Multicollinearity and Multi-regression Analysis for Main Drivers of Cyanobacterial Harmful Algal Bloom (CHAB) in the Lake Torment, Nova Scotia, Canada

  • Published:
Environmental Modeling & Assessment Aims and scope Submit manuscript

Abstract

There are many parameters involved in the bloom patterns of cyanobacteria in Lake Torment. This includes chemical components, viz., TP, PO4-P, NO3-N, NH4-N, and iron etc. However, if there exists multicollinearity in those involving in the CHAB patterns with a quasi-linear dependence in which two or more than two predictor variables are highly correlated to each other, we often cannot control the predictors/drivers affecting on the CHAB appearance. We expect therefore to evaluate the multicollinearity of key parameters leading to the significant contribution of our CHAB predicting mathematical framework, and based on it, multi-regression models have been suggested to predict the CHAB patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Availability of Data and Materials

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. Hushchyna, K., Vo, Q. B., & Nguyen-Quang, T. (2020). The application of TRINDEX to predict Harmful Algal Blooms in Lake Torment (NS, Canada). Environmental Problems., 5(3), 156–163.

  2. Marty, J. (2018). Kings County Lake monitoring program 2018 season. Report. Municipality of the County of Kings.

  3. Bi, J. (2012). A Review of the Statistical Methods for Determination of Relative Importance of Correlated Predictors and Identification of Drivers of Consumer Liking. Journal of Sensory Studies, 27(2), 87–101. https://doi.org/10.1111/j.1745-459X.2012.00370.x

  4. Achen, C. H. (1982). Interpreting and Using Regression. Sage. Beverly Hills. CA.

  5. Stine, R. A. (1995). Graphical Interpretation of Variance Inflation Factors. The American Statistician, 49(1), pp. 53–56. http://www.jstor.org/stable/2684812?origin=JSTOR-pdf

  6. Lindeman, R. H., Merenda, P. F., & Gold, R. Z. (1980). Introduction to bivariate and multivariate analysis. Scott Foresman. Retrieved November 5 2022, from http://catalog.hathitrust.org/api/volumes/oclc/5310754.html

  7. Grömping, U. (2006). Relative importance for linear regression in R: the package relaimpo. Journal of statistical software17, 1-27. https://doi.org/10.18637/jss.v017.i01

  8. Kelley, K., & Bolin, J. H. (2013). Multiple Regression. In Handbook of quantitative methods for educational research, pp. 69–101. Brill Sense.

  9. Hair, J. F, Anderson, R. E, Babin, B. J, & Black, W. C. (2010). Multivariate data analysis: A Global Perspective. 7th ed. Upper Saddle River (N.J.): Pearson education.

Download references

Acknowledgements

TNQ is thankful for the NSERC Discovery Grant RGPIN 03906. The authors would like to thank Lake Torment residents, especially Alexander’s family, for their support.

Funding

NSERC Discovery Grant RGPIN 03906.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Tri Nguyen-Quang and Kateryna Hushchyna; methodology: all; software: Tri Nguyen-Quang, Qurat Ul An Sabir, and Kayla Mclellan; validation: Qurat Ul An Sabir and Kateryna Hushchyna; formal analysis: Tri Nguyen-Quang, Qurat Ul An Sabir, and Kateryna Hushchyna; investigation: Kayla Mclellan, and Kateryna Hushchyna; resources: Tri Nguyen-Quang, Qurat Ul An Sabir, and Kateryna Hushchyna; data curation: Kateryna Hushchyna and Kayla Mclellan; writing—original draft preparation: Kateryna Hushchyna and Tri Nguyen-Quang; writing, review, and editing: Tri Nguyen-Quang and Qurat Ul An Sabir; visualization: Kateryna Hushchyna and Qurat Ul An Sabir; supervision: Tri Nguyen-Quang; project administration: Tri Nguyen-Quangand Qurat Ul An Sabir.

Corresponding author

Correspondence to Tri Nguyen-Quang.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

All authors gave consent to participate in a study, and to have their data published in this article.

Consent to Publication

All authors have given consent to publication.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Annex 1: Validation of the Regression Models

Real Data

Validation data

Day (locations)

Real PC

Real CHL-A

Model 1 (PC)

Model 2 (CHL-A)

Relative error for PC (%)

Relative error for Chl-a (%)

July 19, 2020 (T8)

1

0.644

0.985

0.606

1.475

6.271

July 21, 2020 (T8)

1

0.655

1.003

0.601

0.264

8.943

July 22, 2020 (T8)

1

0.647

0.995

0.595

0.494

8.721

July 25, 2020 (Toutlet)

1

0.656

1.008

0.593

0.808

10.645

Oct 1, 2020 (T8)

1

0.656

1.008

0.593

0.808

10.645

Oct 2, 2020 (T8)

0.016

0.003

0.017

0.019

7.719

5.111

Nov 8, 2020 (T8)

0.99

0.675

1.027

0.6

3.625

12.588

Nov 8, 2020 (T4)

1

0.647

0.995

0.595

0.494

8.721

Nov 8, 2020 (T8)

1

0.647

0.995

0.595

0.494

8.721

Nov 8, 2020 (T4)

1

0.647

0.996

0.593

0.374

9.158

Nov 8, 2020 (Toutlet)

1

0.647

0.995

0.596

0.519

8.63

Nov 8, 2020 (T9)

1

0.659

1.011

0.594

1.121

10.867

Nov 8, 2020 (T8)

1

0.648

0.991

0.605

0.89

7.108

Nov 8, 2020 (T9)

1

0.648

0.997

0.593

0.264

9.296

Aug 2, 2021 (T8)

1

0.648

0.997

0.594

0.335

9.033

Aug 6, 2021 (T8)

0.033

0.028

0.308

0.025

8.178

9.919

Aug 7, 2021 (T8)

1

0.648

0.991

0.605

0.863

7.141

Aug 10, 2021 (T8)

0.033

0.659

0.033

0.594

1.797

10.86

Aug 13, 2021 (T8)

0.248

0.174

0.246

0.161

0.519

7.9

Aug 17, 2021 (T8)

0.248

0.174

0.246

0.161

0.519

7.9

Sept 12, 2021 (T8)

0.248

0.174

0.246

0.161

0.519

7.9

Sept 15, 2021 (T8)

0.248

0.174

0.246

0.161

0.519

7.9

Sept 15, 2021 (Toutlet)

0.248

0.174

0.247

0.161

0.202

8.118

Oct 14, 2021 (T8)

0.248

0.174

0.246

0.161

0.519

7.9

Oct 14, 2021 (T9)

0.253

0.499

0.248

0.501

2.173

0.446

Oct 14, 2021 (Toutlet)

1

0.605

0.934

0.605

7.106

0.047

Oct 14, 2021 (T9)

0.245

0.103

0.257

0.117

4.57

12.362

Oct 14, 2021 (T9)

0.833

0.499

0.795

0.488

4.783

2.308

June 3, 2022 (T9)

0.833

0.499

0.79

0.498

5.489

0.106

June 9, 2022 (T9) 9am

0.833

0.499

0.79

0.498

5.489

0.106

June 15, 2022 (T5)

1

0.605

0.939

0.594

6.516

1.829

June 15, 2022 (T8)

0.722

0.499

0.685

0.466

5.387

6.975

June 15, 2022 (T4)

0.627

0.499

0.687

0.543

8.772

8.13

June 15, 2022 (T9)

0.712

0.499

0.683

0.464

4.233

7.421

June 15, 2022 (T9)

0.627

0.499

0.685

0.543

8.536

8.13

June 15, 2022 (T9)

0.77

0.499

0.683

0.501

12.726

0.446

June 15, 2022 (Tout)

0.833

0.499

0.79

0.498

5.489

0.106

Annex 2: VIF Calculation for Each Predictor Versus Equivalent Target

Factor

VIF value for Chl-a

Color

33.275

TP

30.391

Cond

5.506

NH4_N

3.890

T

3.849

SiO2

3.081

PC

2.811

NO3_N

2.761

Iron

2.620

PO4_P

2.572

DO

2.565

pH

1.575

Factor

VIF value for PC

TP

40.531

Color

34.435

Chl-a

20.844

Cond

5.525

NH4_N

3.380

T

3.482

SiO2

2.947

NO3_N

2.691

Iron

2.462

PO4_P

2.454

DO

2.359

pH

1.684

Factor

VIF value for MCLR

Chl-a

335.400

TP

285.691

Color

81.137

PC

52.52

Cond

30.228

T

16.279

Iron

13.575

SiO2

9.534

NH4_N

9.189

PO4_P

7.925

pH

3.452

NO3_N

3.174

DO

2.848

Annex 3: LMG index (index for the variance importance)

A3.1. Relative importance metrics LMG for Chl-a

PC 0.156511076 15.6%

DO 0.013332912

T 0.013776778

pH 0.004292141

PO4_P 0.020448693

NH4_N 0.011028286

NO3_N 0.006099871

TP 0.318641102 31.9%

Iron 0.077737935 7.77%

SiO2 0.043251999.

Cond 0.027754583.

Color 0.299859759 29.99%

A3.2. Relative importance metrics LMG for PC

Chl-a 0.38744707 38.7%

DO 0.03644179

T 0.02508286

pH 0.01117246

PO4_P 0.04421082

NH4_N 0.08106255 8.11%

NO3_N 0.01160576

TP 0.12809728 12.8%

Iron 0.04916675.

SiO2 0.03057672.

Cond 0.04421614.

Color 0.09706639 9.71%

A3.3. Relative importance metrics LMG for MCLR

Chl-a 0.169253034 16.9%

DO 0.085182256 8.52%

T 0.047486642

pH 0.006301402

PO4_P 0.049363006

NH4_N 0.027271629

NO3_N 0.032489614

TP 0.192465775 19.2%

Iron 0.061680842 6.17%

SiO2 0.047992685.

Cond 0.025199578.

Color 0.164947148 16.5%

PC 0.031347563

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hushchyna, K., Sabir, Q.U.A., Mclellan, K. et al. Multicollinearity and Multi-regression Analysis for Main Drivers of Cyanobacterial Harmful Algal Bloom (CHAB) in the Lake Torment, Nova Scotia, Canada. Environ Model Assess 28, 1011–1022 (2023). https://doi.org/10.1007/s10666-023-09907-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10666-023-09907-z

Keywords

Navigation