A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models

Evans, Mark

doi:10.1007/s10853-011-6097-0

A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models

Published: 15 November 2011

Volume 47, pages 2712–2724, (2012)
Cite this article

Journal of Materials Science Aims and scope Submit manuscript

Mark Evans¹

253 Accesses
2 Citations
Explore all metrics

Abstract

Recently there has been renewed interest in assessing the predictive accuracy of existing parametric models of creep properties, with the recently develop Wilshire methodology being largely responsible for this revival. Without exception, these studies have used multiple linear regression analysis (MLRA) to estimate the unknown parameters of the models, but such a technique is not suited to data sets where the predictor variables are all highly correlated (a situation termed multicollinearity). Unfortunately, because all existing long-term creep data sets incorporate accelerated tests, multicollinearity will be an issue (when temperature is held high, stress is always set low yielding a negative correlation). This article quantifies the severity of this potential problem in terms of its effect on predictive accuracy and suggests a neat solution to the problem in the form of partial least squares analysis (PLSA). When applied to 1Cr–1Mo–0.25V steel, it was found that when using MLRA nearly all the predictor variables in various parametric models appeared to be statistically insignificant despite these variables accounting for over 90% of the variation in log times to failure. More importantly, the same linear relationship appeared to exist between the first PLS component and the log time to failure in both short and long times to failure and this enabled more accurate extrapolations to be made of the time to failure, compared to when the models were estimated using MLRA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Statistical Test for Identifying the Number of Creep Regimes When Using the Wilshire Equations for Creep Property Predictions

Article 06 October 2016

A universal Bayesian inference framework for complicated creep constitutive equations

Article Open access 26 June 2020

Simple Data Analytics Approach Coupled with Larson–Miller Parameter Analysis for Improved Prediction of Creep Rupture Life

Article 20 April 2023

References

ECCC Recommendations (1995) In: Holdsworth SR et al (eds) Creep data validation and assessment procedures, vol. 5: Data assessment
Holdsworth SR (1996) In: Proceedings of the 6th international conference on creep and fatigue design and life assessment at high temperatures, Paper C494/087, IMechE, London
Holdsworth SR, Davies RB (1999) Nuclear Engineering & Design 190(3):287. doi:10.1016/S0029-5493(99)00038-2
Article CAS Google Scholar
Evans M (2009) J Eng Mater Technol 131(2). doi:10.1115/1.3078391
Holdsworth SR (2004) Materials at High Temperatures 21(1):125
Article Google Scholar
Holdsworth SR, Merckling G (2003) In: Proceedings of the 6th international Charles Parsons conference on engineering issues in turbine machinery, power plant and renewables, Trinity College, Dublin
Holdsworth SR, Askins M, Baker A, Gariboldi E, Holmstrom S, Klenk A, Ringel M, Merckling G, Sandstrom M, Schwienheer M (2005) In Proceedings of the 1st ECCC creep conference on creep and fracture in high temperature components, London, p 380
Maddala GS, Lahiri K (2009) Introduction to econometrics, 4th edn. Wiley, Chichester, p 279
Google Scholar
Dorn JE, Shepherd LA (1954) In: Proceedings symposium on the effect of cyclical heating and stressing on metals at elevated temperatures, ASTM Special Technical Publications, 165, Chicago
Eyring H, Gladston S, Laidler KJ (1941) The theory of rate processes. McGraw-Hill, New York
Google Scholar
Boccaletti G, Borri FR, D’Esponosa F, Ghio E (1989) In: Pollino E (ed) Microelectronic reliability, volume II, reliability, integrity assessment and assurance, Chapter 11. Artech House, Norwood
Google Scholar
Klinger DJ (1991) In: Proceedings of the annual reliability and maintainability symposium, institute of electrical and electronics engineering, New York, p 295
Nelson W (1990) Accelerated testing: statistical models test plans and data analyses. Wiley, New York
Google Scholar
Monkman FC, Grant NJ (1963) In: Grant NJ, Mullendore AW (eds) Deformation and fracture at elevated temperature. MIT Press, Boston
Ashby MF, Jones RH (1996) Engineering materials 1: an introduction to their properties and applications, 2nd edn. Butterworth-Heinemann, Oxford, p 169
Google Scholar
Manson SS, Muraldihan U (1983) Analysis of creep rupture data in five multi heat alloys by minimum commitment method using double heat term centring techniques. Research Project 638-1, EPRI Cs-3171
Trunin, II, Golobova NG, Loginov EA (1971) In: Proceedings of the 4th international symposium on heat resistant metallic materials, Mala Fatra, CSSR, p 168
NIMS Creep Data Sheet No. 9b (1990)
Theil H (1966) Applied economic forecasting. North Holland Publishing Company, Amsterdam
Google Scholar
Wilshire B, Scharning PJ (2008) Mater Sci Technol 24(1):1. doi:10.1179/174328407X245779
Article CAS Google Scholar
Childs D (1970) The essentials of factor analysis. Holt, Rinehart and Winston
Google Scholar
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London
Google Scholar
Kendall MG, Stuart A (1966) The advanced theory of statistics, vol 3. Charles Griffith & Co, London
Google Scholar
Swingler K (1996) Applying neural networks: a practical guide. Academic Press, San Francisco
Google Scholar

Download references

Author information

Authors and Affiliations

College of Engineering, Swansea University, Singleton Park, Swansea, SA2 8PP, UK
Mark Evans

Authors

Mark Evans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Evans.

Appendix

For the sake of generality, suppose there is a sample of size n from which to estimate a linear relationship between Y and the explanatory variables X ₁, X ₂,…, X _m. In the context of the creep data set used in this article, Y would be the log of the minimum creep rate, the log time to failure or some other transformed creep property, X ₁ would be the stress, X ₂ would be 1/RT and the other X’s would be transformations and/or combinations of these two explanatory variables.

For i = 1,…, n, the ith datum in the sample is denoted by {x _l(i),…, x _m(i), y(i)}. Also, the vectors of observed values of Y and X _j are denoted by y and x _j, so y = {y(1),…, y(n)}′ and, for j = 1,…, m, x _j = {x _j(1),…, x _j(n)}′. Denote their sample means by $ \bar{y} = \Upsigma_{i} y(i)/n $ and $ \bar{x}_{j} = \Upsigma_{i} x_{j} (i)/n. $ To simplify notation, Y and the X _j are centered to give variables U ₁ and V _lj, where $ U_{ 1} = Y - \bar{y} $ and, for j = 1,…, m, $ V_{lj} = X_{j} - \bar{x}_{j} . $ The sample means of U ₁ and V _lj are 0, and their data values are denoted by $ {\mathbf{u}}_{ 1} = {\mathbf{y}} - \bar{y} \cdot {\mathbf{1}} $ and $ {\mathbf{v}}_{1j} = {\mathbf{x}}_{j} - {\bar{\text{x}}}_{j} \cdot {\mathbf{1}} $, where 1 is the n-dimensional unit vector, {1,…, 1}′. It is also possible to standardise Y and the X _j to give variables U ₁* and V _1j*, where U ₁* = U ₁/S _Y and $ V_{1j}^{\ast} = V_{lj} /S_{{X_{j} }} $ and where S _Y refers to the standard deviation of Y and $ S_{{X_{j} }} $ are the standard deviations of X _j.

The correlation matrix for all the explanatory variables is then given by R = (1/(n − 1))v ₁*/v ₁*. The principal components are obtained from the spectral decomposition, R = HΔH′, where Δ = diag{λ₁ ≥ λ₂ ≥ ··· ≥ λ_m} are the eigenvalues and H = (h ₁,…, h _m) are the corresponding eigenvectors. Essentially these eigenvectors contain the loadings shown in Eq. 11a of the main text. The m principal components are then given by

$$ z_{i} = {\mathbf{v}}_{1}^{\ast} {\mathbf{h}}_{i} $$

(15)

The partial least squares components can be derived in a similar way using singular value decompositions of v ₁ (within this approach, PLS stands for Projections to Latent Structures). Alternatively, the components can be determined sequentially, and it is this approach that is best summarised by the name partial least squares. The first component, T ₁, is intended to be useful for predicting U ₁ and is constructed as a linear combination of the V _lj’s. During its construction, sample correlations between the V _lj’s are ignored. To obtain T ₁, U ₁ is first regressed against V ₁₁, then against V ₁₂, and so on for each V _lj, in turn. Sample means are 0, so for j = 1,…, m the resulting least squares regression equations are

$$ U_{ 1j} = b_{1j} V_{1j} + \eta_{1} \quad {\text{with}}\quad b_{1j} = {{{\mathbf{v}}^{\prime}_{1{\mathbf{j}}} {\mathbf{u}}_{1} } \mathord{\left/ {\vphantom {{v^{\prime}_{1j} u_{1} } {\left( {v^{\prime}_{ij} v_{1j} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {{\mathbf{v}}^{\prime}_{1{\mathbf{j}}} {\mathbf{v}}_{1{\mathbf{j}}} } \right)}} $$

(16)

where η₁ is a random error term. Given values of the V _lj, for a further item, each of the m equations in Eq. 16 provides an estimate of U _l. To reconcile these estimates, whilst ignoring interrelationships between the V _lj, a simple average, Σ_j b _1j V _1j/m or, more generally, a weighted average can be used

$$ T_{1} = \sum\limits_{j = 1}^{m} {w_{1j} b_{1j} V_{1j} } $$

(17)

In the true spirit of PLS these weights will be inversely proportional to the variances of the b _1j’s, namely w _1j = (n − 1)var(V _ij), where var(V _ij) stands for variance of V _ij. An obvious alternative weighting policy is to set each w _1j equal to 1/m, so that each predictor of U ₁ is given equal weight. This seems a natural choice and is also in the spirit of PLS, which aims to spread the load amongst the X variables in making predictions. The latter weighting scheme is used in this research article.

The procedure extends iteratively in a natural way to give components T ₂,…, T _p, where each component is determined from the residuals of regressions on the preceding component, with residual variability in Y being related to residual information in the X’s. Specifically, suppose that T _i (i ≥ 1) has just been constructed from variables U ₁ and V _1j, (j = 1,…, m) and let T _i, U _i, and the V _ij have sample values t _i, u _i and v _ij. From their construction, it will easily be seen that their sample means are all 0. To obtain T _i+1, first the V _(i+l)j’s and U _i+1 are determined. For j = 1,…, m, V _ij is regressed against T _i, giving t ^′_i v _ij/(t ^′_i t _i) as the regression coefficient, and V _(i+l)j is defined by

$$ V_{ (i + 1 )j} = V_{ij} - \left\{ {{{{\mathbf{t}}^{\prime}_{i} {\mathbf{v}}_{ij} } \mathord{\left/ {\vphantom {{{\mathbf{t}}^{\prime}_{i} {\mathbf{v}}_{ij} } {({\mathbf{t}}^{\prime}_{i} {\mathbf{t}}_{i} )}}} \right. \kern-\nulldelimiterspace} {({\mathbf{t}}^{\prime}_{i} {\mathbf{t}}_{i} )}}} \right\}{\mathbf{T}}_{i} $$

(18)

Its sample values, v _(i+1)j, are the residuals from the regression. Similarly, U _i+1 is defined by U _i+1 = U _i − {t ^′_i u _i/(t ^′_i t _i)}T _i, and its sample values, u _i+1, are the residuals from the regression of U _i on T _i.

The “residual variability” in Y is U _i+1 and the “residual information” in X _j is V _(i+l)j, so the next stage is to regress U _i+1 against each V _(i+l)j in turn. The jth regression yields b _(i+1)j V _(i+)j as a predictor of U _i+1, where

$$ b_{ (i + 1 )j} = {{{\mathbf{v}}^{\prime}_{({\mathbf{i}} + 1){\mathbf{j}}} {\mathbf{u}}_{{\mathbf{i}} + 1} } \mathord{\left/ {\vphantom {{v^{\prime}_{(i + 1){\mathbf{j}}} u_{i + 1} } {\left( {v^{\prime}_{(i + 1)j} v_{(i + 1)j} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {{\mathbf{v}}^{\prime}_{({\mathbf{i}} + 1){\mathbf{j}}} {\mathbf{v}}_{({\mathbf{i}} + 1){\mathbf{j}}} } \right)}} $$

(19)

Forming a linear combination of these predictors, as in Eq. 17, gives the next component

$$ T_{i + 1} = \sum\limits_{j = 1}^{m} {w_{(i + 1)j} b_{(i + 1)j} V_{(i + 1)j} } \quad {\text{with}}\quad w_{{\left( {i + 1} \right)j}} = \left( {n - 1} \right){\text{var}}\left( {V_{{\left( {i + 1} \right)j}} } \right) $$

(20)

The PLS regression equation can then take various forms. It may be a simple linear regression of the form

$$ Y = \beta_{0} + \beta_{1} T_{1} + \beta_{2} T_{2} + \cdots + \beta_{m} T_{p} + \varepsilon $$

(21)

where each component T _k is a linear combination of the X _j, the sample correlation for any pair of components is 0 and p < m. Alternatively, if scatter plots of Y against the T _i trace out curves rather than lines, non linear regressions could be used. For example

$$ \begin{aligned} Y = \beta_{0} + \beta_{1} T_{1}^{2} + \beta_{2} T_{2}^{2} + \cdots + \beta_{m} T_{p}^{2} + \varepsilon \quad {\text{or}} \\ \ln (Y) = \beta_{0} + \beta_{1} \ln \left( {T_{1} } \right) + \beta_{2} \ln \left( {T_{2} } \right) + \cdots + \beta_{m} \ln \left( {T_{p} } \right) + \varepsilon \\ \end{aligned} $$

Finally, if the functional relationship is unclear, Multi Layer Perceptron Neural Networks can be used, where the T _i components form the inputs to the network and Y is the output.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Evans, M. A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models. J Mater Sci 47, 2712–2724 (2012). https://doi.org/10.1007/s10853-011-6097-0

Download citation

Received: 20 June 2011
Accepted: 31 October 2011
Published: 15 November 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s10853-011-6097-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models

Abstract

Access this article

Similar content being viewed by others

A Statistical Test for Identifying the Number of Creep Regimes When Using the Wilshire Equations for Creep Property Predictions

A universal Bayesian inference framework for complicated creep constitutive equations

Simple Data Analytics Approach Coupled with Larson–Miller Parameter Analysis for Improved Prediction of Creep Rupture Life

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models

Abstract

Access this article

Similar content being viewed by others

A Statistical Test for Identifying the Number of Creep Regimes When Using the Wilshire Equations for Creep Property Predictions

A universal Bayesian inference framework for complicated creep constitutive equations

Simple Data Analytics Approach Coupled with Larson–Miller Parameter Analysis for Improved Prediction of Creep Rupture Life

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation