A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr–1Mo–0.25V steel using parametric models

Recently there has been renewed interest in assessing the predictive accuracy of existing parametric models of creep properties, with the recently develop Wilshire methodology being largely responsible for this revival. Without exception, these studies have used multiple linear regression analysis (MLRA) to estimate the unknown parameters of the models, but such a technique is not suited to data sets where the predictor variables are all highly correlated (a situation termed multicollinearity). Unfortunately, because all existing long-term creep data sets incorporate accelerated tests, multicollinearity will be an issue (when temperature is held high, stress is always set low yielding a negative correlation). This article quantifies the severity of this potential problem in terms of its effect on predictive accuracy and suggests a neat solution to the problem in the form of partial least squares analysis (PLSA). When applied to 1Cr–1Mo–0.25V steel, it was found that when using MLRA nearly all the predictor variables in various parametric models appeared to be statistically insignificant despite these variables accounting for over 90% of the variation in log times to failure. More importantly, the same linear relationship appeared to exist between the first PLS component and the log time to failure in both short and long times to failure and this enabled more accurate extrapolations to be made of the time to failure, compared to when the models were estimated using MLRA.


M. Evans

Journal of Materials Science, Vol. 47 (6), 2011, Pages 2712-2724. doi: 10.1007/s10853-011-6097-0