Skip to main content
Log in

An Improved Correction for Range Restricted Correlations Under Extreme, Monotonic Quadratic Nonlinearity and Heteroscedasticity

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Standardized tests are frequently used for selection decisions, and the validation of test scores remains an important area of research. This paper builds upon prior literature about the effect of nonlinearity and heteroscedasticity on the accuracy of standard formulas for correcting correlations in restricted samples. Existing formulas for direct range restriction require three assumptions: (1) the criterion variable is missing at random; (2) a linear relationship between independent and dependent variables; and (3) constant error variance or homoscedasticity. The results in this paper demonstrate that the standard approach for correcting restricted correlations is severely biased in cases of extreme monotone quadratic nonlinearity and heteroscedasticity. This paper offers at least three significant contributions to the existing literature. First, a method from the econometrics literature is adapted to provide more accurate estimates of unrestricted correlations. Second, derivations establish bounds on the degree of bias attributed to quadratic functions under the assumption of a monotonic relationship between test scores and criterion measurements. New results are presented on the bias associated with using the standard range restriction correction formula, and the results show that the standard correction formula yields estimates of unrestricted correlations that deviate by as much as 0.2 for high to moderate selectivity. Third, Monte Carlo simulation results demonstrate that the new procedure for correcting restricted correlations provides more accurate estimates in the presence of quadratic and heteroscedastic test score and criterion relationships.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aguinis, H., & Whitehead, R. (1997). Sampling variance in the correlation coefficient under indirect range restriction: Implications for validity generalization. Journal of Applied Psychology, 82(4), 528–538.

    Article  Google Scholar 

  • Alexander, R. A., Alliger, G. M., & Hanges, P. J. (1984). Correcting for range restriction when the population variance is unknown. Applied Psychological Measurement, 8(4), 431–437.

    Article  Google Scholar 

  • Arneson, J. J., Sackett, P. R., & Beatty, A. S. (2011). Ability-performance relationships in education and employment settings critical tests of the more-is-better and the good-enough hypotheses. Psychological Science, 22(10), 1336–1342.

    Article  PubMed  Google Scholar 

  • Berry, C. M., Sackett, P. R., & Sund, A. (2013). The role of range restriction and criterion contamination in assessing differential validity by race/ethnicity. Journal of Business and Psychology, 28, 345–359.

    Article  Google Scholar 

  • Brillinger, D. (1969). The calculation of cumulants via conditioning. Annals of the Institute of Statistical Mathematics, 21, 215–218.

    Article  Google Scholar 

  • Chan, W., & Chan, D. (2004). Bootstrap standard error and confidence intervals for the correlation corrected for range restriction: A simulation study. Psychological Methods, 9, 369–385.

    Article  PubMed  Google Scholar 

  • Chernyshenko, O. S., & Ones, D. S. (1999). How selective are psychology graduate programs? The effect of the selection ratio on GRE score validity. Educational and Psychological Measurement, 59(6), 951–961.

    Article  Google Scholar 

  • Coward, W. M., & Sackett, P. R. (1990). Linearity of ability-performance relationships: A reconfirmation. Journal of Applied Psychology, 75(3), 297–300.

    Article  Google Scholar 

  • Cullen, M. J., Hardison, C. M., & Sackett, P. R. (2004). Using SAT-grade and ability-job performance relationships to test predictions derived from stereotype threat theory. Journal of Applied Psychology, 89(2), 220–230.

    Article  PubMed  Google Scholar 

  • Culpepper, S. A. (2010). Studying individual differences in predictability with gamma regression and nonlinear multilevel models. Multivariate Behavioral Research, 45(1), 153–185.

    Article  PubMed  Google Scholar 

  • Dobson, P. (1988). The correction of correlation coefficients for restriction of range when restriction results from the truncation of a normally distributed variable. British Journal of Mathematical and Statistical Psychology, 41(2), 227–234.

    Article  Google Scholar 

  • Fife, D. A., Mendoza, J. L., & Terry, R. (2013). Revisiting case IV: A reassessment of bias and standard errors of case IV under range restriction. British Journal of Mathematical and Statistical Psychology, 66(3), 521–542.

    PubMed  Google Scholar 

  • Greener, J., & Osburn, H. (1979). An empirical study of the accuracy of corrections for restriction in range due to explicit selection. Applied Psychological Measurement, 3, 31–41.

    Article  Google Scholar 

  • Greener, J., & Osburn, H. (1980). Accuracy of corrections for restriction in range due to explicit selection in heteroscedastic and nonlinear distributions. Educational and Psychological Measurement, 40, 337–346.

    Article  Google Scholar 

  • Gross, A. (1982). Relaxing the assumptions underlying corrections for restriction of range. Educational and Psychological Measurement, 42(3), 795–801.

    Article  Google Scholar 

  • Gross, A. (1990). A maximum likelihood approach to test validation with missing and censored dependent variables. Psychometrika, 55(3), 533–549.

    Article  Google Scholar 

  • Gross, A. (1997). Interval estimation of bivariate correlations with missing data on both variables: A Bayesian approach. Journal of Educational and Behavioral Statistics, 22(4), 407–424.

    Article  Google Scholar 

  • Gross, A. (2000). Bayesian interval estimation of multiple correlations with missing data: A Gibbs sampling approach. Multivariate Behavioral Research, 35(2), 201–227.

    Article  PubMed  Google Scholar 

  • Gross, A., & Fleischman, L. (1983). Restriction of range corrections when both distribution and selection assumptions are violated. Applied Psychological Measurement, 7, 227–237.

    Article  Google Scholar 

  • Gross, A., & Fleischman, L. (1987). The correction for restriction of range and nonlinear regressions: An analytic study. Applied Psychological Measurement, 11, 211–217.

    Article  Google Scholar 

  • Gross, A., & Perry, P. (1983). Validating a selection test, a predictive probability approach. Psychometrika, 48(1), 113–127.

    Article  Google Scholar 

  • Gross, A., & Torres-Quevedo, R. (1995). Estimating correlations with missing data, a Bayesian approach. Psychometrika, 60(3), 341–354.

    Article  Google Scholar 

  • Hagglund, G., & Larsson, R. (2006). Estimation of the correlation coefficient based on selected data. Journal of Educational and Behavioral Statistics, 31, 377–411.

    Article  Google Scholar 

  • Harvey, A. (1976). Estimating regression models with multiplicative heteroscedasticity. Econometrica, 44, 461–465.

    Article  Google Scholar 

  • Held, J., & Foley, P. (1994). Explanations for accuracy of the general multivariate formulas in correcting for range restriction. Applied Psychological Measurement, 518, 355–367.

    Article  Google Scholar 

  • Holmes, D. (1990). The robustness of the usual correction for restriction in range due to explicit selection. Psychometrika, 55, 19–32.

    Article  Google Scholar 

  • Hsu, J. (1995). Sampling behaviour in estimating predictive validity in the context of selection and latent variable modelling: A Monte Carlo study. British Journal of Mathematical and Statistical Psychology, 48, 75–97.

    Article  Google Scholar 

  • Hunter, J., Schmidt, F., & Le, H. (2006). Implications of direct and indirect range restriction for meta-analysis methods and findings. Journal of Applied Psychology, 91, 594–612.

    Article  PubMed  Google Scholar 

  • Kobrin, J., Sinharay, S., Haberman, S., & Chajewski, M. (2011). An investigation of the fit of linear regression models to data from an SAT validity study. Tech. Rep. 2011–3, College Board Research Report.

  • Lawley, D. (1944). A note on Karl Pearson’s selection formulæ. Proceedings of the Royal Society of Edinburgh Section A Mathematical and Physical Sciences, 62(01), 28–30.

    Article  Google Scholar 

  • Le, H., Oh, I., Robbins, S., Ilies, R., Holland, E., & Westrick, P. (2011). Too much of a good thing: Curvilinear relationships between personality traits and job performance. Journal of Applied Psychology, 96, 113–133.

    Article  PubMed  Google Scholar 

  • Lee, R., & Foley, P. (1986). Is the validity of a test constant throughout the test score range? Journal of Applied Psychology, 71, 641–644.

    Article  Google Scholar 

  • Li, J., Chan, W., & Cui, Y. (2011). Bootstrap standard error and confidence intervals for the correlations corrected for indirect range restriction. British Journal of Mathematical and Statistical Psychology, 64(3), 367–387.

    Article  PubMed  Google Scholar 

  • Li, J., Cui, Y., & Chan, W. (2013). Bootstrap confidence intervals for the mean correlation corrected for case IV range restriction: A more adequate procedure for meta-analysis. Journal of Applied Psychology, 98(1), 183–193.

    Article  PubMed  Google Scholar 

  • Linn, R. (1983). Pearson selection formulas: Implications for studies of predictive bias and estimates of educational effects in selected samples. Journal of Educational Measurement, 20, 1–15.

    Article  Google Scholar 

  • Linn, R. (1984). Selection bias: Multiple meanings. Journal of Educational Measurement, 21, 33–47.

    Article  Google Scholar 

  • Little, R. J., & Rubin, D. B. (2014). Statistical analysis with missing data. Hoboken, NJ: John Wiley & Sons Inc.

    Google Scholar 

  • Mendoza, J. (1993). Fisher transformations for correlations corrected for selection and missing data. Psychometrika, 58(4), 601–615.

    Article  Google Scholar 

  • Mendoza, J., Bard, D. E., Mumford, M., & Ang, S. C. (2004). Criterion-related validity in multiple-hurdle designs: Estimation and bias. Organizational Research Methods, 7(4), 418–441.

    Article  Google Scholar 

  • Mendoza, J., & Mumford, M. (1987). Corrections for attenuation and range restriction on the predictor. Journal of Educational Statistics, 12, 282–293.

    Article  Google Scholar 

  • Mendoza, J., Stafford, K. L., & Stauffer, J. M. (2000). Large-sample confidence intervals for validity and reliability coefficients. Psychological Methods, 5(3), 356–369.

    Article  PubMed  Google Scholar 

  • Muthen, B., & Hsu, J. (1993). Selection and predictive validity with latent variable structures. British Journal of Mathematical and Statistical Psychology, 46, 255–271.

    Article  Google Scholar 

  • Muthen, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431–462.

    Article  Google Scholar 

  • Raju, N. S., & Brand, P. A. (2003). Determining the significance of correlations corrected for unreliability and range restriction. Applied Psychological Measurement, 27(1), 52–71.

    Article  Google Scholar 

  • Raju, N. S., Lezotte, D. V., Fearing, B. K., & Oshima, T. (2006). A note on correlations corrected for unreliability and range restriction. Applied Psychological Measurement, 30(2), 145–149.

    Article  Google Scholar 

  • Robie, C., & Ryan, A. (1999). Effect of nonlinearity and heteroscedasticity on the validity of conscientiousness in predicting overall job performance. International Journal of Selection and Assessment, 7, 157–169.

    Article  Google Scholar 

  • Sackett, P., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85, 112–118.

    Article  PubMed  Google Scholar 

  • Stauffer, J., & Mendoza, J. (2001). The proper sequence for correcting correlation coefficients for range restriction and unreliability. Psychometrika, 66, 63–68.

    Article  Google Scholar 

  • Yang, H., Sackett, P., & Nho, Y. (2004). Developing a procedure to correct for range restriction that involves both institutional selection and applicants’ rejection of job offers. Organizational Research Methods, 7, 442–455.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Andrew Culpepper.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (txt 3 KB)

Appendix: Upper-Bound for \(\left| \rho -\rho _\mathrm{{c}}\right| \) in Educational and Employment Selection

Appendix: Upper-Bound for \(\left| \rho -\rho _\mathrm{{c}}\right| \) in Educational and Employment Selection

This section derives an expression for the most extreme quadratic polynomial for selection decisions, which is used to establish an upper-bound for \(\left| \rho -\rho _\mathrm{{c}}\right| \) attributed to a quadratic relationship. This Appendix addresses three issues. The first subsection identifies upper-bound equations for quadratic polynomials that satisfy a positive monotonicity constraint on a fixed interval. The second subsection presents equations for deriving \(\rho _\mathrm{{c}}\) based upon information about the relationship between \(X\) and \(Y\) in \(\pi \), the moments of \(X\) in \(\pi \) and \(\pi _s\), and the nature of selection on \(X\). The third subsection compliments the simulation study by examining the upper-bounds for \(\left| \rho -\rho _\mathrm{{c}}\right| \) in cases of extreme quadratic relationships when \(X\sim N\left( 0,1\right) \) and the distribution of \(Y|X\) is modeled as normal, but heteroscedastic.

1.1 Upper-Bound for Monotone Quadratic Functions

A foundational assumption employed in educational and psychological selection is that the relationship between \(X\) and \(Y\) should be monotonically increasing on a subinterval \(\Omega _\mathrm{{c}} = \left\{ X:X \in \left( -c,c\right) \right\} \) of the support for \(X\) (Arneson et al., 2011). In accordance with the available empirical research, let \(g\left( X\right) \) denote a monotonically increasing quadratic polynomial defined on \(\Omega _\mathrm{{c}}\) such that \(g\left( X\right) = \sum _{j=0}^2 \beta _j X^j\) and \(g^\prime \left( X\right) > 0\) for all \(X \in \Omega _\mathrm{{c}}\).

The variance of slopes for a linear function across the support of \(X\) is zero, so one measure for the degree of deviation from linearity is the variance in the slope of \(g\left( X\right) \) on the support of \(X\). Let the moments of the distribution for \(X\) be denoted as \(\mu _j = E\left( X^j\right) \). The variance of the slopes for the quadratic polynomial \(g\left( X\right) \) is

$$\begin{aligned} V\left[ g^\prime \left( X\right) \right] = V\left[ \beta _1 + 2 \beta _2 X \right] = 4 \beta _2^2 \left( \mu _2 -\mu _1^2\right) . \end{aligned}$$
(8)

Equation 8 implies that the variance in slopes (and the deviation from linearity) increases as \(\beta _2\) increases in absolute value. Larger values of \(\beta _2\) imply a stronger quadratic relationship and the following theorem establishes necessary bounds on \(\beta _2\) and \(V\left[ g^\prime \left( X\right) \right] \) to ensure that a quadratic polynomial monotonically increases on \(\Omega _\mathrm{{c}}\).

Theorem 1

If \(g\left( X\right) \) satisfies positive monotonicity on \(\Omega _\mathrm{{c}}\), the coefficient for the quadratic term \(\beta _2\) is bounded as

$$\begin{aligned} -\frac{\beta _1}{2c}\le \beta _2 \le \frac{\beta _1}{2c}. \end{aligned}$$
(9)

Proof

For quadratic polynomials, the slope relating \(X\) to \(Y\) changes linearly as \(X\) changes. That is, \(g^\prime \left( X\right) = \beta _1 + 2 \beta _2 X\). Let \(c_0\) be an arbitrary value such that \(c_0 \not \in \Omega _\mathrm{{c}}\). A quadratic polynomial is monotone on the subinterval \(\left( -c_0,c_0\right) \) if \(g^\prime \left( X\right) =0\) at either \(X=-c_0\) and/or \(X=c_0\), which occurs when,

$$\begin{aligned} \beta _2 = \pm \frac{\beta _1}{2 c_0}. \end{aligned}$$
(10)

Clearly, \(\beta _2\) declines as \(c_0\) increases in absolute value, so the largest possible value for \(\beta _2\) is achieved when \(c_0 = c\) at the boundary of \(\Omega _\mathrm{{c}}\), which implies that \(V\left[ g^\prime \left( X\right) \right] \le \beta _1^2 \left( \mu _2-\mu _1^2\right) /c^2\). \(\square \)

1.2 Deriving \(\rho _\mathrm{{c}}\) for Nonlinear and Homoscedastic Relationships

Holmes (1990) presented a framework for assessing the effect that nonlinearity and heteroscedasticity have on corrections for range restriction. In particular, the probability density for \(X\) and \(Y\) in the unrestricted population \(\pi \) is

$$\begin{aligned} p\left( X,Y\right) = p\left( Y|X\right) p\left( X\right) . \end{aligned}$$
(11)

As noted by Holmes, selection on test scores only alters the marginal density of \(X\). Consequently, the joint density for the predictor and criterion in the restricted population \(\pi _s\) is

$$\begin{aligned} p_s\left( X,Y\right) = p\left( Y|X\right) p_s\left( X\right) , \end{aligned}$$
(12)

where \(s\) is included in the subscript to denote that \(p_s\left( X\right) \) is the density of \(X\) in \(\pi _s\).

Let \(E\) and \(V\) denote the expected value and variance in the unrestricted population and \(E_s\) and \(V_s\) moments in the restricted population. Holmes et al. (1990) showed how the Laws of Total Variance and Covariance (Brillinger, 1969) can be used to represent the covariance between \(X\) and \(Y\) and the criterion variance in the restricted population (i.e., \(\sigma _{xys}\) and \(\sigma _{ys}^2\)),

$$\begin{aligned} \sigma _{xys}&= E_s\left[ X E\left( Y|X\right) \right] - E_s\left( X\right) E_s\left( E\left( Y|X\right) \right) \end{aligned}$$
(13)
$$\begin{aligned} \sigma _{ys}^2&= V_s\left[ E\left( Y|X\right) \right] + E_s\left[ V\left( Y|X\right) \right] . \end{aligned}$$
(14)

Note that Equations 13 and 14 are used to compute the restricted correlation \(\rho _s\) that is corrected for selection with Equation 1. That is, \(\rho _s\) is

$$\begin{aligned} \rho _s = \frac{\sigma _{xys}}{\sigma _{xs} \sigma _{ys}}, \end{aligned}$$
(15)

which is used with Equation 1.

The derivations in this paper specify the following relationship between \(X\) and \(Y\) in \(\pi \):

$$\begin{aligned} Y = g\left( X\right) + h\left( X\right) u , \end{aligned}$$
(16)

where \(g\left( X\right) \) is a nonlinear function describing how \(E\left( Y|X\right) \) changes as a function of \(X\), \(u\) is a random error that is unrelated to \(X\) and has a mean of zero and variance of \(\sigma _u^2\), and \(h\left( X\right) \) is a function used to describe the pattern of nonconstant error variance as a function of \(X\). Note that \(V\left( Y|X\right) = \left( h\left( X\right) \right) ^2\sigma _u^2\). The Law of Total Variance implies that the amount of variance that \(g\left( X\right) \) accounts for in \(Y\) in \(\pi \) is

$$\begin{aligned} \rho ^2=\frac{V \left[ g\left( X\right) \right] }{V \left[ g\left( X\right) \right] + E\left[ \left( h\left( X\right) \right) ^2\right] \sigma _u^2 }. \end{aligned}$$
(17)

Recall that prior empirical research in educational and employment selection found evidence that the order of \(g\left( X\right) \) does not exceed a quadratic polynomial, which implies that, \(V \left[ g\left( X\right) \right] \) is

$$\begin{aligned} V \left[ g\left( X\right) \right]&=\beta _1^2\left( \mu _2 - \mu _1^2\right) +2\beta _1\beta _2\left( \mu _3-\mu _1 \mu _2\right) + \beta _2^2\left( \mu _4 - \mu _2^2\right) \nonumber \\&=\beta _1^2\left[ \mu _2 - \mu _1^2 +\frac{sgn\left( \beta _2\right) }{c} \left( \mu _3-\mu _1\mu _2\right) + \frac{1}{4c^2}\left( \mu _4 - \mu _2^2\right) \right] , \end{aligned}$$
(18)

where the last equality was obtained by substituting the upper-bound value for \(\beta _2\) and \(sgn\left( \beta _2\right) \) denotes the sign of \(\beta _2\) where \(\beta _2<0\) implies a concave function and \(\beta _2>0\) indicates a convex function.

The proportion of variance accounted by \(g\left( X\right) \) in the restricted population is

$$\begin{aligned} \rho _s^2=\frac{V_s \left[ g\left( X\right) \right] }{V_s \left[ g\left( X\right) \right] + E_s\left[ \left( h\left( X\right) \right) ^2\right] \sigma _u^2 }. \end{aligned}$$
(19)

Similar to Equation 18, \(V_s \left[ g\left( X\right) \right] \) for the most extreme quadratic and monotone polynomial is defined as

$$\begin{aligned} V_s\left[ g\left( X\right) \right] =\beta _1^2\left[ \mu _{2s}-\mu _{1s}^2 +\frac{sgn\left( \beta _2\right) }{c}\left( \mu _{3s}-\mu _{1s}\mu _{2s}\right) + \frac{1}{4c^2}\left( \mu _{4s} - \mu _{2s}^2\right) \right] . \end{aligned}$$
(20)

where \(\mu _{js} = E_s\left( X^j\right) \) are moments in \(\pi _s\).

Let the restricted correlation for the upper-bound value of \(\beta _2\) for a given heteroscedastic pattern be denoted as \(\rho _s\left( \beta _2\right) \) and define the corrected correlation that uses \(\rho _s\left( \beta _2\right) \) with Equation 1 as \(\rho _\mathrm{{c}}\left( \beta _2\right) \). Note that \(\rho _\mathrm{{c}}\left( \beta _2\right) \) represents the corrected correlation when linearity and homoscedasticity are violated. Consequently, the bias of the corrected correlation attributed to satisfying the upper-bound quadratic relationship is

$$\begin{aligned} \rho -\rho _\mathrm{{c}} = \rho -\rho _\mathrm{{c}}\left( \beta _2\right) . \end{aligned}$$
(21)

1.3 Examination of Upper-Bound for \(\rho -\rho _\mathrm{{c}}\) for \(X\sim N\left( 0,1\right) \) Case

This subsection presents specific results on the bias of \(\rho _\mathrm{{c}}\) when the relationship between \(X\) and \(Y\) is nonlinear and heteroscedastic and \(X\sim N\left( 0,1\right) \). The discussion in this subsection proceeds by deriving an expression for \(\rho -\rho _\mathrm{{c}}\). The results in this section use the following definitions for the marginal and conditional distributions (Harvey, 1976),

$$\begin{aligned} p\left( X\right)&\equiv \phi \left( X;0,1\right) \end{aligned}$$
(22)
$$\begin{aligned} p\left( Y|X\right)&\equiv \phi \left( Y;g\left( X\right) ,\left( h\left( X\right) \right) ^2\sigma _u^2\right) , \end{aligned}$$
(23)

where \(\phi \) denotes the normal density. It is important to note that \(p\left( Y|X\right) \) in Equation 23 assumes that the errors around \(g\left( X\right) \) are normally distributed, but that the error variance differs across values of \(X\). This section considers the case where decision-makers use cut scores to select applicants. For instance, only applicants with \(X>x^*\) are selected, which implies that \(p_s\left( X\right) \) is a truncated normal distribution at \(x^*\). Furthermore, the following equation for \(h\left( X\right) \) is used:

$$\begin{aligned} h\left( X\right) = \exp \left[ \frac{\gamma _0 + \gamma _1 X}{2}\right] . \end{aligned}$$
(24)

Equation 24 is a nonnegative function and has accordingly been used in the literature to model nonconstant error variance that either increases or decreases with \(X\). Note that \(\gamma _1<0\) in Equation 24 is consistent with prior research (Held & Foley, 1994; Kobrin et al., 2011; Lee & Foley, 1986).

It is important to note several properties of the normal distribution. Namely, the first four moments of the standard normal distribution are defined as \(\mu _1 = 0\), \(\mu _2 = 1\), \(\mu _3 = 0\), and \(\mu _4 = 3\). The moments can be used to find the covariances among different powers of \(X\). For instance, \(\sigma _{x,x^2} = \mu _3 - \mu _1 \mu _2 = 0\) and \(\sigma _{x^2}^2=\mu _4 - \mu _2^2=2\).

Let \({\varvec{\Sigma }}\) denote the covariance among the \(X\) variables in \(\pi \) and \({\varvec{\beta }}^\prime = \left( \beta _0\;\beta _1\; \beta _2\right) ^\prime \) be the vector of coefficients and note that \(V\left[ g\left( X\right) \right] = {\varvec{\beta }}^\prime {\varvec{\Sigma }} \varvec{\beta }\). An expression for \(E\left[ h\left( X\right) ^2\right] \) can be found using the moment generating function (MGF) for the normal distribution. More precisely, the MGF for a \(\phi \left( \mu , \sigma ^2\right) \) distribution in \(\pi \) is

$$\begin{aligned} {\mathcal {M}}\left( t\right) = E\left( e^{Xt}\right) = \exp \left( \mu t + \sigma ^2 t^2/2\right) , \end{aligned}$$
(25)

which implies that

$$\begin{aligned} E\left[ h\left( X\right) ^2\right] = e^{\gamma _0} E\left( e^{X \gamma _1}\right) = e^{\gamma _0}{\mathcal {M}}\left( t=\gamma _1\right) . \end{aligned}$$
(26)

For the case where \(X\) is standard normal, the aforementioned results imply that \(\rho ^2\) is defined as

$$\begin{aligned} \rho ^2=\frac{ {\varvec{\beta }}^\prime {\varvec{\Sigma }} {\varvec{\beta }} }{{\varvec{\beta }}^\prime {\varvec{\Sigma }} {\varvec{\beta }} + e^{\gamma _0}{\mathcal {M}}\left( t=\gamma _1\right) }. \end{aligned}$$
(27)

Recall that \(\rho _s^2\) is the validity coefficient in \(\pi _s\) that assumes a linear relationship between \(X\) and \(Y\). An expression for \(\rho _s^2\) is a function of the \(\sigma _{xys}\), \(\sigma _{xs}^2\) and \(\sigma _{ys}^2\), which are available from the MGF of the truncated normal distribution. The MGF for the normal distribution truncated at \(x^*\) is

$$\begin{aligned} {\mathcal {M}}_s\left( t\right) = E_s\left( e^{Xt}\right) =\exp \left( \mu t+\frac{\sigma ^2t^2}{2}\right) \frac{1-\Phi \left( \alpha -\sigma t\right) }{1 - \Phi \left( \alpha \right) }, \end{aligned}$$
(28)

where \(\alpha = \left( x^*-\mu \right) \sigma ^{-1}\). Accordingly, \(\mu _{js} = E_s\left( X^j\right) ={\mathcal {M}}_s^{(j)}\left( t=0\right) \) and \(E_s\left[ h\left( X\right) ^2\right] = e^{\gamma _0} {\mathcal {M}}_s \left( t=\gamma _1\right) \). For this example, Equation 13 implies that \(\sigma _{xys}\) is defined as

$$\begin{aligned} \sigma _{xys} = \beta _1 \left( \mu _{2s}-\mu _{1s}^2 \right) + \beta _2 \left( \mu _{3s}-\mu _{1s}\mu _{2s} \right) . \end{aligned}$$
(29)

Also, note that \(\sigma _{xs}^2=\mu _{2s}-\mu _{1s}^2\). Let \({\varvec{\Sigma }}_s\) denote the variance–covariance matrix among the powers of \(X\) in \(\pi _s\) and note that \(V_s\left[ g\left( X\right) \right] = {\varvec{\beta }}^\prime {\varvec{\Sigma }}_s {\varvec{\beta }}\). Equation 14 implies that \(\sigma _{ys}^2\) is defined as,

$$\begin{aligned} \sigma _{ys}^2 = {\varvec{\beta }}^\prime {\varvec{\Sigma }}_s {\varvec{\beta }} +e^{\gamma _0} {\mathcal {M}}_s\left( t=\gamma _1\right) . \end{aligned}$$
(30)

The squared correlation that assumes a linear relationship in \(\pi _s\) is

$$\begin{aligned} \rho _s^2=\frac{ \left[ \beta _1 \left( \mu _{2s}-\mu _{1s}^2 \right) + \beta _2 \left( \mu _{3s}-\mu _{1s}\mu _{2s} \right) \right] ^2}{\left( \mu _{2s}-\mu _{1s}^2\right) \left( {\varvec{\beta }}^\prime {\varvec{\Sigma }}_s {\varvec{\beta }} +e^{\gamma _0} {\mathcal {M}}_s\left( t=\gamma _1\right) \right) }. \end{aligned}$$
(31)

Note that \(c = 3\) was chosen for the boundaries in \(\Omega _\mathrm{{c}}\) given that a standard normal distribution for \(X\) was examined, which implies that the quadratic polynomials were defined as \(\beta _2 = \pm \beta _1/6\) depending upon whether the relationship is concave (i.e., \(\beta _2<0\)) or convex (i.e., \(\beta _2>0\)). Prior empirical research has estimated range-restriction-corrected validity coefficients in the range between 0.4 and 0.8 (Berry, Sackett, & Sund, 2013; Chernyshenko & Ones, 1999) and it is possible to find values of \(\beta _1\) that are consistent with prior research. In the case where \(X\sim N\left( 0,1\right) \), \(\beta _2 = \pm \beta _1/\left( 2 c\right) \), \(\gamma _0=0\), and \(\gamma _1=0\), \(\rho ^2 = \beta _1^2\left( \beta _1^2 + \frac{2c^2}{1+2c^2}\right) ^{-1}\), which implies that \(\beta _1 = \left( \frac{\rho ^2}{1-\rho ^2} \frac{2c^2}{1+2c^2}\right) ^{1/2}\). The model assumes the quadratic maximum/minimum occurs at the extremes of the test score distribution. Recall that 99.7 % of observations fall between \(\pm 3\) for the standard normal distribution, so \(c=3\) implies that \(\rho = 0.4\) when \(\beta _1 = 0.425\) and \(\rho =0.8\) when \(\beta _1 = 1.298\). Consequently, the simulation study examines the bias of \(\rho _\mathrm{{c}}\) for \(\beta _1 \in \left[ 0.4,1.4\right] \), which yields values of \(\rho \) that are representative of estimates observed by applied researchers.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Culpepper, S.A. An Improved Correction for Range Restricted Correlations Under Extreme, Monotonic Quadratic Nonlinearity and Heteroscedasticity. Psychometrika 81, 550–564 (2016). https://doi.org/10.1007/s11336-015-9466-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-015-9466-9

Keywords

Navigation