Skip to main content

Advertisement

Log in

A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables

  • Published:
Research in Higher Education Aims and scope Submit manuscript

Abstract

This methodological note illustrates how a commonly used calculation of the Delta-p statistic is inappropriate for categorical independent variables, and this note provides users of logistic regression with a revised calculation of the Delta-p statistic that is more meaningful when studying the differences in the predicted probability of an outcome between two or more groups. Although one cannot fully document the extent to which this error in the current calculation of the Delta-p statistic has spread across the field, the potential for error is far reaching as an increasing number of researchers use logistic regression to study categorical outcomes and as researchers look more closely at the Delta-p statistic as a means to communicate the results of logistic regression models to policy makers and administrators. It is recommended that higher education scholars and institutional researchers use caution when reporting the Delta-p statistic from prior studies and that they adopt the revised calculation of the Delta-p statistic presented in this methodological note when estimating logistic regression models with categorical independent variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Given this definition, the Delta-p statistic is only one way to illustrate the discrete change in the predicted probability of an outcome. See Long (1997) for other examples.

  2. The equation presented here is in a slightly different, albeit algebraically equivalent, form than those found in Petersen (1985) and Cabrera (1994). Its form has been modified in order to more explicitly illustrate each step in the calculation of the Delta-p statistic.

  3. The magnitude of the difference between these two results is based on the shape of the logistic curve around the point at which the Delta-p statistic is assessed and the magnitude of the coefficient. That there is any similarity between the absolute values of the results from Examples 1 and 2 is due to the fact that the baseline probability (i.e., 0.494) is located along the region of the logistic curve that most closely approximates linearity and that the coefficient (i.e., 0.6834) is moderate in size. Under other circumstances, this difference between the absolute values of the results from Examples 1 and 2 could be appreciably larger. The findings in Table 1 illustrate this point using other published studies.

  4. The other 12 studies, listed in Appendix 1, provide insufficient information to determine whether or not the error in the calculation was present. The majority of these 12 studies do not report parameter estimates, whereas a few of these studies do not report the baseline probability; both of these criteria are necessary for determining whether or not the Delta-p statistic is correctly calculated.

  5. Although these two approaches are conceptually equivalent, discrete change via simulation is based on the model estimate of \( {\bar{\text{y}}} \) whereas Eq. 2 is based on observed \( {\bar{\text{y}}} \). The two approaches may result in slightly different measures of discrete change.

  6. Researchers may also find this method useful for calculating and interpreting the Delta-p statistics for continuous independent variables. Although not addressed in this manuscript, a one-unit change in a continuous independent variable assessed at \( {\bar{\text{y}}} \) is also susceptible to problems in interpretation. Given the shape of the logistic curve, the change in probability given a one-unit increase from the mean of x will be the additive inverse of the change in probability given a one-unit decrease from the mean of x only when \( {\bar{\text{y}}} \) is 0.50. To alleviate problems in interpretation when \( {\bar{\text{y}}} \) is not at the center of the distribution, others (e.g., Kauffman 1996; Long 1997) have recommended computing the “centered discrete change” in the continuous variable, a process that is similar to the calculation proposed in this manuscript for categorical variables.

  7. Three of the nine studies (i.e., Millet 2003; Perna 2003, 2005) do not report the descriptive statistics for the study variables that are necessary for the calculation of the revised Delta-p statistic.

References

  • Agresti, A. (2002). Categorical data analysis (2nd ed.). New Jersey: John Wiley and Sons, Inc.

    Google Scholar 

  • Arbona, C., & Nora, A. (2007). The influence of academic and environmental factors on Hispanic college degree attainment. The Review of Higher Education, 30, 247–269.

    Article  Google Scholar 

  • Cabrera, A. F. (1994). Logistic regression analysis in higher education: An applied perspective. In J. C. Smart (Ed.), Higher education: Handbook of theory and research (Vol. X, pp. 225–256). New York: Agathon Press.

  • Cheng, S., & Long, J. S. (2000). XPost: Excel workbooks for the post-estimation interpretation of regression models for categorical dependent variables. Retrieved June 10, 2008 from http://www.indiana.edu/~jslsoc/web_spost/sp_xpost.htm

  • DesJardins, S. L. (2001). Assessing the effects of changing institutional aid policy. Research in Higher Education, 42, 653–678.

    Article  Google Scholar 

  • Hu, S., & Hossler, D. (2000). Willingness to pay and preference for private institutions. Research in Higher Education, 41, 685–701.

    Article  Google Scholar 

  • Hu, S., & St. John, E. P. (2001). Student persistence in a public higher education system: Understanding racial and ethnic differences. The Journal of Higher Education, 72, 265–286.

    Article  Google Scholar 

  • Kauffman, R. L. (1996). Comparing effects in dichotomous logistic regression: A variety of standardized coefficients. Social Science Quarterly, 77, 90–109.

    Google Scholar 

  • Kim, D. (2004). The effect of financial aid on students’ college choice: Differences by racial groups. Research in Higher Education, 45, 43–70.

    Article  Google Scholar 

  • Kulis, S., Sicotte, D., & Collins, S. (2002). More than a pipeline problem: Labor supply constraints and gender stratification across academic science disciplines. Research in Higher Education, 43, 657–691.

    Article  Google Scholar 

  • Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications, Inc.

    Google Scholar 

  • Long, J. S., & Freese, J. (2003). Regression models for categorical dependent variables using Stata. College Station, TX: Stata Press.

    Google Scholar 

  • McLendon, M. K., Heller, D. E., & Young, S. P. (2005). State postsecondary policy innovation: Politics, competition, and the interstate migration of policy ideas. The Journal of Higher Education, 76, 363–400.

    Article  Google Scholar 

  • Millett, C. M. (2003). How undergraduate loan debt affects application and enrollment in graduate or first professional school. The Journal of Higher Education, 74, 386–427.

    Article  Google Scholar 

  • Paulsen, M. B., & St. John, E. P. (2002). Social class and college costs: Examining the financial nexus between college choice and persistence. The Journal of Higher Education, 73, 189–236.

    Article  Google Scholar 

  • Peng, C. J., So, T. H., Stage, F. K., & St. John, E. P. (2002). The use and interpretation of logistic regression in higher education journals: 1988–1999. Research in Higher Education, 43, 259–293.

    Article  Google Scholar 

  • Perna, L. W. (2000). Differences in the decision to attend college among African Americans, Hispanics, and Whites. The Journal of Higher Education, 71, 117–141.

    Article  Google Scholar 

  • Perna, L. W. (2001a). The contributions of historically black colleges and universities to the preparation of African Americans for faculty careers. Research in Higher Education, 42, 267–294.

    Article  Google Scholar 

  • Perna, L. W. (2001b). Sex and race differences in faculty tenure and promotion. Research in Higher Education, 42, 541–567.

    Article  Google Scholar 

  • Perna, L. W. (2002). Sex differences in the supplemental earnings of college and university faculty. Research in Higher Education, 43, 31–58.

    Article  Google Scholar 

  • Perna, L. W. (2003). The status of women and minorities among community college faculty. Research in Higher Education, 44, 205–240.

    Article  Google Scholar 

  • Perna, L. W. (2005). The benefits of higher education: Sex, racial/ethnic, and socioeconomic group differences. The Review of Higher Education, 29, 23–52.

    Article  Google Scholar 

  • Petersen, T. (1985). A comment on presenting results from logit and probit models. American Sociological Review, 50, 130–131.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd). Advanced quantitatie techniques in the social sciences series 1. Thousand Oaks, CA: Sage Publications.

    Google Scholar 

  • Smith, J. S., Szelest, B. P., & Downey, J. P. (2004). Implementing outcomes assessment in an academic affairs support unit. Research in Higher Education, 45, 405–427.

    Article  Google Scholar 

  • St. John, E. P., Hu, S., Simmons, A., Carter, D. E., & Weber, J. (2004). What difference does a major make? The influence of college major field on persistence by African American and White students. Research in Higher Education, 45, 209–232.

    Article  Google Scholar 

  • St. John, E. P., Hu, S., Simmons, A., & Musoba, G. D. (2001). Aptitude vs. merit: What matters in persistence. The Review of Higher Education, 24, 131–152.

    Google Scholar 

  • St. John, E. P., Musoba, G. D., & Simmons, A. (2003). Keeping the promise: The impact of Indiana’s twenty-first century scholars program. The Review of Higher Education, 27, 103–123.

    Article  Google Scholar 

  • St. John, E. P., Paulsen, M. B., & Carter, D. F. (2005). Diversity, college costs, and postsecondary opportunity: An examination of the financial nexus between college choice and persistence for African Americans and Whites. The Journal of Higher Education, 76, 545–569.

    Article  Google Scholar 

  • Titus, M. A. (2004). An examination of the influence of institutional context on student persistence at 4-year colleges and universities: a multilevel approach. Research in Higher Education, 45, 673–699.

    Article  Google Scholar 

  • Wolniak, G. C., & Engberg, M. E. (2007). The effects of high school feeder networks on college enrollment. The Review of Higher Education, 31, 27–53.

    Article  Google Scholar 

Download references

Acknowledgements

I would like to thank J. Scott Long, Thomas F. Nelson Laird, Stephen R. Porter, Robert K. Toutkoushian and two anonymous reviewers for commenting on previous drafts of this manuscript. Any errors are my own.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ty M. Cruce.

Appendices

Appendix 1

See Table 3.

Table 3 Studies appearing in The Journal of Higher Education, Research in Higher Education, and The Review of Higher Education between 2000 and 2007 that cite Cabrera (1994) or Petersen (1985) as the source of their calculation of the Delta-p statistic

Appendix 2

Generalization of Eq. 2 for k − 1 dummy-coded independent variables:

$$ {\text{L}}_{{{\bar{\text{y}}}}} = { \ln }\left[ {{{{\bar{\text{y}}}} \mathord{\left/ {\vphantom {{{\bar{\text{y}}}} {\left( {1 - {\bar{\text{y}}}} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {1 - {\bar{\text{y}}}} \right)}}} \right] $$
$$ {\text{L}}_{0} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$
$$ {\text{L}}_{1} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {1 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$
$$ {\text{L}}_{2} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {1 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$
$$ {\text{L}}_{3} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {1 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$
$$ {\text{L}}_{4} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {1 - {\bar{\text{x}}}_{4} } \right) $$
$$ {\text{P}}_{0} = {{{ \exp }\left( {{\text{L}}_{0} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{0} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{0} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{0} } \right)} \right]}} $$
$$ {\text{P}}_{ 1} = {{{ \exp }\left( {{\text{L}}_{ 1} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{ 1} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{ 1} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{ 1} } \right)} \right]}} $$
$$ {\text{P}}_{2} = {{{ \exp }\left( {{\text{L}}_{2} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{2} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{2} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{2} } \right)} \right]}} $$
$$ {\text{P}}_{3} = {{{ \exp }\left( {{\text{L}}_{3} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{3} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{3} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{3} } \right)} \right]}} $$
$$ {\text{P}}_{4} = {{{ \exp }\left( {{\text{L}}_{4} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{4} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{4} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{4} } \right)} \right]}} $$
$$ {\text{Delta-p}}_{ 1} = {\text{P}}_{ 1} - {\text{P}}_{0} $$
$$ {\text{Delta-p}}_{2} = {\text{P}}_{2} - {\text{P}}_{0} $$
$$ {\text{Delta-p}}_{3} = {\text{P}}_{3} - {\text{P}}_{0} $$
$$ {\text{Delta-p}}_{4} = {\text{P}}_{4} - {\text{P}}_{0} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cruce, T.M. A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables. Res High Educ 50, 608–622 (2009). https://doi.org/10.1007/s11162-009-9131-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11162-009-9131-1

Keywords

Navigation