A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables

Cruce, Ty M.

doi:10.1007/s11162-009-9131-1

A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables

Published: 07 April 2009

Volume 50, pages 608–622, (2009)
Cite this article

Research in Higher Education Aims and scope Submit manuscript

Ty M. Cruce¹

767 Accesses
23 Citations
Explore all metrics

Abstract

This methodological note illustrates how a commonly used calculation of the Delta-p statistic is inappropriate for categorical independent variables, and this note provides users of logistic regression with a revised calculation of the Delta-p statistic that is more meaningful when studying the differences in the predicted probability of an outcome between two or more groups. Although one cannot fully document the extent to which this error in the current calculation of the Delta-p statistic has spread across the field, the potential for error is far reaching as an increasing number of researchers use logistic regression to study categorical outcomes and as researchers look more closely at the Delta-p statistic as a means to communicate the results of logistic regression models to policy makers and administrators. It is recommended that higher education scholars and institutional researchers use caution when reporting the Delta-p statistic from prior studies and that they adopt the revised calculation of the Delta-p statistic presented in this methodological note when estimating logistic regression models with categorical independent variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is Qualitative in Qualitative Research

Article Open access 27 February 2019

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Notes

Given this definition, the Delta-p statistic is only one way to illustrate the discrete change in the predicted probability of an outcome. See Long (1997) for other examples.
The equation presented here is in a slightly different, albeit algebraically equivalent, form than those found in Petersen (1985) and Cabrera (1994). Its form has been modified in order to more explicitly illustrate each step in the calculation of the Delta-p statistic.
The magnitude of the difference between these two results is based on the shape of the logistic curve around the point at which the Delta-p statistic is assessed and the magnitude of the coefficient. That there is any similarity between the absolute values of the results from Examples 1 and 2 is due to the fact that the baseline probability (i.e., 0.494) is located along the region of the logistic curve that most closely approximates linearity and that the coefficient (i.e., 0.6834) is moderate in size. Under other circumstances, this difference between the absolute values of the results from Examples 1 and 2 could be appreciably larger. The findings in Table 1 illustrate this point using other published studies.
The other 12 studies, listed in Appendix 1, provide insufficient information to determine whether or not the error in the calculation was present. The majority of these 12 studies do not report parameter estimates, whereas a few of these studies do not report the baseline probability; both of these criteria are necessary for determining whether or not the Delta-p statistic is correctly calculated.
Although these two approaches are conceptually equivalent, discrete change via simulation is based on the model estimate of $ {\bar{\text{y}}} $ whereas Eq. 2 is based on observed $ {\bar{\text{y}}} $. The two approaches may result in slightly different measures of discrete change.
Researchers may also find this method useful for calculating and interpreting the Delta-p statistics for continuous independent variables. Although not addressed in this manuscript, a one-unit change in a continuous independent variable assessed at $ {\bar{\text{y}}} $ is also susceptible to problems in interpretation. Given the shape of the logistic curve, the change in probability given a one-unit increase from the mean of x will be the additive inverse of the change in probability given a one-unit decrease from the mean of x only when $ {\bar{\text{y}}} $ is 0.50. To alleviate problems in interpretation when $ {\bar{\text{y}}} $ is not at the center of the distribution, others (e.g., Kauffman 1996; Long 1997) have recommended computing the “centered discrete change” in the continuous variable, a process that is similar to the calculation proposed in this manuscript for categorical variables.
Three of the nine studies (i.e., Millet 2003; Perna 2003, 2005) do not report the descriptive statistics for the study variables that are necessary for the calculation of the revised Delta-p statistic.

References

Agresti, A. (2002). Categorical data analysis (2nd ed.). New Jersey: John Wiley and Sons, Inc.
Google Scholar
Arbona, C., & Nora, A. (2007). The influence of academic and environmental factors on Hispanic college degree attainment. The Review of Higher Education, 30, 247–269.
Article Google Scholar
Cabrera, A. F. (1994). Logistic regression analysis in higher education: An applied perspective. In J. C. Smart (Ed.), Higher education: Handbook of theory and research (Vol. X, pp. 225–256). New York: Agathon Press.
Cheng, S., & Long, J. S. (2000). XPost: Excel workbooks for the post-estimation interpretation of regression models for categorical dependent variables. Retrieved June 10, 2008 from http://www.indiana.edu/~jslsoc/web_spost/sp_xpost.htm
DesJardins, S. L. (2001). Assessing the effects of changing institutional aid policy. Research in Higher Education, 42, 653–678.
Article Google Scholar
Hu, S., & Hossler, D. (2000). Willingness to pay and preference for private institutions. Research in Higher Education, 41, 685–701.
Article Google Scholar
Hu, S., & St. John, E. P. (2001). Student persistence in a public higher education system: Understanding racial and ethnic differences. The Journal of Higher Education, 72, 265–286.
Article Google Scholar
Kauffman, R. L. (1996). Comparing effects in dichotomous logistic regression: A variety of standardized coefficients. Social Science Quarterly, 77, 90–109.
Google Scholar
Kim, D. (2004). The effect of financial aid on students’ college choice: Differences by racial groups. Research in Higher Education, 45, 43–70.
Article Google Scholar
Kulis, S., Sicotte, D., & Collins, S. (2002). More than a pipeline problem: Labor supply constraints and gender stratification across academic science disciplines. Research in Higher Education, 43, 657–691.
Article Google Scholar
Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications, Inc.
Google Scholar
Long, J. S., & Freese, J. (2003). Regression models for categorical dependent variables using Stata. College Station, TX: Stata Press.
Google Scholar
McLendon, M. K., Heller, D. E., & Young, S. P. (2005). State postsecondary policy innovation: Politics, competition, and the interstate migration of policy ideas. The Journal of Higher Education, 76, 363–400.
Article Google Scholar
Millett, C. M. (2003). How undergraduate loan debt affects application and enrollment in graduate or first professional school. The Journal of Higher Education, 74, 386–427.
Article Google Scholar
Paulsen, M. B., & St. John, E. P. (2002). Social class and college costs: Examining the financial nexus between college choice and persistence. The Journal of Higher Education, 73, 189–236.
Article Google Scholar
Peng, C. J., So, T. H., Stage, F. K., & St. John, E. P. (2002). The use and interpretation of logistic regression in higher education journals: 1988–1999. Research in Higher Education, 43, 259–293.
Article Google Scholar
Perna, L. W. (2000). Differences in the decision to attend college among African Americans, Hispanics, and Whites. The Journal of Higher Education, 71, 117–141.
Article Google Scholar
Perna, L. W. (2001a). The contributions of historically black colleges and universities to the preparation of African Americans for faculty careers. Research in Higher Education, 42, 267–294.
Article Google Scholar
Perna, L. W. (2001b). Sex and race differences in faculty tenure and promotion. Research in Higher Education, 42, 541–567.
Article Google Scholar
Perna, L. W. (2002). Sex differences in the supplemental earnings of college and university faculty. Research in Higher Education, 43, 31–58.
Article Google Scholar
Perna, L. W. (2003). The status of women and minorities among community college faculty. Research in Higher Education, 44, 205–240.
Article Google Scholar
Perna, L. W. (2005). The benefits of higher education: Sex, racial/ethnic, and socioeconomic group differences. The Review of Higher Education, 29, 23–52.
Article Google Scholar
Petersen, T. (1985). A comment on presenting results from logit and probit models. American Sociological Review, 50, 130–131.
Article Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd). Advanced quantitatie techniques in the social sciences series 1. Thousand Oaks, CA: Sage Publications.
Google Scholar
Smith, J. S., Szelest, B. P., & Downey, J. P. (2004). Implementing outcomes assessment in an academic affairs support unit. Research in Higher Education, 45, 405–427.
Article Google Scholar
St. John, E. P., Hu, S., Simmons, A., Carter, D. E., & Weber, J. (2004). What difference does a major make? The influence of college major field on persistence by African American and White students. Research in Higher Education, 45, 209–232.
Article Google Scholar
St. John, E. P., Hu, S., Simmons, A., & Musoba, G. D. (2001). Aptitude vs. merit: What matters in persistence. The Review of Higher Education, 24, 131–152.
Google Scholar
St. John, E. P., Musoba, G. D., & Simmons, A. (2003). Keeping the promise: The impact of Indiana’s twenty-first century scholars program. The Review of Higher Education, 27, 103–123.
Article Google Scholar
St. John, E. P., Paulsen, M. B., & Carter, D. F. (2005). Diversity, college costs, and postsecondary opportunity: An examination of the financial nexus between college choice and persistence for African Americans and Whites. The Journal of Higher Education, 76, 545–569.
Article Google Scholar
Titus, M. A. (2004). An examination of the influence of institutional context on student persistence at 4-year colleges and universities: a multilevel approach. Research in Higher Education, 45, 673–699.
Article Google Scholar
Wolniak, G. C., & Engberg, M. E. (2007). The effects of high school feeder networks on college enrollment. The Review of Higher Education, 31, 27–53.
Article Google Scholar

Download references

Acknowledgements

I would like to thank J. Scott Long, Thomas F. Nelson Laird, Stephen R. Porter, Robert K. Toutkoushian and two anonymous reviewers for commenting on previous drafts of this manuscript. Any errors are my own.

Author information

Authors and Affiliations

University Planning, Institutional Research, and Accountability, Indiana University, Poplars Building 805, 400 East Seventh Street, Bloomington, IN, 47405, USA
Ty M. Cruce

Authors

Ty M. Cruce
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ty M. Cruce.

Appendices

Appendix 1

See Table 3.

Table 3 Studies appearing in The Journal of Higher Education, Research in Higher Education, and The Review of Higher Education between 2000 and 2007 that cite Cabrera (1994) or Petersen (1985) as the source of their calculation of the Delta-p statistic

Full size table

Appendix 2

Generalization of Eq. 2 for k − 1 dummy-coded independent variables:

$$ {\text{L}}_{{{\bar{\text{y}}}}} = { \ln }\left[ {{{{\bar{\text{y}}}} \mathord{\left/ {\vphantom {{{\bar{\text{y}}}} {\left( {1 - {\bar{\text{y}}}} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {1 - {\bar{\text{y}}}} \right)}}} \right] $$

$$ {\text{L}}_{0} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$

$$ {\text{L}}_{1} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {1 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$

$$ {\text{L}}_{2} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {1 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$

$$ {\text{L}}_{3} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {1 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {0 - {\bar{\text{x}}}_{4} } \right) $$

$$ {\text{L}}_{4} = {\text{L}}_{{{\bar{\text{y}}}}} + {\text{B}}_{1} \left( {0 - {\bar{\text{x}}}_{1} } \right) + {\text{B}}_{2} \left( {0 - {\bar{\text{x}}}_{2} } \right) + {\text{B}}_{3} \left( {0 - {\bar{\text{x}}}_{3} } \right) + {\text{B}}_{4} \left( {1 - {\bar{\text{x}}}_{4} } \right) $$

$$ {\text{P}}_{0} = {{{ \exp }\left( {{\text{L}}_{0} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{0} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{0} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{0} } \right)} \right]}} $$

$$ {\text{P}}_{ 1} = {{{ \exp }\left( {{\text{L}}_{ 1} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{ 1} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{ 1} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{ 1} } \right)} \right]}} $$

$$ {\text{P}}_{2} = {{{ \exp }\left( {{\text{L}}_{2} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{2} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{2} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{2} } \right)} \right]}} $$

$$ {\text{P}}_{3} = {{{ \exp }\left( {{\text{L}}_{3} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{3} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{3} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{3} } \right)} \right]}} $$

$$ {\text{P}}_{4} = {{{ \exp }\left( {{\text{L}}_{4} } \right)} \mathord{\left/ {\vphantom {{{ \exp }\left( {{\text{L}}_{4} } \right)} {\left[ { 1+ { \exp }\left( {{\text{L}}_{4} } \right)} \right]}}} \right. \kern-\nulldelimiterspace} {\left[ { 1+ { \exp }\left( {{\text{L}}_{4} } \right)} \right]}} $$

$$ {\text{Delta-p}}_{ 1} = {\text{P}}_{ 1} - {\text{P}}_{0} $$

$$ {\text{Delta-p}}_{2} = {\text{P}}_{2} - {\text{P}}_{0} $$

$$ {\text{Delta-p}}_{3} = {\text{P}}_{3} - {\text{P}}_{0} $$

$$ {\text{Delta-p}}_{4} = {\text{P}}_{4} - {\text{P}}_{0} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cruce, T.M. A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables. Res High Educ 50, 608–622 (2009). https://doi.org/10.1007/s11162-009-9131-1

Download citation

Received: 19 February 2008
Accepted: 13 August 2008
Published: 07 April 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s11162-009-9131-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Note on the Calculation and Interpretation of the Delta-p Statistic for Categorical Independent Variables

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation