Abstract
We examine the effect of sample design on estimation and inference for disparate treatment in binary logistic models used to assess for fair lending. Our Monte Carlo experiments provide information on how sample design affects efficiency (in terms of mean squared error) of estimation of the disparate treatment parameter and power of a test for statistical insignificance of this parameter. The sample design requires two decision levels: first, the degree of stratification of the loan applicants (Level I Decision) and secondly, given a Level I Decision, how to allocate the sample across strata (Level II Decision). We examine four Level I stratification strategies: no stratification (simple random sampling), exogenously stratifying loan cases by race, endogenously stratifying cases by loan outcome (denied or approved), and stratifying exogenously by race and endogenously by outcome. Then, we consider five Level II methods: proportional, balanced, and three designs based on applied studies. Our results strongly support the use of stratifying by both race and loan outcome coupled with a balanced sample design when interest is in estimation of, or testing for statistical significance of, the disparate treatment parameter.
Similar content being viewed by others
References
Anderson, J. A. (1972). “Separate Sample Logistic Discrimination,” Biometrika 59, 19–35.
Avery, R. B., P. E. Beeson, and P. S. Calem. (1997). “Using HMDA Data as a Regulatory Screen for Fair Lending Compliance,” Journal of Financial Services Research 11, 9–42.
Breslow, N. E., and N. Chatterjee. (1999). “Design and Analysis of Two Phase Studies with Binary Outcome Applied to Wilms Tumour Prognosis,” Applied Statistics 48, 457–468.
Bull, S. B. (1993). “Sample Size and Power Determination for a Binary Outcome and an Ordinal Exposure when Logistic Regression Analysis is Planned,” American Journal of Epidemiology 137, 676–684.
Calem, P. S., and S. D. Longhofer. (2002). “Anatomy of Fair Lending Exam: The Uses and Limitations of Statistics,” Journal of Real Estate Finance and Economics 24, 207–237.
Calem, P. S., and M. Stutzer. (1995). “The Simple Analytics of Observed Discrimination in Credit Markets,” Journal of Financial Intermediation 4, 189–212.
Carr, J. H., and I. F. Megbolugbe. (1993). “The Federal Reserve Bank of Boston Study on Mortgage Lending Revisited,” Journal of Housing Research 4, 277–313.
Cosslett, S. (1981a). “Maximum Likelihood Estimators for Choice-Based Samples,” Econometrica 49, 1289–1316.
Cosslett, S. (1981b). “Efficient Estimation of Discrete-Choice Models.” In C. F. Manski and D. McFadden (eds.), Structural Analysis of Discrete Data with Econometric Applications, Cambridge, Mass.: MIT Press, pp. 51–111.
Courchane, M., D. Nebhut, and D. Nickerson. (2000). “Lessons Learned: Statistical Techniques and Fair Lending,” Journal of Housing Research 11, 277–295.
Day, T. E. and S. J. Liebowitz (1998). “Mortgage Lending to Minorities: Where’s the Bias?,” Economic Inquiry 36, 3–28.
Glennon, D., and M. Stengel. (1994). “An Evaluation of the Federal Reserve Bank of Boston’s Study of Racial Discrimination in Mortgage Lending.” Office of the Comptroller of the Currency, Economic and Policy Analysis Working Paper 94–2.
Harrison, G. W. (1998). “Mortgage Lending in Boston: A Reconsideration of the Evidence,” Economic Inquiry 36, 29–38.
Horne, D. K. (1994). “Evaluating the Role of Race in Mortgage Lending,” FDIC Banking Review 7, 1–15.
Horne, D. K. (1997). “Mortgage Lending, Race, and Model Specification,” Journal of Financial Services Research 11, 43–68.
Kao, T.-C., and G. P. McCabe. (1991). “Optimal Sample Allocation for Normal Discrimination and Logistic Regression under Stratified Sampling,” Journal of the American Statistical Association 86, 432–436.
Ladd, H. F. (1998). “Evidence on Discrimination in Mortgage Lending,” Journal of Economic Perspectives 12, 41–62.
Longhofer, S., and S. Peters. (1999). “Beneath the Rhetoric: Clarifying the Debate on Mortgage Lending Discrimination,” Federal Reserve Bank of Cleveland, Economic Review 34, 2–13.
Manski, C. F., and S. R. Lerman. (1977). “The Estimation of Choice Probabilities from Choice Based Samples,” Econometrica 45, 1977–1988.
Munnell, A. H., L. E. Browne, J. McEneaney, and G. M. B. Tootell. (1992). “Mortgage Lending in Boston: Interpreting HMDA Data,” Federal Reserve Bank of Boston, Working Paper 92-7.
Munnell, A. H., G. M. B. Tootell, L. E. Browne, and J. McEneaney. (1996). “Mortgage Lending in Boston: Interpreting HMDA Data,” American Economic Review 86, 25–53.
Prentice, R. L., and R. Pyke. (1979). “Logistic Disease Incidence Models and Case-Control Studies,” Biometrika 66, 403–411.
Scott, A. J., and C. J. Wild. (1986). “Fitting Logistic Models under Case-Control or Choice Based Sampling,” Journal of the Royal Statistical Society, Series B 23, 469–476.
Scott, A. J., and C. J. Wild. (1991). “Fitting Logistic Regression Models in Stratified Case-Control Studies,” Biometrics 47, 497–510.
Self, S. G., and R. H. Mauritsen. (1988). “Power/Sample Size Calculations for Generalized Linear Models,” Biometrics 44, 79–86.
Stengel, M., and D. Glennon. (1999). “Evaluating Statistical Models of Mortgage Lending Discrimination: A Bank-Specific Analysis,” Real Estate Economics 27, 299–334.
Whittemore, A. (1981). “Sample Size for Logistic Regression with Small Response Probability,” Journal of the American Statistical Association 76, 27–32.
Windmeijer, F. A. G. (1995). “Goodness-of-fit Measure in Binary Choice Models,” Econometric Reviews 14, 101–116.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Clarke, J.A., Courchane, M.J. Implications of Stratified Sampling for Fair Lending Binary Logit Models. J Real Estate Finan Econ 30, 5–31 (2005). https://doi.org/10.1007/s11146-004-4829-5
Issue Date:
DOI: https://doi.org/10.1007/s11146-004-4829-5