Measuring Residential Segregation With the ACS: How the Margin of Error Affects the Dissimilarity Index


The American Community Survey (ACS) provides valuable, timely population estimates but with increased levels of sampling error. Although the margin of error is included with aggregate estimates, it has not been incorporated into segregation indexes. With the increasing levels of diversity in small and large places throughout the United States comes a need to track accurately and study changes in racial and ethnic segregation between censuses. The 2005–2009 ACS is used to calculate three dissimilarity indexes (D) for all core-based statistical areas (CBSAs) in the United States. We introduce a simulation method for computing segregation indexes and examine them with particular regard to the size of the CBSAs. Additionally, a subset of CBSAs is used to explore how ACS indexes differ from those computed using the 2000 and 2010 censuses. Findings suggest that the precision and accuracy of D from the ACS is influenced by a number of factors, including the number of tracts and minority population size. For smaller areas, point estimates systematically overstate actual levels of segregation, and large confidence intervals lead to limited statistical power.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. These are further adjusted by using a finite correction factor for larger ACS samples, such as the 2005–2009 data (U.S. Census Bureau 2009a).

  2. This process applies to 2009 and later ACS data.

  3. 95 % CIs are used instead of standard errors because they can be modified (per Census Bureau instructions) to account for 0 as the lower bound of population counts (U.S. Census Bureau n.d.a). The 95 % interval was chosen over the Census Bureau’s standard 90 % interval because it is most commonly used in academic research.

  4. The adjusted standard errors are computed using the following formula: \( SE(Y)=sdf\times \sqrt{5Y\left(1-\left(\frac{Y}{N}\right)\right)} \), where Y is the estimate, N is the size of publication area, and sdf is a variable survey design factor (U.S. Census Bureau 2002).

  5. For the remainder of the article, non-Hispanic whites are referred to simply as “whites,” non-Hispanic blacks as “blacks,” and non-Hispanic Asians as “Asians.”

  6. The ordering of the groups in D has no effect: in this instance, for example, w D b = b D w .

  7. See Denton et al. (n.d.) for more information on New York CBSAs.

  8. Measures of covariance are not provided with the ACS data. As a result, both methods assume that no relationship exists between the population counts within tracts.

  9. In accordance with census documentation (U.S. Census Bureau n.d.a) despite the fact that population counts are nonnormal (see Farley and Johnson 1985).

  10. The code and data analysis for this article was generated with SAS software. (Copyright 2015, SAS Institute Inc. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc., Cary, NC, USA.)

  11. Similar plots were examined for measures of the dispersion of w D b , and nearly identical conclusions were reached.

  12. A larger number of trials could make results more precise; more or less may be desired when using other indices and data sources.

  13. Results for all CBSAs are available at http://

  14. This area has 721 tracts along with an estimated 2.3 million total population, 2.1 million whites, 1.8 million blacks, 35,000 Asians, and 26,000 Hispanics according to the 2005–2009 ACS.

  15. The log of the absolute value of the difference is used in the models. The independent variables are either logged or modeled with linear and squared terms, as appropriate.

  16. Regression coefficients are available upon request.

  17. Other variables, such as size of the white population, the number of tracts without whites, and the simulated point estimate of D, have small and inconsistent effects (results not shown).

  18. Although definitional changes for tracts and CBSAs may also lead to small changes in D across the time period.


An early version of this article was presented at the 2013 meeting of the Population Association of America. The authors thank Ruby Wang, Hui-Shien Tsao, and Jin-Wook Lee for providing research assistance. We also thank Richard Alba, Glenn Deane, Samantha Friedman, Timothy Gage, and Maria Krysan for their helpful comments and suggestions. The Center for Social and Demographic Analysis of the University at Albany provided technical and administrative support for this research through a grant from the National Institute of Child Health and Human Development (R24-HD044943).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jeffrey Napierala.

Appendix: Sample SAS Code for Simulating D

Appendix: Sample SAS Code for Simulating D

figure a
figure b

