Abstract
In this chapter I present analyses that document various aspects of the empirical relationships among the segregation indices examined in this study. I document both situations where the indices consistently agree and also situations where they often disagree. I then offer observations on what may be learned from considering these two situations. In addition, I use portions of the chapter to review several practical issues researchers may want to consider when using the indices in empirical studies.
Download chapter PDF
In this chapter I present analyses that document various aspects of the empirical relationships among the segregation indices examined in this study. I document both situations where the indices consistently agree and also situations where they often disagree. I then offer observations on what may be learned from considering these two situations. In addition, I use portions of the chapter to review several practical issues researchers may want to consider when using the indices in empirical studies.
I start by reviewing results from a large, comprehensive data base of index scores for WhiteMinority segregation comparisons. More specifically, the data base contains segregation scores for WhiteBlack, WhiteLatino, and WhiteAsian comparisons for 960 corebased statistical areas (CBSAs). CBSAs are constructed from counties. I applied the 2010 definitions to data from 1990 to 2000 to obtain index scores using constant area boundaries at these three points in time. The full data set includes index scores computed using data for three different spatial units – census blocks, census block groups, and census tracts. I focus primarily on the scores computed using blocklevel data because block groups and census tracts are too large to use for assessing segregation in smaller CBSAs.
Massey and Denton (1988:299) note that multiple options for areal units can be conceptually defensible. Citing prior research by Duncan and Duncan (1955b) and Taeuber and Taeuber (1965) as well as drawing on their own experiences, Massey and Denton also note that, while index scores consistently run higher when segregation is calculated using smaller areal units, blockbased and tractbased index scores tended to correlate closely in the studies they considered. This suggests that findings regarding patterns in crosscity variation in segregation and trends over time in segregation will tend to be similar whether using scores computed from tracts, block groups, or blocks. However, an important qualification must be noted on this point. It is that these findings are based on studies using a relatively small number (N ≈ 60) of large metropolitan areas and the findings do not hold in broader data sets. Thus, I obtain similar findings as reported in these earlier studies when I restrict the analysis here to include only the largest metropolitan areas. However, I find that the choice of spatial unit is much more consequential when I use the full data set which includes hundreds of smaller metropolitan CBSAs and micropolitan CBSAs.
The reason choice of spatial unit matters more in broader samples is simple; tracts are too large to reveal segregation patterns in smaller CBSAs. Indeed, the number of tracts in micropolitan CBSAs is often very small – sometimes falling to single digits. As a result, tracts are not viable units for assessing segregation in smaller communities; tracts consistently yield low scores when closer inspection of residential patterns reveals that segregation is clear and pronounced. In contrast, census blocks can reliably detect segregation patterns in all CBSAs regardless of size. The difference between index scores based on tracts and index scores based on blocks is consistently much larger in small and mediumsized CBSAs. Accordingly, I use scores based on block data in analyses involving the full range of metropolitan and micropolitan CBSAs. When I use scores based on tract or block group data I restrict analysis to include only large metropolitan CBSAs.
Index scores for my full CBSA analysis data set are based on blocklevel group population counts obtained from Summary File 1 in 2000 and 2010 and from the PL94 (voter redistricting) File for 1990. The data for Whites, Blacks, and Asians do not include Latinos and the data for Latinos include persons of all races. The analyses reported here are based on 4,319 WhiteMinority comparisons for CBSAs where both groups in the segregation comparison have overall population counts of at least 1,500. In all there are 1,718 WhiteBlack comparisons, 1,754 WhiteLatino comparisons, and 847 WhiteAsian comparisons.
Table 6.1 provides descriptive statistics summarizing the distributions of index scores for G, D, R, H, and S obtained for each of the three WhiteMinority comparisons. Several patterns stand out in the results. One is that scores for G and D consistently run higher than scores for R, H, and S. This is evident when comparing values at the mean and also at the five quantile values examined. A related pattern is that scores for R, H, and S are relatively similar at the median and above (i.e., at P_{50}, P_{75}, and P_{90}), but scores for H and especially S are noticeably lower below the median (i.e., at P_{25} and especially at P_{10}). The analyses reported in the previous chapter provide a basis for understanding both of these patterns. S typically generates smaller group differences on contact with Whites because S registers the original untransformed pairwise contact scores (p). In contrast, G, D, R, and H subject the original or “raw” contact scores (p) to a nonlinear rescaling that consistently serves to exaggerate group differences in contact with Whites when the original “rawscore” contact differences are small (i.e., when average values of p are relatively high for both groups) and S is likely to take a low value. As noted in the previous chapter, the nonlinearity in the yp scaling function is more dramatic for G and D. This causes their scores tend to consistently run somewhat higher than the other indices. One practical implication of these findings is that one should keep these inherent “scale” differences in index values in mind when making comparisons across different indices. For example, as a rule of thumb, I suggest the three and fourcategory schemes for characterizing levels of segregation in broad categories in Fig. 6.1.
Regarding group comparisons, all five indices suggest that WhiteBlack segregation is consistently higher than both WhiteLatino segregation and WhiteAsian segregation. Index scores are higher for the WhiteBlack comparison at the mean and at every quantile listed in the table. Interestingly, the absolute and relative differences in how scores vary across group comparison are smallest for G, which has the highest scores on average, and they are largest for S, which generally takes much lower scores. The magnitude of the differences across group comparisons for D, R, and H fall in between the larger differences seen for S and the smaller differences seen for G. When comparing median values, the maximum difference across group comparisons is 10.8 points for G, 12.4 points for D, 15.3 points for R, 19.0 points for H, and 33.2 points for S.^{Footnote 1}
The indices tell a less consistent story regarding the comparison of WhiteLatino segregation and WhiteAsian segregation. G indicates the two are roughly similar but with WhiteAsian segregation being slightly higher. D and R clearly indicate that WhiteAsian segregation is higher. H indicates the two comparisons are similar but with WhiteLatino segregation being slightly higher. In contrast, S indicates that WhiteLatino segregation is considerably higher than WhiteAsian segregation. At both the mean and the median, S for the WhiteLatino comparison is higher by at least 10 points than S for the WhiteAsian comparison and the mean and median for S for the WhiteLatino comparison is at least double the level of S for the WhiteAsian comparison.
Close inspection of the underlying distributions of residential outcomes reveals general patterns similar to those seen in the example for Houston, Texas discussed earlier. Specifically, S is higher for the WhiteBlack comparison because the WhiteBlack segregation routinely involves high levels of group separation and neighborhood polarization and S is lower for the WhiteAsian comparison because WhiteAsian segregation almost never involves even moderate levels of group separation and neighborhood polarization. WhiteLatino segregation stands in between; it routinely involves moderate levels of group separation and polarization and occasionally involves high levels. The level of White pairwise contact with Whites across CBSAs is very high in both of these WhiteMinority comparisons; for example, at the median it is 94.4 % for WhiteBlack comparisons, 94.5 % for WhiteLatino comparisons, and 97.9 % for WhiteAsian comparisons. Thus, the difference in S across the different WhiteMinority comparisons arises primarily due to differences in the levels of pairwise contact Blacks, Latinos, and Asians have with Whites. For Blacks the median for pairwise contact with Whites across CBSAs is 46.6 %, for Latinos it is 68.0 %, and for Asians it is 84.4 %. The “flip” side of these values – that is, average pairwise samegroup contact for the minority group – tells a similar story. It averages 15.6 % for Asians, 33.0 % for Latinos, and 53.4 % for Blacks.
Taken together, these results reveal that residential separation from Whites is low for Asians, moderate for Latinos, and high for Blacks. Recall that, for separation and polarization to be high, both groups in the comparison must reside in neighborhoods where their group predominates (i.e., when both have high levels of pairwise samegroup contact). This is why, in sharp contrast to overall or pairwise isolation, separation and polarization are independent of city racial composition. Under even distribution, an imbalanced racial mix for the city will cause one group to experience a high level of samegroup contact but it also will cause the smaller group to experience a low level of same groupcontact. So, regardless of city ethnic composition, segregating forces must be operating for both groups to have highlevels of samegroup contact. The results just reviewed indicate that Whites consistently have highlevels of (pairwise) samegroup contact. This is not simply due to city racial composition. If it was merely a function of racial composition, Blacks, Latinos, and Asians also would experience high levels of contact with Whites when samegroup contact is high for Whites. But the reality is that samegroup contact for both groups is above the level expected under even distribution.
Other indicators (not reported in the table) further confirm that WhiteBlack segregation routinely involves substantial group residential separation and neighborhood polarization while WhiteAsian segregation almost never does and the pattern for WhiteLatino segregation falls in between. One such indicator is whether at least half of the population in both groups in the comparison lives in a neighborhood where their group constitutes at least 60 % of the population. This outcome can never occur under even distribution under any city racial composition. So when it is observed, it is a clear sign that segregation dynamics have produced group separation and neighborhood polarization. This result is seen in 44.5 % of WhiteBlack comparisons, 11.8 % of WhiteLatino comparisons, and only 1.5 % of WhiteAsian comparisons. Thus, clear separation and polarization is rare for WhiteAsian segregation and uncommon for WhiteLatino segregation but common for WhiteBlack segregation.
6.1 When Do Indices Agree? When Can They Disagree?
Table 6.2 presents simple and squared correlations among the scores of the indices for WhiteMinority segregation comparisons for CBSAs in 1990, 2000, and 2010 previously reported in Table 6.1. Squared correlations are reported above the diagonal and are in bold typeface. Simple linear correlations are reported below the diagonal. As noted earlier the full analysis data set includes a total of 4,319 WhiteMinority segregation comparisons where the minority population was 1,500 or more. Due to this large sample size all of the correlations reported in the table are statistically significant at conventional levels and so statistical significance is not specifically noted in the table. As a last preliminary comment, note that the table includes correlations for scores for the symmetric version of the Atkinson index (A_{[0.5]}) as an added point of comparison.^{Footnote 2}
The results in the table document several interesting findings. One is that scores for indices that are related to the segregation curve – namely, G, A, D, and R – correlate very closely.^{Footnote 3} The associations among G, A, and D are particularly high. The lowest simple linear correlation among them is 0.984 and the lowest squared correlation is 0.967. Correlations of R with A and D also are very high. The correlation of R with G appears to be lower with a squared correlation of 0.936 but closer inspection reveals that G and R have a very close relationship that is mildly nonlinear. This is not surprising as R has an exact nonlinear relationship with A, specifically \( \mathrm{A}=\left(2\mathrm{R}{\mathrm{R}}^{{}^2}\right) \), which in turn has a close linear relationship with G.
Figure 6.2 provides graphical depictions of the associations among indices reported in Table 6.2. The scatterplots make it clear that relationships among these four indices – G, A, D, and R – are exceedingly close, even closer than the high correlations suggest if one takes account of the mild nonlinearities in several of the relationships. Indeed, in any pair combination, the multiple squared correlations for predicting the values of any one index based on the value of one of the other indices plus either its square or its square root (depending on the index combination) exceeds 0.969 in all cases. These close associations reflect the fact that the G, A, D, and R all assess segregation outcomes consistent with the principle of segregation curve dominance. As noted earlier, this means that all of these indices are geared to registering group differences in rank order standing on pairwise contact with Whites (p).
The results reported in Table 6.2 also document a second important finding; the correlations involving H and S are lower than the correlations observed among G, A, D, and R. Unlike G, A, D, and R, H and S are not related to the segregation curve. It is perhaps not surprising then that H and S are more strongly associated with each other (squared correlation of 0.818) than with the other indices. The HS scatterplot in Fig. 6.2 documents that the correspondence between H and S is close at high values but is weaker when one of the indices takes a lower value. This accounts for why the correlation between H and S is not as high as those seen among G, D, A, and R. Generally, but not always, scores for H run higher than scores for S. This tendency is more pronounced when S is in the lowtomoderate range (e.g., below 40). The squared correlations of H with G, A, D, and R are not as high as the squared correlation of H with S; but they are moderately strong and run from a low of 0.628 to a high of 0.798. The squared correlations of S with G, A, D, and R are much lower across the board. They run from a low of 0.217 to a high of only 0.303.
Figure 6.3 presents selected scatterplots from Fig. 6.2 to highlight particular results. It shows that the correspondence of H with G, D, and R is relatively close at high and low values of H, but it is looser in the midranges of H. In the case of the relationship of R with H, values of R rarely fall more than a few points below values of H; less than ten percent of cases are lower by more than five points and none are lower by 10 points. However, in the lowtomiddle ranges of H (i.e., 25–50), the values of R often are substantially higher than the values of H with R exceeding H by more than 10 points in over a quarter of cases. In the case of the relationship of D with H, values of D always are well above values of H and again it is evident that the DH discrepancies are largest in the lowtomiddle range of H (i.e., 25–50). A similar pattern is seen in the relationship of G with H. Values of G always are well above values of H and the GH discrepancies tend to be largest in the lower middle range of H (i.e., 20–40).
S has a close correspondence with G, D, and R only when values of S are hightovery high. When values of S are not high, the relationships between S and these three indices are weak and inconsistent. The reason for this is that values of G, D, and R can and frequently do vary over wide ranges when S is at lowtomoderate values. To be sure, G, D, and R can and sometimes do agree with S and take lowtomoderate values when S takes lowtomoderate values. But G, D, and R also can and often do take high values when the value of S is low.
It is instructive to consider the comparison of S with D. Scores of D are never lower than scores of S, but the amount by which D exceeds S can and does vary dramatically across comparisons. For example, when S is in the range of 15–25, the interdecile range for the difference between D and S is 27.5 points with more than ten percent of scores for D falling below 47 and more than 10 % exceeding 73. Similarly, when S is in the range of 35–45, the interdecile range for the DS difference is 22.6 with over ten percent of scores for D below 56 and more than 10 % above 78. The patterns for S compared with G are similar. Scores for G are never below D and thus run considerably higher than scores for S. But the amount by which G exceeds S varies greatly. For example, when S is in the range of 15–25, the interdecile range for the difference between G and S is 25.3 points with more than 10 % of scores for G falling below 64 and more than 10 % falling above 88. Similarly, when S is in the range of 35–45, the interdecile range for the GS difference is 19.0 points with more than 10 % of scores of G falling below 73 and more than 10 % exceeding 91.
The pattern for S compared with R is similar to those just described for D and G but with one difference; scores for R occasionally are lower than scores for S. This is not typical and, when it occurs, R is lower than S only by a small amount. The more important finding is that the values of R, like values of D and G, can vary greatly at a given level of S. For example, when S is in the range of 15–25, the interdecile range for the RS difference is 31.3 points with over 10 % of scores for R below 23 and more than 10 % above 53. The same variability in scores for R is seen when S is in the range of 35–45. In this situation, the interdecile range for the RS difference is 29.9 points and it is not uncommon to observe scores of R ranging at or below 29 to at or above 59.
Summing up, indices that are closely associated with the segregation curve – namely, G, A, D, and R – correlate at high levels with each other, but less so with H and much less so with S, two measures not linked to the segregation curve. These findings depart dramatically from previous findings of high correlations among all indices of uneven distribution. For example, Duncan and Duncan’s (1955a) landmark methodological study reported that D, G, and S were correlated at high levels and suggested the correlations were so high that there was little practical benefit to gain from considering measures beyond D which had advantages in ease of calculation and interpretation. More recently, the valuable and influential methodological study by Massey and Denton (1988) similarly reported very high levels of correlation among G, A_{(0.50)}, D, H, and S with the lowest correlation among the indices being 0.89 (for the correlation between G and S).
Why are these correlations reported in these previous studies so high when correlations of G, A, D, and R with H and S reported here are moderatetoweak? The answer traces to basic differences in research design across the studies. Specifically, the difference in findings traces to difference in the samples of cities considered and to differences in the spatial units used when computing segregation scores. Regarding the differences in the samples of cities, the studies by Duncan and Duncan (1955a) and Massey and Denton (1988) both were based on 60 cities consisting primarily of the largest metropolitan areas in the country. Duncan and Duncan examined cities for which tract data had been tabulated in the 1940 census and the sample was primarily, but not exclusively, comprised of the largest metropolitan areas in the country. Massey and Denton developed their analysis sample by first taking the 50 largest metropolitan areas and then including an additional 10 metropolitan areas with large Latino populations. Regarding spatial units, both studies used tractlevel data when computing segregation scores. While this is a common practice, it is not well suited for assessing segregation for smaller groups or for assessing segregation in smaller communities. These two aspects of the samples used in the landmark studies by Duncan and Duncan (1955a) and Massey and Denton (1988) tend to minimize differences between measures that emerge in the much broader sample used here. To be clear, the results reported in these earlier studies are not incorrect. But the results reported in these studies do not generalize beyond large metropolitan areas.
I provide evidence to support this conclusion with several analyses. To begin I replicated the analysis reported in Table 6.2 using a subset sample of 58 CBSAs that corresponds as closely as possible to the cities used in Massey and Denton’s (1988) study.^{Footnote 4} I found that the correlations among indices obtained using this subsample were consistently higher, often by substantial amounts and were never significantly lower in comparison to correlations using the broader sample. For example, the correlation of D and S using scores computed from block data was 0.5205 in the broader sample and 0.6433 in the Massey and Denton subsample. I then examined correlations using index scores computed from tractlevel data instead of blocklevel data. The correlations among indices increased by substantial amounts and closely matched the correlations reported in Massey and Denton (1988). For example, the correlation of scores for D and S based on tractlevel data in the subsample of cases corresponding to the Massey and Denton sample was 0.9248 and replicates the value of 0.92 reported in Massey and Denton.
These analyses establish that the associations among segregation indices are markedly lower when study designs draw on a broader sample of cities and assess segregation using block data instead of tract data. For example, when computing scores using tract data the squared correlation between D and S is 0.8552 (\( \mathrm{r}=0.9248 \)) for the Massey and Denton subsample of CBSAs. It drops to 0.5895 (\( \mathrm{r}=0.7678 \)) when using the broader sample. Both values are much higher than the squared correlation of 0.2709 (\( \mathrm{r}=0.5205 \)) observed for the broader sample of CBSAs using scores computed from blocklevel data.
Table 6.3 explores the issue in more detail by reporting the correlation and squared correlation of D and S using subsets of segregation comparisons grouped by the size of the populations in the segregation comparisons (a close correlate of city population size). Correlations are reported separately for index scores based on tract, block group, and block data. Several patterns are clear.

Correlations are consistently stronger for scores computed using tract data and weaker for scores computed using block data.

Correlations are stronger for comparisons for CBSAs with populations of 500,000 and even stronger for CBSAs with populations of 1,000,000 or more. This pattern holds generally for scores computed using tract, block group, and block data.
These results support the general conclusion that correlations between indices are consistently weaker when using broader, more heterogeneous samples of cities and when using index scores computed for blocks instead of tracts. As a final check, I replicated these results using alternative versions of index scores that corrected for index bias (discussed in Chaps. 14 and 15), a potential concern when using index scores computed from blocklevel data. The relevant results were fundamentally similar and strengthen the conclusion I offer here.
I now answer the questions posed in the heading for this section of the chapter, “When do different indices agree?” and “When can they disagree?” The previous discussion provides a preliminary answer. Indices are more likely to agree in studies that focus on large metropolitan areas and compute index scores using tractlevel data. Conversely, indices are more likely to disagree in studies that use broader samples and/or compute index scores with blocklevel data. But why is this so? Two findings provide clues. One is that cities in the Massey and Denton sample have higher levels of relative minority presence and the other is that correlations among indices are consistently higher when the relative size of the minority population is larger. Among the CBSAs segregation comparisons that meet the criterion of having at least 1,500 in population for the minority group, relative minority presence is consistently higher in the subset of CBSAs in the Massey and Denton subsample and this is true for all three WhiteMinority comparisons considered.
This is consequential because correlations among indices are higher when pairwise minority group proportions are moderatetohigh.^{Footnote 5} Evidence for this is presented in Fig. 6.4 and in Table 6.4. Table 6.4 is organized in three panels. The top panel gives correlations among index scores computed from blocklevel data for the subset of WhiteMinority segregation comparisons where the two groups in the comparison are similar in relative size; specifically, these are the subset of 510 segregation comparisons where the pairwise proportion for the smaller group in the comparison is in the range of 0.30–0.50. The key finding documented here is simple and compelling; the correlations among all of the indices are extremely high. The weakest relationship observed is between G and R with a simple linear correlation of 0.9697 and a squared correlation of 0.9403. Figure 6.4 presents the scatterplots for these same relationships. It documents that the relationships are even stronger than the simple linear correlations suggest as the lower correlations involve relationships that are very close but mildly nonlinear. When the nonlinearities are taken into account, all relationships are near exact. For example, the GR combination has the lowest squared linear correlation (0.9403) but regressing G on R and the square root of R yields a multiple Rsquare statistic of 0.9859.
The middle panel of Table 6.4 presents results for WhiteMinority segregation comparisons where the pairwise proportion for the smaller group in the comparison is the range of 0.10–0.30. The key finding documented here is that, while the correlations are generally lower, they all remain very high. Thus, the lowest squared correlation is 0.8660 for the DS relationship and nine of the fifteen correlations exceed 0.95.
The bottom panel of Table 6.4 reports correlations among index scores for the subset of WhiteMinority segregation comparisons where the minority group is small in relative size. Specifically, it reports correlations for cases where the pairwise proportion for the smaller group in the comparison is under 0.10. Two findings warrant mention. First, the correlations among G, D, A, and R – the four measures related to the segregation curve – remain high; the lowest squared correlation is 0.9432 for the GR combination. Second, and more importantly, the squared correlations involving H and S – the two measures not related to the segregation curve – drop off considerably, especially correlations involving S. The squared correlation of 0.8370 between H and S is fairly high. But squared correlations of H with G, D, A, and R fall in a substantially lower range of 0.7056–0.7543 for H and the squared correlations of S with these measures fall in a much lower range of 0.3132–0.3756.
I highlight the most important points of the above discussion as follows.

Scores for all popular segregation indices consistently agree and correlate closely with one another when the two groups in the comparison are similar in size.

Scores for popular segregation indices that are closely related to the segregation curve – G, D, A, and R –consistently agree and correlate closely with one another regardless of relative group size.

Scores for popular segregation indices not related to the segregation curve – H and S – correlate closely with each other even when relative group size is imbalanced (i.e., when the pairwise proportion for the smaller group is under 0.10).

Scores for H and S correlate closely with scores for G, D, A, and R when relative group size is relatively balanced (i.e., when the pairwise proportion for the smaller group is \( \ge 0.10 \)). But the correlations fall off substantially, especially those involving S, when the pairwise proportion for the minority group is low (i.e., below 0.10).
6.2 Why Does Relative Group Size Matter?
The difference of means framework provides a basis for gaining insight into these findings. In this framework segregation index scores are obtained as differences of group means on segregationrelevant residential outcomes (y) that are scored from area proportion White (p) via indexspecific scaling functions \( \mathrm{y}=\mathrm{f}\left(\mathrm{p}\right) \). It is obvious that scores for different indices will correlate more closely when the indexspecific scaling functions \( \mathrm{y}=\mathrm{f}\left(\mathrm{p}\right) \) for the indices involved are similar. Conversely, correlations among scores will be lower when the scaling functions involved differ. The graphs in Fig. 5.1 introduced earlier documented how the scaling functions vary across indices. In the case of S, the scaling function is linear. The scaling functions for the other indices are nonlinear with nonlinearity being more pronounced for some indices than for others. Specifically, the graphs in Fig. 5.1 documented that the nonlinearity is least pronounced for H and progressively more pronounced for R, D, and G. This helps explain why scores for G, D, and R consistently correlate closely. It also helps explain why scores for S correlate more closely with scores for H than with scores for G, D, and R.
The scaling function for S is invariant across variation in relative group size; y is always a simple, onetoone linear function of p. Significantly, the scaling functions for all of the other indices vary systematically with relative group size. Specifically, the “amplitude” of the nonlinearity in the scoring function is most pronounced when relative group size is highly imbalanced and it is least pronounced when relative group size is equal (i.e., 50/50). Figures 6.5 and 6.6 document this for the Theil index (H) and the Hutchens square root index (R) by plotting the scaling function \( \mathrm{y}=\mathrm{f}\left(\mathrm{p}\right) \) with values of relative group size set variously at 0.01, 0.05, 0.20, 0.50, 0.80, 0.95, and 0.99. The variation in nonlinearity is particularly easy to summarize for these two functions because they are smooth and continuous. Nonlinearities in the scaling functions for G and D behave in a similar manner, but are more complicated visually because the functions involve monotonic but irregular step functions.
Figures 6.5 and 6.6 show that in all four cases the nonlinear functions departure from linearity is mildest when groups in the segregation comparison are similar in size and it grows increasingly more pronounced as groups become more unequal in size. Since the scaling function for S is always linear, this explains why scores for S correlate more closely with scores for the other indices when groups are equal in size and less closely, sometimes markedly so, when the two groups in the comparison are unequal in size. In general, the difference between any two indexspecific scaling functions is least pronounced when groups are equal in size and it grows larger as groups become more unequal in size. This accounts for why index scores generally correlate more closely when groups are equal in size and correlate less closely when groups are unequal in size.
The potential discrepancies between scores for different indices follow a very clear pattern. At one end of the spectrum there are indices like G and D which register residential outcome scores (y) based on scaling functions that involve more pronounced nonlinearities (as seen in Fig. 5.1). On the other end of the spectrum are indices like H and S which register residential outcome scores (y) based on scaling functions that involve only mild nonlinearity (H) or simple linear scaling (S). Under all conditions scores for G and D consistently run higher than scores for H and S. But there are big differences in how this plays out depending on the group size comparison. When group size is relatively balanced (e.g., pairwise proportion for the smaller group is 0.15 or higher), scores for G and D will run higher than scores for H and S and will fall in a narrow range of variation at any particular level of H or S. In contrast, when group size is imbalanced (e.g., pairwise proportion for the smaller group under 0.10), scores for G and D will run higher than scores for H and S but they may fall in a sizeable range of variation at any particular level of H or S.
This is documented in Fig. 6.7 which plots values of D against values of H and S for three sets of cases.^{Footnote 6} The first panel of the figure depicts the DH and DS relationships for all CBSAs. The second panel depicts the same relationships for the subset of CBSAs where the pairwise proportion for the smaller group is 0.15 or higher. The third panel depicts the relationships for the subset of CBSAs where the pairwise proportion for the smaller group is below 0.10. Note that I exclude CBSAs with the very lowest values (i.e., values below 0.02) on pairwise group proportion so it will be clear that the pattern observed in this panel is not determined by extreme cases. The scatterplots in the second panel document that, when the groups in the segregation comparison are relatively similar in size, D varies in a narrow range at any particular level of H or S. The scatterplots in the third panel document that, when the groups in the segregation comparison are somewhat unequal in size, D varies in a much larger range at any specific level of H and S. On the low end, the variation in D extends down to the levels seen in the second panel of the figure. On the high end the variation in D is considerable and often ranges 25–35 points above scores on the low end. The first panel combines the CBSAs in the second and third panels and also includes CBSAs where the smaller group meets the group size requirement of 1,500 in population but has a pairwise proportion of less than 0.02. This amplifies the pattern seen in the third panel by extending the range of variation on both the high and low ends at any given level of H and S.
Figure 6.7 documents that popular indices of uneven distribution can and often do yield highly discrepant results. When this happens, a specific substantive interpretation applies. The pattern of segregation in these situations involves extensive group differences in displacement from parity but does not involve high levels of group residential separation and neighborhood polarization. The combination comes about because indices such as G, D, and R can respond with high scores when displacement from parity involves group differences in pairwise contact that are quantitatively small. Indices that register group separation and neighborhood polarization take low values in these situations because the two groups are living together, not apart, with most minority individuals living with Whites and few residing in predominantly minority residential areas (e.g., ghettos and barrios). It is important to be aware of this possibility for many reasons not the least of which being that it affects the potential policy implications of eliminating uneven distribution. When group separation and area polarization are absent, majorityminority differences in residential outcomes will change little when uneven distribution is eliminated. When separation and polarization are present, the residential outcomes experienced by minority individuals can potentially change dramatically when uneven distribution is eliminated. I believe this is an important aspect of the correspondence, or lack of it, between different indices. Accordingly, I review the issue in more detail in Chaps. 7 and 8.
Notes
 1.
Comparisons at the means of the distributions yield similar patterns.
 2.
This is primarily to document that A is an exact function of H, which is less well known to sociologists.
 3.
Specifically, G, A, D, and R satisfy the principle of “segregation curve dominance” which means that when comparing two cases the index will indicate that segregation is lower for a case if its segregation curve is somewhere above and nowhere below the segregation curve for the other case.
 4.
Two cases in the Massey and Denton sample are not included in the subset of cases examined here. In the 2010 CBSA definitions used here their areas of PatersonClifton and Jersey City are assigned to the New YorkWhite PlainsWayne CBSA Division.
 5.
More carefully, correlations are higher when the two groups are similar in size; that is, when P and Q are equal. The distinction is relevant in segregation comparisons where Whites are the smaller group; for example, WhiteLatino segregation in San Antonio and El Paso.
 6.
Results for G and R are not shown, but are similar. I highlight results for D because it is used more often in empirical studies.
References
Duncan, O. D., & Duncan, B. (1955a). A methodological analysis of segregation indices. American Sociological Review, 20, 210–217.
Duncan, O. D., & Duncan, B. (1955b). Residential distribution and occupational stratification. American Journal of Sociology, 60, 493–503.
Massey, D. S., & Denton, N. A. (1988). The dimensions of residential segregation. Social Forces, 67, 281–309.
Taeuber, K., & Taeuber, A. (1965). Negroes in cities: Racial segregation and neighborhood change. Chicago: Aldine Publishing Company.
Author information
Authors and Affiliations
Rights and permissions
This chapter is licensed under the terms of the Creative Commons AttributionNonCommercial 2.5 International License (http://creativecommons.org/licenses/bync/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Fossett, M. (2017). Empirical Relationships Among Indices. In: New Methods for Measuring and Analyzing Segregation. The Springer Series on Demographic Methods and Population Analysis, vol 42. Springer, Cham. https://doi.org/10.1007/9783319413044_6
Download citation
DOI: https://doi.org/10.1007/9783319413044_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783319413020
Online ISBN: 9783319413044
eBook Packages: Social SciencesSocial Sciences (R0)