Introduction

The expression “birds of a feather flock together” suggests that people prefer being around similar others. One common way to measure this preference for similarity in an environment, or “aggregation,” is by examining people’s seating choice. Sitting next to a person expresses a liking towards that person and, therefore, choosing to sit next to some people but not others can reveal what traits we value (Holland et al., 2004). Indeed, research finds that people prefer to sit next to other people who are similar on a variety of traits including race, sex, and physical appearance (Batson, Flink, Schoenrade, Fultz, & Pych, 1986; Sriram, 2002). This preference to sit next to similar others leads to less contact between groups (e.g., races, sexes), which can promote further separation and prejudice (Campbell, Kruskal, & Wallace, 1966). Although studying aggregation is important, the current method for studying aggregation is difficult to implement, unable to accommodate many situations, and provides limited statistical information. This paper presents a new method for studying aggregation that addresses these limitations and allows researchers more opportunities to understand intergroup biases.

Seat choice and preference for similarity

Even without explicitly stating their attitudes, people often reveal their biases towards others with subtle non-verbal cues, such as how near or far they are sitting from them. A person, or group of people, may not admit on self-reports to liking people who are similar to them on some dimension (e.g., race), but their non-verbal behavior, such as whether they choose to sit next to them, could indicate a preference (Snyder, Kleck, Strenta, & Mentzer, 1979). A variety of past research has used seating patterns to understand the dynamics between groups and the diversity of a setting. Campbell, Kruskal, & Wallace (1966) found that students tend to sit next to students of the same race and sex. Further, when examining several different schools, how much a school’s White students preferred to sit next to others of the same race, on average, predicted students’ average level of positivity towards Black students within that school. This association between group-level seating preferences and group-level attitudes was found for both direct attitude survey measures and indirect measures of attitude such as projection tests and electrodermal response. More recently, seating aggregation has helped examine racial relations in areas with strong racial divisions such as South Africa (Koen & Durrheim, 2010) and Singapore (Sriram, 2002). By unobtrusively examining seating behavior, these studies showed how a population’s underlying intergroup biases manifest in daily life. In addition to revealing preferences for well known individual differences, studying seating aggregation also allows researchers to understand very subtle preferences people hold that are less immediately obvious, such as liking those whom they physically resemble (Mackinnon, Jordan, & Wilson, 2011). People with glasses are more likely to sit next to people who wear glasses and vice versa. Therefore, measuring aggregation is important because it allows researchers to understand how integrated a setting currently is, and study sensitive or subtle attitudes towards others that are often difficult to assess merely through self-reports.

Current measure of seating aggregation

The most widely used measure of seating preference is the “aggregation index” (Campbell, Kruskal, & Wallace, 1966). This procedure examines whether the number of dissimilar group pairings observed (e.g., a White person sitting next a Black person) in a given location (e.g., a classroom) differs from what the dissimilar group pairings would be if people chose their seats randomly. Thus, the aggregation index is similar to a z-score, where an observed value is compared to some null criterion and this difference is divided by the variability. Large differences relative to the variability suggest that seating preferences are unlikely to be random, but are based on some systematic preference. It is important to note that, in the equation, a group can be defined in any way by the researcher as long as it is dichotomous (e.g., Black/White, male/female, Northerner/Southerner). Thus, a researcher can apply the aggregation index to the same setting multiple times by examining the seating patterns of different types of groups.

Aggregation index

The overall index value, I, is the difference between the observed and the expected number of adjacent seat parings between members of two different groups, all divided by the standard deviation of the expected number of dissimilar adjacencies. Negative values of I indicate that people prefer sitting by similar others (i.e., aggregation), while positive values of I indicate that people favor sitting next to dissimilar others (i.e., segregation).

The following formula expresses the aggregation index:

$$ I=\frac{A-EA}{\sigma {}_A} $$
(1)

In Eq. 1, variable A represents the observed number of pairs of dissimilar group members who are adjacent. This variable is determined by examining how many of pairs of row-wise adjacent seats contain members of different groups (e.g., how many times a White student is sitting next to a Black student).

Variable EA represents the expected number of dissimilar adjacencies the room would have if people chose their seats randomly. EA is calculated using a formula that takes into account the number of members from each group, number of contiguous rows of people (i.e., clusters of people who are alongside each other), and the total number of people in the room. The formula for expected adjacencies is

$$ EA=2\frac{M\left(N-M\right)}{N\left(N-1\right)}\left(N-K\right) $$
(2)

where N is the total number of people in a room, M is the number of people in the reference group (e.g., Black students), and K is the number of groups of row-wise contiguous people, including isolates. Another way to define K is that it is the number of uninterrupted chains of adjacent people in the room, as well as people sitting by themselves.

Variable σA represents the standard deviation of the number of dissimilar adjacencies under randomness. It is derived from the assumption that seats were randomly chosen with regard to group status, but that the pattern of occupied seats was fixed.

The following is the formula for the standard deviation of adjacencies under randomness:

$$ \begin{array}{l}{\sigma}_A=2\frac{M\left(N-M\right)}{N\left(N-1\right)}\left(2N-3K+K1\right)+4\frac{M\left(M-1\right)\left(N-M\right)\left(N-M-1\right)}{N\left(N-1\right)\left(N-2\right)\left(N-3\right)}\\ {}\left[\left(N-K\right)\left(N-K-1\right)-2\Big)\left(N-2K+K1\right)\right]-4\frac{M^2{\left(N-M\right)}^2}{N^2{\left(N-1\right)}^2}\left(N-K\right)\end{array} $$
(3)

The variables in the standard deviation formula are the same as those in the expected value formula, with the addition of K1, which is the number of people with no-one next to them (i.e., isolates).

Limitations of current seating aggregation measure

The aggregation index proposed by Campbell et al. (1966) provided a useful tool for researchers studying intergroup relations, but many aspects of the method are problematic. One often mentioned issue with the Campbell et al. method is that the equations are difficult to implement. Many authors have commented on the complexity of the equations given, specifically the calculation of the variance term (McCauley, Plummer, Moskalenko, & Mordkoff, 2001; Schofield & Sagar, 1977). The equations are so complex that Campbell et al. issued a correction to the variance equation due to a typographical error they made in the original paper (American Sociological Association, 1967). This correction may have added to the confusion of future users of the technique, whom, if not aware of the correction, may mistakenly use the wrong formula. Even with the correct formulas, the technique still requires researchers to visually inspect the seating chart, and enter the relevant variables into the equations. To calculate the three components of the aggregation index, a researcher must manually count the total number of people, the individual group sizes, the number of isolates, and the number of row-wise contiguous groups. This visual inspection is subject to human error and is also time-consuming for larger seating charts. As the size of the room or number of seating charts increases, the potential to miscount one of the needed components also increases. Therefore, a large limitation of the current technique for calculating aggregation is the difficulty in implementing it.

Another weakness of the Campbell et al. formulas is that they are only able to address a narrow range of seating situations. Specifically, their equations restrict the definition of adjacency to only one direction. That is, the formulas given for the method only examine adjacencies along one spatial axis at a time. This limitation is due to the formulas requiring the number of groups of row-wise contiguous people (i.e. K), which is not possible when you are interested in both rows (i.e., side-by-side) and column (i.e., front-and-back) adjacencies. However, researchers may conceptualize being next to a person as not just sitting side-by-side, but also as sitting across from a person or occupying any space that touches the person. Looking at only one dimension does not capture their intended construct. In the past, researchers have needed to perform separate tests for each axis (Ramiah, Schmid, Hewstone, & Floe, 2014; Schoofield & Sagar, 1977). Performing multiple comparisons, though, comes at the expense of increasing the type I error rate. For every dimension a researcher analyzes, he/she increases the probability that sampling variation will have produced an extreme result and he/she may erroneously conclude that the finding is systematic and able to be replicated (Simmons, Nelson, & Simonsohn, 2011). Also, conducting separate tests along multiple axes disregards information from the other axes by treating each dimension in isolation. Researchers may want to combine the adjacencies from all of the axes to have a more powerful test. For these situations where a researcher prefers to conduct a single test of aggregation along all axes, the Campbell et al. method offers no clear solution.

Another technical limitation of the Campbell et al. method is that it can only examine aggregation between two groups. However, groups in real-life are sometimes more complex than binary categories, and take the form of ethnicities, class, and religions, among others. Group differences may not even be nominal. Theories of aggregation suggest that preferences for similarity go beyond groups and for variables that occur along a continuum. Many important variables such as age, attractiveness, and skin color that past research has studied in aggregation are naturally continuous. Dichotomizing physical similarity variables reduces power by discarding information that would otherwise be available in a continuous variable. Therefore, using a continuous similarity variable can potentially provide greater generalizability and statistical power. A more appropriate method of measuring seating aggregation should be able to take into account the continuous nature of the data.

An additional limitation of the current method is the lack of inferential statistics available to the researcher. With the Campbell et al. method, the only way to determine the probability of observing aggregation at least as extreme as what was currently observed assuming no true underlying preference (i.e., a p-value) is by collecting data from multiple rooms and then running a statistical test such as a one-sample t-test, using each room as an observation. Thus, the current method requires a great deal of resources (e.g., time, participants, etc.) for researchers to know how reliable their estimates are. This restriction therefore favors research designs measuring many small rooms/locations. Situations involving large areas, such as lecture halls, and infrequent events, such as ceremonies, are therefore difficult to study. However, these situations may still be meaningful to the research question or theory. Further, despite not being able to resample the seating setting, the number of participants could still be large (e.g., a stadium), and therefore still provide more information than multiple observations on smaller settings (e.g, four classrooms of ten people). Therefore, the current method limits the amount of information that researchers can infer from a setting, and places pressures on researchers to uses designs that involve repeated settings.

Despite offering insight to researchers interested in studying aggregation, the current aggregation technique has room for improvement. Solutions to this problem must address the limitations discussed by being (1) simpler to understand, (2) easier to and implement, and (3) more flexible in the types of analyses. The present paper describes a technique based on bootstrapped simulation of seating charts that can estimate the parameters in the aggregation index more simply and intuitively, while also providing a greater variety of analyses than the original closed-form method allowed.

Proposed simulation method

Rather than using deterministic formulas to estimate the aggregation index parameters, this paper proposes using bootstrapped simulations to calculate the otherwise complex parameters of the aggregation index (for a more in-depth tutorial see: Efron & Tibshirani, 1994; Simonsohn, 2013). The method follows the intuition behind the aggregation index of Campbell et al., where I is the difference between the current aggregation and what would be expected by random seating, divided by the standard deviation. However, this method calculates the expected value of aggregation and the variability of aggregation by simulating people choosing seats randomly in the specified space. This method, which has been compiled into an executable program (SocialAggregation.exe; Fig. 1), iteratively simulates a room whose occupied seats were chosen at random to get values for expected adjacencies and variability of those adjacencies. Specifically the program is given a row-by-column seating chart (either in an Excel, comma separated, tab-delimited file, or entered directly into the program), which is then represented as a two-dimensional matrix. In this matrix, all empty seats or spaces that no students occupy are set as null values, and all occupied seats are represented as an integer representing a specific group (e.g., 1=White, 2=Black). The program then calculates the number of dissimilar adjacencies by searching through the chart, and counts the number of instances where two positive integers are next to each other and are not equal to each other. This search obtains the first value needed for the aggregation index: the observed number of dissimilar adjacencies.

Fig. 1
figure 1

Screenshot of the program. “SocialAggregation.” It is currently set to analyze the seating chart of Campbell, Kruskal, & Wallace (1966)

To calculate the second parameter—the average number of adjacencies that would be expected by chance, assuming the seat choices are fixed—the program un-assigns the people from their seats (by temporarily removing their values from the seating matrix) and then randomly assigns each person, without replacement, to a seat that was previously occupied. The program then counts the number of dissimilar adjacencies in this random seating and appends that value to a list. After numerous iterations, the program then takes the mean of that list, which is equivalent to the expected value of dissimilar adjacencies, assuming random seating. Further, the standard deviation of that list represents the last parameter of the index, which is the variability of dissimilar adjacencies under random seating conditions. With those three values, the program can compute the aggregation index.

Example of the proposed method

Creating a seating chart

To demonstrate how the program functions, we will the use the school seating chart of White and Black students that was originally used in the Campbell et al. paper (Fig. 2). This seating chart shows eight rows of seats, with four seats in each row. Seats that are occupied have a square in that location. The color of the square represents the race of the student (White/Black). The first row on the top-left has two students (a White student in the far-left, a Black student next to them), and two empty seats to their right. This seating chart can also be represented as a matrix with size i x j, where i is equal to the number of rows, and j is equal to the number of columns (Fig. 3). The seating chart of Campbell et al. has four rows of eight chairs each, with one aisle separating them in the middle. Thus it can be represented as a matrix of size 4 ×9, with each race/group coded as a separate integer (e.g. White=1; Black=2), and all empty spaces (including empty chairs and barriers) coded as a “0” (Fig. 3). This matrix representation of a seating chart can be easily created by a researcher using a text/spreadsheet editor, and then read into the program (Fig. 4).

Fig. 2
figure 2

The original seating chart from Campbell, Kruskal, & Wallace (1966) in a matrix format. White squares represent a White student, Black squares represent a Black student, and underscores represent an unoccupied seat

Fig. 3
figure 3

A matrix representation of the seating chart from Campbell, Kruskal, & Wallace (1966). Values of 1 represent a White student, values of 2 represent a Black student, and values of 0 represent an empty area, such as an aisle, or an unoccupied seat

Fig 4
figure 4

A sample spreadsheet and a sample comma-separated text file that recreate the seating chart from Campbell, Kruskal, & Wallace (1966). Values of 1 represent a White student, values of 2 represent a Black student, and values of 0 represent an empty area, such as an aisle, or an unoccupied seat

Because unoccupied space is irrelevant for the calculation of the aggregation index, any two-dimensional setting can be represented in the matrix format. The examples previously discussed involve a naturally rectangular environment (e.g., a classroom). However, as long as blank space and unoccupied chairs are represented as 0s in the chart, the environment can still be represented in a spreadsheet. Figure 5 shows an example of how circular seating arrangements or open spaces that do not have clearly defined seats can be translated into a spreadsheet. In irregular seating patterns, the dimensions of adjacency become especially important to consider. For circular seating arrangements, adjacencies should probably include a corner dimension to include those sitting where the circle bends. Also, for areas where seats are not clearly defined (e.g., a mall or a park), it is important to use spaces of equal size to represent a possible seating location.

Fig. 5
figure 5

Sample spreadsheets for environments with irregular location/seating arrangements. Values of 1 represent a person from one group, while values of 2 represent a person from another group, and values of 0 represent an empty area. The panel on the left is a room with a circular seating arrangement. The panel on the right would be similar to an open mall or park where each cell represents a patch of land of the same square size (e.g, 1 m ×1 m)

Fig. 6
figure 6

The mean absolute error of bootstrapped estimates, compared to the Campbell, Kruskal, & Wallace (1966) equation estimates, for the expected number of dissimilar seating adjacencies in a room. The x-axis shows the number of iterations used in the bootstrap routine. Lines represent the number of seats in a square room

Fig. 7
figure 7

The mean absolute error of bootstrapped estimates, compared to the Campbell, Kruskal, & Wallace (1966) equation estimates, for the standard deviation of the expected number of dissimilar seating adjacencies in a room. The x-axis shows the number of iterations used in the bootstrap routine. Lines represent the number of seats in a square room

It is important to note that Campbell et al.’s method (and thus the proposed method) assumes that seat choice possibilities are fixed. That is, the method Campbell et al. introduced assumes that the places people chose to sit are the only spots available for others to choose to sit. Therefore, empty space, barriers, and unoccupied seats are all irrelevant to the calculation. It is important to keep in mind that the technique makes this assumption, but it does not seem to be particularly problematic for many researchers as the aggregation index has shown predictive validity and convergent validity with other intergroup research, as discussed in the “Introduction” section. This assumption is also important because it allows all seating charts to be represented in the matrix form described in the above paragraph. Regardless of the room shape, obstructions, and chair placement, the matrix notation only requires researchers to specify where the people are currently sitting in relation to another (i.e., are they next to each other or is there space or another person between them). Even irregularly shaped rooms can be represented because any area where a person is not currently siting is represented by a 0. Whether or not a 0 is between two people is a decision left up to the researcher who decides if the empty space is small/insignificant enough for those two people to be considered adjacent or not.

Calculating the observed number of dissimilar adjacencies

The program allows researchers to define adjacency along different dimensions. Researchers can specify that an adjacency is only when two people are sitting side-by-side (i.e., to the left or right of each other on a seating chart). The program can also have adjacency specified as front-and-back, or on the corners of a person. For this example, we will use the definition Campbell et al. used, which was side-by-side. When the matrix is loaded, and the adjacency is specified, the program can be run. When the program analyzes the data, it computes the three parameters in the aggregation index.

To compute the observed number of adjacencies, the program sets a variable that represents the starting number of adjacencies to 0. The program then looks at the cells in the i th row, and j th column, starting at i=1 and j=1. The value of the integer, in this case “1”, is compared to the cells adjacent to it. If an adjacency is defined as side-by-side, then the program looks at the cell in the i th row and j+1st column. In this case, the adjacent cell has a value of “2.” The two values are compared, and if the both cells are not 0, and the value of the second is not equal to the first, then the number of total dissimilar adjacencies for that room is incremented by 1. The process continues for the next column, until all seats in the row are analyzed. The program then moves to the next row and does the same calculation. After all rows have been examined the number of adjacencies counted is stored. The comparative process is illustrated in pseudo-code (see Appendix 1).

If researchers wish to define adjacencies as not only side-by-side but also as people sitting in the front and back of the person, the program can calculate these special cases in a similar way. Rather than looking only at the j+1st column in same row, the search for adjacencies would also include i+1st row in the same column. Therefore a person sitting in the 1st row (i=1) and 4th column (j=4), would be counted as being adjacent to a person who was a in the 2nd row, and 4th column. Related, dissimilar adjacencies can be calculated by looking at not only the next seat, but also two seats ahead in case norms of personal space dictate that an extra seat should always be left empty between people sitting side-by-side. If adjacency is defined in this way, then, in addition to the normal adjacency calculation, the program can also look at the seat in the j+1st position to see if it is empty, and the j+2nd position to see if a person is there and if they are similar (see Appendix 2). In prior research, the decision of how to define an “adjacent seat” has been left up to the individual researcher. Some researchers prefer to count only the seats immediately next to a person as their definition. This definition is consistent with how Campbell et al. originally presented the method. Other researchers prefer to count two people as adjacent if they are next to each other or if there is one empty seat in between them. As mentioned, one justification given for this procedure is social norms. That is, society dictates that a seat be left between people, even if they have a shared relation. Thus, a researcher interested in examining people’s preferences will add imprecision to the measure by missing many instances relevant to the construct. Another justification given is that including people separated by an open seat reduces the number of people sitting alone (i.e., “isolates”). As seen in the Campbell et al. equations, rooms with greater numbers of isolates increase the standard deviation, and thus make it more difficult to detect an effect (i.e., reducing the power of the measure). There does not seem to be a clear way to assess the relative merits of each reason, and it is important for researchers to understand the benefits and the limitations to make the most informed decision.

Calculating the expected number of dissimilar adjacencies and its variance

After all of the observed number of adjacencies in the specified seating chart are counted, the program then calculates the number of adjacencies that would be expected by random seating. To assign random seating, the program creates an array of length N, with each entry corresponding to an individual person in the room, represented by their group’s assigned integer. For example, in our seating chart, there are 22 total people in the room. Of those 22, 16 are White and six are Black. Because our seating chart assigned the integer, “1”, to the White group, and “2” to the Black group, the program would create an array of length 22, with 16 1s and six 2s. The list would then be randomly shuffled. Each currently filled seat (i.e. a cell in the seating chart not set to 0) would be assigned the next entry in the shuffled list. Therefore, we would have a new seating chart with the people randomly placed in the seats that were previously occupied.

The program would then perform the same adjacency counting calculation previously described and append the number of adjacencies counted into an observed adjacency list. This process of randomly assigning people and counting adjacencies would repeat for a large number of iterations (e.g., 10,000). After the final iteration, the mean of the observed adjacency list would indicate the expected number of adjacencies if the students occupied the seats without preference for race/group status. The standard deviation of that list would represent the variability from random seating. Now, using the observed number of dissimilar adjacencies, and the simulated estimates for the expected number of adjacencies and the associated standard deviation, the program can compute the aggregation index using equation 1.

Optimal number of iterations: A simulation study

When using an iterative estimation procedure, such as bootstrapping, it is important to determine the number of iterations needed to obtain both accurate and stable parameter estimates. The parameters estimated in the proposed method are the expected number of adjacencies for a room (EA), the standard deviation of expected number of adjacencies (σA) for the room, and from those parameters, the program then computes the aggregation index (I). Therefore, a simulation study was conducted for different room sizes and variations of number of iterations used by the bootstrap routine. Then the estimated parameter values (EA, σA, and I) for each simulation were compared to the closed-form equations. These comparisons provide the expected error for different settings and iteration values. It is important to note that because Campbell et al. only provide estimates for adjacencies along a single dimension, the current simulations can only address optimality for side-by-side situations.

Simulation program

A separate program was conducted for simulating rooms. This program generated a square (N × N) matrix, where the room size (N) was set to be either: 5, 10, 15, or 20. Thus, we examined rooms containing between 25 and 400 seats. The choice of room sizes was arbitrary, but was intended to represent a realistic spectrum of rooms encountered in life. Note that room size is defined by number of possible seats, and not by space. Therefore a setting with large square footage, but with a few seats close to one another, is more similar to a smaller room size in these simulations than a room with little space, but with the many separated seats. These rooms were populated with two separate groups of “people” in equal proportion (represented as 1s and 2s in the matrix). The sparseness of the room was set to be 50 % (e.g., if the room had 100 seats, it contained 50 people: 25 people from group 1 and 25 people from group 2). For odd-numbered room sizes, the number of people was rounded up to the nearest integer that was closest to 50 % of the room size. Each simulation randomly assigned each person to an empty seat.

Once the room was constructed, the Campbell et al. method (coded as a separate program that measures the parameters in the equations) was used to obtain closed-form (i.e., absolute) values for the expected number of left-right adjacencies (EA) as well as the standard deviation of the expected number of adjacencies (σA). The bootstrapped estimates were obtained by having the bootstrapping program count the number of similar side-by-side adjacencies, and perform the bootstrap routine to estimate the EA and σA parameters for that room. The number of iterations used to calculate those estimates were: 1,000, 5,000, 10,000, 50,000, 100,000, 500,000. These numbers were chosen because prior informal simulations by the researcher suggested that 500,000 iterations were sufficient to obtain accuracy and reliability, and therefore served as an upper limit for the possible values.

Because people were randomly assigned to seats, there may be variability in the number of adjacencies for each room type, and, therefore, each room was re-simulated 1,000 times to minimize the standard error of the estimates. Although the resampling value can be any arbitrarily large number, 1,000 was chosen for computational practicality as the time needed for any order of magnitude larger would be extremely time intensive (e.g., months, years). Therefore, these simulations will provide a measure of how much error, on average, there is between the methods for varying room sizes and iteration values.

Measuring estimate accuracy

Following the simulations, the estimates between the two methods (e.g., formula-derived EA and bootstrapped EA) were subtracted from each other, and then the absolute value of that difference was computed. For each room size and iteration value, the average of this error was computed. This mean absolute error represents how much the simulations were off from the closed-form answer and serves as a measure of how imprecise the estimates obtained from the program are. All statistics are reported in Table 5, including the associated standard deviation of these values that show how much variability the parameter estimates have for each iteration and room size.

Results

The results of this simulation study are shown in Figures 6, 7, and 8. For all simulations, the error rates are all relatively low. As might be expected, the best parameter estimates are obtained with the largest iteration values and in the smallest room (i.e., a room with 25 seats). The least accurate estimates are obtained when only 1,000 iterations are used for a room size of 400 (see Tables 1, 2, and 3). However, when the number of iterations is 50,000 or greater, the estimates differ from the true value by only .002 on average and therefore sufficient for reporting the statistic to a precision of two digits.

Fig 8
figure 8

The mean absolute error of bootstrapped estimates, compared to the Campbell, Kruskal, & Wallace (1966) equation estimates, for the aggregation index. The x-axis shows the number of iterations used in the bootstrap routine. Lines represent the number of seats in a square room

Fig. 9
figure 9

A sample spreadsheet that examines students belonging to four distinct groups (e.g., White = 1, Black = 2, Latino = 3, and Asian = 4)

Fig. 10
figure 10

A sample spreadsheet that uses a continuous measure of similarity (i.e., numbers from 1–10) for each student. The program has a special option that must be selected when the entries in the seating chart are a continuous-level variable (e.g. height, attractiveness, age)

Table 1 Mean absolute error rates for bootstrapped expected adjacency estimates
Table 2 Mean absolute error rates for bootstrapped standard deviation estimates
Table 3 Mean absolute error rates for bootstrapped aggregation index estimates

When examining room size, larger rooms tend to create more error in the estimates. This finding is not unexpected given the parameters in the original Campbell et al. equations,which suggest that the number of occupants as number of non-contiguous blocks of people will increase the variability of adjacencies. However, when estimating the true aggregation index (Fig. 8), room size does not seem to make as much of difference in the error rates, and all iteration methods will produce roughly the same error rate for the aggregation indices in larger rooms as they do in smaller room.

Another important property of the bootstrap method to examine is when does the error rate experience diminishing returns for increased iterations. That is, a researcher may be interested in knowing the iteration value when accuracy stops increasing to an appreciable level. One method to assess this question is scree analysis (Cattell, 1966). This method examines variance/error plots for “elbows.” These elbows can often be seen visually, though there are also quantitative methods to suggest the appropriate value (Cng: Gorsuch & Nelson, 1981; mReg: Zoski & Jurs, 1993). From visual inspection, in all three plots the elbow occurs between the 10,000th and 50,000th iteration. That is, after about 50,000 iterations, the estimates see very slow improvement. This visual analysis showing that between 10,000 and 50,000 iterations is the point of diminishing returns was also confirmed by quantitative scree analysis methods. For all estimates, the mReg method suggests the 50,000th iteration, while the Cng method suggests the 10,000th iteration. Thus, this paper recommends using 50,000 iterations when conducting research, and at least 500,000 iterations for atypical situations not examined in this simulation study.

Comparison with results from previous studies

To assess the validity of the simulation method at estimating the aggregation index’s parameters, six published and unpublished seating charts from papers examining aggregation were analyzed with both the Campbell et al. and the current simulation method. Equations 2 and 3 were used to calculate the parameter values of the aggregation index for the Campbell et al. method. To estimate the values using the simulation methods, the seating chart was converted to the matrix format (in an Excel spreadsheet) using the procedures previously discussed. These seating matrices were entered into the program, and each chart’s parameters were computed from a simulation using 500,000 iterations (to reduce the standard error as much as possible) of random seat assignment.

The first parameter calculated was the observed number of dissimilar adjacencies. For the Campbell et al. method, this parameter has to be visually calculated, while the simulation method automates the counting. The results of this comparison showed no differences between visually counting the number of observed adjacencies and having them counted with the SocialAggregation program. Because the number of adjacencies counted is equivalent to visual inspection, it suggests that the adjacency counting algorithm is functioning as expected. Table 4 compares the methods’ estimates for the expected number of adjacencies. The bootstrap method shows high convergence with the closed-form equations. The largest deviation between the simulated estimates and the closed-form solution was .0027. Similarly, Table 5 shows how the estimates of the standard deviation compare between methods. The bootstrap estimates were never more than approximately .0031 off from the estimates of the closed-form equations. This similarity between parameters suggests that the aggregation index parameter estimates from the simulation are comparable for previous investigations of seat preference. The parameter estimates between methods are all within two decimal places of each other, and greater accuracy may potentially be achieved with a greater number of iterations.

Table 4 Comparison of calculated and simulated expected adjacency values

Advantages and extensions

Simultaneous analysis of adjacencies

By using simulations to estimate the parameters, this bootstrap method offers several improvements over the Campbell et al. method due it being more flexible in the types of analyses, simpler to understand, and easier to use. The previous method could only compute an aggregation index concerning adjacencies along one dimension (e.g, leftside-rightside) at a single time. If more dimensions were of interest (e.g., front and back), the researchers would have to conduct a separate test for that dimension. However, this proposed bootstrap method offers users the ability to examine the different types of adjacencies simultaneously. The program, as usual, would then count the number of those types of adjacencies in the provided seating chart, and compute what the expected number of adjacencies under random seating would be, as well as the standard deviation of the random adjacencies with that definition.

Table 5 Comparison of calculated and simulated standard deviation values

Multiple group comparisons

The program also allows for examining more categories of groups. Because the program’s instructions are to count the number of times non-empty, dissimilar cells are next to each other, the researcher can specify more than two groups in the seating chart (by using a unique integer code for each group), and similarity preferences can still be calculated (Fig. 9). If there is high aggregation, and people tend to sit next members of their own groups, then it will count very few dissimilar adjacencies. However, if people are more open to sitting next to dissimilar group members, then many dissimilar adjacencies will be counted. The expected adjacencies and standard deviation of adjacencies under random seating can still be assessed using this computational method of counting and multiple simulations. Like the previous closed-form method, this method also allows for an aggregation index to be computed for each separate group. If the researcher is concerned about the amount of aggregation White, Black, and Latin-American students show, he/she can have the focal group coded as one integer (“1”), and the other groups coded as a separate common integer (e.g. “2”) . This recoding process to convert groups into different integers can be done easily within a basic text editing or spreadsheet program using the Find-and-Replace function. For example, if White students are coded as 1, Latinos as 2, and Asians as 3, and a user wishes to examine aggregation of White students to themselves versus out-group members, the user would only have to Find-Replace 3 into 2. Or, if the user wanted to compare Latino aggregation to other Latinos versus out-group members, the user would only need to Find-Replace 3 into 1. Thus separate indices can be obtained for each group, or if the groups are left as distinct integers, an overall aggregation index can be provided for the entire room. Therefore, this method allows not only for the total aggregation in an environment to be measured, but also for measuring group-level aggregation.

Examining continuous variables

Another advantage of the proposed simulation method is that similarity can be defined as a continuous variable, while still preserving the typical interpretation of the aggregation index. Rather than assigning people integers in the seating matrix to indicate group membership, researchers can instead enter peoples’ numerical value on a continuous interval-level trait of interest (e.g. age, attractiveness, skin tone). If the researcher specifies to the program that the seating chart is coded with a continuous variable (Fig. 10), the program performs a different process for computing the aggregation index but one that is very similar to the process for a nominal variable. The program examines all non-empty cells and sees if there are non-empty cells adjacent to it. If there are occupied seats adjacent to it, then it takes the absolute deviation of each of those cells to the original cell and appends that value to a list.

When all of the cells have been examined, the program takes the average of the list to represent the average adjacent dissimilarity in the room. Then the program randomly assigns the people to previously occupied seats over multiple trials and computes the average amount of dissimilarity that would be expected to be observed by chance, as well as the standard deviation of the random dissimilarity. As before, the aggregation index is computed by taking the difference of the observed average dissimilarity to the expected average dissimilarity and dividing the result by the standard deviation (see Appendix 3 for code). It is important to mention that the aggregation index and associated formulas have been used over several years, and Campbell et al. provided proof for the computation of the parameters. Thus, this bootstrapped method is able to verify its accuracy by comparing the bootstrapped results to the results from the equations. Because no closed-form equation exists for calculating aggregation on continuous data, it is important to keep in mind that further testing is needed to validate the results. The full code for the calculation is posted in the Appendix, and can be reviewed by any researcher interested in using this experimental calculation.

Non-parametric inferential statistics

Because the program computes the amount of aggregation in the room under random seating conditions, researchers can know more precisely the probability of observing aggregation as extreme as the amount they found. With the prior Campbell et al. method, a researcher could only compute p-values when he/she had measured many seating charts, and had multiple aggregation indices. With this method, the researcher only needs to observe one setting to know how often he/she would observe aggregation at least as extreme as the amount in the present chart if seating were chosen at random. For example, if a researcher observes 12 dissimilar adjacencies in his/her study, and out of 10,000 simulations with random seating, only three of those simulations have dissimilar adjacencies ≥12, then there is approximately a 3/10,000 chance that the researcher would have observed that much aggregation if seating is being chosen at random (p = .0003). Therefore, researchers can test directional hypotheses with the proposed bootstrap method.

In addition to p-values, the program also provides confidence intervals for the mean value of the expected number of adjacencies. That is, researchers are able to understand more about the setting they are examining, and know how many adjacencies would be expected under randomness 95 % of the time, thereby giving the researcher further information not attainable with the previous Campbell et al. method. To compute these confidence intervals, after each simulation of random seating, the number of adjacencies in that randomly seated room is counted and added to a list. Once all iterations have finished, the program then computes bootstrapped confidence intervals from that list using the bias-corrected and accelerated bootstrap suggested by Efron (1987), which adjusts for both bias and skewness in the bootstrap distribution. This bias-corrected procedure for confidence intervals tends to produce more accurate/narrow estimations then the more simple percentile bootstrapped confidence method of removing the first and last 1 – α/2 entries from the sorted list.

This confidence interval provides the range of observable values at the specified confidence for the expected number of dissimilar adjacencies. Therefore, this interval can be compared to the observed number of adjacencies to determine if the observed dissimilar adjacencies overlap or are outside the range of the interval. The program also provides bias-corrected confidence intervals for the estimated aggregation index using a similar process, and therefore researchers can report the confidence interval for this effect size measure. With these confidence intervals, it possible for researchers to conduct their own pre-study power analysis. Researchers who anticipate certain room sizes, total number of persons, group distributions, and isolates can submit hypothetical seating charts to the program and discover the range of dissimilar adjacencies that are probable (i.e., the confidence interval for EA), as well as the standard deviation of the expected adjacencies. With these estimates, the researcher knows how much aggregation they would need to observe to obtain a certain effect size, and can therefore plan studies accordingly and know the feasibility of those studies finding extreme levels of aggregation.

New measures of aggregation

Because of the bootstrapped nature of the method, more information about the room and the individuals can be provided that go beyond the information provided by the Campbell et al. equations. One limitation of the closed-form equations is that they provide very little individual level information. That is, the parameters EA, σA, and I reflect what happens at an aggregate level, but say nothing about the experience of an individual group member. With the bootstrap method, it is possible to compute different statistics about individual level behavior and experiences.

One statistic this paper proposes is the proportion of dissimilar adjacencies for a person (p-DAP). This statistic is intended to provide a more easily interpretable measure of aggregation by describing the daily experience of a typical group member. Specifically, it describes what percent of people next to a person are members of a different group. This statistic thus offers an easy-to-describe picture of intergroup contact that can be communicated to a non-technical audience more clearly. It can also be expressed as raw counts (e.g., for every eight people a White person is next to, roughly two of them are going be Black), which research suggests is one of the most understandable ways to convey statistical information to the public (Gigerenzer, Gaissmaier, Kurz-Milcke, Schwartz, & Woloshin, 2007). To compute the statistic, during the initial counting phase (where the number of observed adjacencies are counted), each group receives its own empty dictionary where the entries are the group members. When a person is adjacent to a person from a different group their dictionary value increments by 1. Thus, all people in a room are assigned a value representing the number of dissimilar people next to them. Further, each seat on the matrix has a value for how many adjacencies are possible (e.g., a person in the top-left corner has only one side-by-side adjacency possible, but two possible adjacencies if adjacent is defined as side-by-side and front-and-back). To obtain an individual’s probability of having an adjacent person next to them, each person’s dissimilar adjacency count is divided by the total possible adjacencies for that seat. For example, if a person is next to only one dissimilar person, and their seat has two total possible adjacencies, then the proportion of people next to them that are dissimilar is .50 or 50 %. In other words, 50 % of the people this person will encounter at their seat are of from a different group. These probabilities are computed for each person in the seating chart. Next, all probabilities are then averaged. This average indicates the probability that an individual of a certain group (e.g., White) will have a dissimilar group member next to them. It is important to note that each group receives its own p-DAP estimate. Further, with this statistic, it is possible to compute the expected proportion of dissimilar adjacencies of a group member for a given room. By computing the p-DAP statistics for each group during the bootstrapped randomization process, it is possible to show how conducive a room is to having intergroup contact at an individual level. Thus, the p-DAP informs researchers of how much experience with other group members an individual has, and also how often these dissimilar encounters will even take place by chance alone given the nature of the room and proportions of group members. These statistics provide a richer picture of intergroup relations and structural barriers to contact as they give an immediately interpretable description of what a group member will experience in a room as well as an exact measure of environmental encouragement that is directly comparable to other environments. While the prior Campbell et al. method does detail the expected dissimilar adjacencies of the environment, this integer is difficult to compare across settings, and therefore may not be particularly helpful for researchers who want to understand how much an environment facilitates intergroup contact at the individual level.

The aggregation index, however, still serves an important metric for researchers. This index, which shows the magnitude of a difference between an observed and null hypothesis value, standardized by the variability, provides an overall representation of group preferences. As a standardized difference between two mean values, it meets the requirements of many different definitions given for an effect size (Kazis, Anderson, & Meenan, 1989; Kelley & Preacher, 2012; NCES, 2002; Olejnik & Algina, 2003; Thompson, 2004). Further, this specific definition of effect size is analogous to the definition provided for Cohen’s d (1988), and therefore typical interpretations of effect size magnitude are applicable. Journals are placing an increasing emphasis on reporting effect size instead of null-hypothesis tests (Cumming, 2014), and therefore it represents a preferred way of expressing results and communicating the extent of a finding. Further, this effect size is directly comparable to other effect sizes, which makes it especially useful for being included in meta-analyses (not only for meta-analayses of seating studies but also meta-analyses of attitudes or in-group bias). The alternative statistic proposed, p-DAP, is probably better suited for researchers interested in a real-life implication of group-member contact and the pressures a particular environment places on group interactions. Thus, the aggregation index and p-DAP can be seen as complementary and not competing measures of aggregation.

Conclusion

The method originally proposed by Campbell et al. was an important contribution to the social sciences, and offered a closed-form method of analyzing aggregation. However, advances in computing speed offer the opportunity to improve upon the method’s limitations. Using bootstrap simulations to estimate the various parameters of the aggregation index allows researchers to analyze more types of situations, learn more information about the index, and do this analysis more efficiently.

This new approach makes it possible to develop intuitive examinations of seating preferences that are more flexible, and allow for a greater variety of analyses, than is currently possible with the Campbell et al. closed-form equations. Further, this method still maintains a high degree of accuracy in parameter estimation, and converges with the previous method’s estimates. Therefore, the bootstrapped simulation method is recommended as an alternative to analyzing how much preference for similarity exists in a social setting.