Measuring and Modeling Repeat and Near-Repeat Burglary Effects
- First Online:
- 1.6k Downloads
We develop a mathematical framework aimed at analyzing repeat and near-repeat effects in crime data. Parsing burglary data from Long Beach, CA according to different counting methods, we determine the probability distribution functions for the time interval τ between repeat offenses. We then compare these observed distributions to theoretically derived distributions in which the repeat effects are due solely to persistent risk heterogeneity. We find that risk heterogeneity alone cannot explain the observed distributions, while a form of event dependence (boosts) can. Using this information, we model repeat victimization as a series of random events, the likelihood of which changes each time an offense occurs. We are able to estimate typical time scales for repeat burglary events in Long Beach by fitting our data to this model. Computer simulations of this model using these observed parameters agree with the empirical data.
KeywordsRepeat victimization Burglary Event dependence
Repeat victimization has recently emerged as a central focus in criminology. Research has demonstrated that individuals who have been victims of personal or property crimes are more likely to be victimized again (Farrell and Pease 2001). In the case of residential burglary, which we shall focus on in this paper, repeat victimization is described in terms of exact-repeat and near-repeat events (Johnson et al. 2007). Exact-repeat events are defined as consecutive burglaries occurring at the same location, separated by a time interval of any duration; near-repeat burglary events are instead classified as taking place within a set spatial neighborhood of a focal burglary point.
Repeat burglary victimization may be due to a variety of reasons, including persistent spatial heterogeneity of risk and/or event dependence tied to the specific activities of burglars (Tseloni and Pease 2003, 2004; Johnson 2008). When considering a complex urban environment, risk heterogeneity implies that some houses (or neighborhoods) are at higher risk than others, and that this difference in risk persists throughout time. Some houses may be at higher risk because they are physically soft targets (e.g., easily forced doors or windows) or because the routine activities of inhabitants leave them much less secure than other homes. By contrast, event dependence suggests that some aspect of the burglar’s previous experience victimizing the house increases their preference to return. For example, a burglar may discover an abundance of items that could be targeted in a subsequent burglary, or they may simply prefer to return to a location where they know that their entry methods are guaranteed to work again, amongst other possible reasons (Farrell et al. 1995). In addition, it has been suggested that this elevated risk may spread to neighboring homes as well (Johnson et al. 1997; Townsley et al. 2003; Sagovsky and Johnson 2007), especially in areas where nearby homes are similar in layout and type of inhabitant. Notice, however, that in the case of event dependence, burglary risk is not persistent throughout time, but may change as the burglar’s preferences, skills, and exposure to other opportunities change (Farrell et al. 1995).
The concept of such biased repeat burglary carries strong implications for the dynamics of crime pattern formation and for the development of prevention and resource allocation strategies (Bowers et al. 1998; Farrell and Pease 1993, 2001). Models based on the event dependence hypothesis show that individual crimes can establish positive feedbacks and nucleate into crime hotspots (Eck et al. 2005; Johnson and Bowers 2004; Short et al. 2008). Effective control strategies would pinpoint these pivotal sites, using past crimes as indicators of future ones, breaking the feedback loops and thus surgically halting the further spread of crime (Farrell et al. 2007). Therefore, simple and accurate methods of testing for the presence of event dependent repeat effects are of great importance.
Repeat burglary effects are often observed via the distribution of victimization order within a population of homes, where the victimization order is here defined as the number of times a home is burgled within some fixed temporal window. This distribution is typically inconsistent with a Poisson distribution, which is what would be expected if all homes had the same, persistent risk of burglary. In order to see whether event dependence may be responsible for this inconsistency, one often focuses on the distribution of time intervals τ between successive events that occurred at the same location, a procedure that has been performed using burglary data from a variety of cities worldwide (Johnson et al. 2007). In general, it is observed that the distribution of time intervals between burglary events is a rapidly decaying function, with short time intervals much more likely to occur than longer ones. This observation has been taken as evidence for the existence of event dependence, and that a house will exhibit an increased risk of being burgled after being victimized once. However, there has not yet been a rigorous discussion as to why exactly these decaying time interval distributions support this hypothesis. In fact, as we will show in this paper, this observation alone does not necessarily support the event dependence hypothesis at all, and the method of counting the time intervals is of critical importance when interpreting the distribution of τ.
Throughout the remainder of this paper, we will be performing analyses on a dataset which includes the geographic location and day on which each reported residential burglary for the years 2000–2005 occurred in Long Beach, CA. We consider only those burglaries that occurred at single family homes, since we do not possess data that is detailed enough to pinpoint specific units within multi-family housing. Here, the term “single family homes” refers to stand-alone housing units (i.e., detached houses) with unique physical addresses as opposed to “multi-family housing”, which could be an apartment complex or condominium building where many separate units share the same physical address that belongs to the entire structure. In our analyses, we have ignored the influence of seasonality (Farrell and Pease 1994), specifically because the climate of Long Beach minimizes such fluctuations. However, all of the results and formulas can be modified to include seasonal effects in a straightforward manner. The dataset contains 9,042 events, and the distribution of victimization order across the homes is: 7,002 order one, 819 order two, 98 order three, 19 order four, 5 order five, and 1 order seven. According to the 2000 US census, there are between 70,000 and 80,000 occupied single family homes in Long Beach. Using this fact, and the house order distribution, we find that a simple Poisson distribution does not fit our Long Beach data well, indicating that something is indeed causing repeat victimization there.
The goal of this paper is to first describe a model in which this repeat victimization is due solely to risk heterogeneity; we refer to this as the random event hypothesis (REH) (see also Nelson 1980). We then show, following from some very reasonable assumptions, that the distribution of time intervals τ for exact-repeats in the REH is that of a sum of decaying exponentials, and that we should observe just this when using a moving-window counting method on our data (which will be describe later). Using our burglary data, we illustrate that the observed distribution of τ is completely compatible with that predicted by the REH, with the parameters of the fit interpreted as measures of risk heterogeneity. This compatibility, however, is not sufficient to prove the validity of the REH since other possible mechanisms of burglary dynamics might be equally compatible with the observed results. In fact, the parameters of the fit lead to a predicted distribution of home orders that is wildly different from that observed, indicating that the REH is insufficient to explain exact-repeat effects in our data. We then introduce a different method of counting exact-repeat time intervals that allows us to unequivocally differentiate data sets generated via the REH from those in which burglary events are in fact related via event dependence, using only the time interval distribution. Applying this novel analysis method to our data set, we find that there is, in fact, event dependence in Long Beach. We present a simple mathematical model with a straightforward criminological interpretation that explains the observations under both counting methods, and which reproduces both the time interval distributions and the home order distributions well in simulation. Finally, we extend some of these results to the measurement of event dependence in the near-repeat effect, finding that it too is present in our data.
The Random Event Hypothesis
Therefore, if the REH is correct, the distribution of time intervals between exact-repeat events at a given home with rate constant λ should follow an exponential decay of the type shown in Eq. 6. Note that this distribution, which displays a much higher number of events at short time intervals than long, was derived without introducing any notion of correlation between burglary events. In fact, this distribution will only hold if the events are statistically independent, a notion that is completely contrary to the typical assumptions of the event dependence hypothesis.
Testing the REH
The Moving-Window Method
In order to test the distribution predicted by the REH, one must first develop a proper counting scheme for the time intervals τ between exact-repeat events. Ideally, one would watch each burgled house within the city of interest until it is burgled again, and simply mark the time to repeat. However, this is clearly infeasible, as many homes will not be burgled again during a reasonable observation period. In fact, for our Long Beach data set, out of the 7,944 unique locations burgled, only 942 of them were burgled more than once. If we were to only use the time intervals from these relatively few locations, we would likely introduce a bias into our count because we would be systematically discarding many time intervals which were at longer timescales and were, therefore, never observed.
To count properly, then, we use a method that we will call here the moving-window method. The basic idea behind this method is to first choose a time window of interest, τmax, and then to observe after each burglary event whether or not another event occurs at that same location within this time window; let us use as an example a τmax of 727 days for our Long Beach data set. If an event does indeed occur, the time interval τ between the initial and secondary event is noted. Of course, any event which occurs within the last τmax days for which we have data cannot be subsequently watched over the full τmax window, as some of the window would clearly lie outside of the dates for which we have data. Therefore, we do not perform our observation following these events. We call this final τmax period within our data the “buffer interval”, which corresponds to the years 2004 and 2005 in our example. The final output of the count consists of the number of events No for which an observation was performed (this is equal to the total number of events in our dataset minus the number of events that occur within the buffer interval) and a list of time intervals observed. Note that the number of time intervals recorded will in general not be No, since not every home that is observed will be subject to another burglary within our τmax window, as discussed above. Finally, we make a histogram of the observed τ, dividing the frequency for each histogram bin by No to arrive at a probability distribution that we can directly compare to Eq. 7. It is in this way that the homes burgled only once affect our count—they contribute no time intervals, but they do increase No and thereby influence the probabilities.
Along with the observed τ distribution, we have plotted in Fig. 1 (the solid line) a curve of the type shown in Eq. 7 with parameters chosen to give the best fit to our data. Using N = 3, we find the best fit to be w1 = 0.915, λ1 = 5.32 × 10−5, w2 = 0.066, λ2 = 2.45 × 10−3, w3 = 0.019, and λ3 = 8.41 × 10−2. The choice of N = 3 was made simply because this was the smallest value for which a good fit of the curve to our data could be found; both the N = 1 and N = 2 curves deviate too substantially from our data. For this choice, though, the REH curve fits our data rather well.
On the basis of this analysis, one might conclude that there is no event dependence effect in our data, since our observations of the distribution of τ are completely consistent with the REH, with the spatial heterogeneity of risk described by the wi and λi used in the fit. Specifically, 91.5% of the houses have a time-invariant risk of being burgled once every 51 years, 6.6% of the houses have a higher time-invariant risk of being burgled once every 1.1 years, and 1.9% of the houses have an even higher time-invariant risk of being burgled once every 11.8 days (see Eqs. 11 and 12 below). However, one can perform further analyses that cast doubt upon this conclusion. If the distribution of risk and the rates for each were indeed as given by our wi and λi above, the distribution of house order would be very different than what is observed: using 70,000 for the total number of single family homes in Long Beach, these parameters predict around 6,800 order one, 700 order two, 600 order three, 700 order four, 800 order five, 700 order six, 500 order seven, and many hundreds of homes with order greater than seven. It is clear from this subsequent analysis that spatial risk heterogeneity is not an accurate description of what is happening in Long Beach.
The Fixed-Window Method
Although the moving-window counting method is a valid approach, its corresponding null hypothesis curve as derived through the REH contains a large number of parameters, making it difficult to compare to observations in a meaningful way. In addition, as shown above, even if parameters can be chosen such that the REH curve fits the data very well, further calculations are needed to interpret these results. In order to more easily determine the validity of the REH, we develop a counting method for which the null hypothesis curve is completely parameter free and that can by itself definitively confirm or deny the REH; we term this the fixed-window method. We first remind the reader that each home appearing within our data set can be classified by the number of times it was burgled in total over the D days of data available; we refer to this as the order of the house. The probability of any given home with burglary rate λi being of order k is given by Eq. 2, replacing δt with D. Note, however, that Eq. 2 is independent of the particular times at which the home was burgled, so long as there were a total of k events. This means, for example, that for order one homes, each of the D days is equally likely to be the day on which the one event occurred, assuming that λi is persistent in time (i.e., seasonality is ignored and there is no event dependence). Similarly, for order two homes, each possible pair of days that can be made from our D day interval is equally likely to be the observed pair.
Modeling Event Dependence
One may wonder why the time interval data seemed to agree with the REH when using the moving-window counting method, but clearly contradicts it when using the fixed-window method. The answer to this question is that the moving-window curve defined by Eq. 7 is not unique to the REH. An alternate model, based on the idea of event dependence, that would present the same moving-window curve can be described in the following way. Suppose that there exist some number N of different risk states that any given home can exhibit. Associated with each of these states i is a burglary rate λi and a weight wi; these wi form a probability distribution for the rates λi in a sense that shall be described shortly. Now, imagine that we have a large number of homes, each of which is initially assigned one of the rates λi, with the probability distribution for this assignment being the weights wi. For each home, then, time goes by until that home is subject to a burglary event; the time elapsed will clearly depend upon which of the rates λi was initially assigned to the home. As the critical step, after this (and each subsequent) burglary event, the home is again assigned a rate λi according to the probability distribution wi, thus leading to the possibility of a change in the Poisson rate at that home. This process repeats indefinitely, with a state assigned via the wi distribution after each burglary event. Data generated via this process will lead to a moving-window curve as given in Eq. 7, but will not yield the fixed-window curve derived in Eq. 10. This is because the Poisson rate for any given house is no longer, in general, persistent over all time (thus altering the null hypothesis curve for the fixed-window method), though it is persistent between any two events at a home (giving it the same null hypothesis curve as the REH for moving-window counts). Hence, this event dependence based model is a good candidate for the process that is in reality generating our empirically observed burglary events.
The preceding model also has a simple criminological interpretation. Suppose N = 2, meaning that a home can exhibit two states, which we can refer to as the baseline state (that state with the smaller λ1) and the excited state (that state with the larger λ2). After any burglary event, the house will, with probability w1, be assigned the baseline state, meaning that the offender who just burgled the home has no specific intention of returning there in the future (though he or another burglar of course may return simply by chance). Alternatively, the house will, with probability w2, be assigned the excited state, meaning that the offender, possibly for reasons such as those described in the introduction, plans on returning to this specific home again in a relatively short period of time. Any number of possible states N greater than two simply correspond to more possible levels of excitement for returning offenders, who may range from only slightly interested to extremely interested in returning. Indeed, the good fit between Eq. 6 and our Long Beach data for N = 3 (see Fig. 2) suggests that in Long Beach, a house might be assigned one of three possible states after an event: the baseline state or one of two excited states.
For Long Beach, this predicts that at any given time, 99.842% of homes are in the baseline state, 0.156% are in the first excited state, and the remaining 0.002% are in the second excited state.
A near-repeat event occurs whenever two “nearby” houses are burgled within some period of time. Like exact-repeats, we can measure the time interval that lies between each event in a near-repeat pair, but in this case we must also make note of the physical separation of the two homes. This procedure allows us to examine separately the time interval histogram for near-repeat pairs that lie at varying physical distances from each other. It has been noted in previous studies that those near-repeat events that are relatively close in space tend to occur more closely in time as well, like exact-repeats, whereas those that are far apart seem to exhibit no temporal correlation. These previous studies use Monte Carlo algorithms to find the likelihood of the observed patterns happening if there were no correlation between the spatial and temporal distributions (Johnson et al. 2007; Ratcliffe and Rengert 2008), determining that this is highly unlikely. In this section, we instead test explicitly for near-repeat event dependence by extending our finite-window counting method used above for exact-repeats to the case of near-repeat events in our Long Beach data.
To test for the presence of near-repeat event dependence, we first isolate in our data all order one homes. We then perform a fixed-window count on these events in a pair-wise fashion, measuring both the temporal separation and physical distance between the burglaries comprising each possible pair of events. Note that this is essentially the same procedure as was performed for the exact-repeats earlier, except that in that case the physical distances were all zero and we used order two homes rather than order one. The fact that order one homes are approximately equally likely to be burgled on any day of our fixed interval means that the time intervals for near-repeat events should be distributed exactly as in Eq. 10 if no correlation between the two burglaries making up a pair exists (i.e., if there is no event dependence). This is because, since each of the homes is equally likely to be burgled on any given day, each of the possible pairs of days making up a near-repeat event ought to be equally likely as well, which is the condition that leads directly to Eq. 10. 1
Discussion and Conclusions
Exact-repeat and near-repeat burglary patterns are often explained in terms of event dependence, in which burglars prefer targets with which they are familiar over targets that necessitate acquiring new information (Bernasco and Nieuwbeerta 2005). The daily routines of the residents may be known, as are the specific items that might be stolen, in a house that an offender has previously victimized. The risk of burglary is thought to spread to neighbors for many of the same reasons: even if the specific routines of residents, the goods to be stolen, or the general layouts of adjacent houses are not known exactly, they are likely to be similar. It is easy therefore to map a burglary strategy or set of cognitive scripts from one house to another (Wright and Decker 1994). Moreover, the expectation is that the risk of near-repeat burglary will spread more readily where housing types and the everyday routines of residents are spatially more homogeneous (Townsley et al. 2003; Johnson et al. 2007).
In this paper we have taken a new approach toward measuring and modeling event dependent repeat burglary effects, emphasizing the mathematical derivation of a null hypothesis model for the time interval between repeat burglaries. This model, termed the REH, assumes that burglaries at individual houses (or within some spatial neighborhood) are statistically independent events occurring at characteristic rates, and that persistent spatial heterogeneity of rates is solely responsible for any repeat victimization present. Equation 7 gives the probability of observing different time intervals between burglaries under the REH. We then empirically tested the REH using two different methods for counting burglaries. The moving-window method involves defining a temporal buffer equal to the maximum waiting time of interest, τmax, between burglaries. The buffer ensures that we will count repeats only for houses burgled on days that can be subsequently observed for the full τmax period. Using this counting method, the REH predicts that the probability distribution for the observed time intervals between exact-repeat burglaries decreases exponentially. Evidence from moving-window counting of residential burglaries in Long Beach, California is consistent with this formulation. This result is significant in suggesting that a decay-like distribution of waiting times between burglaries, measured using a moving-window counting procedure, is not necessarily indicative of a correlation between burglary events nor a preference for offenders to return to sites previously victimized (i.e., event dependence). However, further calculations based upon the moving-window count indicate that, though the time interval observations are consistent with the REH null hypothesis curve, the heterogeneity parameters derived from this count cannot explain the correct distribution of repeats within our data.
Because the moving-window counting method has a large number of parameters and does not seem to definitively test the REH, we explored an alternative method based on counting burglaries within a fixed time window. The resulting model gives the probability distribution of time intervals between events for houses burgled two or more times (see Eq. 10) within a set period of time. The fixed window counting procedure also predicts a decreasing probability of observing longer time intervals between burglaries, but in the case of houses victimized twice the relationship is linear with no free parameters (Eq. 9). Long Beach exact-repeat burglaries counted using the fixed-window method are inconsistent with the REH hypothesis. Repeat burglaries appear to be much more likely to occur in a short time interval after a first event than is expected given the REH, and it is clear from this counting method that burglary rates cannot be persistent in time. This result is significant in that it supports the view that there are temporal correlations between burglary events and that offenders do preferentially return to previously victimized homes within a short period of time after an initial burglary.
Surprisingly, perhaps, different procedures for counting exact-repeat burglary events lead to different and seemingly contradictory test outcomes. However, it is possible to interpret Eq. 7 in a way such that both counting procedures yield results that are consistent with event dependant repeat burglary. Specifically, rather than referring to a persistent environmental heterogeneity in burglary risk, Eq. 7 may be taken as a description of individual houses stochastically transitioning between states of varying burglary risk following each burglary event. The probability of adopting state i after any event is given by wi and the corresponding burglary rate for that state is λi. Using parameters fit to our Long Beach data, for example, a house in state i = 1, with a baseline burglary rate of λ1 = 5.32 × 10−5 burglaries per day may, after an event, adopt an excited state i = 2 with probability w2 = 0.066 and burglary rate λ2 = 2.45 × 10−3 burglaries per day. We expect a repeat burglary to occur at this house on average within 408 days, which is 46 times faster than the expected interval between burglaries at houses that remain in the state i = 1. Similarly, we expect a house to adopt the even more excited state i = 3 with probability w3 = 0.019 and a characteristic burglary rate of λ3 = 8.41 × 10−2 burglaries per day. In this case, the expected time interval to a repeat burglary at the house is approximately 12 days, more than 1,530 times faster than a house in the baseline state, and 33 times faster than a house in the intermediate excited state. We have not specified the exact causes behind transition between different states, though it is reasonable to infer excited states (greater risk of repeat burglary) in some way correspond to burglars’ preferences to return to previously victimized houses, as discussed previously.
Overall, the models and results presented here confirm that event dependence in the exact-repeat burglary effect exists and can be quantitatively distinguished from the REH using appropriate counting procedures. A similar conclusion is drawn when examining near-repeat burglary events, which are significantly different from a corresponding null hypothesis at short distances between events (e.g., 0–100 m), but are indistinguishable from the null hypothesis at greater distances (e.g., 3.9–4.0 km).
Most importantly, the general rate transition model described by Eq. 7 and our approach to testing for the presence of this behavior using a fixed-window counting procedure (Eq. 10) is potentially applicable to other types of repeat victimization. For example, we might hypothesize that the individuals in a given population may each be at risk of being a victim of a violent crime and, subsequent to being victimized for a first time, adopt according to some probability distribution one of several possible states, each with its own characteristic victimization rate. Some individuals will fall into a baseline group after victimization, while others might be more at risk after being victimized (Lauritsen and Quinet 1995). Similarly, Ratcliffe and Rengert (2008) analyze shootings in Philadelphia and find an elevated risk of near-repeat shootings occurring within 2 weeks and within one city block of previous incidents. Our model would suggest that the near-repeat effect observed by Ratcliffe and Rengert stems from the area around a previous shooting event transitioning to an excited state characterized by a higher rate of shooting events.
The results here reinforce the view that repeat and near-repeat victimization may play a role in the nucleation of crime patterns in space and time and, as a consequence, may be an appropriate basis for designing crime prevention strategies. However, we also note that there are a number of challenges yet to meet in designing optimized responses to repeat crimes. In particular, our results from analyses of exact-repeat burglaries in Long Beach suggests that at any given time only about 0.002% of houses exhibit the highest excited state (Eqs. 11 and 12) with an expected time to a repeat event of approximately 12 days. This corresponds to only about 1 single family residence from a total of approximately 70,000 units. The challenge is to determine which house(s) belong to this very small set, which would allow preferential targeting of resources at these locations.
Actually, there is a slight subtlety involved in this step that we should explain. To be precise, each pair of days d1 and d2 in which \(d_1 \neq d_2\) is equally likely, and all of those pairs in which d1 = d2 are half as likely as that. This is due to the fact that each non-identical d1, d2 pair can be made in two ways: d1 at house 1 and d2 at house 2, or vice versa. In practice, we sidestep this issue by simply dividing the counted number of near-repeats by two if the time interval is greater than zero, so that our null hypothesis curve for near-repeats counted this way is continuous and precisely that of Eq. 10.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Bowers KJ, Hirschfield A, Johnson SD (1998) Victimization revisited: a case study of non-residential repeat burglary on Merseyside. Br J Criminol 38:429–452Google Scholar
- Eck JE, Chainey S, Cameron JG, Leitner M, Wilson RE (2005) Mapping crime: understanding hot spots. National Institute of Justice/NCJRS, Washington, DCGoogle Scholar
- Farrell G, Pease K (1993) Once bitten, twice bitten: repeat victimization and its implications for crime prevention. Crime Prevention Unit Paper 46. Home Office, LondonGoogle Scholar
- Farrell G, Pease K (1994) Crime seasonality: domestic disputes and residential burglary in Merseyside 1988-90. Br J Criminol 34:487–498Google Scholar
- Farrell G, Pease K (eds) (2001) Repeat victimization. Criminal Justice Press, MonseyGoogle Scholar
- Farrell G, Phillips C, Pease K (1995) Like taking candy: why does repeat victimization occur? Br J Criminol 35:384–399Google Scholar
- Farrell G, Bowers KJ, Johnson SD, Townsley M (eds) (2007) Imagination for crime prevention. Crime prevention studies, vol 21. Criminal Justice Press, MonseyGoogle Scholar
- Feller W (1968) An introduction to probability theory and its applications, 3rd edn., vol 1. Wiley, New YorkGoogle Scholar
- Johnson SD, Bowers KJ, Hirschfield AFG (1997) New insights into the spatial and temporal distribution of repeat victimization. Br J Criminol 37:224–241Google Scholar
- Townsley M, Homel R, Chaseling J (2003) Infectious burglaries. A test of the near repeat hypothesis. Br J Criminol 43:615–633Google Scholar
- Wright RT, Decker SH (1994) Burglars on the job: streetlife and residential breakins. Northeastern University Press, BostonGoogle Scholar