Searching for clusters of targets under stochastic resetting

We consider diffusion under stochastic resetting to the origin in one dimension and compute the mean time to find both of two targets placed either side of the origin. A surprising result is that increasing the distance between two targets can decrease the overall search time. We compute the optimal arrangement of two targets in limiting cases. We generalise to obtain recursive expressions for the mean time to find all of multiple targets. We discuss the relevance to real-world problems of locating multiple targets such as proteins locating clusters of DNA lesions.


Introduction
It has long been appreciated that search processes in biology are speeded up by the use search strategies which include long-range as well as short-range moves [1]. For example, in order that vital biological functions occur, proteins must locate binding sites on DNA rapidly in order to trigger various transcription processes. In fact they locate sites up to 1000 times faster than expected for a diffusion controlled process [2]. The first theories that attempted to account for these fast search times proposed facilitated diffusion as the search mechanism [3][4][5]. This mechanism reduces the dimensionality of the search by splitting it into 1D and 3D components: the proteins search in 1D along a strand of DNA to which they are loosely bound and then disassociate, diffuse in 3d and re-associate at another non-specific site, which may be far along the strand, to continue the 1D search.
More recently, search processes have been of interest within the statistical physics community and different classes of search strategies have been identified, see e.g. [6][7][8][9]. Different specific search problems may have different protocols, but they share the desire for an optimal search strategy. In an intermittent search (see [10] for a review) there are local steps, which effect searching, interspersed with long distance relocations. Such a combination of moves is believed to be beneficial in wide range of biological behaviours a e-mail: m.evans@ed.ac.uk from animal foraging to target search of proteins on DNA molecules [11][12][13].
A simple model for searching with short-range and longrange moves is diffusion with stochastic resetting [14]. Here short range foraging is modelled by diffusion of the searcher and long-range relocations by instantaneous 'reset events'. In the simplest framework the resetting of the searcher is to a fixed resetting location but a distribution of resetting sites may also be considered [15]. It has been shown that the mean time for the searcher to locate a fixed target is significantly improved by the inclusion of resetting. Indeed, for a purely diffusive process the mean time to locate a target diverges whereas it is rendered finite by the introduction of a resetting process. Moreover, by tuning the resetting rate, one may optimise the mean time to locate a target. The resetting paradigm has been explored in a number of other contexts including restarting of stochastic algorithms [16] and complex chemical reactions [17]-see [18] for a recent review. Recently, experiments in optical traps have measured the optimal mean time for a diffusing Brownian particle to reach a target under resetting [19,20].
Diffusion with stochastic resetting has the appealing feature that many properties may be analysed exactly. Mostly, this has been carried out in one spatial dimension [14] but one can easily extend some results to higher spatial dimensions [21]. Typically, one focusses on the mean time for a searcher to find a target at which point the searcher is absorbed. Some works have considered multiple targets [22,23,[25][26][27]. However, in some applications it may happen that there are a number of targets, distributed in some manner, and the goal is to locate all of the targets. A real world context is that of proteins searching for DNA lesions [28]. DNA lesions need to be located by proteins quickly in order to be fixed, If there are many DNA lesions and proteins are successfully finding just one of them, then the bulk of the tissue will remain damaged so a successful search must try to locate them all. Moreover, if the tissue has been exposed to a large dose of ionising radiation which creates the DNA lesions, the lesions may exist in clusters [29][30][31]. In order to address the problem of locating each and every one of a cluster of targets, the logical first step is to consider a searcher looking for two targets, rather than just one.
In this paper, inspired by the problem of locating all of a cluster of targets, we consider the problem of diffusion with resetting with a single searcher and the mean time to locate multiple targets in one dimension. We find exact expressions for the mean time to multiple abosorption. An interesting emergent effect is that the presence of multiple targets combined with resetting after locating each of them, can lead to a reduction in the mean time to find a single target. This counterintuitve result illustrates that co-operative effects may arise from having a cluster of targets to find.
The paper is organised as follows. In section 2 we define the model beginning with two targets on either side of a resetting site and set up the formalism for the computation of mean time to double absorption. In section 3 we present results for the Laplace transform of the survival probability and the mean time to double absorption. In section 4 we generalise these results to an arbitrary number of targets and present a recursive formula for the mean time to multiple absorptions. In section 5 we conclude with some comments on how the model may be improved with respect to the realworld problem of modelling proteins locating DNA lesions.

Model definition: One Searcher, Two Targets
We consider a diffusive particle (searcher) with diffusion constant D moving on a one-dimensional lattice. With rate r the searcher is reset instantaneously to the origin. We consider one target at x l < 0 and the other at x r > 0. When the searcher touches a target for the first time, the target is absorbed and the searcher is instantaneously restarted at the origin. It is important to note that once either x r or x l is located, it is no longer an absorbing target i.e. if the searcher touches the target again, the searcher is not restarted. The search is completed when both targets have been located.
We note that a related problem of diffusion with resetting on a one-dimensional domain with absorbing boundaries has been considered in [23,24] where the statistics of the time to be absorbed by either boundary were considered.
Our goal is to find the mean time to find both targets (mean time to double absorption, MTDA). For usual first passage problems there are a variety of approaches to calculating survival probabilities and mean first-passage times: forward Fokker-Planck equation, backward Fokker Planck equation and renewal equations (see e.g. [18,32,33]). Here we find it most convenient to use the first approach.
There are three relevant survival probability densities to consider: q(x,t): the probability density of finding the searcher at position x, time t with it not having touched either x r or x l . q r (x,t): the probability density for the searcher to be at position x at time t and it having touched the target at x r but not x l . q l (x,t): the probability density for the searcher to be at position x at time t and it having touched the target at x l but not x r .
The survival probability density, q(x,t), satisfies the forward master equation, in which the spatial variable x is the position after time t: with boundary conditions, The initial condition is since the searcher begins at the origin at t = 0. The first term on the right hand side of (1) represents a loss of probability at x due to resetting and the third term on the right hand side indicates a gain of probability at the origin due to resetting from all x within the domain. The boundary conditions (2) correspond to absorption of the searcher when it touches x r or x l . One can write down analogous but more involved equations for q r (x,t) and q l (x,t), but as we shall see we do not need to explicitly compute these quantities.
The rate at which the search is completed when the second of the targets is found (with one target already having been found) is the sum of the rate of locating x l once x r already been found and the rate of locating x r once x l has already been found. These two rates are given by the diffusive currents from q r at x l and from q l at x r , respectively. Thus, the rate at which the second of the targets is located, and the search completed, can be written as Now, the rate of locating the final target may also be written as the negative rate of change of the total survival probability, Q tot (t), is the total probability that the searcher has not yet touched both targets. Note that this survival probability contains three terms: the probability of having touched x r but not x l ; the probability of having touched x l but not x r ; and the probability of having touched neither target. The limits on the integrals on the rhs of (6) are distinct because for each survival probability density the allowed x domain is different e.g. for q r (x,t), x can vary from x l to ∞ since once x r has been touched it is no longer absorbing. In order to obtain the MTDA, the standard approach for mean first passage time calculations would be to average the time to double absorption over the rate of absorption, F(t), to obtain after integrating by parts, assuming the survival probabilities decay faster than 1/t. We use the notation T 2 to emphasize that it is the mean time to find both targets. Defining the Laplace transforms of the survival probabilities as equation (7) can be written The standard approach would therefore be to compute the Laplace transforms (8) , through which the MTDA will ultimately be found.
To shorten the calcuation, we will take advantage of the known result for T 1 (X r ), the mean time to absorption for diffusion under resetting to the origin with a single target at X r [14], For our case the MTDA, T 2 (x l , x r ), can be written in terms of the probabilities P r , P l to find the right, left target first, the mean times T r , T r to find the corresponding target, conditioned on that target being found first, and the mean time to find a single target, once the other target has been eliminated where we have defined which is the mean first passage time to either of the targets at x r , x l . We note that P r , P l are referred to in the literature as splitting probabilities and have been studied in [23,34]. The relation (15) was used in [23]. Thus (14) reads that the mean time to double absorption is the mean time to the first absorption plus the average of the mean time for the second absorption weighted according to the probabilities of which absorption occurs first. The quantities τ, P r , P l appearing in (14) only require the knowledge of q(x, 0). To see this note that and P l is given by the integral of the absorption rate at target Similarly, Our task is therefore to compute q(x, 0). For completeness we compute in the appendix the full Laplace transform q(x, s).

Laplace transform of survival probability q(x, s)
The Laplace transform of equation (1) is Rearranging yields where In the appendix we give details of the solution of (20,21) for q(x, s). Here we present the result for q(x, 0), which we use in the following, where 3.2 Mean Time to Double Absorption, T 2 We now put everything together to obtain an expression for the mean time to double absorption (MTDA) using (14). First we compute τ, the mean time to absorb the first target, from the formula (16) using expressions (22,23) to obtain This expression is in agreement with equation (15) of [23]. One can check that the limits x r → ∞ and x l → −∞ recover (12). We also require the following expressions which yield These expressions are in agreement with equations (31,32) of [23]. We then obtain from using (14), after simplification, Expression (30) is the main result of this section. One can check that as x r → 0 one obtains which recovers the mean time to absorption under stochastic resetting for a single target at x l (12). The reason is that as x r → 0 (or x l → 0) one target is immediatly located and eliminated by the searcher starting from the origin. Then the searcher effectively starts again from origin at t = 0 to search for a single target. We have also checked (30) by the longer route of computing q r (x, 0), q l (x, 0) and using formulas (7), (6) and found perfect agreement.  Plots of equation (30) with T plotted against both α 0 and r (with parameter values D = 1, x l = −1 and x r = 1) exhibit the expected features ( Figure 1): (i) MTDA tends to infinity as the resetting increases as the searcher does not have sufficient time to diffuse far enough to find a target between resets and therefore will never find either target. (ii) MTDA tends to infinity as the resetting rate tends to zero. This is because, after having found the first target (inevitable given the searcher is sandwiched between the two targets, and there is no resetting), the system reduces to the wellstudied system of one diffusive particle searching in 1D for one fixed target without resetting [14]. In this system, the mean time to absorption diverges. (iii) There is a turning point at which MTDA is minimised (see Fig. 1 1b). The value for r at this minimum is the optimal value for r; for this set of parameters optimal r is 2.03 for which MTDA is 2.15.

Dependence of MTDA on r
One can ask what is the optimal resetting rate i.e. the value of r that minimises (30). For the case of a single target at X r it was found that there is a unique minimum value, which is most neatly expressed in terms of the variable γ = X r α 0 which is the ratio of distance to target over the typical length diffused between resets. For the two-target case considered here, the problem is more complicated since there are two lengths x r , |x l |. The simplest case to consider is |x l | = x r where again we can express the minimisation in terms of a single variable Equating the derivative of (30) with respect to r to zero, yields the following transcendental equation The unique non-zero solution is γ = 1.42433, to be compared to γ = 1.59362 for the case of a single target at x r [14]. For the case D = 1, x r = 1 this equates to an optimal resetting rate r = 2.02872, which is in agreement with Fig.  1b). The fact that the optimal value of γ (and consequently of r) is lower for double absorption than for a single target is easy to understand. In the case of two equidistant targets it would be optimal to have zero resetting rate up until the first target is found and then adopt the optimal resetting rate for a single target search. This suggests that a lower constant resetting rate is optimal overall.

Dependence of MTDA on x r
We now turn to the dependence of the MTDA on the positions of the targets Plotting T against x r (again with arbitrary parameters D = 1, x l = −1 and r = 1) produces a surprising result (Figure 2). Whilst the plot exhibits the expected feature of a divergent MTDA as x r tends to infinity, the position of the turning point is unexpected. With the target at x r able to exist at any point x r > 0 and the target at x l fixed at x l = −1, naively one might assume that the MTDA would be minimised when x r = 0. In this scenario, x r would be found immediately, instantly resetting the system followed by a search for x l (essentially, this scenario is only a search for one target, x l ). However, placing x r further from x l = −1 actually reduces the MTDA. There exists a kind of cooperative effect in which the existence of a second target makes the search for the first target quicker.
The reason for this lies in the fact that the searcher is reset to the origin once it finds a target. Therefore having a target at x r > 0 actually cuts off some trajectories that are moving away from x l . However, placing x r too far from the origin negates the cooperative effect. If x r becomes too large, the time taken to find x r increases such that it cancels out the benefit from the cooperative effect. Thus there exists an optimum value for x r .
To check that cooperative effect between targets always occurs one can compute Thus, there is a decrease in T 2 , on increasing x r from zero, for all parameter values. We now consider the value of x r > 0 that minimises the MTDA given a fixed x l < 0. Setting the derivative of (30) with respect to x r to zero yields a quartic equation for η ≡ e α 0 x r : The form of this equation implies that for η > 1 (which corresponds to x r > 0) there is always a unique solution. Thus, there is a unique optimal value x * r of x r . We can obtain the behaviour of x * r in the two limiting cases where α 0 |x l | is either large or small. The quantity α 0 |x l | is the ratio of the distance to the left target from the resetting site to the typical distance diffused between resets [14].
Also the probability of locating the right target first is For α 0 |x l | 1 so Also the probability of locating the right target first is It is interesting to note that in both limiting cases the optimal distance of the right target is less than the distance to the left target, by simple factors of 1/3 for α 0 |x l | 1 and 0.4142 . . . for α 0 |x l | 1. Also, the probability of locating the right target first is greater than one half in both cases. Thus in its optimal position the right target effectively cuts off errant trajectories.

Generalisation to multiple targets
Here we outline how the results may be generalised to an arbitrary number of targets on the real line. So far we have considered two targets either side of the resetiing position, the origin. The result for two targets on the same side, say at positions x r 1 > 0 and x r 2 > x r 1 is simply i.e. it is the mean time to locate the nearest target plus the mean time to find the furthest target. For the case of two targets to the right of the origin, x r 1 > 0 and x r 2 > x r 1 and one to the left x l < 0, using the same logic as for (14), the mean time to find all three targets is Equation (43) states that the mean time to triple absorption is equal to the mean time to the first absorption plus the average of the mean time for the remaining double absorption weighted according to the probabilities of which absorption occurs first.
In this way, one can recursively write down expressions for the mean time to absorb any number of targets. Specifically, for m targets to the left of the origin and n targets to the right the expression reads T m+n (x l m , . . . x l 1 , x r 1 , . . . , x r n ) = τ(x l , x r 1 ) where P r 1 (x l 1 , x r 1 ) is now the probability of first finding the target on the right of the pair of closest targets on either side of the origin.

Conclusion
In this paper we have studied mean times for a diffusive searcher, under resetting to a fixed position, to locate all of multiple targets. We have presented the exact expression (30) for two targets in one dimension and shown how to generalise to an arbitrary number of targets on the real line (44).
A perhaps surprising result is that the presence of multiple targets can actually reduce the time to find a single target. In the case of two targets we have obtained expressions for the optimal position of the second target which minimises the time to locate both targets. Although seemingly counterintuitive, the effect is a result of the searcher being returned to the origin once a target has been located.
The study was motivated by the problem of healing of lesions on DNA. For this purpose the model we consider is necessarily a crude simplification where each event of disassocation/re-association from the DNA is considered a reset of the system and the resetting instantaneously occurs at a single resetting site. In reality there are many possible processes for the translocation of proteins between sites e.g. jumping, hopping, intersegment transfer and sliding are commonly discussed [5]. With the emergence of single molecule methods, it has become possible to observe the motion of proteins at an individual molecule level [35], which may inform modelling. An obvious improvement to be made to the resetting dynamics is to more faithfully model a distribution of binding sites (for resetting to) and to include a delay in the reset process. A cluster of target sites may also have some specific structure. Moreoever, the mechanism via which DNA is physically repaired is, itself, not a straightforward process. There exist multiple types of DNA repair [36] such as nucleotide excision repair, mismatch repair, homologous recombination. It is also possible that the proteins that search for, and are able to sense, DNA lesions are involved in the repair process as well as the search process. This would mean that, after having located a target site, rather than being released back into the system to search for another target, a protein may stay bound to the target site [37]. Finally, whether searching for a binding site for gene transcription or searching for a DNA lesion in order to trigger the repair process, there are typically multiple searchers looking for multiple targets.
Thus, there are a plethora of future modifications that can be made to the basic model we study. Encouragingly, some studies of diffusion with resetting have begun to explore such details, for example, including a resetting distribution [15], including dynamics in the absorption process [38], including multiple searchers in the analysis [8], and adding a delay to the reset process [17,[39][40][41][42].
Author Contribution Statement: GRC designed and carried out research and drafted sections of the paper. MRE designed and carried out research and wrote the paper. 1 − e 2αx l e 2αx r − e 2αx l (A.12) The self-consistent solution for E is found using (21) E = (r + s)(e αx r + e αx l ) r(e α(x r +x l ) + 1) + s(e αx r + e αx l ) . (A.13) Case s = 0 In the case s = 0, which we will consider from now on, the expression for E simplifies to E = (e α 0 x r + e α 0 x l ) e α 0 (x r +x l ) + 1 , (A.14) where α 0 = r D