1 Introduction

Coherent flow structures are widely observed in two-dimensional turbulent fluid motions. Isolated regions of concentrated vorticity emerge, for instance, as a result of stirring or solid body interactions, and they persist for rather long time scales while moving with the ambient background flow and interacting with nearby vortex regions. Vortex regions that are sufficiently far from each other can be considered as rigid disc-like objects, whose interactions reduce approximately to the interactions of their centers.

The idea of studying the motion of vortices in two-dimensional fluid flows by means of an idealized point vortex system dates back to the second half of the nineteenth century and the work of Helmholtz [20]. Arguing formally, Helmholtz derived an ODE model for interacting point vortices in inviscid incompressible fluids. Kirchhoff later documented the Hamiltonian structure of this system [22], and, as a result, the ODE model is nowadays commonly referred to as the Helmholtz–Kirchhoff point vortex system.

The Helmholtz–Kirchhoff system is a reasonable description of the actual vortex dynamics of isolated and highly concentrated vortex regions. In these situations, if the fluid kinematic viscosity is small, it seems quite natural to expect that the vortices will exhibit a behavior similar to the Helmholtz–Kirchhoff idealization. Likewise, it is reasonable to expect that this description will become invalid in the event of collisions and merging of vortices or if vortex regions dissolve in the ambient background flow as a result of viscous friction. The goal of the present paper is to derive estimates that describe to what extent the point vortex system captures the motion of vortex regions in incompressible viscous flows.

For inviscid fluids, the first rigorous work in this direction is due to Marchioro and Pulvirenti [26], who proved that bounded vortex patch solutions to the Euler equation remain close to the point vortices during the evolution. Their strategy exploits the regularity of the velocity field sufficiently far from the vortex patches and the symmetry properties of the two-dimensional Biot–Savart kernel. The original work was gradually improved in many subsequent articles, see for example [3, 25, 27]. Other methods have been employed to study the connection of the point vortex system with the Euler equation. We mention the approach by Turkington [30], which relies on the conservation of the energy. This has the advantage that it works also in situations where the Biot–Savart operator does not possess certain symmetry features, for example in the setting of three-dimensional axisymmetric fluids without swirl, where it has been successfully applied [2]. It does not seem however suited to studying several interacting vortices; in this case a combination with the Marchioro–Pulvirenti method has been used [4, 5]. Another technique effectively implemented to this context is the so-called gluing method [13, 14]. In these works, solutions with highly concentrated vorticities containing precise information on the vortex cores are constructed in the two-dimensional case, and helical vortex filaments are built in the three-dimensional setting.

In the context of viscous fluids, the connection with the Helmholtz–Kirchhoff point vortices has been shown for the Navier–Stokes equation [9, 23, 24], where the authors consider bounded vorticities initially sharply concentrated in separated regions and prove that they converge in the inviscid limit to the point vortices. A more singular initial datum was considered by Gallay [17], namely a collection of several point vortices. He proved that the corresponding Navier–Stokes vorticity concentrates, in terms of a weighted \(L^2\) norm, near a combination of Lamb–Oseen vortices centered around a viscous regularization of the point vortex system (see also the nice review paper [18]). A Lamb–Oseen vortex (see expression (12)) is the solution corresponding to an initial vorticity concentrated on a single point, and it can in some ways be regarded as the fundamental solution to the two-dimensional Navier–Stokes vorticity equation. Gallay gives an estimate on the rate of concentration, proving that it is proportional to \(\nu t\), where \(\nu \) is the viscosity and t is the time. As consequence of this result, the convergence of the Navier–Stokes solution to the Helmholtz–Kirchhoff point vortex system for \(\nu \rightarrow 0\) is also established.

It would be desirable to establish a relation between solutions to the Navier–Stokes equations and the (viscous) point vortex dynamics that has both features: First, it provides stability estimates in the sense of the Marchioro–Pulvirenti strategy showing that solutions that start close to point vortices remain close to vortices. Second, it describes the accurate shape of vortices that are actually spreading as a result of viscosity.

In the present work, we take a tiny step towards this goal by addressing the first property and by improving in several aspects the aforementioned works [23, 24] by Marchioro on the Navier–Stokes equations: As a measure for vortex concentration, we work with Wasserstein distances. Weak notions of concentration are suitable in the viscous setting due to the fact that the viscosity spreads instantaneously over the full space. However, also in the inviscid case, weak notions of concentration are favorable as they allow for more general vortex configurations like elongated vortex regions or even long tentacles, as can be observed in chaotic or turbulent flows. We have considered concentration in the Wasserstein sense in our earlier works on the Euler equation [7, 8]. In the context of three-dimensional Euler filaments, geometric flat norms, which are related to Wasserstein distances in 2D, were considered [21].

Furthermore, we establish estimates that hold uniformly in the viscosity constant, independently from the scale of concentration. This way our results apply also to initial data that are given by a collection of point vortices as studied by Gallay [17], and our estimates describe the sharp rates of vortex spreading due to viscosity.

We finally consider configurations whose initial vorticity is allowed to be unbounded but in \(L^p\), for some \(p>2\), with almost no assumption on the magnitude of the \(L^p\) norm. This same condition was studied [8] in the Euler setting and extends the corresponding requirement on the \(L^\infty \) norm that was considered in [9, 24].

Organization of the paper. In Sect. 2 we introduce precisely the mathematical setting and state the main results. In Sect. 3 we present the proofs.

2 Mathematical setting

We study the dynamics of vortices in a viscous incompressible fluid in the two-dimensional plane \(\mathbb {R}^2\). This motion can be modelled via the Navier–Stokes equations in vorticity form. These read

$$\begin{aligned} \partial _t \omega + u\cdot \nabla \omega = \nu \Delta \omega \quad \text {in }(0,+\infty )\times \mathbb {R}^2, \end{aligned}$$
(1)

where \(u:(0,+\infty )\times \mathbb {R}^2\rightarrow \mathbb {R}^2\) is the velocity field and \(\omega :(0,+\infty )\times \mathbb {R}^2\rightarrow \mathbb {R}\) is the scalar vorticity field that measures the tendency of the fluid to rotate. The incompressibility condition is translated into the mathematical constraint that the velocity field has zero divergence,

$$\begin{aligned} \nabla \cdot u=0 \quad \text {in }(0,+\infty )\times \mathbb {R}^2. \end{aligned}$$
(2)

The vorticity field can be computed from the velocity as the rotation \(\omega = \partial _1 u_2 - \partial _2 u_1\), and vice versa the velocity u can be recovered from \(\omega \) via the Biot–Savart law

$$\begin{aligned} u(t,x) = K*\omega (t,x) = \int _{{\mathbb {R}}^2} K(x-y)\,\omega (t,y)\,dy, \end{aligned}$$

where the convolution is understood in space. Here, the function K is the Biot–Savart kernel

$$\begin{aligned} K(z) = \frac{1}{2\pi }\frac{z^\perp }{|z|^2}, \end{aligned}$$

and \(z^\perp \) denotes the counter-clockwise rotation of the vector z by 90 degrees. Finally, the constant \(\nu >0\) is the kinematic viscosity of the fluid.

We consider a compactly supported initial datum \({\bar{\omega }}\in L^p(\mathbb {R}^2)\) for some \(p>2\). It is well-known (see [18] and references therein) that there exists a unique solution \(\omega \in C\big ( (0,+\infty ); L^1\cap L^\infty ({\mathbb {R}}^2) \big )\) to Eq. (1). Such a solution has been shown to be smooth in space and time [1], and its Lebesgue norms are non-increasing in time, that is

$$\begin{aligned} \Vert \omega (t)\Vert _{L^1} \le \Vert {\bar{\omega }}\Vert _{L^1}, \quad \Vert \omega (t)\Vert _{L^p} \le \Vert {\bar{\omega }}\Vert _{L^p}. \end{aligned}$$

We study the situation where the initial vorticity is split into N components of definite sign

$$\begin{aligned} {\bar{\omega }}=\sum _{i=1}^{N}{\bar{\omega }}_i, \end{aligned}$$

where for every \(i\in \{1,\ldots ,N\}\) it holds either \({\bar{\omega }}_i \ge 0\) or \({{\bar{\omega }}}_i\le 0\). The motion of the vorticity components obeys the advection-diffusion equation

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t \omega _i + u\cdot \nabla \omega _i = \nu \Delta \omega _i , \\ \omega _i(0,\cdot ) = {{\bar{\omega }}}_i(\cdot ) . \end{array}\right. } \end{aligned}$$
(3)

Hence their sign is preserved over time and, because of the uniqueness of the solutions to linear advection-diffusion equations, the vorticity \(\omega \) remains expressed as the sum of its components,

$$\begin{aligned} \omega (t,\cdot ) = \sum _{i=1}^{N}\omega _i(t,\cdot ). \end{aligned}$$

We observe that, thanks to the divergence-free condition (2) and to the absence of domain boundaries, the intensity, that is the space integral of each vortex component, is constant in time,

$$\begin{aligned} a_i = \int _{\mathbb {R}^2}\omega _i(t,x)\,dx = \int _{\mathbb {R}^2}{\bar{\omega }}_i\,dx. \end{aligned}$$

Depending on the sign of the vorticity components, we will either have that \(a_i>0\) or \(a_i<0\), so that \(\omega _i/a_i\) is a well-defined probability distribution.

We suppose that the vortex components are sharply concentrated around N distinct points \({\bar{Y}}_1,\ldots ,{\bar{Y}}_N\in \mathbb {R}^2\) in the sense that

$$\begin{aligned} W_2\left( \frac{{\bar{\omega }}_i}{a_i},\,\delta _{{\bar{Y}}_i} \right) \le \varepsilon , \end{aligned}$$
(4)

for all \(i=1,\ldots ,N\). The parameter \(\varepsilon >0\) is the concentration scale, that will be assumed to be as small as needed. Here \(W_2\) is the 2-Wasserstein distance, which, in the case that one of the measures considered is atomic, is just the square root of the variance,

$$\begin{aligned} W_2\left( \frac{{\bar{\omega }}_i}{a_i},\,\delta _{{\bar{Y}}_i} \right) ^2 = \frac{1}{a_i} \int _{\mathbb {R}^2} |x-{\bar{Y}}_i|^2\,{\bar{\omega }}_i(x)\,dx, \end{aligned}$$
(5)

and measures the average “size” of the vortex region. We refer to Villani’s book [31] for more information on Wasserstein distances. To understand the role of \(\varepsilon \), notice that condition (4) is satisfied for example if \({\bar{\omega }}_i\) is supported in a ball of radius \(\varepsilon \) centred on \({\bar{Y}}_i\). An easy computation shows that the Wasserstein distance is minimized when the atomic measure is located on the center of vorticity

$$\begin{aligned} {\bar{X}}_i = \frac{1}{a_i} \int _{\mathbb {R}^2}x{\bar{\omega }}_i(x)\,dx, \end{aligned}$$

that is

$$\begin{aligned} W_2\left( \frac{{\bar{\omega }}_i}{a_i},\,\delta _{{\bar{X}}_i} \right) \le W_2\left( \frac{{\bar{\omega }}_i}{a_i},\,\delta _{{\bar{Y}}_i} \right) . \end{aligned}$$
(6)

Since Wasserstein distances metrize the weak convergence in the sense of measures, see Theorem 7.12 in [31], assumption (4) implies that the rescaled vortex component \(\omega _i/a_i\) converges weakly as \(\varepsilon \) goes to zero to an atomic measure. We suppose that the intensities \(a_i\) are independent of \(\varepsilon \) and, accordingly, the \(L^p\) norm of the components must diverge as \(\varepsilon \rightarrow 0\). Here, we assume that

$$\begin{aligned} \Vert {\bar{\omega }}\Vert _{L^p} \le \varepsilon ^{-\gamma }, \end{aligned}$$
(7)

where \(\gamma \) is a fixed positive number. This is the same condition that we considered earlier in the inviscid setting [8], while similar requirements on the \(L^\infty \) norm were studied e.g. in [9, 23, 24] for the Navier–Stokes equations and in [6, 25] for the Euler equations. Notice that the two assumptions (4) and (7) enforce that \(\gamma \ge 2-2/p\) as a consequence of the interpolation \(1 \lesssim \Vert f\Vert _{L^p}^{p} W_2(f,\delta _X)^{2p-2}\), which holds true for any probability distribution f. We furthermore observe that, since the \(L^p\) norm of every component is non-increasing by Eq. (3), assumption (7) translates into a condition on each \(\omega _i\) for positive times,

$$\begin{aligned} \Vert \omega _i(t)\Vert _{L^p} \le \Vert {\bar{\omega }}_i\Vert _{L^p} \le \Vert {\bar{\omega }}\Vert _{L^p}. \end{aligned}$$
(8)

Finally, we make a last hypothesis on the initial configuration, namely we suppose that there is not much vorticity far away from their centers. More precisely, we assume that there exists a radius \(R\gg \varepsilon \) and a constant \(\beta \) such that

$$\begin{aligned} m_i(0,R) \le \varepsilon ^{\beta }, \end{aligned}$$
(9)

where \(m_i(t,R)\) is the vorticity portion of the i-th component at time t lying at least at distance R from its center \(X_i(t)\), that is,

$$\begin{aligned} m_i(t,R)=\frac{1}{a_i}\int _{B_R(X_i(t))^c} \omega _i(t,x)\,dx. \end{aligned}$$

We will assume that

$$\begin{aligned} R = R_0 (\varepsilon +\sqrt{\nu })^{\delta }, \end{aligned}$$
(10)

for some \(\delta \in [0,1/2)\) and some constant \(R_0\). Notice that the concentration assumption (4), the expression (5) and the bound (6) immediately imply that

$$\begin{aligned} m_i(0,R) \le \frac{1}{R^2}W_2\left( \frac{ \bar{\omega }_i}{a_i},\delta _{{{\bar{X}}}_i}\right) \le \frac{\varepsilon ^2}{R^2}. \end{aligned}$$

Hence, in view of (10), the weak assumption on the support (9) is stronger than the concentration assumption alone if \(\beta \ge 2(1-\delta )\). As we will choose \(\beta \ge 2\), this will be always true.

Our goal is to estimate to what extent the vortex components remain close to the Helmholtz–Kirchhoff point-vortex system starting from \({\bar{Y}}_1,\ldots ,{\bar{Y}}_N\), that is a collection of points that move according to the equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{d}{dt}Y_i(t) = \sum _{\underset{j \ne i}{j=1}}^{N} a_j K(Y_i(t)-Y_j(t)), \\ Y_i(0) = {\bar{Y}}_i. \end{array}\right. }\quad \forall i=1,\ldots ,N. \end{aligned}$$
(11)

We will assume that the actual point vortices are not colliding in some time interval \([0,T_c]\), and that their minimal distance

$$\begin{aligned} d = \min _{i\not = j}\inf _{t\in [0,T_c]} | Y_i(t)- Y_j(t)|, \end{aligned}$$

is large compared to R, say \(d \ge 12 R_0\).

Our first result gives a componentwise estimate.

Theorem 1

Let \({\bar{Y}}_1,\ldots ,{\bar{Y}}_N \in {\mathbb {R}}^2\) be pairwise distinct points and consider their evolution \({Y}_1,\ldots ,{Y}_N \in {\mathbb {R}}^2\) through the point-vortex system (11). There exist \(\varepsilon _0\), \(\nu _0\in (0,1)\), \(\beta _0\in (1,\infty )\) and a time \(T\le T_c\) such that for any concentration parameter \(\varepsilon \in (0,\varepsilon _0)\), viscosity \(\nu \in (0,\nu _0)\) and decay parameter \(\beta \in (\beta _0,\infty )\) the following holds: Let \(\omega _i\) be an evolving vortex region (3) that is initially sharply concentrated around the point vortices (4), decaying away from the centers (9) and satisfying the bound (7). Then it holds for all \(i=1,\ldots ,N\) and \(t\in [0,T]\) that

$$\begin{aligned} W_2 \left( \frac{\omega _i(t)}{a_i},\,\delta _{Y_i(t)} \right) \le C e^{\Lambda t}\,(\varepsilon +\sqrt{\nu t}) \end{aligned}$$

where C and \(\Lambda \) are constants that are independent of \(\varepsilon \), \(\nu \) and R. Moreover, \(T\ge c\) for some constant c independent of \(\varepsilon \), \(\nu \) and R. If \(T_c=\infty \) and \(\delta >0\) in (10), there is the stronger estimate

$$\begin{aligned} T\ge c\log \frac{1}{\varepsilon +\sqrt{\nu }}. \end{aligned}$$

The estimate in Theorem 1 features the sharp rate of viscous spreading \(\sqrt{\nu t}\): In the context of the Navier–Stokes equations, the support of each initial component is instantaneously distributed all over \(\mathbb {R}^2\). This effect can be nicely observed when studying the evolution of the self-similar Lamb–Oseen vortex

$$\begin{aligned} \omega _{\text {LO}}(t,x) = \frac{1}{\nu t} \Gamma \left( \frac{x}{\sqrt{\nu t}}\right) ,\quad \Gamma (\xi ) = \frac{1}{4\pi } e^{-|\xi |^2/4}, \end{aligned}$$
(12)

which solves the Navier–Stokes equation with initial datum \(\delta _0\) (i.e. the Dirac measure concentrated on the origin). Apparently, the solution is positive for all positive times, and the vortex core grows in time at the rate \(\sqrt{\nu t}\). This growth law can be expressed in terms of the Wasserstein distance,

$$\begin{aligned} W_2(\omega _{\text {LO}}(t),\delta _0)\sim \sqrt{\nu t}, \end{aligned}$$

as can be verified via a straightforward scaling argument. Our result thus proves that the general “vortex components” described initially in (4) show a similar (optimal) growth. Moreover, in the limit \(\varepsilon \rightarrow 0\), our result describes the optimal spreading rates for the Navier–Stokes equations with atomic initial data.

Theorem 1 can also be stated as an estimate on the full solution (rather than the single vortex components) in terms of the 1-Wasserstein distance \(W_1\). Thanks to the dual Kantorovich–Rubinstein representation

$$\begin{aligned} W_1(f,g) = \sup \left\{ \int _{\mathbb {R}^2}(f-g)\zeta \,dx:\;\Vert \nabla \zeta \Vert _{L^\infty }\le 1 \right\} , \end{aligned}$$

see for example Theorem 1.14 in [31], \(W_1\) is indeed well-defined merely if f and g have the same average or total mass, with no requirement on their sign.

Corollary 1

Under the assumptions of Theorem 1, it holds that

$$\begin{aligned} W_1 \left( \omega (t), \sum _{i=1}^{N}a_i \delta _{Y_i(t)} \right) \le C e^{\Lambda t} (\varepsilon +\sqrt{\nu t}) \end{aligned}$$

for all \(t\le T\).

This implies weak convergence in the sense of measures of the Navier–Stokes vorticity \(\omega (t)\) to the “point-vortex measure” \(\sum _{i=1}^{N}a_i\delta _{Y_i(t)}\) if we consider the simultaneous limits \(\varepsilon \rightarrow 0\), \(\nu \rightarrow 0\), with a convergence rate that, up to an exponential in time, remains of the order of \(\varepsilon \), as was assumed for the initial configuration in (4), plus a correction term \(\sqrt{\nu t}\) due to the expansion of the vortices caused by the viscosity.

We remark that the point-vortex system (11) can be regarded as a very weak solution to the Euler equation, as was observed by Schochet [28]. Hence, this result may also be interpreted as a stability estimate between the Navier–Stokes and Euler equations. Such a bound could be rigorously obtained by a simple application of the triangle inequality when combined with our earlier result [8], if suitable estimates on the convergence of viscous to inviscid solutions were available. These would be optimal presumably if an order of \(\sqrt{\nu t}\) on the \(W_1\) distance between the Navier–Stokes and Euler vorticities could be achieved, but, even in the Yudovich setting of integrable and bounded vorticities, this has not yet been established in general, see e.g. [10,11,12, 29]. It is interesting to note that, in the setting considered in this paper, one sees the expected rate \(\sqrt{\nu t}\) (at least for viscosities of the order of \(\varepsilon ^2/t\)) in the inviscid limit \(\nu \rightarrow 0\).

We can also express the estimate in terms of the vorticity centers

$$\begin{aligned} X_i(t) = \frac{1}{a_i} \int _{\mathbb {R}^2}x\omega _i(t,x)\,dx \end{aligned}$$

of the components and their velocities as follows.

Theorem 2

Under the assumptions of Theorem 1, it holds that

$$\begin{aligned} |X_i(t)-Y_i(t)| \le C e^{\Lambda t} \, (\varepsilon +\sqrt{\nu t}) \quad \text {and}\quad \left| \frac{dX_i(t)}{dt}-\frac{dY_i(t)}{dt} \right| \le C e^{\Lambda t} \, (\varepsilon +\sqrt{\nu t}) \end{aligned}$$

for all \(t\le T\) and \(i=1,\ldots ,N\).

We finally comment on our estimate on time T in Theorem 1. In view of the exponential growth term in our concentration estimates, the statement of our theorems become meaningless if \(T\gg \log \frac{1}{\varepsilon +\sqrt{\nu }}\). In this regard, the bound on T is satisfactory at least if the “outer vorticity” in (9) vanishes appropriately outside a decreasing ball around the vortex center, i.e., \(\delta >0\) in (10). The logarithmic bound ceases to hold if we fix a radius independently of \(\varepsilon \) and \(\nu \), and it is not clear to us, how to overcome this restriction. Comparable logarithmic bounds on time were obtained earlier by Cetrone and Serafini [9] under stronger concentration assumptions. The exponential growth term results from a Gronwall argument. It is very likely that better estimates on time would require a completely different (and genuinely nonlinear) approach.

3 Proofs

We remark that we treat \(\varepsilon \) and \(\nu \) as two independent parameters and, since we are interested in what happens when both of them are not too large, in the computations we will always assume that \(\varepsilon <1\) and \(\nu <1\). Moreover, \(\varepsilon \) will always be (sufficiently) smaller than the other length parameters R and d, that is \(\varepsilon \ll R,\,d\), and we notice that R is chosen in (10) such that

$$\begin{aligned} R\le \frac{d}{12}. \end{aligned}$$
(13)

Before starting with the proofs, we introduce some notation. In the following, C will always denote a positive constant independent of t, \(\nu \) and \(\varepsilon \), possibly depending on other quantities, and whose precise value may change from line to line. When we write \(A \lesssim B\) we mean that \(A \le CB\) for some C, and when we write \(A \sim B\) we mean that \(A \lesssim B\) and \(B \lesssim A\). Hence for example we will often neglect the vortex intensities \(a_i\).

We denote the velocity generated by the i-th vorticity as \(u_i(t,x)=K*\omega _i(t,x)\), where the convolution is meant in space. Moreover, we define the “external field” \(F_i\) acting on \(\omega _i\) as the velocity field generated by all other vorticities, that is, \(F_i = \sum _{j\ne i}u_j\). In this way, the total velocity \(u=K*\omega \) satisfies

$$\begin{aligned} u=\sum _{j=1}^{N}u_j=u_i+F_i. \end{aligned}$$

The 2-Wasserstein distance of the i-th component from its center will be indicated by

$$\begin{aligned} W_i(t)=W_2 \left( \frac{\omega _i(t)}{a_i},\,\delta _{X_i(t)} \right) = \sqrt{\frac{1}{a_i}\int _{\mathbb {R}^2} |x-X_i(t)|^2 \omega _i(t,x)\,dx }. \end{aligned}$$

In our first step, we derive a bound on the growth of the Wasserstein distance.

3.1 Proof of the estimate on \(W_i\).

We fix \(i\in \{1,\ldots ,N\}\). It will be convenient to regularize the Biot–Savart kernel by cutting out the singularity at the origin. We do this with the help of a radial cut-off function \(\psi \in C^\infty _c({\mathbb {R}}^2)\) satisfying

$$\begin{aligned} 0\le \psi \le 1, \quad \psi =1 \text { on }B_{d/12}(0), \quad \psi =0 \text { out of } B_{d/6}(0), \quad |\nabla \psi | \lesssim \frac{1}{d}. \end{aligned}$$
(14)

We then split the external field acting on \(\omega _i\) into two parts, \(F_i=F_i^L+F_i^B\), with \(F_i^L\) being the regular part,

$$\begin{aligned} F_i^L(t,x)=\sum _{j\ne i} \int _{{\mathbb {R}}^2}[1-\psi (x-y)]\,K(x-y)\,\omega _j(t,y)\,dy \end{aligned}$$

and \(F_i^B\) the remainder

$$\begin{aligned} F_i^B(t,x)=\sum _{j\ne i} \int _{{\mathbb {R}}^2}\psi (x-y)\,K(x-y)\,\omega _j(t,y)\,dy. \end{aligned}$$

In the next lemma, we show that \(F_i^L\) is indeed a Lipschitz function while \(F_i^B\) is merely bounded.

Lemma 1

\(F_i^L\) is Lipschitz in space uniformly in time with Lipschitz constant \(\sim 1/d^2\), and \(F_i^B\) is bounded with

$$\begin{aligned} |F_i^B(t,x)| \lesssim {\left\{ \begin{array}{ll} d^{\theta }\sum _{j\ne i} m_j(t,d/6)^{\theta } \, \Vert \omega _j(t)\Vert _{L^p}^{1-\theta } &{} \quad \text {for }x\in B_{d/6}(X_i),\\ d^{\frac{p-2}{p}}\sum _{j\ne i} \Vert \omega _j(t)\Vert _{L^p} &{} \quad \text {for }x\in B_{d/6}(X_i)^c, \end{array}\right. } \end{aligned}$$

for any \(t\ge 0\) such that

$$\begin{aligned} \min _{i\not =j} |X_i(t)-X_j(t)|\ge \frac{d}{2}, \end{aligned}$$
(15)

where \(\theta = (p-2)/(3p-2)\).

Proof

The Lipschitz bound on \(F_i^L\) can be easily proven computing the gradient of the integrand,

$$\begin{aligned} \begin{aligned} |\nabla \big [ \, \big (1-\psi (z) \big ) \, K(z) \big ]|&\le |K(z)|\,|\nabla \psi (z)| + |1-\psi (z)|\,|\nabla K(z)| \lesssim \frac{1}{d^2}, \end{aligned} \end{aligned}$$

where we used the scaling of the Biot–Savart kernel in the form \(|K(z)|\lesssim 1/|z|\), \(|\nabla K(z)|\lesssim 1/|z|^2\), and the properties of the cut-off function (14).

We concentrate now on proving the boundedness of \(F_i^B\). First, if \(x\in B_{d/6}(X_i)\), we have for any \(y\in B_{d/6}(x)\) and \(j\ne i\) that \(|y-X_j| \ge |X_j-X_i|-|X_i-x|-|x-y| \ge d/6\) because \(|X_i-X_j|\ge d/2 \) by condition (15). Therefore, for any \({\tilde{\theta }}\in (0,1)\), it holds that

$$\begin{aligned} \begin{aligned} |F_i^B(x)|&\lesssim \sum _{j\ne i} \int _{B_{d/6}(x)\cap B_{d/6}(X_j)^c } \frac{1}{|x-y|}\,|\omega _j(y)|^{{\tilde{\theta }}}\,|\omega _j(y)|^{1-{\tilde{\theta }}}\,dy \\&\le \sum _{j\ne i} \left( \int _{B_{d/6}(x) } \frac{1}{|x-y|^q}\,dy \right) ^{\frac{1}{q}} \left( \int _{ B_{d/6}(X_j)^c } |\omega _j(y)| \,dy \right) ^{{\tilde{\theta }}} \left( \int _{\mathbb {R}^2} |\omega _j(y)|^{p} \,dy \right) ^{\frac{1-{\tilde{\theta }}}{p}}, \end{aligned}\nonumber \\ \end{aligned}$$
(16)

where we have used the (generelized) Hölder inequality with exponent q chosen so that \(1/q + {\tilde{\theta }} + (1-{\tilde{\theta }})/{p} = 1\). Requiring \(q<2\) ensures that the first integral in (16) is of the order \(d^{(2-q)/q}\). Solving for \({{\tilde{\theta }}}\) gives

$$\begin{aligned} {\tilde{\theta }} = \left( 1-\frac{1}{p}-\frac{1}{q} \right) \frac{p}{p-1}, \end{aligned}$$

and thus, the condition that \({\tilde{\theta }}\) is positive enforces \(q \in \left( p/(p-1),2 \right) \). Choosing q to be, say, the mean value between \(p/(p-1)\) and 2 yields \({\tilde{\theta }}=\theta \) as in the statement of the lemma.

Second, if \(x \notin B_{d/6}(X_i)\), we use the rougher (in fact, globally valid) bound

$$\begin{aligned} \begin{aligned} |F_i^B(x)|&\lesssim \sum _{j\ne i} \int _{B_{d/6}(x)} \frac{1}{|x-y|}\,|\omega _j(y)|\,dy \\&\lesssim \sum _{j\ne i}\left( \int _{B_{d/6}(0)} \frac{1}{|z|^{p'}}\,dz \right) ^{\frac{1}{p'}} \Vert \omega _j\Vert _{L^p} \sim d^{1-\frac{2}{p}} \sum _{j\ne i}\Vert \omega _j\Vert _{L^p}. \end{aligned} \end{aligned}$$

This concludes the proof. \(\square \)

In principle, we could estimate the \(L^p\) norms of the vorticity components in the previous lemma with the help of the a priori estimate (8) and the assumption on the initial data (7). For large times, however, it is more convenient to make use of the smoothing properties of the diffusive part in (3). More precisely, we have the following estimate.

Lemma 2

For any \(i=1,\dots ,N\), \(q\in [1,\infty ]\) and any \(t>0\), it holds that

$$\begin{aligned} \Vert \omega _i(t)\Vert _{L^q} \lesssim \frac{1}{(\nu t)^{1-\frac{1}{q}}}\, \Vert \bar{\omega _i}\Vert _{L^1}. \end{aligned}$$
(17)

This estimate is known to be true for solutions to the Navier–Stokes equation (see, e.g., Theorem 4.3 in [19]), but due to possible cancellation effects, we cannot deduce its validity for individual vortex components. Instead, we provide a short proof of this estimate, in which we follow an argumentation given in [16]. For more general linear parabolic equations, a version of that bound already appears in [15].

Proof

It is enough to consider the statement for \(q=2^k\). The statement for general \(q<\infty \) follows by interpolation, while the statement for \(q=\infty \) comes from taking the limit \(k\rightarrow \infty \). The case \(q=1=2^0\) is trivially true since then \(\Vert \omega _i(t)\Vert _{L^1}=a_i\). We suppose that \(k\ge 1\) or, equivalently, \(q\ge 2\) from here on.

We consider \(E_q(t):= \Vert \omega _i(t)\Vert _{L^q}^q\) and observe that it satisfies the identity

$$\begin{aligned} -\frac{d}{dt} E_q(t) = \frac{4\nu (q-1)}{q}\Vert \nabla \omega _i^{q/2}\Vert _{L^2}^2\sim \nu \Vert \nabla \omega _i^{q/2}\Vert _{L^2}^2 \end{aligned}$$

under the evolution (3). Notice that we use here the fact that \(q\ge 2\), so that the dependency on q in this estimate can indeed be neglected. Making use of the 2D Nash inequality \(\Vert f\Vert _{L^2}^2\lesssim \Vert f\Vert _{L^1}\Vert \nabla f\Vert _{L^2}\), the latter turns into the estimate

$$\begin{aligned} \frac{d}{dt} E_q(t)^{-1} \gtrsim \nu E_{\frac{q}{2}}(t)^{-2}. \end{aligned}$$
(18)

It remains to apply an induction argument over k, and we suppose thus that the statement holds for q/2. Then (18) implies

$$\begin{aligned} \frac{d}{dt} E_q(t)^{-1} \gtrsim \nu (\nu t)^{q-2} \Vert \bar{\omega }_i\Vert _{L^1}^{-q}, \end{aligned}$$

and an integration in time gives

$$\begin{aligned} \Vert \omega _i(t)\Vert _{L^q}^{-q} = E_q(t)^{-1} \ge E_q(t)^{-1} - E_q(0)^{-1} \gtrsim (\nu t)^{q-1} \Vert {{\bar{\omega }}}_i\Vert _{L^1}^{-q}, \end{aligned}$$

from which we easily deduce the statement of the lemma. \(\square \)

In the next lemma we use the properties of \(F_i^L\) and \(F_i^B\) that we just showed to deduce a bound on the evolution of the Wasserstein distance between \(\omega _i\) and \(X_i\).

Lemma 3

Let \(\theta \in (0,1)\) be given as in Lemma 1. For any \(t\ge 0\) with property (15), it holds that

$$\begin{aligned} \frac{d}{dt}W_i^2(t)\lesssim & {} d^{-2}W_i^2(t) + d^{\theta }\sum _{j\ne i} m_j(t,d/6)^{\theta } \, \Vert \omega _j(t)\Vert _{L^p}^{1-\theta }\, W_i(t) \nonumber \\{} & {} + d^{\frac{p-2}{p}} m_i(t,d/6)^{\frac{1}{2}} \, \sum _{j\ne i}\Vert \omega _j(t)\Vert _{L^p} \, W_i(t) + \nu . \end{aligned}$$
(19)

Proof

Since time is fixed, we will often forget to write it in this proof. We observe first that, thanks to the assumption that the initial vorticity components are compactly supported, we know that \(W_i(0)<\infty \). Because of the smoothness of \(\omega _i\), the Wasserstein distance remains finite and differentiable for positive times.

We consider the evolution of the squared Wasserstein distance, and compute, using the evolution (3) of the vorticity components, multiple integrations by parts and the definition of the vorticity centers \(X_i\),

$$\begin{aligned} \begin{aligned} \frac{d}{dt}W_i^2&= \frac{2}{a_i} \int _{{\mathbb {R}}^2} (x-X_i)\cdot u(x)\,\omega _i(x)\,dx + \frac{\nu }{a_i} \int _{{\mathbb {R}}^2} \Delta |x-X_i|^2 \,\omega _i(x)\,dx \\&\quad - \frac{2}{a_i} \int _{{\mathbb {R}}^2} (x-X_i)\, \omega _i(x)\,dx \cdot \frac{dX_i}{dt} \\&= \frac{2}{a_i} \iint _{{\mathbb {R}}^2\times \mathbb {R}^2} (x-X_i)\cdot K(x-y)\,\omega _i(x)\omega _i(y)\,dxdy \\&\quad + \frac{2}{a_i} \int _{{\mathbb {R}}^2} (x-X_i)\cdot F_i(x)\,\omega _i(x)\,dx + 4\nu . \end{aligned} \end{aligned}$$

The first integral on the right-hand side vanishes because K is odd and because \(z \cdot K(z) = 0\). Using the definition of \(X_i\) again and decomposing \(F_i=F_i^L+F_i^B\), we thus have

$$\begin{aligned} \frac{d}{dt}W_i^2(t)&= \frac{2}{a_i} \int _{{\mathbb {R}}^2} (x-X_i) \cdot \left( F_i^L(x)-F_i^L(X_i)\right) \omega _i(x)\,dx \\&\qquad + \frac{2}{a_i} \int _{{\mathbb {R}}^2} (x-X_i) \cdot F_i^B(x)\, \omega _i(x)\,dx + 4\nu . \end{aligned}$$

The first integral is easy to estimate by \(W_i^2/d^2\) because of the Lipschitz property of \(F_i^L\). On the second integral we use the different bounds obtained for \(F_i^B\). Splitting the integration domain into \(B_{d/6}(X_i)\) and its complement, we have

$$\begin{aligned}&\int _{{\mathbb {R}}^2} (x-X_i) \cdot F_i^B(x)\, \omega _i(x)\,dx \\&\quad \lesssim \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \int _{{\mathbb {R}}^2} |x-X_i|\,|\omega _i(x)|\,dx \\&\qquad + \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i)^c)} \left( \int _{B_{d/6}(X_i)^c} |\omega _i(x)|\,dx \right) ^{\frac{1}{2}} \left( \int _{{\mathbb {R}}^2} |x-X_i|^2 \,|\omega _i(x)|\,dx \right) ^{\frac{1}{2}} \\&\quad \lesssim \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \, W_i + \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i)^c)} \left( \int _{B_{d/6}(X_i)^c} |\omega _i(x)|\,dx \right) ^{\frac{1}{2}} \, W_i, \end{aligned}$$

where we used Jensen’s and Hölder’s inequalities. Observe that in the estimates above the assumption that the vortex components have definite sign was necessary to bound

$$\begin{aligned} \int _{{\mathbb {R}}^2} |x-X_i|\,|\omega _i(x)|\,dx \lesssim \left( \int _{{\mathbb {R}}^2} |x-X_i|^2 \,|\omega _i(x)|\,dx\right) ^\frac{1}{2} \lesssim W_i. \end{aligned}$$

It remains only to apply Lemma 1 and the proof is complete. \(\square \)

Now we come to the objective of this subsection, that is a Gronwall estimate on the Wasserstein distance. Our argument will be the following. We start from the differential inequality (19), and for small times we observe that the \(L^p\) norm of the vorticity inherits the bound (7) from the initial datum. For larger times, the effect of the viscosity kicks in, and we use the decay estimate (17). In both cases, in order to balance (7) or (17), we need to make sure that the portion of vorticity far from the centers is small enough. For this reason, we have to ensure not only that the vorticity centers remain sufficiently far from each other as in (15), but also that most of each vorticity component remains concentrated around its center. We observe that, under the localization condition (4) and the definition of d in (13), we know that

$$\begin{aligned} \min _{i\not =j}|{\bar{X}}_i-{\bar{X}}_j|\ge \frac{3}{4}d , \end{aligned}$$
(20)

provided that \(\varepsilon \le d/8\), which we will suppose from here on. We select then \(T=T(\varepsilon ,\nu ,R)\) as the minimal time for which at least one of the following equalities holds true:

$$\begin{aligned} T = T_c \end{aligned}$$

or

$$\begin{aligned} \max _i m_i(T,d/6)=\varepsilon ^\alpha +(\nu T)^{\frac{\alpha }{2}} , \end{aligned}$$

or

$$\begin{aligned} \max _i |X_i(T) - Y_i(T)| =\frac{d}{4} , \end{aligned}$$

where \(\alpha =\alpha (p,\gamma )< \beta \) is a positive parameter to be fixed later, cf. (23). Here we observe that, thanks to (4), (20) and (9), and because both the vortex centers \(X_i\) and the outer vorticity \(m_i\) are continuous in time, the time T is always strictly positive, \(T>0\). We thus have

$$\begin{aligned} m_i(t,d/6)\le \varepsilon ^\alpha +(\nu t)^{\frac{\alpha }{2}} \quad \text {and}\quad |X_i(t) - Y_i(t)| \le \frac{d}{4} \end{aligned}$$
(21)

for all \(t\le \min \{ T,T_c\}\) and any i. Moreover, using the triangle inequality and (13), we also notice that the latter entails

$$\begin{aligned} |X_i(t)-X_j(t)|\ge \frac{d}{2} \end{aligned}$$
(22)

for all \(t\le \min \{ T,T_c\}\) and any \(i\not =j\). We derive the estimate on the growth of \(W_i\) for times smaller than T, while in Sect. 3.3 we will show that the time T can be chosen independently of \(\varepsilon \) and \(\nu \).

Proposition 1

Suppose that \(\alpha \) satisfies

$$\begin{aligned} \alpha \ge \frac{1}{\theta } +\left( \frac{1}{\theta }-1\right) \gamma = \frac{3p-2}{p-2} + \frac{2p\gamma }{p-2}, \end{aligned}$$
(23)

where \(\theta \) was given as in Lemma 1. Then, for any \(i=1,\ldots ,N\) there holds

$$\begin{aligned} W_i^2(t) \lesssim e^{Ct/d^2} (\varepsilon ^2 + \nu t), \end{aligned}$$
(24)

for some positive constant \(C>0\) and any \(t\le T\).

Proof

Using conditions (21), (13) and assumption (7) together with (8) in estimate (19), we obtain, in the case \(\sqrt{\nu t}\le \varepsilon \),

$$\begin{aligned} \frac{d}{dt}W_i^2(t) \lesssim d^{-2}W_i^2(t) + [d^{\theta }\varepsilon ^{\theta \alpha - \gamma (1-\theta )} + d^{\frac{p-2}{p}}\varepsilon ^{\frac{\alpha }{2}-\gamma } ] W_i(t) + \nu . \end{aligned}$$

It is sufficient to assume that \(\alpha \) is large enough, so that all the exponents of \(\varepsilon \) are larger or equal to 1. This condition is guaranteed by (23) and hence, being that \(\varepsilon < 1\), \(R\lesssim 1\) and \(\theta <\frac{p-2}{p}\), we have

$$\begin{aligned} \frac{d}{dt}W_i^2(t) \lesssim d^{-2} W_i^2(t) + d^{\theta } \varepsilon W_i(t) + \nu \lesssim d^{-2} W_i^2(t) + d^{2\theta +2} \varepsilon ^2 +\nu , \end{aligned}$$
(25)

for these times. There exists thus a constant \(C>0\) such that

$$\begin{aligned} \frac{d}{dt}\left( e^{-Ct/d^2}W_i^2(t)\right) \le Ce^{-Ct/d^2}\left( d^{2\theta +2}\varepsilon ^2+\nu \right) , \end{aligned}$$

and an integration in time and using \(d \sim 1\) yields

$$\begin{aligned} W_i^2(t) \le e^{Ct/d^2} W_i^2(0) + \left( e^{Ct/d^2}-1\right) \left( d^{2\theta +4}\varepsilon ^2 +d^2 \nu \right) \lesssim e^{Ct/d^2} \left( \varepsilon ^2 +t \nu \right) ,\nonumber \\ \end{aligned}$$
(26)

for any \(t\le \varepsilon ^2/\nu \), where we applied assumption (4) and inequality (6).

For larger times, say, \(1\ge \sqrt{\nu t}\ge \varepsilon \), together with conditions (21) and (13), we use estimate (17) in the differential inequality (19), and get

$$\begin{aligned} \frac{d}{dt} W_i^2 \lesssim d^{-2} W_i^2 + d^{\theta }(\nu t)^{\frac{\alpha \theta }{2} - (1-\theta )\left( 1-\frac{1}{p}\right) } W_i + d^{\frac{p-2}{p}}(\nu t)^{\frac{\alpha }{4} -\left( 1-\frac{1}{p}\right) } W_i +\nu . \end{aligned}$$

Using condition (23) on \(\alpha \) and the lower bound \(\gamma \ge 2-2/p\) ensures that all the exponents of \(\nu t\) are larger than 1/2. Then thanks to the facts that \(\nu t\le 1\) and \(d\sim 1\), we obtain the inequality

$$\begin{aligned} \frac{d}{dt}W_i^2(t) \lesssim d^{-2} W_i^2(t) + d^{\theta } \sqrt{\nu t} W_i(t) + \nu \lesssim d^{-2} W_i^2(t) + (d^{2\theta +2}t+1)\nu \end{aligned}$$
(27)

for these times. Similarly as above, we rewrite this differential inequality as

$$\begin{aligned} \frac{d}{dt}\left( e^{-Ct/d^2}W_i^2(t)\right) \le C e^{-Ct/d^2}(d^{2\theta +2}t+1)\nu , \end{aligned}$$

with a (possibly larger) constant \(C>0\). Integration in time over \(t\ge \varepsilon ^2/\nu \) and using \(d\sim 1 \) once more then yields

$$\begin{aligned} W_i^2(t) \le e^{C(t-\varepsilon ^2/\nu )/d^2} W_i^2(\varepsilon ^2/\nu ) +e^{C t/d^2} (\varepsilon ^2 +t\nu ) \lesssim e^{Ct/d^2} \left( \varepsilon ^2 +t\nu \right) , \end{aligned}$$

where we have used the previous bound (26). Combining both, we obtain our thesis.

Finally, if \(\nu t\ge 1\), which entails \(t\ge 1\), we argue similarly and obtain instead of (27) that

$$\begin{aligned} \frac{d}{dt}W_i^2(t) \lesssim d^{-2} W_i^2(t) + (d^{2\theta +2}t^{\sigma }+1)\nu \end{aligned}$$

for some \(\sigma \ge 1\). The desired estimate follows upon integration and noting that \(t^{\sigma } e^{C t/d^2} \lesssim t e^{{{\tilde{C}}} t/d^2}\) if \({{\tilde{C}}}/C\) is large enough. \(\square \)

From now on, we choose \(\alpha \) such that (23) holds. Moreover, in what follows, we will write \(\Lambda = C/d^2\) in the exponential growth factor.

3.2 Derivation of the main estimates

So far, we have derived a bound on the spreading rate of the single components in terms of the second moment function \(W_i(t)\), that is, the Wasserstein distance between the vorticity components and their centers. However, we would like to express it in relation to the point-vortices solving (11). Hence, in our next step we attempt to estimate the distance \(|X_i-Y_i|\) between the vorticity centers and the point-vortices.

Proof of Theorem 2

The proof is a simple adaptation of an argument in the proof of Theorem 2.2 from [9]. Up to considering \(\omega _j/a_j\) instead of \(\omega _j\), we assume without loss of generality that \(\omega _j\) is nonnegative and that \(a_j=1\) for all j. Moreover, we often forget about the t’s. We compute

$$\begin{aligned} \begin{aligned} \frac{dX_i}{dt}-\frac{dY_i}{dt}&= \sum _{j\ne i} \left[ \int _{{\mathbb {R}}^2}\int _{{\mathbb {R}}^2} K(x-y)\,\omega _i(x)\,\omega _j(y)\,dy\,dx - K(Y_i-Y_j) \right] \\&= A_1 + A_2 + A_3 + A_4, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} A_1&= \int _{{\mathbb {R}}^2} F_i^L(x) \omega _i(x)\,dx - F_i^L(Y_i),\\ A_2&= \int _{{\mathbb {R}}^2} F_i^B(x) \omega _i(x)\,dx,\\ A_3&= \sum _{j\ne i} \int _{{\mathbb {R}}^2} [K(Y_i-y)-K(Y_i-Y_j)](1-\psi (Y_i-y)) \,\omega _j(y)\,dy,\\ A_4&= -\sum _{j\ne i} \int _{{\mathbb {R}}^2} K(Y_i-Y_j)\,\psi (Y_i-y)\,\omega _j(y)\,dy, \end{aligned}$$

with \(\psi \) the same cutoff function that has been defined in (14). Let us consider \(A_1\) first. Recalling Proposition 1,

$$\begin{aligned} A_1&= \int _{{\mathbb {R}}^2} [F_i^L(x)-F_i^L(Y_i)]\,\omega _i(x)\,dx \\&\lesssim \frac{1}{d^2} \int _{{\mathbb {R}}^2} |x-X_i|\,\omega _i(x)\,dx + \frac{1}{d^2}|X_i-Y_i| \lesssim e^{\Lambda t} (\varepsilon +\sqrt{\nu t}) + |X_i-Y_i|, \end{aligned}$$

because \(d\sim 1\). Concerning \(A_2\), splitting the integration domain in \(B_{d/6}(X_i)\) and its complement and using (13), by Lemma 1 we have

$$\begin{aligned} \begin{aligned} |A_2|&\lesssim \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} + \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))^c} \, m_i(t,d/6) \\&\lesssim \sum _{j\ne i}m_j(t,d/6)^{\theta }\Vert \omega _j\Vert _{L^p}^{1-\theta } + \sum _{j\ne i}\Vert \omega _j\Vert _{L^p}\,m_i(t,d/6). \end{aligned} \end{aligned}$$

We proceed similarly to the proof of Proposition 1. In the case that \(\sqrt{\nu t}<\varepsilon \), we use (8), (7), (21), as well as condition (23) on \(\alpha \) to obtain \(|A_2|\lesssim \varepsilon \) for these times. If \(\sqrt{\nu t}\ge \varepsilon \), we use estimates (17), (21) and condition (23) to obtain \(|A_2|\lesssim \sqrt{\nu t}\) also in this case. We consider now \(A_3\). Because of the cutoff function \(1-\psi \), we may restrict the integral to \(|y-Y_i|>d/12\); this ensures that \(|K(Y_i-y) - K(Y_i-Y_j)|\lesssim |Y_j-y|/d^2 \) on the domain of integration. Hence we can bound

$$\begin{aligned} |A_{3}|&\lesssim \frac{1}{d^2} \sum _{j\ne i} \int _{\mathbb {R}^2} |y-Y_j|\, \omega _j(y)\, dy \lesssim e^{\Lambda t} (\varepsilon +\sqrt{\nu t}) + \sum _{j\ne i} |X_j-Y_j| \end{aligned}$$

by Proposition 1. We turn now to \(A_4\). Since \(|Y_i-Y_j|\ge d\gtrsim 1\), the kernel \(K(Y_i-Y_j)\) is bounded by 1, and thus \(|A_4| \lesssim \sum _{j\ne i} \int _{B_{d/6}(Y_i)} \,\omega _j(y)\,dy\). We observe that on the domain of integration there holds

$$\begin{aligned} |y-X_j| \ge |Y_j-Y_i|-|Y_i-y|-|Y_j-X_j| \ge \frac{7}{12}d, \end{aligned}$$

again thanks to (21). Then \(|A_4| \lesssim \sum _{j\ne i} \int _{B_{d/6}(X_j)^c} \omega _j(y)\,dy = \sum _{j\ne i} m_j(t,d/6)\), and again we use condition (21) to bound \(m_j\) with \(\varepsilon \) or \(\sqrt{\nu t}\) respectively in the case \(\sqrt{\nu t}<\varepsilon \) and \(\sqrt{\nu t}\ge \varepsilon \), keeping in mind that \(\alpha >1\) by (23). In conclusion, all the estimates obtained yield

$$\begin{aligned} \left| \frac{d}{dt}|X_i-Y_i| \right| \le \left| \frac{dX_i}{dt}-\frac{dY_i}{dt} \right| \lesssim \sum _j |X_j-Y_j| + e^{\Lambda t}(\varepsilon +\sqrt{\nu t}) \end{aligned}$$

for all \(t\le T\). Summing over \(i=1,\ldots ,N\) and using a Gronwall argument, together with the initial condition that \(|{\bar{X}}_i-{\bar{Y}}_i|\le \varepsilon \), yields the result. \(\quad \square \)

Proof of Theorem 1

This is an easy consequence of Proposition 1 and Theorem 2 thanks to the triangle inequality. \(\square \)

Proof of Corollary 1

From the properties of the Wasserstein distance \(W_1\) there follows

$$\begin{aligned} \begin{aligned} W_1\left( \omega (t), \sum _i a_i \delta _{Y_i(t)} \right)&\le \sum _i W_1(\omega _i(t), a_i\delta _{Y_i(t)}) \\&= \sum _i |a_i| \int _{{\mathbb {R}}^2} |x-Y_i(t)|\, \frac{\omega _i(t,x)}{a_i} \,dx. \end{aligned} \end{aligned}$$

The thesis is then a consequence of Jensen’s inequality and Theorem 1. \(\square \)

3.3 Lower bounds on T

In the following proofs we assume without loss of generality that \(T\le T_c\) because otherwise there is nothing left to prove. We distinguish two cases.

In the first case, we suppose that

$$\begin{aligned} |X_i(T)-Y_i(T)| = \frac{d}{4} \end{aligned}$$
(28)

for some i.

Lemma 4

Suppose that (28) holds. Then \(T\gtrsim \log \frac{1}{\varepsilon + \sqrt{\nu }}\).

Proof

This is any easy consequence of the first estimate in Theorem 2. Indeed, by the virtue of (28), it holds that

$$\begin{aligned} d\lesssim |X_i(T)-Y_i(T)| \lesssim e^{2\Lambda T}(\varepsilon +\sqrt{\nu }), \end{aligned}$$

which yields the statement of the lemma upon taking the logarithm and choosing \(\varepsilon +{\sqrt{\nu }}\) sufficiently small. \(\square \)

In the second case, we assume that for some \(i\in \{1,\ldots ,N\}\)

$$\begin{aligned} m_i(T,d/6) = \varepsilon ^\alpha +(\nu T)^{\frac{\alpha }{2}}. \end{aligned}$$
(29)

We endeavour to prove that T is bounded uniformly from below in \(\varepsilon \) and \(\nu \) also in this case. In order to do this, we derive an estimate on the outer vorticity portion \(m_i(T,d/6)\) by employing an iteration estimate similar to the one used in [8]. This in turn is based on the iterative procedure developed by Marchioro and co-workers, see for example [6, 24] but, differently from these works, we start from the assumption (9) that the initial vortex components are quickly decaying outside of balls of radius R, while [6, 24] make use to a great extent of their hypothesis that this radius is of the order of \(\varepsilon \).

To estimate \(m_i(T,d/6)\), we introduce the “smoothened outer vorticity portion” \(\mu _i(t,\rho ,r)\) defined in the following. We consider a smooth radially symmetric cut-off function \(\eta =\eta _{\rho ,r}\) such that

$$\begin{aligned} \eta (x)=1 \text { if } |x|\le \rho , \quad \eta (x)=0 \text { if } |x|>\rho +r, \end{aligned}$$

with

$$\begin{aligned} |\nabla \eta | \lesssim \frac{1}{r}, \quad |\nabla ^2\eta |\lesssim \frac{1}{r^2}. \end{aligned}$$
(30)

The i-th smoothened outer vorticity portion is defined as

$$\begin{aligned} \mu _i(t,\rho ,r) = \frac{1}{a_i} \int _{\mathbb {R}^2} \big ( 1-\eta _{\rho ,r}(x-X_i(t)) \big ) \omega _i(t,x)\,dx. \end{aligned}$$

It is easy to check that the functions \(m_i\) and \(\mu _i\) satisfy the relation

$$\begin{aligned} m_i(t,\rho +r) \le \mu _i(t,\rho ,r) \le m_i(t,\rho ). \end{aligned}$$
(31)

We show a differential inequality for \(\mu _i\) by making use of the fact that, thanks to Proposition 1,

$$\begin{aligned} m_i(t,\rho ) \le \frac{1}{\rho ^2} W_i(t)^2 \lesssim \frac{ e^{\Lambda t}}{\rho ^2}(\varepsilon ^2+\nu t) \end{aligned}$$
(32)

holds for all \(t\le T\).

Lemma 5

Let \(i\in \{1,\ldots ,N\}\) be fixed. For any \(t\le T\), \(r>0\) and \(\rho \in (R,d/6-r)\) it holds that

$$\begin{aligned} \mu _i(t,\rho ,r) \le \mu _i(0,\rho ,r) + \kappa (\rho ,r) \int _{0}^{t} m_i(s,\rho )\,ds, \end{aligned}$$
(33)

where

$$\begin{aligned} \kappa (\rho ,r) \lesssim \frac{e^{2\Lambda T}}{\rho ^2} \left( \frac{\varepsilon +\sqrt{\nu }}{r} + \frac{\varepsilon ^2+\nu }{r^2}\right) +\frac{\rho }{r} \end{aligned}$$
(34)

Proof

Up to dividing by \(a_i\), we can suppose without loss of generality that \(\omega _i\ge 0\) and \(a_i=1\). Moreover, we often neglect the time t and we write \(\mu _i(t,\rho ) = \mu _i(t,\rho ,r)\) for notational convenience.

Because \(\omega _i\) solves the advection-diffusion Eq. (3), and thanks to the fact that

$$\begin{aligned} \frac{d}{dt}X_i(t) = \frac{1}{a_i} \int _{{\mathbb {R}}^2} u(t,x)\omega _i(t,x)\,dx = \frac{1}{a_i} \int _{{\mathbb {R}}^2} F_i(t,x)\omega _i(t,x)\,dx, \end{aligned}$$
(35)

the second identity being true by the fact that the self-interaction term vanishes because of the property \(K(z)=-K(-z)\), we can compute

$$\begin{aligned} \frac{d}{dt}\mu _i(t,\rho )= & {} \left( \frac{d}{dt}X_i\right) \cdot \int _{\mathbb {R}^2} \nabla \eta (x-X_i)\,\omega _i(x)\,dx \\{} & {} - \int _{\mathbb {R}^2} (u_i(x)+F_i(x))\cdot \nabla \eta (x-X_i)\,\omega _i(x)\,dx \\{} & {} -\nu \int _{\mathbb {R}^2} \Delta \eta (x-X_i)\, \omega _i(x)\,dx \\= & {} -\int _{\mathbb {R}^2}\int _{\mathbb {R}^2} K(x-y)\cdot \nabla \eta (x-X_i)\, \omega _i(y)\,\omega _i(x)\,dy\,dx \\{} & {} + \int _{\mathbb {R}^2}\int _{\mathbb {R}^2} [F_i(y)-F_i(x)]\cdot \nabla \eta (x-X_i)\,\omega _i(y)\,\omega _i(x)\,dy\,dx \\{} & {} -\nu \int _{\mathbb {R}^2} \Delta \eta (x-X_i)\, \omega _i(x)\,dx \\= & {} I_1 + I_2 + I_3. \end{aligned}$$

Using once more \(K(z)=-K(-z)\) we may write

$$\begin{aligned} I_1 = \frac{1}{2} \int _{\mathbb {R}^2}\int _{\mathbb {R}^2} (\nabla \eta (y-X_i)-\nabla \eta (x-X_i)) \cdot K(x-y)\,\omega _i(x)\,\omega _i(y)\,dx\,dy. \end{aligned}$$

We notice that the integrand in non-zero only if either \(\rho<|x-X_i|<\rho +r\) or \(\rho<|y-X_i|<\rho +r\). We can therefore split the integration domain into the sets

$$\begin{aligned} A = \left( B_{\rho +r}(X_i)\setminus B_{\rho }(X_i)\right) \times B_{\frac{\rho }{2}}(X_i),\quad B = B_{\frac{\rho }{2}}(X_i) \times \left( B_{\rho +r}(X_i)\setminus B_{\rho }(X_i)\right) ,\\ C = ((B_{\rho +r}(X_i)\setminus B_{\rho }(X_i))\times B_{\frac{\rho }{2}}^c(X_i))\cup (B_{\frac{\rho }{2}}^c(X_i)\times (B_{\rho +r}(X_i)\setminus B_{\rho }(X_i))), \end{aligned}$$

and we denote by \(I_1^A\), \(I_1^B\) and \(I_1^C\) the contributions to \(I_1\) due to the sets A, B and C respectively. The terms \(I_1^A\) and \(I_1^B\) can be estimated in almost the same way. Using the radial symmetry of the cut-off function and orthogonality, we notice that

$$\begin{aligned} \nabla \eta (x-X_i)\cdot K(x-y)= & {} \eta '(|x-X_i|) \frac{x-X_i}{|x-X_i|}\cdot K(x-y)\\= & {} \eta '(|x-X_i|) \frac{y-X_i}{|x-X_i|}\cdot K(x-y), \end{aligned}$$

and thus, since on their domain of integration it holds that \(|x-y|\ge \rho /2\), using the scaling of the gradient of \(\eta \) from (30) and Proposition 1 , we see that

$$\begin{aligned} |I_1^A|&= \left| \frac{1}{2}\iint _A |\eta '(|x-X_i|) | \frac{|y-X_i|}{|x-X_i|}| K(x-y)|\omega _i(x)\omega _i(y)\, dxdy\right| \\&\lesssim \frac{1}{r\rho ^2} m_i(t,\rho )W_i(t)\\&\lesssim \frac{e^{\Lambda t}}{\rho ^2} \frac{\varepsilon +\sqrt{\nu t}}{r} m_i(t,\rho ), \end{aligned}$$

and the estimate of \(I_1^B\) proceeds analogously. To estimate \(I_1^C\) we use the Lipschitz condition (30) and estimate (32) to obtain

$$\begin{aligned} |I_1^C|&\lesssim \frac{1}{r^2} \iint _C |x-y||K(x-y)| \omega _i(x)\omega _i(y)\, dxdy \\&\lesssim \frac{1}{r^2} m_i\left( t,\frac{\rho }{2}\right) m_i(t,\rho ) \\&\lesssim \frac{e^{\Lambda t}}{\rho ^2}\frac{\varepsilon ^2+\nu t}{r^2}\,m_i(t,\rho ). \end{aligned}$$

Now we consider \(I_2\). We split \(F_i = F_i^L + F_i^B\),

$$\begin{aligned} \begin{aligned} I_2&= \int _{\mathbb {R}^2}\int _{\mathbb {R}^2} [F^L_i(y)-F^L_i(x)]\cdot \nabla \eta (x-X_i)\,\omega _i(y)\,\omega _i(x)\,dydx \\&\quad + \int _{\mathbb {R}^2}\int _{\mathbb {R}^2} [F^B_i(y)-F^B_i(x)]\cdot \nabla \eta (x-X_i)\,\omega _i(y)\,\omega _i(x)\,dydx \\&= I_{21} + I_{22}. \end{aligned} \end{aligned}$$

We use the Lipschitz-continuity of \(F_i^L\) given by Lemma 1, property (30), the triangle and Jensen’s inequalities and Lemma 1 to write

$$\begin{aligned} \begin{aligned} |I_{21}|&\lesssim \frac{1}{r} \int _{\mathbb {R}^2} \int _{\rho<|x-X_i|<\rho +r} |x - y| \, \omega _i(x)\,\omega _i(y)\,dx\,dy \\&\lesssim \frac{1}{r} \int _{\rho<|x-X_i|<\rho +r} |x - X_i| \, \omega _i(x) \,dx \\&\quad + \frac{1}{r} \left( \int _{|x - X_i|> \rho } \omega _i(x)\,dx \right) \left( \int _{\mathbb {R}^2} |y - X_i| \,\omega _i(y)\,dy \right) \\&\lesssim \left( 1+\frac{\rho }{r}\right) \, m_i(t,\rho ) + e^{\Lambda t} \frac{\varepsilon +\sqrt{\nu t}}{r}\, m_i(t,\rho ). \end{aligned} \end{aligned}$$

Now we consider \(I_{22}\). By our choice of r and \(\rho \), it holds that \(\rho + r \le d/6\), and thus the integrand vanishes when \(x\notin B_{d/6}(X_i)\). We can therefore split the integration domain of \(I_{22}\) into the following sets,

$$\begin{aligned} D = B_{d/6}(X_i) \times B_{d/6}(X_i), \quad E = B_{d/6}(X_i) \times B_{d/6}(X_i)^c, \end{aligned}$$

and investigate the two contributions \(I_{22}^D\) and \(I_{22}^E\) separately. On the set D, we use (30) so that

$$\begin{aligned} |I_{22}^D| \lesssim \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \frac{1}{r} m_i(t,\rho ). \end{aligned}$$

Thanks to Lemma 1 and property (21), it holds that

$$\begin{aligned} \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \lesssim (\varepsilon ^\alpha +(\nu t)^{\frac{\alpha }{2}})^\theta \sum _{j\not =i}\Vert \omega _j\Vert _{L^p}^{1-\theta }, \end{aligned}$$

and then, by the non-increasing property (8) of the \(L^p\) norm, the scaling assumption (7), condition (23) on \(\alpha \) and the bound (17) on \(\omega _i\), we find

$$\begin{aligned} \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \lesssim {\left\{ \begin{array}{ll} \varepsilon ^{\alpha \theta -(1-\theta )\gamma }< \varepsilon &{}\text {if } \sqrt{\nu t} \le \varepsilon \\ (\nu t)^{\frac{\alpha \theta }{2}-(1-\theta )\frac{p-1}{p}} < \sqrt{\nu t} &{}\text {if } \sqrt{\nu t} > \varepsilon , \end{array}\right. } \end{aligned}$$
(36)

also because \(\nu t< 1\). Hence

$$\begin{aligned} |I_{22}^D| \lesssim \frac{\varepsilon +\sqrt{\nu t}}{r} m_i(t,\rho ). \end{aligned}$$

We estimate the contribution \(I_{22}^E\) due to the set E as

$$\begin{aligned} \begin{aligned} |I_{22}^E|&\lesssim \Vert F_i^B\Vert _{L^{\infty }} \frac{1}{r} m_i(t,\rho )m_i(t,d/6)\\&\lesssim \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i)^c)} \, m_i(t,d/6)\, \frac{1}{r} \,m_i(t,\rho ) + \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i))} \frac{1}{r} \,m_i(t,\rho ). \end{aligned} \end{aligned}$$

Now, thanks to the first inequality in (21) and Lemma 1, we have

$$\begin{aligned} \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i)^c)} \, m_i(t,d/6) \lesssim \Vert \omega (t)\Vert _{L^p} \, (\varepsilon ^\alpha +(\nu t)^{\frac{\alpha }{2}}). \end{aligned}$$

Invoking property (8), assumption (7), the heat-kernel-type bound (17), and condition (23) on \(\alpha \), we may hence write

$$\begin{aligned} \Vert F_i^B\Vert _{L^\infty (B_{d/6}(X_i)^c)} \, m_i(t,d/6) \lesssim {\left\{ \begin{array}{ll} \varepsilon ^{\alpha -\gamma }< \varepsilon &{}\quad \text {if } \sqrt{\nu t} \le \varepsilon \\ (\nu t)^{\frac{\alpha }{2}-1+\frac{1}{p}} < \sqrt{\nu t} &{}\quad \text {if } \sqrt{\nu t} > \varepsilon . \end{array}\right. } \end{aligned}$$

From this and (36), we see that

$$\begin{aligned} |I_{22}^E|\lesssim \frac{ \varepsilon +\sqrt{\nu t}}{r} \,m_i(t,\rho ). \end{aligned}$$

The term \(I_{3}\) can be easily estimated using the bound (30),

$$\begin{aligned} |I_3| \lesssim \frac{\nu }{r^2} m_i(t,\rho ). \end{aligned}$$

In conclusion, thanks to \(r\le \rho \lesssim 1\),

$$\begin{aligned} \left| \frac{d}{dt} \mu _i(t,\rho ) \right|&\lesssim \frac{e^{2\Lambda T}}{\rho ^2} \left( \frac{\varepsilon +\sqrt{\nu }}{r} + \frac{\varepsilon ^2+\nu }{r^2}\right) m_i(t,\rho )+\frac{\rho }{r} m_i(t,\rho ). \end{aligned}$$

Integrating in time, we obtain the thesis. \(\square \)

Iterating estimate (33), we come to the lower bound of T.

Proposition 2

Suppose that (29) holds and \(\beta \ge 3\alpha \). Then there exist \(\varepsilon _0\), \(\nu _0\in (0,1)\) and a constant c such that for any \(\varepsilon \le \varepsilon _0\) ad \(\nu \le \nu _0\) it holds that \(T\ge c\). If \(T_c=\infty \) and \(\delta >0\) in (10), there is the stronger estimate

$$\begin{aligned} T\ge c\log \frac{1}{\varepsilon +\sqrt{\nu }}. \end{aligned}$$

Proof

We fix i such that (29) holds. Without loss of generality, we suppose that \(a_i=1\) and \(\omega _i\ge 0\). Moreover, we assume that

$$\begin{aligned} T\le \frac{1-2\delta }{4\Lambda }\log \frac{1}{\varepsilon +\sqrt{\nu }}, \end{aligned}$$
(37)

because otherwise, there is nothing left to prove.

Let \(M\in \mathbb {N}\) be a natural number. Our goal is to decrease the radii in M steps from d/6 to R, so that \(\rho _m = d/6-\sum _{n=1}^{m}r_n\) partitions [Rd/6] for some \(r_n\in (0,R)\). We want to choose \(r_m\) so that

$$\begin{aligned} \varepsilon +\sqrt{\nu }\le r_m^{\frac{2}{1-2\delta }}, \end{aligned}$$
(38)

for all \(m\in \{1,\dots ,M\}\), which implies, also thanks to (37), that the constant in (34) is further estimated by

$$\begin{aligned} \kappa (\rho _m,r_m) \lesssim \frac{(\varepsilon +\sqrt{\nu })^{2\delta }}{\rho _m^2} +\frac{\rho _m}{r_m} . \end{aligned}$$
(39)

To specify our choice of \(r_m\), we make the Ansatz

$$\begin{aligned} r_m = \frac{\sigma }{M}\left( \log \frac{1}{\varepsilon +\sqrt{\nu }}\right) ^{\xi } \rho _m , \end{aligned}$$

for some \(\sigma \in (0,1)\) and \(\xi \in \{0,1\}\), which we will both fix later. Solving the previous identity for \(r_m\), we find the iterative formula

$$\begin{aligned} r_m = \chi \left( \frac{d}{6}-\sum _{n=1}^{m-1}r_n\right) ,\quad \chi =\frac{\frac{\sigma }{M}\left( \log \frac{1}{\varepsilon +\sqrt{\nu }}\right) ^{\xi }}{1+\frac{\sigma }{M}\left( \log \frac{1}{\varepsilon +\sqrt{\nu }}\right) ^{\xi }}\in (0,1), \end{aligned}$$

from which we deduce that

$$\begin{aligned} r_m = \chi (1-\chi )^{m-1}\frac{d}{6}. \end{aligned}$$

It is readily verified that the radii \(r_m\) are decreasing, \(r_{m+1}< r_m\). We then compute that

$$\begin{aligned} \rho _M= \frac{d}{6}-\sum _{m=1}^Mr_m = \left( 1-\chi \sum _{m=0}^{M-1}(1-\chi )^m\right) \frac{d}{6} =(1-\chi )^M\frac{d}{6} , \end{aligned}$$

and the expression on the right-hand side is larger than R only if

$$\begin{aligned} M\le \frac{\log \frac{d}{6R}}{\log \frac{1}{1-\chi }}, \end{aligned}$$
(40)

which we suppose from here on. It follows via (10) that the first term in (39) is bounded by \(1/R_0^2\sim 1\), while the second term is larger than 1 provided that

$$\begin{aligned} \sigma \left( \log \frac{1}{\varepsilon +\sqrt{\nu }}\right) ^{\xi } \le M. \end{aligned}$$
(41)

If M can be chosen this way, which we will verify later, it follows that

$$\begin{aligned} \kappa (\rho _m,r_m) \lesssim \frac{M}{\sigma } \left( \log \frac{1}{\varepsilon +\sqrt{\nu }}\right) ^{-\xi }. \end{aligned}$$
(42)

We now use relation (31) and apply Lemma 5 to estimate

$$\begin{aligned} m_i(T,d/6) \le \mu _i(T,\rho _1,r_1) \le m_i(0,R) +\kappa (\rho _1,r_1)\int _0^T m_i(t_1,\rho _1)\, dt_1. \end{aligned}$$

Using (31) again in the sense that \(m_i(t_1,\rho _1) = m_i(t,\rho _2+r_2) \le \mu _i(t,\rho _2,r_2)\), and arguing interatively, we find

$$\begin{aligned}&m_i(T,d/6) \\&\quad \le \sum _{m=0}^{M-1} \frac{1}{m!} \left( \prod _{n=1}^m\kappa (\rho _n,r_n)\right) T^m m_i(0,R)\\&\qquad + \left( \prod _{m=1}^M \kappa (\rho _m,r_m)\right) \int _{0}^{T}\int _{0}^{t_1}\ldots \int _{0}^{t_{M-1}}\mu _i(t_M,\rho _{M+1},r_M)\,dt_M\ldots dt_2 dt_1. \end{aligned}$$

Using the trivial estimate \(\mu _i(t,\rho ,r)\le 1\), which holds true for all t, \(\rho \) and r, invoking the decay assumption of the initial configuration in (9), the defining condition on T in (29) and the bound in (42), we obtain

$$\begin{aligned} \varepsilon ^{\alpha } + (\nu T)^{\frac{\alpha }{2}} = m_i(T,d/6) \le \sum _{m=0}^{M-1} \frac{1}{m!} \left( \frac{C M T }{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^m \varepsilon ^{\beta }+ \frac{1}{M!}\left( \frac{CMT}{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^M, \end{aligned}$$

for some universal constant C. Using the Stirling formula \(M^M < e^M M!\) and its generalization (51) in the appendix, we find that

$$\begin{aligned} \varepsilon ^{\alpha } + (\nu T)^{\frac{\alpha }{2}}\le & {} \left( 1+ \frac{CT}{\sigma \log ^{\xi } \frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^M\varepsilon ^{\beta } + \left( \frac{CT}{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^M\nonumber \\\le & {} 2\left( \varepsilon ^{\frac{\beta }{M}} + \frac{CT}{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^M, \end{aligned}$$
(43)

for some new constant C. From here, we derive the desired bound by distinguishing two cases.

Relatively small viscosity. We deduce from (43) that

$$\begin{aligned} \varepsilon ^{\frac{\alpha }{M}} \le 2^{\frac{1}{M}}\left( \varepsilon ^{\frac{\beta }{M}} + \frac{CT}{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) , \end{aligned}$$
(44)

and the left-hand side is uniformly bounded from below, say \(\varepsilon ^{\frac{\alpha }{M}}= 1/2\) provided that

$$\begin{aligned} \frac{\alpha }{\log 2}\log \frac{1}{\varepsilon } = M . \end{aligned}$$
(45)

Before continuing, we want to make sure that such an M is consistent with all the hypotheses we made so far, i.e., (38), (40) and (41). Let us start by treating the case

$$\begin{aligned} \varepsilon \ge \sqrt{\nu }. \end{aligned}$$

First, regarding (40), we notice that

$$\begin{aligned} \frac{1}{1-\chi } = 1+\frac{\sigma }{M} \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }} \le 1+\frac{\sigma }{M} \log ^{\xi }\frac{1}{\varepsilon } = 1+ \frac{\sigma \log 2}{\alpha }\frac{1}{\log ^{1-\xi }\frac{1}{\varepsilon }}. \end{aligned}$$

Hence, in order to guarantee (40), it is enough to require that

$$\begin{aligned} \frac{\alpha }{\log 2} \log \left( 1+ \frac{\sigma \log 2}{\alpha }\frac{1}{\log ^{1-\xi }\frac{1}{\varepsilon }}\right) \log \frac{1}{\varepsilon }\le \log \frac{d}{6R}. \end{aligned}$$

The latter can be strengthed by estimating the logarithm linearly,

$$\begin{aligned} \sigma \log ^{\xi }\frac{1}{\varepsilon } \le \log \frac{d}{6R}. \end{aligned}$$

If now (10) holds true with \(\delta =0\), that is \(R\gtrsim 1\), we necessarily have to choose \(\xi =0\) and this estimate holds true for \(\sigma \) sufficiently small. Otherwise, if \(\delta >0\), we are allowed to choose \(\xi =1\) and we have to ensure that

$$\begin{aligned} \sigma \log \frac{1}{\varepsilon } \le \delta \log \frac{1}{\varepsilon } -c, \end{aligned}$$

for some universal constant \(c\ge 0\). This is possible whenever \(\sigma < \delta \) and \(\varepsilon \) sufficiently small.

We now turn to condition (41), which can be rewritten as

$$\begin{aligned} \sigma \log ^{\xi } \frac{1}{\varepsilon +\sqrt{\nu }} \le \frac{\alpha }{\log 2}\log \frac{1}{\varepsilon }, \end{aligned}$$

which can be strengthed if \(\sqrt{\nu }\) on the left-hand side is dropped and \(\xi \) is estimated by 1. In this case, choosing \(\sigma \) small does the job.

We finally turn to the worst case of (38), that is \(m=M\), which can be rewritten as

$$\begin{aligned} (M-1)\log \frac{1}{1-\chi } + \log \frac{1}{\chi } \le \frac{1-2\delta }{2} \log \frac{1}{\varepsilon +\sqrt{\nu }} +\log \frac{d}{6}. \end{aligned}$$

Estimating \(\varepsilon \le \varepsilon +\sqrt{\nu }\le 2\varepsilon \) and arguing similarly as above, we notice that this estimate can be strengthed to

$$\begin{aligned} \sigma \log ^{\xi }\frac{1}{\varepsilon } + \log \left( 1+ \frac{M}{\sigma } \log ^{-\xi } \frac{1}{2\varepsilon }\right) \le \frac{1-2\delta }{2}\log \frac{1}{\varepsilon } + \log \frac{d}{12}. \end{aligned}$$

In view of our definition of M, this estimate can be rewritten as

$$\begin{aligned} \sigma \log ^{\xi }\frac{1}{\varepsilon } + \log \left( 1+ \frac{\alpha }{\sigma \log 2 } \log ^{1-\xi } \frac{1}{2\varepsilon }\right) \le \frac{1-2\delta }{2}\log \frac{1}{\varepsilon } + \log \frac{d}{12}, \end{aligned}$$

which holds for \(\xi \in \{0,1\}\) true for any fixed \(\sigma <\frac{1-2\delta }{2}\) and \(\varepsilon \) small enough.

Reviewing the previous arguments, it is easy to check that in the particular case \(\xi =\delta =0\), the choice of M is possible even if

$$\begin{aligned} \varepsilon \le \sqrt{\nu } \le \frac{c\sigma ^2}{\log ^2 \frac{1}{\varepsilon }} , \end{aligned}$$
(46)

for some constant \(c>0\). Indeed, (41) is simply equivalent to \(\sigma \le M\), while (40) becomes \(\left( 1+\sigma /M\right) ^M \le \frac{d}{6R}\), which is thanks to (13) true uniformly in M for \(\sigma \le \log 2\). Moreover, for (38) it is enough to require that

$$\begin{aligned} \log \left( 1+\frac{M}{\sigma }\right) + M \log \left( 1+\frac{\sigma }{M}\right) \le \frac{1}{2}\log \frac{1}{\varepsilon +\sqrt{\nu }} + \log \frac{d}{6}. \end{aligned}$$

Estimating the second logarithm on the left-hand side linearly, and supposing that \(2 \sigma \le \log \frac{d}{6}\), the latter can be strengthend to

$$\begin{aligned} 1+ \frac{M}{\sigma } \le \left( \frac{d/6}{\varepsilon +\sqrt{\nu }}\right) ^{1/2}. \end{aligned}$$

Using our choice of M in (45) and solving for \(\nu \) gives (46).

Now that we have proved that (45) is admissible, we want to pick \(\beta \) such that

$$\begin{aligned} \beta \ge \alpha + (M+1)\frac{\log 2}{\log \frac{1}{\varepsilon }}. \end{aligned}$$

This choice allows us to deduce from (44) that

$$\begin{aligned} 2^{M+1}\varepsilon ^{\beta } \le \varepsilon ^{\alpha }\le 2^{M+1}\left( \frac{C T}{\sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }}}\right) ^M, \end{aligned}$$

and the second inequality in turn implies that

$$\begin{aligned} \sigma \log ^{\xi }\frac{1}{\varepsilon +\sqrt{\nu }} \lesssim T, \end{aligned}$$

where \(\xi =0\) if \(\delta =0\) and \(\xi =1\) if \(\delta >0\) in (10). It remains to notice that \(\beta =3\alpha \) is possible if \(\varepsilon \) is small.

Relatively large viscosities. Let us start with the particular situation where \(\xi =\delta =0\) and

$$\begin{aligned} \sqrt{\nu }\ge \frac{c \sigma ^2}{\log ^2\frac{1}{\varepsilon }} . \end{aligned}$$
(47)

We start again with (43), from which we derive this time the estimate

$$\begin{aligned} \nu T \le 2^{\frac{2}{\alpha }} \left( \varepsilon ^{\frac{\beta }{M}} +\frac{CT}{\sigma } \right) ^{\frac{2M}{\alpha }}, \end{aligned}$$

and we choose \(\beta \) such that \(\varepsilon ^{\frac{\beta }{M}}\le CT/\sigma \), that is

$$\begin{aligned} M\log \frac{\sigma }{CT} \le \beta \log \frac{1}{\varepsilon }. \end{aligned}$$
(48)

Then we obtain

$$\begin{aligned} \nu ^{\frac{\alpha }{2M-\alpha }} \le 2^{\frac{2M+2}{2M-\alpha }} \left( C/\sigma \right) ^{\frac{2M}{2M-\alpha }} T\le 4\left( C/\sigma \right) ^2T, \end{aligned}$$

and the last inequality is true as long as \(M\ge \alpha +1\). The left-hand side is uniformly bounded from below, say \(\nu ^{\frac{\alpha }{2M-\alpha }}=1/2\), if

$$\begin{aligned} \frac{\alpha }{2} \log \frac{2}{\nu }= M. \end{aligned}$$
(49)

In this case, we thus have that

$$\begin{aligned} T\ge \sigma ^2/ (8C^2). \end{aligned}$$
(50)

Notice that under the above estimate, the condition \(M\ge \alpha +1\) is automatically satisfied if \(\nu \) is sufficiently small.

We have to check if the choice (49) of M aligns with the assumption (38), recalling that (40) and (41) are readily verified for \(\xi =0\) if \(\sigma \) is sufficiently small. We rewrite (38) with \(m=M\) as

$$\begin{aligned} \left( \varepsilon +\sqrt{\nu } \right) ^{\frac{1-2\delta }{2}}\le \chi (1-\chi )^{M-1}\frac{d}{6} =\frac{\sigma }{M+\sigma } \left( \frac{M}{M+\sigma }\right) ^{M-1} \frac{d}{6}, \end{aligned}$$

Using \(\varepsilon \le \sqrt{\nu }\) and the linear estimate for the logarithm, we observe that this estimate follows from

$$\begin{aligned} \sigma + \log \left( 1+\frac{\alpha }{2 \sigma } \log \frac{2}{\nu }\right) \le \frac{1-2\delta }{2} \log \frac{1}{\sqrt{\nu }} + \log \frac{d}{12}, \end{aligned}$$

which holds true for \(\nu \) small enough.

Having established (50), we finally claim that (48) is satisfied with \(\beta =3\alpha \), i.e.,

$$\begin{aligned} M\log \frac{\sigma }{CT} \le 3\alpha \log \frac{1}{\varepsilon }. \end{aligned}$$

Using the definition of M in (49) and the lower bound on T in (50), we observe that the latter holds true if

$$\begin{aligned} \log \frac{2}{\nu }\log \frac{8C}{\sigma } \le 6\log \frac{1}{\varepsilon }. \end{aligned}$$

Fortunately, we have supposed an exponential bound on \(\varepsilon \) via (47), so that the previous estimate is implied by

$$\begin{aligned} \log \frac{2}{\nu } \log \frac{8C}{\sigma } \le 6\sigma \left( \frac{ c}{\sqrt{\nu }}\right) ^{1/2}. \end{aligned}$$

This, in turn, is true whenever \(\nu \) is small enough.

The last situation to consider is the case

$$\begin{aligned} \varepsilon \le \sqrt{\nu } \end{aligned}$$

and \(\delta >0\), so that \(\xi =1\). Since the case \(\delta =0\) is weaker than the case \(\delta >0\), cf. (9) and (10), the bound on T in (50) carries over to the case \(\delta >0\). Estimate (43) thus becomes

$$\begin{aligned} \left( \frac{\nu \sigma ^2}{8C^2}\right) ^{\frac{\alpha }{2M}} \le 2^{\frac{1}{M}} \left( \nu ^{\frac{\beta }{2M}} + \frac{CT}{\sigma \log \frac{1}{\varepsilon +\sqrt{\nu }}}\right) , \end{aligned}$$

and a lower bound of the form

$$\begin{aligned} T\gtrsim \log \frac{1}{\varepsilon +\sqrt{\nu }} \end{aligned}$$

can be established almost identically to the case \(\varepsilon \ge \sqrt{\nu }\). \(\square \)