1 Introduction

In the recent decades interacting particle systems attracted a lot of attention from researchers of various fields such as swarming, pedestrian dynamics and opinion formation (cf. Albi and Pareschi 2013; Helbing and Molnár 1995; Toscani 2006; Totzeck 2020 and the references therein). In particular, a model hierarchy was established Carrillo et al. 2010; Golse 2003. The main idea of the hierarchy is to model the same dynamics with different accuracies, each having its own advantages and disadvantages. The model with the highest accuracy is the microscopic one. It describes the positions and velocities of each particle explicitly. For applications with many particles involved, this microscopic modelling leads to a huge amount of computational effort and storage needed. Especially, when it comes to the optimization of problems with many particles Burger et al. 20202020.

There is also an intermediate level of accuracy given by the mesoscopic description, see Albi and Pareschi 2013; Carrillo et al. 2010; Totzeck 2020. We do not want to give its details here, instead, we directly pass to the macroscopic level, where the velocities are averaged and a position-dependent density describes the probability of finding a particle of the dynamics at a given position. Of course, we lose the explicit information of each particle, but have the advantage of saving a lot of storage in the simulation of the dynamics. Despite the lower accuracy many studies Albi and Pareschi 2013; Burger et al. 2020; Mahato et al. 2018 indicate that the evolution of the density yields a good approximation of the original particle system, see also Weissen et al. (2021), which proposed a limiting procedure that is considered in more detail below.

Moreover, the macroscopic description naturally involves boundary conditions which play a crucial role in many engineering experiments with interacting particles, see for example the pedestrian dynamics data archive by the Civil Safety Research of the Forschungszentrum Jülich Seyfried and Boltes (2021).

This observation motivates us to exploit the aforementioned relationship of microscopic and macroscopic models and propose a space mapping-based optimization scheme for interacting particle dynamics which are inappropriate for direct optimization. This might be the case for particle dynamics that involve a huge number of particles for which traditional optimization is expensive in terms of storage, computational effort and time. Another example is the optimization of particle dynamics in bounded domains, where the movement is restricted by obstacles or walls. In fact, systems based on ordinary differential equations (ODEs) do not have a natural prescription of zero-flux or Neumann boundary data, but those conditions are often critical in applications.

An exemplary application is the movement of large crowds Göttlich and Pfirsching 2018; Helbing et al. 2000; Helbing and Molnár 1995 which includes the combination of individual behavior with a herding instinct leading to collective motion. Systematic studies analyze the behavior of human crowds in crowded buildings or at event venues with a high number of participants such as in concerts Johnson 1987 or religious gatherings Haase et al. 2019. Dangerous overcrowding and jamming might occur at bottlenecks Helbing et al. 2000. The herding instinct entails that all individuals move in the same direction of probably blocked pathways. Their movement becomes uncoordinated and since paths are clogged, jams build up. If people get too close together they start to push and interact with each other. The interaction between two pedestrians is modeled using an interaction force which pushes the pedestrians apart if they touch each other. It becomes the dominant force in this case and the physical interaction within these crowds might even become dangerous Helbing et al. 2000. The interactions with walls is also modeled with a similar interaction force Helbing et al. 2000 and the geometry of the domain strongly influences the behavior of the crowd. Interesting scenarios are the evacuation of people from a hotel or an office building Okazaki and Matsushita 1993, football stadia Elliott and Smith 1993, passenger ships Klüpfel et al. 2001 or aircrafts Miyoshi et al. 2012. All of the described settings naturally involve bounded domains and the pedestrian flows depend on the geometry of the domain Helbing et al. 2002. Simulations show the regions where congestion and stagnation in the movement occur. This information can help designers to examine and verify evacuation routes. Another useful application is the transport of material flow on conveyor belts Göttlich et al. 2014; Göttlich and Pfirsching 2018. Conveyor belts are often used in factories to transport large amounts of goods, e.g. in bottling plants Festa et al. 2019. The goods are transported on the belts and redirected by obstacles such as diverter equipment Prims et al. 2019. When parts are too close together they exhibit repelling forces on each other. The same holds true for the interaction with obstacles or walls through which parts experience forces that allow to alter their transport direction.

While boundary interaction can be included in the ODE dynamics, their treatment is not self-evident. The same holds for optimization algorithms based on these simulations. To the authors’ knowledge, alternatives to black box optimization algorithms that allow for convenient optimization of ODE systems within bounded domains do not exist. In contrast to the microscopic ones, models based on partial differential equations (PDEs) require boundary conditions. Often zero-flux or Neumann type boundary conditions are prescribed. Further, numerical schemes naturally integrate these boundary conditions and optimization algorithms can be directly formulated. The approach discussed in the following allows us to approximate the optimizer of microscopic dynamics with additional boundary behavior while only optimizing the macroscopic model with an adjoint based scheme. It is therefore a sophisticated alternative to black box optimization approaches.

1.1 Modeling equations and general optimization problem

We begin with the general framework and propose the space mapping technique to approximate an optimal solution of the interacting particle system. In general, the interacting particle dynamics for \(N\in \mathbb N\) particles in the microscopic setting is given by the ODE system

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d} x_i}{\mathrm {d} t}&= v_i, \\ m\frac{\mathrm {d}v_i}{\mathrm {d}t}&= G(x_i,v_i) + A\sum _{j \ne i } F(x_i-x_j),\\ x_i(0)&= x_i^0, v_i(0)=v_i^0, \end{aligned} \qquad i =1, \dots N\end{aligned}$$
(1)

where \(x_i \in \mathbb {R}^2,v_i \in \mathbb {R}^2\) are the position and the velocity of particle i supplemented with initial condition \(x_i(0) = x_{i}^0, v_i(0) = v_{i}^0\) for \(i=1,\dots ,N\). Here, \(F\) denotes an interaction kernel which is often given as a gradient of a potential D’Orsogna et al. 2006a. For notational convenience, we define the state vector \(y= (x_i,v_i)_{i=1,\dots ,N}\) which contains the position and velocity information of all particles.

Remark 1

Note that there are models that include boundary dynamics with the help of soft core interactions, see for example Helbing and Molnár 1995. In general, these models allow for direct optimization. Nevertheless, for \(N\gg 1\) the curse of dimensionality applies and the approach discussed here may still be useful.

Sending \(N\rightarrow \infty\) and averaging the velocity, we formally obtain a macroscopic approximation of the ODE dynamics given by the PDE

$$\begin{aligned} \begin{aligned} \partial _t \rho + \nabla \cdot \left( \rho \overline{v}(x) - {k(\rho )} \nabla \rho \right)&=0, \qquad (x,t) \in \Omega \times [0,T] \\ \rho (x,0)&= \rho ^0(x), \qquad x \in \Omega \end{aligned} \end{aligned}$$
(2)

where \(\rho = \rho (x,t)\) denotes the particle density in the domain \(\Omega \subseteq \mathbb {R}^2\). The velocity \(\overline{v}\) is the averaged velocity depending on the position and \({k(\rho )}\) is the diffusion coefficient.

We consider constrained optimization problems of the form

$$\begin{aligned} \begin{aligned} \min _{u\in \mathcal {U}_{ad}}&J(u,y) \\ \text {subject to }~&E(u,y) = 0, \end{aligned} \end{aligned}$$

where J is the cost functional, \(\mathcal {U}_{ad}\) is the set of admissible controls and \(y\) are the state variables with \(E(u,y) = 0\). In the following, for a given control \(u\in \mathcal {U}_{ad}\), the constraint \(E(u,y)\) contains the modeling equations for systems of ODEs or PDEs. With the additional assumption that for a given control \(u\), the model equations have a unique solution, we can express \(y= y(u)\) and consider the reduced problem

$$\begin{aligned}&\min _{u\in \mathcal {U}_{ad}} J(u,y(u)). \end{aligned}$$
(3)

This is a nonlinear optimization problem, which we intend to solve for an ODE constraint \(E(u,y(u))\). To do this, one might follow a standard approach Hinze et al. 2009 and apply a gradient descent method based on adjoints Tröltzsch 2010 to solve the microscopic reduced problem iteratively. In contrast, the space mapping technique employs a cheaper, substitute model (coarse model) for the optimization of the fine model optimization problem. Under the assumption that the optimization of the microscopic system is difficult and the optimization of the macroscopic system can be computed efficiently, we propose space mapping-based optimization. The main objective is to iteratively approximate an optimal control for the microscopic dynamics. To get there, we solve a related optimal control problem on the macroscopic level in each iteration.

1.2 Literature review and outline

Space mapping was originally introduced in the context of electromagnetic optimization Bandler et al. 1994. The original formulation has been subject to improvements and changes Bandler et al. 2004 and enhanced by classical methods for nonlinear optimization. The use of Broyden’s method to construct a linear approximation of the space mapping function, so-called aggressive space mapping (ASM) was introduced by Bandler et al. Bandler et al. (1995). We refer to Bakr et al. (2000); Bandler et al. (2004) for an overview of space mapping methods.

More recently, space mapping has been successfully used in PDE based optimization problems. Banda and Herty Banda and Herty (2011) presented an approach for dynamic compressor optimization in gas networks. Göttlich and Teuber Göttlich and Teuber (2018) use space mapping based optimization to control the inflow in transmission lines. In both cases, the fine model is given by hyperbolic PDEs on networks and the main difficulty arises from the nonlinear dynamics induced by the PDE. These dynamics limit the possibility to efficiently solve the optimization problems. In their model hierarchy, a simpler PDE serves as the coarse model and computational results demonstrate that such a space mapping approach enables to efficiently compute accurate results.

Pinnau and Totzeck Totzeck and Pinnau (2020) used space mapping for the optimization of a stochastic interacting particle system. In their approach the deterministic state model was used as coarse model and lead to satisfying results.

Here, we employ a mixed hyperbolic-parabolic PDE as the coarse model in the space mapping technique to solve a control problem on the ODE level. Our optimization approach therefore combines different hierarchy levels. As discussed, the difficulty on the ODE level can arise due to boundaries in the underlying spatial domain or due to a large number of interacting particles. In contrast, the macroscopic equation naturally involves boundary conditions and its computational effort is independent of the particle number.

The outline of the paper is as follows: We introduce the space mapping technique in Sect 2 together with the fine and coarse model description in the Sects. 2.1 and 2.2. Particular attention is paid to the solution approach for the discretized coarse model in Sect. 2.2.2, which is an essential step in the space mapping algorithm. The discretized fine model optimal control problem is presented in Sect. 3. We give an example where we can compare and validate our approach to a standard optimization approach for the fine model. We provide numerical examples in bounded domains in Sect. 4. In the Sects. 4.1 and 4.2, the microscopic optimization approach cannot be applied due to the additional boundary interaction. However, the space mapping algorithm can still be applied and properly includes the boundary interaction. Various controls such as the source of an eikonal field in evacuation dynamics, cf. Sect. 4.1, and the conveyor belt velocity in a material flow setting, cf. Sect. 4.2, demonstrate the diversity of the proposed space mapping approach. In the conclusion in Sect. 5 our insights are summarized.

2 Space mapping technique

Space mapping considers a model hierarchy consisting of a coarse and a fine model. Let \({\mathcal{G}}^{c} :{\mathcal{U}}^{c} _{{ad}} \to \mathbb{R}^{{n_{c} }} ,{\mathcal{G}}^{f} :{\mathcal{U}}^{f} _{{ad}} \to \mathbb{R}^{{n_{f} }}\) denote the operators mapping a given control \(u\) to a specified observable \({\mathcal{G}}^{c} (u)\) in the coarse and \({\mathcal {G}}^f(u)\) in the fine model, respectively. The idea of space mapping is to find the optimal control \(u^{f} _{*} \in {\mathcal{U}}^{f} _{{ad}}\) of the complicated (fine) model control problem with the help of a coarse model, that is simple to optimize.

We assume that the optimal control of the fine model

$$\begin{aligned} u^f_*= {{\,\mathrm{argmin}\,}}_{u\in {\mathcal{U}}^{f} _{{ad}}} \Vert{\mathcal{G}}^{f} (u) -\omega _*\Vert , \end{aligned}$$

where \({\omega _*} \in \mathbb {R}^n\) is a given target state, is inappropriate for optimization. In contrast, we assume the optimal control \(u^c_*\in \mathcal {U}^c_{ad}\) of the coarse model control problem

$$\begin{aligned} u^c_*= {{\,\mathrm{argmin}\,}}_{u\in \mathcal {U}^c_{ad}} \Vert \mathcal {G}^c(u) - \omega _*\Vert , \end{aligned}$$

can be obtained with standard optimization techniques. While it is computationally cheaper to solve the coarse model, it helps to acquire information about the optimal control variables of the fine model. By exploiting the relationship of the models, space mapping combines the simplicity of the coarse model and the accuracy of the more detailed, fine model very efficiently Bakr et al. 2001; Echeverría and Hemker 2005.

Definition 2.1

The space mapping function \(\mathcal {T}: {\mathcal{U}}^{f} _{{ad}}\rightarrow \mathcal {U}^c_{ad}\) is defined by

$$\begin{aligned} \mathcal {T}(u^f) = {{\,\mathrm{argmin}\,}}_{u\in \mathcal {U}^c_{ad}} \Vert \mathcal {G}^c(u)-\mathcal {G}^f(u^f)\Vert . \end{aligned}$$

The process of determining \(\mathcal {T}(u^f)\), the solution to the minimization problem in Definition 2.1, is called parameter extraction. It requires a single evaluation of the fine model \(\mathcal {G}^f(u^f)\) and a minimization in the coarse model to obtain \(\mathcal {T}(u^f) \in {\mathcal {U}^c_{ad}}\). Uniqueness of the solution to the optimization problem is desirable but in general not ensured since it strongly depends on the two models and the admissible sets of controls \({\mathcal{U}}^{f} _{{ad}}, \mathcal {U}^c_{ad}\), see Echeverría and Hemker 2005 for more details.

The basic idea of space mapping is that either the target state is reachable, i.e., \(\mathcal {G}^f(u_*^f) \approx \omega _*\) or both models are relatively similar in the neighborhood of their optima, i.e., \(\mathcal {G}^f(u_*^f) \approx \mathcal {G}^c(u_*^c)\). Then we have

$$\begin{aligned} \mathcal {T}(u_*^f) = {{\,\mathrm{argmin}\,}}_{u\in \mathcal {U}^c_{ad}} \Vert \mathcal {G}^c(u) -\mathcal {G}^f(u_*^f)\Vert \approx {{\,\mathrm{argmin}\,}}_{u\in \mathcal {U}^c_{ad}} \Vert \mathcal {G}^c(u)-\omega _*\Vert = u^c_*, \end{aligned}$$

compare Echeverría and Hemker 2005. In general, it is very difficult to establish the whole mapping \(\mathcal {T}\), we therefore only use evaluations. In fact, the space mapping algorithms allow us to shift most of the model evaluations in an optimization process to the faster, coarse model. In particular, no gradient information of the fine model is needed to approximate the optimal fine model control Bakr et al. 2001. Figure 1 illustrates the main steps of the space mapping algorithm, see also Bandler et al. (2004); Koziel et al. (2011).

Fig. 1
figure 1

Schematic representation of a space mapping algorithm

In the literature, many variants of the space mapping idea can be found Bandler et al. 2004. We will use the ASM algorithm, see algorithm 1 in "Appendix A" or the references Bandler et al. 1995; Göttlich and Teuber 2018 for algorithmic details. Starting from the iterate \(u_1 = u^c_*\), the descent direction \(d_{k}\) is updated in each iteration \(k\) using the space mapping evaluation \(\mathcal {T}(u_k)\). The algorithm terminates when the parameter extraction maps the current iterate \(u_k\) (approximately) to the coarse model optimum \(u^c_*\), such that \(\Vert \mathcal {T}(u_k) - u^c_*\Vert\) is smaller than a given tolerance in an appropriate norm \(\Vert \cdot \Vert\). The solutions \(u^c_*\) and \(\mathcal {T}(u_k)\) are computed using adjoints here and will be explained in Sect. 2.2.2.

2.1 Fine model

We seek to control a general microscopic model for the movement of \(N\) particles with dynamics given by (1). We choose the velocity selection mechanism

$$\begin{aligned} G(x, v) = - \frac{(v - \overline{v}(x))}{\tau }, \end{aligned}$$

which describes the correction of the particle velocities towards an equilibrium velocity \(\overline{v}(x)\) with relaxation time \(\tau\). Such systems describe the movements of biological ensembles such as school of fish, flocks of birds Armbruster et al. 2017; Chuang et al. 2007; D’Orsogna et al. 2006b, ants Boi et al. 2000 or bacterial colonies Koch and White 1998 as well as pedestrian crowds Göttlich et al. 2018; Helbing and Molnár 1995 and transport of material Göttlich et al. 2014, 2015. In general, the force \(F\) occuring in (1) is a pairwise interaction force between particle i and particle j. We choose to activate it whenever two particles overlap and therefore \(\Vert x_i - x_j\Vert _2 < 2R\). For \(\Vert x_i - x_j\Vert _2 \ge 2R\), the interaction force is assumed to be zero. In the following, we restrict ourselves to forces described by

$$\begin{aligned} F(x_i - x_j)&= {\left\{ \begin{array}{ll} b_F\left( \Vert x_i - x_j\Vert _2 - 2R\right) ^2 \frac{x_i - x_j}{\Vert x_i - x_j\Vert _2} &{}\text { if } \Vert x_i -x_j\Vert _2 \le 2R, \\ 0 &{}\text { otherwise. } \end{array}\right. } \end{aligned}$$
(4)

where \(b_F>0\).

We consider the optimization problem (3) and set \(E(u,y) =0\) if and only if the microscopic model equations (1) are satisfied to investigate various controls \(u.\) For example, \(u\) being the local equilibrium velocity \(\overline{v}(x)\) of the velocity selection mechanism or \(u\) being the factor \(A\) scaling the interaction force between the particles. The objective function under consideration in each of the scenarios is the squared deviation of the performance evaluation \(j(u,y(u))\) from the target value \(\omega _*\in \mathbb {R},\) that is

$$\begin{aligned} J(u,y(u)) = \frac{1}{2}\left( j(u,y(u)) - \omega _*\right) ^2. \end{aligned}$$
(5)

In the following, we discuss the macroscopic approximation which is used as a coarse model for the space mapping.

2.2 Coarse model

Reference Weissen et al. 2021 shows that in the many particle limit, \(N\rightarrow \infty\), the microscopic system (1) can be approximated by the advection-diffusion Eq. (2) with diffusion coefficient

$$\begin{aligned} {k(\rho )} = C\rho H(\rho -\rho _{crit}), \qquad C> 0. \end{aligned}$$

The density \(\rho _{crit}= 1\) is a density threshold, above which diffusion in the macroscopic model is activated. \(H\) denotes the Heaviside function

$$\begin{aligned} H(x) = {\left\{ \begin{array}{ll} 0 &{}\text { if } x <0, \\ 1 &{}\text { otherwise }. \end{array}\right. } \end{aligned}$$

At the boundary, we apply zero-flux boundary conditions for the advective and the diffusive flux

$$\begin{aligned} \begin{aligned} (\overline{v}\rho ) \cdot \vec {n}&= 0, \quad x \in \partial \Omega , \\ ( {k(\rho ) \nabla \rho }) \cdot \vec {n}&= 0, \quad x \in \partial \Omega , \end{aligned} \end{aligned}$$
(6)

where \(\vec {n}=(n^{(1)},n^{(2)})^T\) is the outer normal vector at the boundary \(\partial \Omega\).

The advection-diffusion Eq. (2) serves as the coarse model in the space mapping technique. To solve optimization problems in the coarse model, we pursue a first-discretize-then-optimize approach. In the following, we discretize the macroscopic model and derive the first order optimality system for the discretized macroscopic system.

Remark 2

We recommend to choose the optimization approach depending on the structure of the macroscopic equation. Here, the PDE is hyperbolic whenever no particles overlap, we therefore choose first-discretize-then-optimize. If the macroscopic equation would be purely diffusive, one might employ a first-optimize-then-discretize approach instead.

2.2.1 Discretization of the macroscopic model

We discretize a rectangular spatial domain \((\Omega \cup \partial \Omega ) \subset \mathbb {R}^2\) with grid points \(x_{{ij}} = (x_{ij}^{(1)}, x_{ij}^{(2)})^T\), \((i,j) \in \mathcal {I}_\Omega = \lbrace 1, \dots , N_x\rbrace \times \lbrace 1, \dots N_x\rbrace\) on a grid with step size \(\Delta x\) in both coordinate directions. The boundary \(\partial \Omega\) is described with the set of indices \(\mathcal {I}_{\partial \Omega }\subset \mathcal {I}_\Omega\). The time discretization of the coarse model is \(\Delta t^c\) and the grid constant is \(\lambda = \Delta t^c/ \Delta x\). We compute the approximate solution to the advection-diffusion Eq. (2) as follows

$$\begin{aligned} \rho (x,t) = \rho _{{ij}}^s ~ \text {for } {\left\{ \begin{array}{ll} x \in \mathcal {C}_{{ij}}, \\ t \in [t^{s}, t^{s+1}), \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} \mathcal {C}_{{ij}}&= \left[ x_{i-\frac{1}{2},j}^{(1)}, x_{i+\frac{1}{2},j}^{(1)} \right) \times \left[ x_{i, j-\frac{1}{2}}^{(2)}, x_{i, j+\frac{1}{2}}^{(2)} \right) , \\ t^s&= s \Delta t^c\quad \text {for }s=1, \dots , N_t^c. \end{aligned}$$

The discretization of the initial density in (2) is obtained from the microscopic initial positions using a kernel density approach, see e.g. Fan and Seibold (2013); Parzen (1962); Rosenblatt (1956). This means that the initial density is constructed from the microscopic positions which are smoothed with a Gaussian filter such that the initial density reads

$$\begin{aligned} \rho ^0(x) = \frac{\pi R^2}{2 \pi } \sum _{i=1}^N\exp \left( -\frac{1}{2} \Vert x- x_i^0\Vert _2^2 \right) . \end{aligned}$$
(7)

To compute \(\rho _{ij}^s, s>0\), we solve the advection part with the Upwind scheme and apply dimensional splitting. The diffusion part is solved implicitly

$$\begin{aligned} \begin{aligned} \tilde{\rho }_{ij}^{s}&= \rho _{ij}^s - \frac{\Delta t^c}{\Delta x} \left( \mathcal {F}_{ij}^{(1),s,+} - \mathcal {F}_{ij}^{(1),s,-} \right) , \\ \overline{\rho }_{ij}^{s}&= \tilde{\rho }_{ij}^s - \frac{\Delta t^c}{\Delta x} \left( \mathcal {F}_{ij}^{(2),s,+} - \mathcal {F}_{ij}^{(2),s,-}\right) , \\ \rho _{ij}^{s+1}&= \overline{\rho }_{ij}^s + \frac{\Delta t^c}{(\Delta x)^2 } B_{ij}^{s+1}, \end{aligned} \end{aligned}$$
(8)

where the following short notation is used

$$\begin{aligned} \mathcal {F}_{ij}^{(1),s,+}&= \mathcal {F}^{(1)}(\rho _{ij}^s,\rho _{i+1j}^s), \qquad \mathcal {F}_{ij}^{(1),s,-} = \mathcal {F}^{(1)}(\rho _{i-1j}^s, \rho _{ij}^s), \\ \mathcal {F}_{ij}^{(2),s,+}&= \mathcal {F}^{(2)}(\tilde{\rho }_{ij}^s,\tilde{\rho }_{ij+1}^s), \qquad \mathcal {F}_{ij}^{(2),s,-} = \mathcal {F}^{(2)}(\tilde{\rho }_{ij-1}^s,\tilde{\rho }_{ij}^s), \\ B_{ij}^{s+1}&= B\left( \rho _{i-1j}^{s+1}, \rho _{i+1j}^{s+1}, \rho _{ij}^{s+1}, \rho _{ij-1}^{s+1}, \rho _{ij+1}^{s+1} \right) . \end{aligned}$$

Moreover, the fluxes \(\mathcal {F}^{(1)},\mathcal {F}^{(2)}\) and B are given by

$$\begin{aligned} \mathcal {F}^{(1)}(\rho _{ij}^s, \rho _{i+1j}^s)&= {\left\{ \begin{array}{ll} \rho _{ij}^s \overline{v}^{(1)}_{ij}&{}\text { if } \overline{v}^{(1)}_{ij}\ge 0, (i+1,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ \rho _{i+1j}^s \overline{v}^{(1)}_{ij}&{}\text { if } \overline{v}^{(1)}_{ij}< 0, (i,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ 0 &{}\text { otherwise, } \end{array}\right. } \\ \mathcal {F}^{(2)}(\tilde{\rho }_{ij}^s, \tilde{\rho }_{ij+1}^s)&= {\left\{ \begin{array}{ll} \tilde{\rho }_{ij}^s \overline{v}^{(2)}_{ij} &{}\text { if } \overline{v}^{(2)}_{ij} \ge 0, (i,j+1) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ \tilde{\rho }_{ij+1}^s\overline{v}^{(2)}_{ij} &{}\text { if } \overline{v}^{(2)}_{ij} < 0, (i,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ 0 &{}\text { otherwise, } \end{array}\right. } \\ B(\rho _{i-1j}^{s+1}, \rho _{i+1j}^{s+1},&\rho _{ij}^{s+1}, \rho _{ij-1}^{s+1}, \rho _{ij+1}^{s+1}) = b_{i-1j}^{s+1} + b_{i+1j}^{s+1} - 4b_{ij}^{s+1} + b_{ij-1}^{s+1} + b_{ij+1}^{s+1}, \end{aligned}$$

where \(\overline{v}(x_{ij}) =\overline{v}_{ij}\), \(\overline{v}_{ij}= 0 ~\forall (i,j) \in \mathcal {I}_{\partial \Omega }\)and \(b_{ij}^{s+1} = b(\rho _{ij}^{s+1})\) with \(b(\rho ) = \int _0^{\rho } Cz H(z-\rho _{crit}) \, dz\). The Heaviside function \(H\)is smoothly approximated and the time step restriction for the numerical simulations is given by the CFL condition of the hyperbolic part

$$\begin{aligned} \Delta t^c\le \min _{(i,j)} \frac{1}{\frac{|\overline{v}_{ij}^{(1)}|}{\Delta x} + \frac{|\overline{v}_{ij}^{(2)}|}{\Delta x}}, \end{aligned}$$
(9)

compare Holden et al. 2000; Weissen et al. 2021. We denote the vector of density values \(\varvec{\rho } = (\rho _{{ij}}^s)_{(i,j,s) \in \mathcal {I}_\Omega \times \lbrace 0, \dots N_t^c\rbrace }\). It is the discretized solution (8) of the macroscopic Eq. (2) which depends on a given control \(u\). The vectors containing intermediate density values \(\varvec{\tilde{\rho }}, \varvec{\overline{\rho }}\) and Lagrange parameters \(\varvec{\mu }, \varvec{\tilde{\mu }}, \varvec{\overline{\mu }}\) used below are defined analogously.

2.2.2 Solving the coarse model optimization problem

Next, we turn to the solution of the coarse-scale optimization problem. The construction of a solution to this problem is paramount to the space mapping algorithm. We provide a short discussion on the adjoint method for the optimization problem (3) before we specify the macroscopic adjoints.

First order optimality system Let \(J(u, y(u))\) be an objective function which depends on the given control \(u\). We wish to solve the optimization problem (3) and apply a descent algorithm. In a descent algorithm, a current iterate \(u_k,\) is updated in the direction of descent of the objective function J until the first order optimality condition is satisfied. An efficient way to compute the first order optimality conditions is based on the adjoint, which we recall in the following. Let the Lagrangian function be defined as

$$\begin{aligned} L(u,y(u))&= J(u,y(u)) + \mu ^T E(u,y(u)), \end{aligned}$$

where \(\mu\) is called the Lagrange multiplier.

Solving \(\mathrm {d} L= 0\) yields the first order optimality system

  1. (i)

    \(E(u,y(u)) = 0\),

  2. (ii)

    \((\partial _y E(u,y(u))^T) \mu = - (\partial _y J(u,y(u))^T\),

  3. (iii)

    \(\frac{\mathrm {d}}{\mathrm {d}u} J(u,y(u)) = \partial _uJ(u,y(u)) + \mu ^T \partial _uE(u,y(u)) = 0.\)

For nonlinear systems it is difficult to solve the coupled optimality system (i)–(iii) all at once. We therefore proceed iteratively: for the computation of the total derivative \(\frac{\mathrm {d}}{\mathrm {d}u} J(u,y(u))\), the system \(E(u,y(u)) = 0\) is solved forward in time. Then, the information of the forward solve is used to solve the adjoint system (ii) backwards in time. Lastly, the gradient is obtained from the adjoint state and the objective function derivative.

Nonlinear conjugate gradient method We use a nonlinear conjugate gradient method Dai and Yuan 1999; Fletcher and Reeves 1964 within our descent algorithm to update the iterate as follows

$$\begin{aligned} d_{k} = - \nabla J\left( u_{k},y(u_{k})\right) + \hat{\beta }_{k-1} \hat{d}_{k-1}, \qquad u_{k+1} = u_k + \sigma _k d_k. \end{aligned}$$
(10)

The step size \(\sigma _{k}\) is chosen such that it satisfies the Armijo-Rule Hinze et al. (2009); Nocedal and Wright (2006)

$$\begin{aligned} J\left( u_{k}+ \sigma _{k} d_{k},y(u_{k}+ \sigma _{k} d_{k}) \right) - J(u_{k},y(u_{k})) \le \sigma _{k} c_1\nabla J(u_{k},y(u_{k}))^T d_{k}, \end{aligned}$$
(11)

and the standard Wolfe condition Nocedal and Wright 2006

$$\begin{aligned} \nabla J(u_{k} + \sigma _{k} d_{k}, y(u_{k} +\sigma _{k} d_{k}))^T d_{k} \ge c_2\nabla J(u_{k},y(u_{k}))^T d_{k}, \end{aligned}$$
(12)

with \(0< c_1< c_2< 1\). We start from \(\sigma _{k} = 1\) and cut the step size in half until (11)–(12) are satisfied. The parameter \(\hat{\beta }_{k}\) is given by

$$\begin{aligned} \hat{\beta }_{k} = \frac{\Vert \nabla J(u_{k+1}, y(u_{k+1}))\Vert ^2}{d_{k}^T \hat{d}_{k}} \text { with } \hat{d}_{k} = \nabla J(u_{k+1}, y(u_{k+1})) - \nabla J(u_{k}, y(u_{k})), \end{aligned}$$

which together with conditions (11)–(12) ensures convergence to a minimizer Dai and Yuan 1999. We refer to this method as adjoint method (AC). In the following we apply this general strategy to our macroscopic equation.

Macroscopic Lagrangian We consider objective functions depending on the density, i.e., \(J^c(u,\varvec{\rho })\). The discrete Lagrangian \(L= L(u, \varvec{\rho }, \varvec{\tilde{\rho }}, \varvec{\overline{\rho }}, \varvec{\mu }, \varvec{\tilde{\mu }}, \varvec{\overline{\mu }})\) is given by

$$\begin{aligned} \begin{aligned} L= J^c(u,\varvec{\rho }) +&\sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} \mu _{ij}^s \left( \frac{\tilde{\rho }_{ij}^s - \rho _{ij}^s}{\Delta t^c} + \frac{\mathcal {F}_{ij}^{(1),s,+} - \mathcal {F}_{ij}^{(1),s,-}}{\Delta x}\right) \\ +&\sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} \tilde{\mu }_{ij}^s \left( \frac{\overline{\rho }_{ij}^s - \tilde{\rho }_{ij}^s}{\Delta t^c} + \frac{\mathcal {F}_{ij}^{(2),s,+} - \mathcal {F}_{ij}^{(2),s,-}}{\Delta x}\right) \\ +&\sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} \bar{\mu }_{ij}^s \left( \frac{\rho _{ij}^{s+1} - \overline{\rho }_{ij}^s}{\Delta t^c} - \frac{ B_{ij}^{s+1}}{(\Delta x)^2}\right) . \end{aligned} \end{aligned}$$
(13)

We differentiate the Lagrangian with respect to \(\rho _{ij}^s\)

$$\begin{aligned} \partial _{\rho _{ij}^s} L&= \partial _{\rho _{ij}^s} J^c(u,\varvec{\rho }) \\&\quad - \mu _{ij}^s \left( \frac{1}{\Delta t^c} - \frac{\partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,+}}{\Delta x} + \frac{\partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,-}}{\Delta x}\right) \\&\quad + \mu _{i-1j}^s \frac{\partial _{\rho _{ij}^s} \mathcal {F}_{i-1j}^{(1),s,+}}{\Delta x} - \mu _{i+1j}^s \frac{\partial _{ \rho _{ij}^s} \mathcal {F}_{i+1j}^{(1),s,-}}{\Delta x} \\&\quad + \bar{\mu }_{ij}^{s-1} \left( \frac{1}{\Delta t^c} - \frac{\partial _{\rho _{ij}^s} B_{ij}^{s}}{(\Delta x)^2}\right) - \bar{\mu }_{i-1j}^{s-1} \frac{\partial _{\rho _{ij}^s} B_{i-1j}^{s}}{(\Delta x)^2 } \\&\quad - \bar{\mu }_{i+1j}^{s-1} \frac{\partial _{\rho _{ij}^s} B_{i+1j}^{s}}{(\Delta x)^2} - \bar{\mu }_{ij-1}^{s-1} \frac{\partial _{\rho _{ij}^s} B_{ij-1}^{s}}{(\Delta x)^2} - \bar{\mu }_{ij+1}^{s-1} \frac{\partial _{\rho _{ij}^s} B_{ij+1}^{s}}{(\Delta x)^2} \\&= 0. \end{aligned}$$

Rearranging terms yields

$$\begin{aligned} T^{i,j}(\overline{\mu }^{s-1})&= \overline{\mu }_{ij}^{s-1} - \frac{\Delta t^c}{(\Delta x)^2} \bigg ( \overline{\mu }_{i-1j}^{s-1} \partial _{\rho _{ij}^s} B_{i-1j}^{s} + \overline{\mu }_{i+1j}^{s-1} \partial _{ \rho _{ij}^s} B_{i+1j}^{s} \\&\qquad \qquad + \overline{\mu }_{ij}^{s-1} \partial _{\rho _{ij}^s} B_{ij}^{s} + \overline{\mu }_{ij-1}^{s-1} \partial _{\rho _{ij}^s} B_{ij-1}^{s} + \overline{\mu }_{ij+1}^{s-1} \partial _{ \rho _{ij}^s} B_{ij+1}^{s} \bigg ) \\&= - \Delta t^c\partial _{\rho _{ij}^s} J^c(u,\varvec{\rho }) + \mu _{ij}^s \left( 1 - \lambda \partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,+} + \lambda \partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,-} \right) \\&\qquad ~ - \mu _{i-1j}^s \lambda \partial _{\rho _{ij}^s} \mathcal {F}_{i-1j}^{(1),s,+} + \mu _{i+1j}^s \lambda \partial _{\rho _{ij}^s} \mathcal {F}_{i+1j}^{(1),s,-}. \end{aligned}$$

Using \(\partial _{\rho _{ij}^s} B_{i-1j}^{s} = \partial _{\rho _{ij}^s} B_{i+1j}^{s} = \partial _{\rho _{ij}^s} B_{ij-1}^{s} = \partial _{\rho _{ij}^s} B_{ij+1}^{s} = {k(\rho _{ij}^s)}\) and \(\partial _{ \rho _{ij}^s} B_{ij}^{s} = - 4 {k(\rho _{ij}^s)}\) on the left-hand side and (17)–(18), see "Appendix B", on the right-hand side, leads to

$$\begin{aligned} T^{i,j}(\overline{\mu }^{s-1})&= \overline{\mu }_{ij}^{s-1} - \frac{\Delta t^c}{(\Delta x)^2} k (\rho _{ij}^s) \bigg ( \overline{\mu }_{i-1j}^{s-1} + \overline{\mu }_{i+1j}^{s-1} - 4\overline{\mu }_{ij}^{s-1} + \overline{\mu }_{ij-1}^{s-1} + \overline{\mu }_{ij+1}^{s-1} \bigg ) \nonumber \\&\overset{\tiny (17),(18)}{=} - \Delta t^c\partial _{\rho _{ij}^s} J^c(u,\varvec{\rho }) + \mu _{ij}^s \\&\qquad - \lambda \left( \left( \mu _{ij}^s - \mu _{i+1j}^s\right) \partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,+} - \left( \mu _{ij}^s - \mu _{i-1j}^s\right) \partial _{\rho _{ij}^s} \mathcal {F}_{ij}^{(1),s,-} \right) . \end{aligned}$$

This is solved backward in time to obtain the Lagrange parameter \((\mu _{ij}^{s-1})_{(i,j) \in \mathcal {I}_\Omega }\). Note that the above expression \(T(\overline{\mu }^{s-1}) = \big (T^{i,j}(\overline{\mu }^{s-1})\big )_{(i,j) \in \mathcal {I}_\Omega }\) defines a coupled system for the Lagrange parameter of time step \(s-1\) in space and has to be solved in each time step. This system arises from the implicit treatment of the diffusion term in the forward system (8). It is the main difference to adjoints for purely hyperbolic equations where the Lagrange parameters in step \(s-1\) in the backward system are simply obtained as a convex combination of those from step s, see Erbrich et al. 2018. Proceeding further, we differentiate the Lagrangian with respect to \(\tilde{\rho }_{ij}^s\) to get

$$\begin{aligned} \partial _{\tilde{\rho }_{ij}^s} L&= \frac{\mu _{ij}^s}{\Delta t^c} - \tilde{\mu }_{ij}^s \left( \frac{1}{\Delta t^c} - \frac{\partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,+}}{\Delta x} + \frac{\partial _{ \tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,-}}{\Delta x}\right) \\&\qquad + \tilde{\mu }_{ij-1}^s \frac{\partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij-1}^{(2),s,+}}{\Delta x} - \tilde{\mu }_{ij+1}^s \frac{\partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij+1}^{(2),s,-}}{\Delta x} \\&= 0. \end{aligned}$$

Again, rearranging terms yields

$$\begin{aligned} \mu _{ij}^s&= \tilde{\mu }_{ij}^s \left( 1 - \lambda \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,+} + \lambda \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,-} \right) \nonumber \\&\qquad - \tilde{\mu }_{ij-1}^s \lambda \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij-1}^{(2),s,+} + \tilde{\mu }_{ij+1}^s \lambda \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij+1}^{(2),s,-} \nonumber \\&\overset{\tiny (19),(20)}{=} \tilde{\mu }_{ij}^s - \lambda \left( \left( \tilde{\mu }_{ij}^s - \tilde{\mu }_{ij+1}^s \right) \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,+} - \left( \tilde{\mu }_{ij}^s - \tilde{\mu }_{ij-1}^s\right) \partial _{\tilde{\rho }_{ij}^s} \mathcal {F}_{ij}^{(2),s,-}\right) . \end{aligned}$$

Finally, we differentiate the Lagrangian with respect to \(\overline{\rho }_{ij}^s\) to obtain

$$\begin{aligned} \partial _{\overline{\rho }_{ij}^s} L = \frac{\tilde{\mu }_{ij}^s}{\Delta t^c} - \frac{\overline{\mu }_{ij}^s}{\Delta t^c} = 0 \qquad \Rightarrow \qquad \tilde{\mu }_{ij}^s = \overline{\mu }_{ij}^s. \end{aligned}$$

The equality of the Lagrange parameters \(\tilde{\mu },\overline{\mu }\) stems from the fact that the diffusion is solved implicitly in the forward system (8)Footnote 1. In the next section, we consider the diffusion coefficient \(C\) as control for the macroscopic system, \(u= C\). In this case, the derivative of the Lagrangian with respect to the control reads

$$\begin{aligned} \partial _{C} L= \sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} - \frac{1}{C} \frac{\bar{\mu }_{ij}^s }{(\Delta x)^2} \left( b_{i-1j}^{s+1} + b_{i+1j}^{s+1} - 4b_{ij}^{s+1} + b_{ij-1}^{s+1} + b_{ij+1}^{s+1}\right) . \end{aligned}$$

3 Validation of the approach

To validate the proposed approach, we consider a toy problem and compare the results of the space mapping method to optimal solutions computed directly on the microscopic level. For the toy problem, we control the potential strength \(A\) of the microscopic model. The macroscopic analogue is the diffusion coefficient \(C\).

3.1 Discrete microscopic adjoint

Let \(N_t^f\in \mathbb N\) and \(\Delta t^f\in \mathbb R\) be the number of time steps and the time step size, respectively. For simplicity, we normalize the particle mass in our example, i.e. \(m= 1\). We discretize the fine, microscopic model (1) in time to obtain

$$\begin{aligned} x_i^{s+1}&= x_i^{s} + \Delta t^fv_i^{s}, \qquad v_i^{s+1} = v_i^{s} + \Delta t^f\left( G(x_i^{{s+1}},v_i^{{s+1}}) + A\sum _{j \ne i} F{(x_i^{s+1}, x_j^{s+1})} \right) \end{aligned}$$

for \(s = 1, \dots , N_t^f\). We denote

$$\varvec{x} = (x_i^s)_{(i,s) \in \lbrace 1, \dots , N\rbrace \times \lbrace 0, \dots , N_t^f\rbrace } \quad \text { and }\quad \varvec{v} = (v_i^s)_{(i,s) \in \lbrace 1, \dots , N\rbrace \times \lbrace 0, \dots , N_t^f\rbrace }.$$

Furthermore, let \(J^f(u, \varvec{x})\) be the microscopic objective function. The microscopic Lagrange function \(L(u, \varvec{x},\varvec{v}, \varvec{\mu }, \varvec{\tilde{\mu }}, \varvec{\overline{\mu }}, \varvec{\hat{\mu }})\) is then given by

$$\begin{aligned} \begin{aligned} L&= J^f(u,\varvec{x}) + \sum _{s=0}^{N_t^f} \sum _{i=1}^N\mu _i^s \left( \frac{x_i^{(1),s+1} - x_i^{(1),s}}{\Delta t^f} - v_i^{(1),s}\right) \end{aligned} \end{aligned}$$
$$\begin{aligned} \begin{aligned}&+ \sum _{s=0}^{N_t^f} \sum _{i=1}^N\tilde{\mu }_i^s \left( \frac{x_i^{(2),s+1} - x_i^{(2),s}}{\Delta t^f} - v_i^{(2),s} \right) \\&+ \sum _{s=0}^{N_t^f} \sum _{i=1}^N\overline{\mu }_i^s \left( \frac{v_i^{(1),s+1} - v_i^{(1),s}}{\Delta t^f} - G_i^{(1), {s+1}} - A\sum _{j \ne i} F_{ij}^{(1),{s+1}} \right) \\&+ \sum _{s=0}^{N_t^f} \sum _{i=1}^N\hat{\mu }_i^s \left( \frac{v_i^{(2),s+1} - v_i^{(2),s}}{\Delta t^f} - G_i^{(2),{s+1}} - A\sum _{j \ne i} F_{ij}^{(2),{s+1}} \right) , \end{aligned} \end{aligned}$$
(14)

where

$$\begin{aligned} G_i^{(l),{s}}(x_i^s,v_i^s)&= - \frac{v_i^{(l),s} -\overline{v}^{(l)}(x_i^s) }{\tau }, \\ F_{ij}^{(l),{s}}(x_i^s,x_j^s)&= {\left\{ \begin{array}{ll} \frac{b_F\left( \Vert x_i^s - x_j^s \Vert _2 - 2R\right) ^2 }{\Vert x_i^s - x_j^s \Vert _2} \left( x_i^{(l),s} - x_j^{(l),s} \right) &{}\text { if } \Vert x_i^s - x_j^s \Vert _2 < 2R, \\ 0 &{}\text {otherwise,} \end{array}\right. } \end{aligned}$$

for \(l=1,2\). The details of the derivatives of the force terms and the computation of the adjoint state can be found in "Appendix C". Moreover, the derivative of the Lagrangian with respect to the control \(u= A\) reads

$$\begin{aligned} \partial _AL&= - \sum _{s=0}^{N_t^f} \sum _{i=1}^N\sum _{j \ne i} \left( \overline{\mu }_i^s F_{ij}^{(1), {s+1}} + \hat{\mu }_i^s F_{ij}^{(2),{s+1}} \right) . \end{aligned}$$

3.2 Comparison of space mapping to direct optimization

We apply ASM and the direct optimization approach AC to the optimization problem (3). In each iteration \(k\) of the adjoint method for the fine model, a computation of the gradient \(\nabla J^f\) for the stopping criterion as well as several objective function and gradient evaluations for the computation of the step size \(\sigma _{k}\) are required. These evaluations are (mostly) shifted to the coarse model in ASM. Let \(\Omega = [-5,5]^2\) be the domain and \(\overline{v}(x) = - x\) the velocity field of our toy example. We investigate whether the macroscopic model is an appropriate coarse model in the space mapping technique. For the microscopic interactions, we use the force term (4) with \(b_F= 1/ R^5\). Without interaction forces, \(A= 0\), all particles are transported to the center of the domain \(\left( x^{(1)}, x^{(2)}\right) = (0,0)\) within finite time. Certainly, they overlap after some time. With increasing interaction parameter, i.e., increasing \(A\), particles encounter stronger forces as they collide. Therefore, scattering occurs and the spatial spread increases. We penalize the spatial spread of the particle ensemble at \(t=T\) in the microscopic model, leading to a cost

$$\begin{aligned} j^f(A,\varvec{x})&= \frac{1}{N} \sum _i^{N} \langle x_i^{N_t^f}, x_i^{N_t^f} \rangle , \end{aligned}$$

and the objective function derivative with respect to the state variables \(x_i\) is given by

$$\begin{aligned} \partial _{x_i^{(l),s}} J^f(A,\varvec{x})&= {\left\{ \begin{array}{ll} \left( \frac{1}{N} \sum _i \langle x_i^{N_t^f}, x_i^{N_t^f} \rangle -\omega _*\right) \frac{2 x_i^{(l),s}}{N} &{}\text { if } s = N_t^f, \\ 0 &{}\text { otherwise. } \end{array}\right. } \end{aligned}$$

We choose \(A\), the scaling parameter of the interaction force, as microscopic control. The coarse, macroscopic model is given by (2) and the spatial spread of the density at \(t=T\) is given by

$$\begin{aligned} j^c(C, \varvec{\rho })&= \frac{1}{M} \sum _{(i,j)} \rho _{ij}^{N_t^c} \langle x_{ij}, x_{ij}\rangle , \\ \partial _{\rho _{ij}^s} J^c(C, \varvec{\rho })&={\left\{ \begin{array}{ll} \frac{\langle x_{ij}, x_{ij}\rangle }{M} \left( \left( \frac{1}{M}\sum _{(i,j)} \rho _{ij}^{N_t^c} \langle x_{ij}, x_{ij}\rangle \right) - \omega _*\right) &{}\text { if } s = N_t^c, \\ 0 &{}\text { otherwise, } \end{array}\right. } \end{aligned}$$

where \(M\) is the total mass, i.e., \(M=\sum _{(i,j)} \rho _{ij}^0 (\Delta x)^2\). We consider the parameters in Table 1. The macroscopic model requires spatial and time step sizes. Here, we choose a large spatial step size \(\Delta x\) such that the macroscopic model can be computed sufficiently fast. The time step size \(\Delta t^c\) satisfies the CFL condition (9). The simulation of the microscopic model is more involved. For the simulation of pedestrian dynamics, compare e.g. Helbing et al. 2000, the parameter \(\tau\) is chosen small such that deviations of individual velocities from the velocity field \(\overline{v}\) are corrected fast and the parameter \(b_F\) in the interaction force is chosen large such that the interaction force becomes dominant when pedestrians are close together. Since the particles move to the center of the domain and the interaction force gets dominant, when particles are close together, we have to choose a small time step size \(\Delta t^f\). (Fig. 2)

Table 1 Model parameters

Two particle collectives with \(N/2 = 100\) particles are placed in the domain, see Fig. 3a. The macroscopic representation (7) of the particle groups is shown in Fig. 3b. We set box constraints on the controls \(0 \le A,C\le 10\) and compare the number of iterations of the two approaches to obtain a given accuracyFootnote 2 of \(\Vert J^f(u_{k}, \varvec{x})\Vert _2 < 10^{-5}\). The step sizes \(\sigma _{k}\) for AC are chosen such that they satisfy the Armijo Rule and standard Wolfe condition (11)–(12) with \(c_1= 0.01, c_2= 0.9\). If an iterate violates the box constraint, it is projected into the feasible set.

In the space mapping algorithm, the parameter extraction \(\mathcal {T}(u_{k})\) is the solution of an optimization problem in the coarse model space, see Definition 2.1. In each iteration, the parameter extraction identifies the coarse model control C which matches best the microscopic model with control A by solving an optimization in the coarse model. The optimization is solved via adjoint calculus with \(c_1,c_2\) as chosen above and \(u_{start} = \mathcal {T}(u_{k-1})\), which we expect to be close to \(\mathcal {T}(u_{k})\). Further, to determine the step size \(\sigma _{k}\) for the control update, we consider step sizes such that \(u_{k+1} = u_{k} + \sigma _{k} d_{k}\) satisfies \(\Vert \mathcal {T}(u_{k+1})-u_*^c\Vert _2 < \Vert \mathcal {T}(u_{k})-u_*^c\Vert _2\) and thus decreases the distance of the parameter extraction to the coarse model optimal control from one space mapping iteration to the next. The optimization results and computation times (obtained as average computation time of 20 runs on an Intel(R) Core(TM) i7-6700 CPU 3.40 GHz, 4 Cores) for target values \(\omega _*\in \lbrace 1,2,3 \rbrace\) are compared in Table 2. Both optimization approaches start far from the optima at \(u_0 = 8\). Optimal controls \(u_*^{AC}\) and \(u_*^{ASM}\) closely match. The objective function evaluations \(J^f(u_*^{AC},\varvec{x})\), \(J^c(u_*^c, \varvec{\rho })\) describe the accuracy at which the fine and coarse model control problem are solved, respectively. \(J^f(u_*^{ASM}, \varvec{x})\) denotes the accuracy of the space mapping optimal control when the control is plugged into the fine model and the fine model objective function is evaluated. Note that the ASM approach in general does not ensure a descent in the microscopic objective function value \(J^f(u_{k}, \varvec{x})\) during the iterative process and purely relies on the idea to reduce the distance \(\Vert \mathcal {T}(u_{k}) - u_*^c\Vert _2\). However, ASM also generates small target values \(J^f(u_*^{ASM},\varvec{x})\) and therefore validates the proposed approach. Moreover, the model responses of the optimal controls illustrate the similarity of the fine and the coarse model, see Fig. 3c–d.

Table 2 Aggressive Space Mapping (ASM) vs. Adjoint Calculus (AC)

The space mapping iteration finishes within two to three iterations and therefore needs less iterations than the pure optimization on the microscopic level here, see Fig. 2. Note that each of the space mapping iterations involves the solution of the coarse optimal control problem. Hence, the comparison of the iterations may be misleading and we consider the computation times as additional feature. It turns out that the iteration times vary and therefore this data does not allow to prioritize one of the approaches based on computational time. Obviously, the times depend on the number of particles, the space and time discretizations. However, very few iterations within the space mapping approach are needed to obtain the results. We see that the space mapping technique is validated, because we can compare ASM solutions to solutions computed with the microscopic adjoints. Next, we apply ASM to scenarios in bounded domain, where solutions cannot be computed on the microscopic level anymore due to the boundary interactions.

Fig. 2
figure 2

Objective function value of iterates

Fig. 3
figure 3

Initial conditions and space mapping solution for \(\omega _*= 3\)

4 Space mapping in bounded domains

In the following, we consider problems with dynamics restricted to a spatial domain with boundaries. For the microscopic simulations we add artificial boundary behaviour, tailored for each application, to the ODEs.

4.1 Crowd dynamics

Interesting dynamics evolve in the modeling of pedestrian dynamics, when pedestrian groups cross at intersections, crowds pass through corridors or a bottleneck, or they try to reach the location of a staircase, an elevator or escalators in case of an emergency. People trying to move can be injured when they are pushed into obstacles in their way Helbing et al. 2002. To apply the space mapping technique to pedestrian dynamics, we consider a scenario similar to the evacuation of \(N\) individuals from a domain with obstacles. The goal is to gather as many individuals as possible at an assembly point \(x_s\in \Omega \subset \mathbb R^2\) up to the time T. The control is the assembly point \(x_s=(x_s^{(1)},x_s^{(2)})\). We model this task with the help of the following cost functions

$$\begin{aligned} j^f(x_s,\varvec{x})&= \frac{1}{N} \sum _{(i)} \langle x_i^{N_t^f}- x_s, x_i^{N_t^f} - x_s\rangle , \\ j^c(x_s, \varvec{\rho })&= \frac{1}{M} \sum _{(i,j)} \rho _{ij}^{N_t^c} \langle x_{ij}- x_s, x_{ij}- x_s\rangle , \end{aligned}$$

for the fine and coarse model, respectively. They measure the spread of the crowd at time \(t=T\) with respect to the location of the source.

The velocity \(\overline{v}(x)\) is based on the solution to the eikonal equation with point source \(x_s\). In more detail, we solve the eikonal equation

$$\begin{aligned} |\nabla T(x)| = \frac{1}{f(x)},\qquad x \in \Omega , \qquad T(x_s) = 0, \end{aligned}$$

where \(T(x)\) is the minimal amount of time required to travel from x to \(x_s\) and \(f(x)\) is the speed of travel. We choose \(f(x) = 1\) and set the velocity field to

$$\begin{aligned} \overline{v}(x) = \frac{\nabla T(x)}{\Vert \nabla T(x)\Vert _2} \min \lbrace {\Vert x-x_s\Vert _2,1 \rbrace }. \end{aligned}$$
(15)

In this way, the velocity vectors point into the direction of the gradient of the solution to the eikonal equation and the speed depends on the distance of the particle to \(x_s\). The particles are expected to slow down when approaching \(x_s\) and the maximum velocity is bounded \(\Vert \overline{v}(x)\Vert _2 \le 1\). The solution to the eikonal equation on the 2-D cartesian grid is computed using the fast marching algorithm implemented in C with Matlab interfaceFootnote 3. The travel time isoclines of the eikonal equation and the corresponding velocity field are illustrated in Fig. 4. Note that we have to set the travel time inside the boundary to a finite value to obtain a smooth velocity field.

Fig. 4
figure 4

Solution of the eikonal equation in a bounded domain

The derivative of the macroscopic Lagrangian (13) with respect to the location of the point source, \(u= x_s\), is given by

$$\begin{aligned} \partial _{x_s^{(l)}} L&= \sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} \frac{\mu _{ij}^s}{\Delta x} \left( \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(1),s,+} - \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(1),s,-} \right) \\&\qquad + \sum _{s=0}^{N_t^c} \sum _{i=1}^{N_x} \sum _{j=1}^{N_x} \frac{\tilde{\mu }_{ij}^s}{\Delta x} \left( \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(2),s,+} - \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(2),s,-} \right) , \end{aligned}$$

where

$$\begin{aligned} \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(1),s,+}&= {\left\{ \begin{array}{ll} \rho _{ij}^s \partial _{x_s^{(l)}} \overline{v}_{ij}^{(1)} &{}\text { if } \overline{v}_{ij}^{(1)} \ge 0, (i+1,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega },\\ \rho _{i+1j}^s \partial _{x_s^{(l)}} \overline{v}_{ij}^{(1)} &{}\text { if } \overline{v}_{ij}^{(1)}< 0, (i,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ 0 &{}\text { otherwise, } \end{array}\right. }&\qquad l=1,2, \\ \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(1),s,-}&= {\left\{ \begin{array}{ll} \rho _{i-1j}^s \partial _{x_s^{(l)}} \overline{v}_{i-1j}^{(1)} &{}\text { if } \overline{v}_{i-1j}^{(1)} \ge 0, (i,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ \rho _{ij}^s \partial _{x_s^{(l)}} \overline{v}_{i-1j}^{(1)} &{}\text { if } \overline{v}_{i-1j}^{(1)} <0, (i-1,j) \in \mathcal {I}_\Omega \setminus \mathcal {I}_{\partial \Omega }, \\ 0 &{}\text { otherwise, } \end{array}\right. }&\qquad l=1,2, \end{aligned}$$

and \(\partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(2),s,+}, \partial _{x_s^{(l)}} \mathcal {F}_{ij}^{(2),s,-}\) are defined analogously.

To obtain the partial derivatives \(\partial _{x_s^{(l)}} \overline{v}_{ij}^{(k)}\), the travel-time source derivative of the eikonal equation is required. It is approximated numerically with finite differences

$$\begin{aligned} \partial _{x_s^{(l)}} \overline{v}_{ij}^{(k)} \approx \frac{\overline{v}_{ij}^{(k)}(x_s+ \Delta xe^{(l)}) - \overline{v}_{ij}^{(k)}(x_s-\Delta xe^{(l)} )}{2\Delta x},\qquad k=1,2, \end{aligned}$$

where \(e^{(1)} = (1,0)^T, e^{(2)} = (0,1)^T\) denote the unit vectors.

4.1.1 Discussion of the numerical results

To investigate the robustness of the space mapping algorithm, we consider different obstacles in the microscopic and macroscopic setting. Let \(\Omega = [-8,8]^2\) be the domain. For the microscopic model we define an internal boundary \(2 \le x^{(1)}\le 3, 1 \le x^{(2)}\le 8\), see Fig. 6a. For the macroscopic setting the obstacle is shifted by \(gap \ge 0\) in the \(x^{(2)}\)-coordinate. Additionally, we shift the initial density with the same gap, see Fig. 6b. It is interesting to see whether the space mapping technique is able to recognize the linear shift between the microscopic and the macroscopic model. This is not trivial due to the non-linearities in the models and the additional non-linearities induced by the boundary interactions. Macroscopically, we use the zero flux conditions (6) at the boundary. Microscopically, a boundary correction is applied, that means, a particle which would hypothetically enter the boundary is reflected into the domain, see Fig. 5.

Fig. 5
figure 5

Reflection at the boundary

Fig. 6
figure 6

Initial conditions with \(gap = 2\)

For computational simplicity, we restrict the admissible set of the controls

$$\begin{aligned} {\mathcal{U}}^{f} _{{ad}}= \mathcal {U}_{ad}^c= [-8, 2] \times [-8,8], \end{aligned}$$

i.e., the point source is located to the left-hand side of the obstacle.

The velocity \(\overline{v}(x),\) given by (15), is restricted to the grid with spatial step sizes \(\Delta x= 0.5\) for the macroscopic model. To obtain the velocity field on the grid, the source location \(x_s\in \mathcal {C}_{{ij}}\) is thereby projected to the cell center of the corresponding cell

$$\begin{aligned} P(x_s) = x_{ij}, \qquad x_s\in \mathcal {C}_{ij}. \end{aligned}$$
(16)

The continuous velocity field of the microscopic model is approximated by the eikonal solution on a grid with smaller grid size. (Fig. 6)

We choose the parameters from Sect. 3.2, Table 1 except for T which is set to \(T=5\). Moreover, we consider \(A,C=0.87\) for which the macroscopic and microscopic model behavior match well in the situation without boundary interactions, see Table 1 in Sect. 3.1.

We apply the space mapping method to the described scenario with \(gap \in \lbrace 0,1,2,3 \rbrace\). Due to the grid approximation, we formally move from continuous optimization problems to discrete ones which we approximately solve by applying ASM (and AC for the parameter extraction within ASM) for continuous optimization and project each iterate to the grid using (16). In general, due to the grid approximation we cannot ensure that arbitrarily small stepsizes \(\sigma _{k} \ge 0\) exist for which the Armijo condition is satisfied in the parameter extraction with \(c_1> 0\). Therefore, we choose \(c_1= 0, c_2= 0.9\) and formally lose the convergence of our descent algorithm to a minimizer. Nevertheless, it is still ensured that the distance to the coarse model optimum in ASM is nonincreasing since the step size is chosen such that

$$\begin{aligned} \Vert \mathcal {T}(u_{k} + \sigma _{k} d_{k}) - u_*^c\Vert _2 \le \Vert \mathcal {T}(u_{k}) - u_*^c\Vert _2 \end{aligned}$$

holds.

As starting point for the parameter extraction, we choose \(u_{start} = u_*^c\) and tolerance is set to \(10^{-5}\). We remark that the parameter extraction does not have a unique solution here, therefore, providing \(u_{start} = u_*^c\) as starting value is used to stipulate the parameter extraction identifying a solution \(\mathcal {T}(u_{k})\) near \(u_*^c\).

Table 3 Iterates of ASM

The macroscopic optimal solution with the corresponding gap is given by \(u_*^c= [1.5, --0.5 + gap]\), compare Table 3. For \(gap = 0\), we have \(\mathcal {T}(u_*^c) = u_*^c\) and the space mapping is finished at \(k=1\) since the model optima coincide. For \(gap >0\), the parameter extraction identifies a shift between the modeling hierarchies since the coarse model optimum is not optimal for the fine model. Indeed, the application of the coarse model optimal control leads to collision of the particles with the boundary and therefore delays gathering of the particles around the source location \(u_*^c\), see Fig. 7b. The particles are spread more widely, because the boundary as a physical obstacle prevents that the crowd gathers circularly shaped around the source location \(u_*^c\) as it is the case for the macroscopic model, compare Fig. 7a . Space mapping for \(gap \in \{1,3\}\) finishes within one iteration since the parameter extraction of \(u_1\) is given by \(\mathcal {T}(u_1) = u_1 + [0, gap]\) and \(\mathcal {T}(u_2) = u_*^c\). For \(gap = 2\), the first parameter extraction underestimates the shift in \(x^{(2)}\)-direction and thus, two iterations are needed to obtain the optimal solution, see Table 3.

Fig. 7
figure 7

Solutions of the space mapping iterates at \(t=T\) with \(gap = 2\)

We investigated the need for additional iterations in more detail. It turned out that the behavior is caused by the discretization of the optimization problem on the macroscopic grid. We have \(j^c([1.5, 3.0], \varvec{\rho }) = 4.4370\) and \(j^c([1.5, 3.5], \varvec{\rho }) = 5.3451\), which indicates that the true (continuous) value \(\mathcal {T}([1.5,1.5])\) lies between the two grid values. However, the discrete optimization for the parameter extraction terminates with \(\mathcal {T}([1.5,1.5]) = [1.5, 3.0]\), because it is closer to the microscopic simulation result \(j^f([1.5,1.5], \varvec{x})\). The microscopic optimal solution is shown in Fig. 7c. In comparison to the result with the control \(u_*^c\) shown in Fig. 7b, we observe in Fig. 7c that the crowd is gathered together more closely and has a smaller spread, i.e., \(j^f(u_{3}, \varvec{x}) < j^f(u_{1}, \varvec{x})\), compare Table 3.

4.2 Material flow

In the following, the control of a material flow system with a conveyor belt is considered. Similar control problems have been investigated in Erbrich et al. (2018). We use the microscopic model proposed in Göttlich et al. (2014) that describes the transport of homogeneous parts with mass m and radius \(R\) on a conveyor belt \(\Omega \subset \mathbb {R}^2\) with velocity \(v_T= (v_T^{(1)},0)^T\in \mathbb {R}^2\). The bottom friction

$$\begin{aligned} G(v) = - \gamma _b(v - v_T), \end{aligned}$$

with bottom damping parameter \(\gamma _b\ge 0\) corrects deviations of the parts’ velocities from the conveyor belt velocity. The interaction force \(F\) modelling interparticle repulsion is given by

$$\begin{aligned} F(x) = {\left\{ \begin{array}{ll} c_m(2R-\Vert x\Vert _2) \frac{x}{\Vert x\Vert _2} &{}\text { if } \Vert x\Vert _2 \le 2 R, \\ 0 &{}\text { otherwise, } \end{array}\right. } \end{aligned}$$

where \(c_m>0\) scales the interaction force and depends on the material of the parts.

We investigate the control of the material flow via the conveyor belt velocity \(v_T^{(1)}\). The particles (goods) are redirected at a deflector to channel them. A common way to describe such boundary interactions is to apply obstacle forces which are modeled similar to the interaction force between particles Helbing and Molnár 1995. Here, we consider

$$\begin{aligned} F_{obst}(x) = {\left\{ \begin{array}{ll} c_{obst}(R-\Vert x\Vert _2) \frac{x}{\Vert x\Vert _2} &{}\text { if } \Vert x\Vert _2 \le R, \\ 0 &{}\text { otherwise, } \end{array}\right. } \end{aligned}$$

where x is the distance to the closest point of the boundary. Note that this is a slight variation of Helbing and Molnár (1995) as the interaction takes place with the closest boundary point only, see also Remark 3. Further note that the computation of adjoint states analogous to Sect. 3.1 can become very complicated for this boundary interaction. We therefore avoid the computation of the microscopic optimal solution \(u_*^f\) and use the proposed space mapping approach instead.

The performance evaluation used here is the number of goods in the domain \(\Omega\) at time T given by

$$\begin{aligned} j^f(v_T^{(1)},\varvec{x}) = \sum _{i=1}^{N} \, \mathbbm {1}_{ \left( x_i^{N_t^f} \in \Omega \right) }. \end{aligned}$$

The transport is modeled macroscopically with the advection-diffusion Eq. (2). The corresponding macroscopic performance evaluation is given by

$$\begin{aligned} j^c(v_T^{(1)},\varvec{\rho }) = \frac{N}{M} \sum _{(i,j): x_{ij}\in \Omega } \rho _{ij}^{N_t^c} (\Delta x)^2. \end{aligned}$$

We apply zero-flux boundary conditions (6) for the advective and the diffusive flux at the deflector.

Remark 3

Note that if the boundary was discretized with stationary points and boundary interaction was modeled with the help of soft core interaction forces in the microscopic setting, as for example in Helbing and Molnár (1995), the model would allow for direct optimization. Nevertheless, many applications involve a huge number of (tiny) goods, for example the production of screws. The pairwise microscopic interactions would blow up the computational effort, hence it makes sense to consider a macroscopic approximation for optimization tasks.

4.2.1 Dependency on the diffusion coefficient

We investigate the robustness of the space mapping technique for different diffusion coefficients \(C\) and investigate whether variations in the diffusion coefficient affect the performance of the space mapping algorithm or the accuracy of the final result. We set \(\Omega =[0, 0.64] \times [0, 0.4]\), \(N=100\), \(\omega _*=25\). The parameters \(R\), \(m\), \(c_m\), \(c_{obst}\), \(\gamma _b\) are given in Table 4 and are validated with a classical Runge-Kutta method of fourth order against real data Göttlich and Pfirsching 2018[Section 4.1]. We set \(u_0=0.5\) and compute the space mapping solution with the ASM algorithm and stopping criterion \(\Vert \mathcal {T}(u_{k}) - u_*^c\Vert _2 < 10^{-3}\). The diffusion constants \(C\in \lbrace 0, 0.1, 0.5,1 \rbrace\) are tested to see whether the space mapping approach is successful for all values of \(C\). The results of our space mapping approach are summarized in Table 5. Each parameter extraction uses \(u_{start} = \mathcal {T}(u_{k-1})\) and has an optimality tolerance of \(10^{-5}\).

Table 4 Model parameters
Table 5 Space mapping with different diffusion coefficients \(C\)

For every diffusion coefficient, space mapping finishes in less than seven iterations and Table 5 shows that a microscopic optimal solution is obtained for \(u_*^f\in [0.6093,0.6185]\). In all cases, space mapping generates optimal solutions. Even for the case with \(C=0\), which is pure advection (without diffusion) in the macroscopic model, the ASM algorithm is able to identify a solution. This underlines the robustness of the space mapping algorithm and emphasizes that even a very rough depiction of the underlying process can serve as coarse model. However, the advection-diffusion equations with \(C>0\) clearly match the microscopic situation better and portray the spread of particles in front of the obstacle more realistically, see Fig. 8 for \(C=0.5\).

Fig. 8
figure 8

Density with \(u=0.6612\) and particles with \(u=0.6093\) at \(t=0.7\)

5 Conclusion

We proposed space mapping-based optimization algorithms for interacting particle systems. The coarse model of the space mapping is chosen to be the macroscopic approximation of the fine model that considers every single particle. The algorithm is validated with the help of a toy problem that allows for the direct computation of optimal controls on the particle level. Further, the algorithm was tested in scenarios where the direct computation of microscopic gradients is infeasible due to boundary conditions that do not naturally appear in the particle system formulation. Numerical studies underline the feasibility of the approach and motivate to use it in further applications and more intricate domains.