1 Introduction

Digital maps provide a range of important information for many applications in Highly Automated Driving (HAD). Since, nevertheless, the information in these maps is often not sufficient for some applications, we will present a method to compute additional locally and temporarily valid map data that hold for a specific vehicle in that situation. We will create a model based on the digital map data and on some vehicle information, and we develop a global optimization approach that derives those additional data from that model.

1.1 Technical background

Driver assistance functions and HAD are closely linked to future mobility. The resulting challenges in perception and recognition make large demands on connectivity and data management, and they open a broad field of research. It is important to provide high definition maps (Seif and Hu 2016; Liu et al. 2020; Schwab and Kolbe 2019) that consist of large amounts of street topology data, not only containing all lanes and boundaries, but also logical information about their connections. In addition, landmarks, e.g., street lamps and traffic signs, or other objects of interest, are included for further information. Apart from the given fixed map data, in the following referred to as primary information, we intend to derive additional scenario-based information, which we refer to as secondary information.

In this paper, we compute optimal emergency stops for next-generation driver assistance functions and automated driving functions. The suggested, novel approach for deriving secondary information from high definition maps is, however, designed to be generic so that it can be as well applied and adapted to many other use cases. We formulate a maximization problem with a nonlinear, nonconcave objective function that is explicitly defined by the emergency stop scenario. The global solutions to this problem describe new stopping points that are added to an additional secondary information layer of the digital map. We present two deterministic global optimization strategies, whereby one of the approaches is faster than all other deterministic algorithms that we have encountered.

The small highway sectionFootnote 1 in Fig. 1 indicates the large amount of map data that is handled by the optimization algorithm for a larger, more realistic map excerpt. The curves of different types in Fig. 1B are depicted in various colors. The visualized map data contribute to the objective function shown in Fig. 1A. Just the map elements that belong to a certain surrounding of the own vehicle’s position, in the following referred to as ego position \(X_0\), are taken into account. The structure of the digital maps helps to filter for the relevant data, since they are usually divided into separate tiles, where only the closest tiles to \(X_0\) are used. The geometric parametrization of the curves is different for the existing map formats. However, it is very beneficial for the derivation of secondary information to define the lanes by means of cubic B-spline curves (Piegl and Tiller 1997, chs. 2,3).

Fig. 1
figure 1

A The objective function is based on the digital map data below. The Cartesian Coordinates are given in meters. B The landmarks (two types, represented by red and green cylinders) and curves of different types in the digital orthophoto (©BKG 2021) define the above objective function

The following requirements on the solution method are relevant from the perspective of a car manufacturer or developer of automated driving functions:

  1. R.1

    Three-dimensionality: Digital maps have evolved from two to three dimensions because of the importance of the height information for certain use cases. Therefore, the problem specifications explicitly require an optimization model that is not only defined along the map elements, but in the full three-dimensional space, to maintain the possibility of solutions that are off the available map elements.

  2. R.2

    Genericity: Although we focus on a highway emergency stop here, the valuation of the map data and the developed algorithm must be able to cover general scenarios, e.g., urban scenarios. This allows to exploit synergy effects, as multiple functions can be built upon the same algorithmic foundation. As a further reason, the development of this kind of functionally safe optimization algorithm is very costly. Therefore, development costs can be shared across multiple modules.

  3. R.3

    Determinism: The derivation of secondary information belongs to the category of applications for automated driving functions with high expectations to system safety. Thus, the optimization algorithm must be deterministic, according to the demanded Automotive Safety Integrity Level (ASIL) (International Organization for Standardization 2018). This functionally safe algorithm can also be used for assistant functions with less safety requirements.

  4. R.4

    Real time: The real-time performance is essential, since decisions in a driving vehicle often must be made in a fraction of a second. The solutions are only temporarily valid, because a driving car changes its position continuously.

Although there exist practically efficient heuristic approaches to compute global optima (Hedar and Fukushima 2004; Hansen and Ostermeier 2001; Csendes et al. 2008), those algorithms neglect the required reproducibility of the solution stated in R.3. As a consequence, we develop a new hybrid optimization method that combines a deterministic global optimization method with local higher-order approaches. This requires enhancements in a number of aspects that use the structure of the considered class of optimization models to improve the results of the existing algorithms. The main focus lies on a fast algorithm (cf. R.4).

1.2 State of the art

Rigorous optimization algorithms (Floudas 2005, chs. 3–5, 11–13, Kearfott 1996, chs. 1, 5, Pintér 2002; Rios and Sahinidis 2009; Neumaier 2004) identify global solutions with absolute certainty up to a predefined tolerance. They often are based on branch-and-bound type methods (Mitten 1970) that divide the initial search space into smaller subsets (usually boxes) step-by-step (branching) and discard the irrelevant ones that do not contain a global optimum (bounding). Branch-and-bound methods often use convex underestimators or concave overestimators, such as those obtained by the \(\alpha \)BB technique (Adjiman et al. 1998; Adjiman and Floudas 1996) or by bounding schemes for multilinear and other functions (Bao et al. 2015; Rikun 1997; Ryoo and Sahinidis 2001). They can be further enhanced, e.g., by range reduction techniques that generate tightened valid inequalities (Ryoo and Sahinidis 1996; Zhang et al. 2020). One of the most popular methods is the DIRECT (DIviding RECtangles) algorithm by Jones et al. (1993) which has been developed to overcome convergence speed problems of Shubert’s algorithm (Shubert 1972), a sequential method using a global Lipschitz constant. Instead of applying this Lipschitz constant globally to estimate upper bounds for the function value on each box (in the maximization case), a locally valid positive constant is used to find potentially optimal boxes. Since then, there have been various attempts to improve or modify the DIRECT algorithm for specific types of objective functions (for instance, Gablonsky and Kelley 2001; Chiter 2006; Liu et al. 2015; di Serafino et al. 2011). Other rigorous methods are the Multilevel Coordinate Search (MCS) (Huyer and Neumaier 1999), which is regarded as a branch-without-bound type method that, such as DIRECT, evaluates a single point per box. Besides the box size and the subdivision level, the branching rules of MCS include the computation of an expected gain by a quadratic model that is based on a few point evaluations. Furthermore, the MULTK algorithm (Sergeyev and Kvasov 2017, ch. 4) uses locally approximated Lipschitz gradients for differentiable objective functions.

Moreover, global interval optimization algorithms introduced by Hansen (1980) and further investigated in Gau and Schrage (2004), (Hansen and Walster 2004, chs. 7–8, 12–13), Neumaier (2015) use interval arithmetic (Hickey et al. 2001; Hansen 1975), (Moore et al. 2009, chs. 2, 4, 5), so that the bounds on the function value ranges are used to make optimality statements for whole boxes at once. Based on the interval values, one can discard those subsets that cannot contain a global optimum. There are various refinements and extensions of the original interval algorithm with new bounding strategies, e.g., the Krawczyk Newton approach (Krawczyk and Neumaier 1986), interval constraint propagation (Kjøller et al. 2007), McCormick-based relaxation (Mitsos et al. 2009), and relatively recent implementations such as GOP (Pál and Csendes 2009).

Apart from these deterministic and rigorous approaches, we want to mention two stochastic algorithms. GLOBAL (Csendes et al. 2008) is a multistart clustering algorithm that uses the best locally minimizing samples to gain information about the regions of attraction, whereas CMA-ES (Hansen and Ostermeier 2001) is an evolution strategy that adapts the covariance matrix for the sampling of new points iteratively. We apply these methods later on to compare them to some deterministic algorithms and to get an impression of how good our solutions compete with stochastic approaches.

1.3 Key contributions of the paper

In this paper, we construct a system to derive secondary information for the given map data, i.e., we design a new type of model that values the various map elements and we develop optimization approaches that exploit the structure of this model as much as possible. Thereby, we build two hybrid branch-and-bound optimization algorithms that use interval arithmetic in the first place, based on Hansen’s considerations (Hansen 1980). We illustrate and also verify the advantages of the interval-valued evaluation in comparison to simple point evaluations of other global methods. Then, second-order bounding methods that are only applied to certain boxes in the interval algorithm extend both algorithms to generate better function value bounds. Several modifications in both algorithms are needed due to the fact that the objective function is just piecewise differentiable.

The \(\alpha \)BB approach locally overestimates the objective function by a concave function whose maximum can easily be computed. The techniques by Gerschgorin (1931) or Hertz (2009) require good estimates for the corresponding interval Hessian matrices to compute concave overestimators. The interval Hessian and the overestimation term have been modified by Meyer and Floudas (2005), Skjäl et al. (2012), Skjäl and Westerlund (2014) to construct tighter overestimators. In our concrete problem, the interval-valued outcome for the function evaluation and its derivatives are significantly overestimated, and, therefore, these interval matrix approaches do not provide meaningful overestimators. It is a key point of this paper to obtain sufficiently tight estimates for the parameter \(\alpha \), which defines the concave overestimator, without the need of computing interval matrices as in the mentioned approaches. To reach this goal, we exploit the form of the model and estimate the \(\alpha \)-values for all map element functions separately. For the curves, in particular, we present a novel method that locally linearizes the curves in the digital map and computes the maximum curvature of this locally very accurate approximation of the part of the objective function corresponding to this curve (which, for short, we call the curve function).

In the second approach, we build local quadratic Taylor models of the objective function to benefit from existing theories about quadratic programming with box constraints (Burer and Letchford 2009; De Angelis et al. 1997). It is essential to bound the Taylor approximation error for a proper interval bound, e.g., presented by Bompadre et al. (2013). We again apply interval arithmetic to the map element functions individually, including the local curve linearizations for the curve functions, to calculate a tighter bound for the overall Taylor error than, e.g., in (Mitsos et al. 2009). Thereby, the distances between geometric primitives, e.g., the branch-and-bound boxes and the lines as approximations of the curves, are used to speed up the function evaluation and to improve the result significantly for an interval-valued evaluation.

1.4 Structure of the paper

The proposed pipeline for deriving secondary information by means of a mathematical optimization is shown in Fig. 2. It depicts the particular subtasks with references to the corresponding sections. This paper is structured as follows:

Fig. 2
figure 2

The pipeline presents the various steps to derive secondary information from primary map data. We also refer to the corresponding sections that cover the details

In Sect. 2, we derive the underlying optimization problem for the derivation of secondary information from digital map data and show how to adapt the generic formulation to the specific use case of deriving suitable emergency stops. Section 3 deals with suitable approaches for solving the optimization problem derived in Sect. 2. We introduce an interval algorithm as well as the computation of locally valid concave overestimators and quadratic approximations, that are combined and extended to construct a suitable algorithm. Additionally, we describe a one-dimensional optimization method that provides good initial values to reduce the computational effort of the interval algorithm significantly. In Sect. 4, we focus on aspects that are relevant for an efficient function evaluation, in particular exploiting the B-spline form of the curves. In Sect. 5, we evaluate our prototypical implementation for a highly relevant practical scenario of deriving emergency stops for advanced driver assistance and automated driving functions. Furthermore, we point out how the alterations of the optimization algorithms significantly improve the suitability and performance, and give guidance for further optimizations. In the last section, we summarize our results, discuss them, and give an outlook.

2 Mathematical problem formulation

The input data must be converted into a three-dimensional mathematical model in order to compute the optimal emergency stops. We define a suitable objective function that is based upon the map elements, the scenario, and the ego position. For every type of map element, we construct a generic function model into which we insert the elements of the specific data type. Then, the position, the scenario-dependent weight of each specific map element and the ego position define the map element functions that are all summed up to define the objective function of the corresponding maximization problem.

2.1 Representation of the input data

Let the disjoint union \({\mathcal {X}}= {\mathcal {X}}_M~{\dot{\cup }}~ {\mathcal {X}}_C\) denote the set containing all map elements, where \({\mathcal {X}}_M\) is the set of all sorts of landmarks, and \({\mathcal {X}}_C\) is the set of all lanes in the digital map, i.e.,

$$\begin{aligned} \begin{aligned} {\mathcal {X}}_M=&\{y\in {\mathbb {R}}^3 ~;~ y\text { is a landmark}\},\\ {\mathcal {X}}_C=&\{\gamma \in C^2([0,1],{\mathbb {R}}^3) ~;~ \Gamma =\gamma ([0,1]) \text { is a lane}\}. \end{aligned} \end{aligned}$$
(2.1)

We assume that the cardinality of \({\mathcal {X}}\) is finite and write \({\mathcal {X}}=\{{\chi _{i}}~;~ i\in I\}\), where \(|I|=|{\mathcal {X}}|<\infty \) and \(({\chi _{i}})_{i\in I}\) enumerates the elements of \({\mathcal {X}}\). In our setting, landmarks in \({\mathcal {X}}_{M}\) are points, whereas lanes are \(C^2\)-curves, indicated by the subscript \(C\). We would like to stress that our model and our optimization method can be extended to the case where also curves are allowed as landmarks, but for the remainder of the paper, we only consider the case of points as landmarks.

As part of the scenario \({\mathcal {S}}\), which means performing an emergency stop on the highway in our case, each element \({\chi _{i}}\) is assigned a constant weight \(w({\chi _{i}},{\mathcal {S}})\in {\mathbb {R}}\) which models the strength of the contribution of the particular map element to the objective function, i.e., \({\chi _{i}}\in {\mathcal {X}}\mapsto w({\chi _{i}},{\mathcal {S}})\in {\mathbb {R}}.\) In practice, this choice is made based on the categories of the map elements which arise from the element properties in the digital map. They are exemplarily shown in Table 1. According to those categories, \({\mathcal {X}}_C\) is partitioned into \(L\in {\mathbb {N}}\) disjoint subsets, each containing lanes of equal weight analogous to the weighting on the basis of the map data properties, i.e.,

$$\begin{aligned} {\mathcal {X}}_C= \underset{l=1,\dots ,L}{{\dot{\bigcup }}} C_{l}\quad \text {with}\quad w({\chi _{i}},{\mathcal {S}})= w_{l}~~\text { for all }~~ {\chi _{i}}\in C_{l},~~~l=1,\dots ,L. \end{aligned}$$
(2.2)

Furthermore, the function \(w_{X_0,{\mathcal {S}}}:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) weights the distance to the ego position \(X_0\) by \(w_{X_0,{\mathcal {S}}}(x) = \omega _{{\mathcal {S}}}(\Vert x-X_0\Vert _2)\) with a function \(\omega _{{\mathcal {S}}}:{\mathbb {R}}_{\ge 0} \rightarrow {\mathbb {R}}\).

With regard to the evaluation techniques in Sect. 4, we choose cubic B-spline curves (Torrente et al. 2015; Chen et al. 2010; Min et al. 2019) of the form \( \gamma (t) = \sum _{j=1}^{n-4} P_{j} N_{j,4}(t)\), \(t \in [0,1], \) for the representation of the curves in \({\mathcal {X}}_C\). They are defined by the control points \(P_1,\dots ,P_{n-4} \in {\mathbb {R}}^3\) and the cubic B-spline functions \(N_{j,4}: {\mathbb {R}}\rightarrow {\mathbb {R}}_{\ge 0},~j=1,\dots ,n-4\), which are piecewise cubic polynomials that are explicitly defined by a unique knot vector \(\tau _{\gamma }:= [\tau _1,\dots ,\tau _n] \in {\mathbb {R}}^{n}_{\ge 0}\) with \(\tau _1=\dots =\tau _4=0,~\tau _{n-3}=\dots =\tau _{n}=1,~\tau _j \le \tau _k \text { for } j<k\). Since \(\gamma \) lies in the convex hull that is spanned by the control points, the set \(P_{\gamma }:= \{P_j;~j=1,\dots ,n-4\}\) provides an approximation of the shape of \(\gamma \) without evaluating the \(C^2\)-functions \(N_{j,4}\). Because \(\text {supp}~ N_{j,4}\subset [\tau _{j},\tau _{j+4})\), we need to evaluate at most four spline functions to compute \(\gamma (t)\) for some \(t\in [0,1]\), independent of the length of \(\tau _{\gamma }\).

2.2 Formulation of the objective function

We are targeting an objective function that models only the interaction between the lanes and the landmarks, i.e., curves of different categories should not influence each other. Accordingly, every function evaluation can be assigned to a specific category \(C_{l},~l=1,\dots ,L\) from Equation (2.2). Since a category weight represents the maximum contribution of that category to the objective function, the objective function obtained by aggregating all curve evaluations of the same category shall not exceed this weight. Therefore, we define the objective function \(F_{{\mathcal {S}}}:{\mathbb {R}}^3\rightarrow {\mathbb {R}}\) by

$$\begin{aligned} F_{{\mathcal {S}}}(x) ~=~ \underset{l=1,\dots ,L}{\max }~ f_{l}(x) + f_{M}(x), \end{aligned}$$
(2.3)

with the category functions \(f_{l}:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) and the landmark function \(f_{M}:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) that are given by

$$\begin{aligned} f_{l}(x)= & {} \min \left\{ w_{l}, f_{C_{l}}(x) \right\} ~\text {with}~ f_{C_{l}}(x) = \sum _{{\chi _{i}}\in C_{l}} f_{{\chi _{i}}}(x) ~\text {and}~\nonumber \\ f_{M}(x)= & {} \sum _{{\chi _{i}}\in {\mathcal {X}}_M} f_{{\chi _{i}}}(x). \end{aligned}$$
(2.4)

The cut-off \(w_{l}>0\) bounds the function value of \(f_{C_{l}}\), which sums up all curve functions of category \(l\). We point out that \(f_{C_{l}}(x)>w_{l}\) only occurs near points where different curves are very close to each other. \(F_{{\mathcal {S}}}\) is composed of the leading curve category value by \(f_{l}\) and the values for the landmarks by \(f_{M}\). Every function \(f_{{\chi _{i}}}: {\mathbb {R}}^3\rightarrow {\mathbb {R}}\) models the contribution of a map element \({\chi _{i}}\in {\mathcal {X}}\) to \(F_{{\mathcal {S}}}\). Since the \(f_{{\chi _{i}}}\) are at least \(C^2\), the objective function is piecewise continuously differentiable depending on the active category and whether the cut-off \(w_{l}\) in \(f_{l}\) is active. In that case, the local derivative of \(f_{l}\) vanishes, and the derivative of \(F_{{\mathcal {S}}}\) equals the derivative of \(f_{M}\). This piecewise differentiability is exploited in Sect. 3.2.3 for the optimization algorithm.

Since we perform the maximization in three dimensions to generate solutions that do not coincide with the given map data, it is required to develop a generic function model that extends weightings \(w({\chi _{i}},{\mathcal {S}})\) from map elements \({\chi _{i}}\) to the map element functions \(f_{{\chi _{i}}}\). Even in the case where an optimization would be performed only along the curves contained in \({\mathcal {X}}_{C}\), such an extension would be required to obtain a global interplay between the different map elements in forming a global objective function. One idea could be to extend \(w({\chi _{i}},{\mathcal {S}})\) to any point \(x\in {\mathbb {R}}^3\) based on the distance between x and \({\chi _{i}}\). However, computing this distance for a fixed point x requires a global optimization along \({\chi _{i}}\), and, if \({\chi _{i}}\) is a curve that is not a line segment, the distance function would in general be nonsmooth with respect to x. Due to these drawbacks, a radial function \(\phi ({\chi _{i}},{\mathcal {S}}): {\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) is introduced to extend \(w({\chi _{i}},{\mathcal {S}})\) to \({\mathbb {R}}^3\).

We convolve the resulting weighted kernel function \(w_{X_0,{\mathcal {S}}}w({\chi _{i}},{\mathcal {S}})\phi ({\chi _{i}},{\mathcal {S}})\) with a Dirac-type measure (Brokate and Kersting 2015, chs. 3–5), (Kanwal 1998, ch. 1) supported on \({\chi _{i}}\) to obtain the desired extension to the three-dimensional space. For general \(d\in {\mathbb {N}}\) and \(p\in {\mathbb {R}}^d\), the Dirac \(\delta \)-function over \(C^{\infty }({\mathbb {R}}^d)\) for \(p\) is defined as \( \left\langle \delta _{p}, \xi \right\rangle := \int _{{\mathbb {R}}^d} \xi (x) d \delta _{p}(x) = \int _{{\mathbb {R}}^d} \delta (x-p) \xi (x) dx = \xi (p) \) with a \(C^{\infty }\)-function \(\xi :{\mathbb {R}}^d\rightarrow {\mathbb {R}}\). A well-defined generalization from \(p\) to \({\chi _{i}}\) is presented in Onural (2006):

$$\begin{aligned} \left\langle \delta _{{\chi _{i}}}, \xi \right\rangle = \int _{{\mathbb {R}}^3} \delta _{{\chi _{i}}}(y) \xi (y) dy:= \int _{{\chi _{i}}} \xi (y) d{{\chi _{i}}}. \end{aligned}$$
(2.5)

Note, that \({\chi _{i}}\) is an abbreviation for \({\chi _{i}}([0,1])\) if \({\chi _{i}}\in {\mathcal {X}}_C\) in this formula. We define the test function \(\xi _{x}:{\chi _{i}}\rightarrow {\mathbb {R}}\) as the weighted kernel centered at a point \(x\in {\mathbb {R}}^3\), i.e., \(\xi _{x}(y) =w_{X_0,{\mathcal {S}}}(x)w({\chi _{i}},{\mathcal {S}}) \phi ({\chi _{i}}, {\mathcal {S}})(y-x)\). Therefore, the corresponding map element function looks as follows in the case of a landmark \({\chi _{i}}=y\in {\mathcal {X}}_M\):

$$\begin{aligned} f_{y}(x)= \left\langle \delta _{y},w_{X_0,{\mathcal {S}}}(x) w(y,{\mathcal {S}})\phi (y,{\mathcal {S}})(\cdot - x)\right\rangle = w_{X_0,{\mathcal {S}}}(x)w(y,{\mathcal {S}}) \phi (y,{\mathcal {S}})(y-x).\nonumber \\ \end{aligned}$$
(2.6)

The \(\cdot \!~\)-placeholder marks the variable on which the function in the second entry of \(\langle \cdot , \cdot \rangle \) depends. Analogously, Equation (2.5) yields the following function for curves \({\chi _{i}}=\gamma \in {\mathcal {X}}_C\):

$$\begin{aligned} f_{\gamma }(x)&= \left\langle \delta _{\gamma }, w_{X_0,{\mathcal {S}}}(x) w(\gamma ,{\mathcal {S}})\phi (\gamma ,{\mathcal {S}})(\cdot - x) \right\rangle = w_{X_0,{\mathcal {S}}}(x) w(\gamma ,{\mathcal {S}})\int _{\gamma } \phi (\gamma ,{\mathcal {S}})(\cdot -x) ds \nonumber \\&=w_{X_0,{\mathcal {S}}}(x) w(\gamma ,{\mathcal {S}})\int _{0}^{1} \phi (\gamma ,{\mathcal {S}})(\gamma (t) -x) \cdot \Vert \gamma '(t) \Vert ~dt. \end{aligned}$$
(2.7)

The kernel function \(\phi ({\chi _{i}},{\mathcal {S}})\) must be monotonically decreasing to model diminishing influence of \({\chi _{i}}\) with growing distance between \(x\) and \({\chi _{i}}\). The norm \(\Vert \cdot \Vert \) in the integral expression depends on the structure of the kernel function and will be discussed later. Due to the convolution, the curve function \(f_{\gamma }\) fails to equal its target value \(w(\gamma ,{\mathcal {S}})\) near the ends of \(\gamma \). Fortunately, this behavior cancels out when two curves with coinciding weights, or rather of the same category, are connected via a \(C^2\)-transition. From the mathematical point of view, the sum of such two curve functions and a curve function of a longer merged curve are equal.

The construction of the map functions \(f_{{\chi _{i}}}\) requires that each kernel function takes its maximum at the origin and vanishes with increasing distance. The probability density function of the multivariate Gaussian distribution (Vinga 2004), also referred to as the Gaussian function, of the form \(\Phi _{\mu ,\Sigma }(x) = 1/{\sqrt{(2\pi )^3|\Sigma |}} \cdot \exp (- (x-\mu )^T \Sigma ^{-1}(x-\mu )/2)\), \(x,\mu \in {\mathbb {R}}^3\), \(\Sigma \in {\mathbb {R}}^{3\times 3}\), fulfills these requirements under the assumption that tiny function values numerically equal zero. We set \(\mu =0\) and choose the covariance matrix \(\Sigma = \Sigma ({\chi _{i}},{\mathcal {S}})\) to be diagonal with values \((\sigma _{{\mathcal {S}},1}^2,\sigma _{{\mathcal {S}},2}^2,\sigma _{{\mathcal {S}},3}^2)\). The standard deviation \(\sigma _{{\mathcal {S}},k}\) indicates the size of the support in \(x_k\)-direction, \(k=1,2,3\), depending on \({\chi _{i}}\). In the following, we will abbreviate \(\sigma _{{\mathcal {S}},k}\) by \(\sigma _{k}\). The global coordinate system and the ellipsoidal shape of the support of the Gaussian function are depicted in Fig. 5 in Sect. 4. \(\phi ({\chi _{i}},{\mathcal {S}})\) is supposed to be invariant under rotations of the coordinate system in the \(x_1\)-\(x_2\)-plane with equal values for \(\sigma _{1}\) and \(\sigma _{2}\), so that the objective function does not depend on the orientation of the map. \(\sigma _{3}\) is small, since the vehicle motion in \(x_3\)-direction is restricted by the elevation profile of the street.

With a normalization factor \(c_{\chi _{i}}\) ensuring that \(f_{{\chi _{i}}}\) restricted on \({\chi _{i}}\) approximately equals \(w({\chi _{i}},{\mathcal {S}})\) (disregarding the factor \(w_{X_0,{\mathcal {S}}}\) for the ego position \(X_0\)), the kernel functions are defined in the following way with \(\varphi (r) = \exp (-r^2/2)\) for each \(x\in {\mathbb {R}}^3\):

$$\begin{aligned} \phi ({\chi _{i}},{\mathcal {S}})(x) = c_{\chi _{i}}\cdot {\sqrt{(2\pi )^3|\Sigma |}} \cdot \Phi _{0,\Sigma ({\chi _{i}},{\mathcal {S}})}(x) = c_{\chi _{i}}\cdot \varphi (\Vert x \Vert _{\Sigma ^{-1}}). \end{aligned}$$
(2.8)

We use \({\sqrt{(2\pi )^3|\Sigma |}}\) to compensate the factor of the Gaussian function. The norm \(\Vert \cdot \Vert _{\Sigma ^{-1}}\) is induced by the scalar product \(\langle x,y \rangle _{\Sigma ^{-1}}:= x^T \Sigma ^{-1}y\). For a landmark \(y\in {\mathcal {X}}_M\), this constant simply equals \(1\), because of \(f_{y}(y) = c_{y} w_{X_0,{\mathcal {S}}}(y)w(\gamma ,{\mathcal {S}})\). In the case of a curve \(\gamma \in {\mathcal {X}}_C\), we choose \(c_{\gamma }=1/\sqrt{2\pi }\) based on the following considerations: We specify the norm \(\Vert \cdot \Vert \) in Equation (2.7) with \(\Vert \cdot \Vert _{\Sigma ^{-1}}\) to remove the dependencies from the curve direction in such a way that the actual function value does not depend on the orientation of the curve, but mainly on the distance to the curve. If we approximate the curve \(\gamma \) locally with a straight line \({\bar{\gamma }}= vt + y\), \(t \in (-\infty ,\infty )\), \(y \in \Gamma \), \(c_{{\bar{\gamma }}}= c_{\gamma }\), then

$$\begin{aligned} \begin{aligned}&f_{{\bar{\gamma }}}(y) = w_{X_0,{\mathcal {S}}}(y) w({\bar{\gamma }},{\mathcal {S}}) \int _{-\infty }^{\infty } c_{{\bar{\gamma }}} \varphi (\Vert y - {\bar{\gamma }}(t) \Vert _{\Sigma ^{-1}}) \Vert {\bar{\gamma }}'(t) \Vert _{\Sigma ^{-1}} dt\\&\qquad \quad = w_{X_0,{\mathcal {S}}}(y) w(\gamma ,{\mathcal {S}}) c_{{\bar{\gamma }}} \int _{-\infty }^{\infty } \exp \big (-\Vert v \Vert _{\Sigma ^{-1}}^2 t^2 /2\big ) \Vert v \Vert _{\Sigma ^{-1}} dt\\&\qquad \quad = w_{X_0,{\mathcal {S}}}(y) w(\gamma ,{\mathcal {S}}). \end{aligned}\nonumber \\ \end{aligned}$$
(2.9)

Hence, we are able to state the contribution of the landmarks and the curves to the objective function.

3 Optimization approaches for solving for the problem formulation

The maxima of the derived objective function \(F_{{\mathcal {S}}}\) in Equations (2.3) and (2.4) represent points of interest for the given scenario \({\mathcal {S}}\), i.e., the optimal locations for an emergency stop in our example. We determine them by solving the maximization problem

$$\begin{aligned} \begin{aligned} \underset{x \in X}{\max }~ F_{{\mathcal {S}}}(x)&=\underset{x \in X}{\max }~\big ( \underset{l=1,\dots ,L}{\max }~ \min \big \{ w_{l}, \sum _{{\chi _{i}}\in C_{l}} f_{{\chi _{i}}}(x) \big \} + \sum _{{\chi _{i}}\in {\mathcal {X}}_M} f_{{\chi _{i}}}(x) \big ), \end{aligned} \end{aligned}$$
(3.1)

where \(X\subset {\mathbb {R}}^3\) is a closed box (Cartesian product of closed intervals) that contains all relevant points. We target at methods that provide certificates of global optimality within a specifiable tolerance, as already mentioned in Sect. 1. Since this problem is nonsmooth, we will reformulate it to be able to compute higher-order derivatives. We need those derivatives to formulate optimality criteria and to apply higher-order optimization methods, which reduce the number of function evaluations significantly.

The maximization problem is decomposed into \(L\) subproblems of the form:

$$\begin{aligned} \max _{x \in X_{l}}~F_{l}(x),\quad \text {with}\quad F_{l}(x) = f_{l}(x) + f_{M}(x), \quad \text {for all}\quad l= 1,\dots ,L. \end{aligned}$$
(3.2)

Here, \(X_{l}\subset X\) is a large enough closed box containing all relevant map elements that are necessary for the definition of \(F_{l}\) and in particular all maximizers of \(F_{l}\). We resolve the nonsmoothness of \(\min \{w_{l},\cdot \}\) in Problem (3.2) by reformulating it as an equivalent constrained optimization problem with an auxiliary variable \(u_{l}\). For \( l= 1,\dots ,L\), we obtain

$$\begin{aligned} \max _{x \in X_{l}, u_{l}\in {\mathbb {R}}} ~u_{l}+ f_{M}(x) \quad \text {s.t.}\quad u_{l}\le w_{l},\quad u_{l}\le f_{C_{l}}(x). \end{aligned}$$
(3.3)

Hence, the global solution set \(X^{*}\) of Problem (3.1) is a subset of \(\bigcup _{l=1,\dots ,L}X^{*}_{l}\), where \(X^{*}_{l} \subset {\mathbb {R}}^3\) is the solution set of the \(l\)-th subproblem (3.3). \(X^{*}\) consists of all \(x\) that fulfill \(F_{{\mathcal {S}}}(x) \le F_{{\mathcal {S}}}(y)\) for all \(y\in \bigcup _{l=1,\dots ,L}X^{*}_l\). The subproblems are independent of each other, i.e., they can be solved in parallel to reduce the total run time. The number of subproblems is induced by the number of categories \(C_{l}\), i.e., based on the road model complexity.

As we have seen in Sect. 2.2, Problem (3.3) is nonlinear and nonconvex, but sufficiently smooth to apply second-order optimization methods (Bertsekas 1999, chs. 1, 4), since we use smooth kernel functions and \(C^2\)-curves. Due to the nonconvexity of these \(L\) subproblems, ascent methods (Boyd and Vandenberghe 2004, ch. 9) are in general not able to determine the global optima, rather they provide only local solutions that depend on the initial point. In fact, we need a measure for approximate global optimality to obtain certified solutions that contribute to the rigorous algorithm. We develop a hybrid optimization method specially tailored to the modeled problem that combines a rigorous deterministic branch-and-bound global maximization method with local approaches for selected boxes. For each of the \(L\) categories, we globally solve Problem (3.2), or Problem (3.3), to a prescribed accuracy. The following theory is used to achieve a rigorous branch-and-bound algorithm.

3.1 Interval arithmetic

Standard interval arithmetic (Hickey et al. 2001; Hansen 1975), (Moore et al. 2009, chs. 2, 4, 5), (Neumaier 1991, ch. 1) is used to obtain a box \({\textbf{y}}\) that is an enclosure, i.e., \({\textbf{y}}\supset g({\textbf{x}})\), for the image of a box \({\textbf{x}}\) under a map g. A box \({\textbf{x}}\) as a subset of \({\mathbb {R}}^d\) is a d-dimensional interval of the form \({\textbf{x}} = [x^L,x^U]=[x^L_1,x^U_1]\times \cdots \times [x^L_d,x^U_d]\) with lower bound \(\inf ({\textbf{x}})=x^L\in {\mathbb {R}}^d\) and upper bound \(\sup ({\textbf{x}}) = x^U \in {\mathbb {R}}^d\). The principal goal of interval arithmetic is to evaluate a function on a whole interval \({\textbf{x}}\) at once without having to evaluate the function at every single point \(x \in {\textbf{x}}\). This is done by computing an interval (the tighter the better) that contains the image of \({\textbf{x}}\) under this function. Hence, we build an interval extension of our objective function and, where required, of its derivatives. However, it must be noted that this approach in general leads to an overestimation of the interval-based function value since possible dependencies between intervals are usually lost.

Given a function \(g:{\mathbb {R}}^d \rightarrow {\mathbb {R}},~d\in {\mathbb {N}},\) and some interval \({\textbf{x}} \subset {\mathbb {R}}^d\), it holds \(g({\textbf{x}})\subset [g]({{\textbf{x}}})\), but in general not “\(=\)”, where \(g({\textbf{x}}):=\{g(x);~ x \in {\textbf{x}}\}\) and \([g]({{\textbf{x}}})\) refers to the output of interval arithmetic when we evaluate \(g\) for the box \({\textbf{x}}\), i.e., \([g]\) is an interval extension for \(g\). For instance, interval extensions of \(+\) and \(\cdot \) yield for \({\textbf{x}}=[-1,1]\):

$$\begin{aligned}{}[g]({{\textbf{x}}})= & {} [{\textbf{x}}+{\textbf{x}}\,\cdot \,{\textbf{x}}] = [[-1,1]+[[-1,1]\cdot [-1,1]]]\nonumber \\= & {} [[-1,1]+[-1,1]]=[-2,2]. \end{aligned}$$
(3.4)

In comparison, the optimal enclosure (or interval hull, i.e., the smallest containing interval) for \(g({\textbf{x}})=\big (\cdot + (\cdot )^2\big )({\textbf{x}})\) would be \([-1/4,2]\). If instead of multiplication we use \((\cdot )^2\) then \([{\textbf{x}}^2]=[0,1]\) (this is tight) and \([{\textbf{x}}+{\textbf{x}}^2]=[[-1,1]+[0,1]]=[-1,2]\), which is better but still not tight. A tight result can be achieved by the alternative formula \([({\textbf{x}}+1/2)^2-1/4]=[([-1/2,3/2])^2-1/4]=[-1/4,2]\). We see that, in general, we can only expect promising results from interval arithmetic if the sequence of interval calculations is arranged and processed in an attentive way.

Due to the form of the objective function, we compute several interval extensions of all map element functions \(f_{{\chi _{i}}}\) in order to derive an expression for \([F_{l}]\). In particular, defining suitable inclusion functions is a non-trivial task because of the convolution of the map elements with the Gaussian kernel function, and, therefore, we need an interval extension \([\phi ({\chi _{i}},{\mathcal {S}})]\) to define \([f_{{\chi _{i}}}]\). This results in a one-dimensional problem, since \( \phi ({\chi _{i}},{\mathcal {S}})(x) = c_{\chi _{i}}\prod _{k=1}^{3}\varphi _{0,\sigma _{k}}(x_k)\) with \(\varphi _{\mu ,\sigma ^{2}}: {\mathbb {R}}\rightarrow {\mathbb {R}}, z\mapsto 1/{\sqrt{2\pi \sigma ^2}} \exp (-{(z-\mu )^2}/({2\sigma ^2})), \) and \(\mu \in {\mathbb {R}},~\sigma \in {\mathbb {R}}_{>0}\). The monotonicity of \(\varphi _{\mu ,\sigma }\) and \(\varphi '_{\mu ,\sigma }\), which help to derive tight bounds for the interval-based evaluation, are deduced from \( \varphi '_{\mu ,\sigma ^2}(z) = -({z-\mu })/{\sigma ^2} \cdot \varphi _{\mu ,\sigma ^2}(z) \) and \( {\varphi }''_{\mu ,\sigma ^2}(z) = ( {(z-\mu )^2}/{\sigma ^4} - {1}/{\sigma ^2} )\cdot \varphi _{\mu ,\sigma ^2}(z). \)

Therefore, we are able to calculate optimal enclosures for \(\phi ({\chi _{i}},{\mathcal {S}})\), i.e., \([\phi ({\chi _{i}},{\mathcal {S}})]({{\textbf{x}}})=\phi ({\chi _{i}},{\mathcal {S}})({\textbf{x}})\), and for the interval evaluation of \(\nabla \phi ({\chi _{i}},{\mathcal {S}})\). With the constant \(\eta _1:= \exp (-1/2)/(\sqrt{2\pi }\sigma ^2)\), the one-dimensional interval-valued Gaussian function and its derivative have the following form for \(\textbf{z} = [z^L,z^U]\):

$$\begin{aligned} \begin{aligned} \varphi _{\mu ,\sigma ^2}(\textbf{z})&= \left\{ \begin{aligned}&[\varphi _{\mu ,\sigma ^2}(z^L),\varphi _{\mu ,\sigma ^2}(z^U)],{} & {} \text { if } z^U \le \mu ,\\&[\varphi _{\mu ,\sigma ^2}(z^U),\varphi _{\mu ,\sigma ^2}(z^L)],{} & {} \text { if } z^L \ge \mu ,\\&[\min \{\varphi _{\mu ,\sigma ^2}(z^L),\varphi _{\mu ,\sigma ^2}(z^U)\},1/(\sqrt{2\pi \sigma ^2})],{} & {} \text { else, } \end{aligned} \right. \\ \varphi '_{\mu ,\sigma ^2}(\textbf{z})&= \left\{ \begin{aligned}&[\varphi '_{\mu ,\sigma ^2}(z^L),\varphi '_{\mu ,\sigma ^2}(z^U)],{} & {} \!\!\text {if } (z^L \ge \mu +\sigma ) \vee (z^U \le \mu -\sigma ),\\&[\varphi '_{\mu ,\sigma ^2}(z^U),\varphi '_{\mu ,\sigma ^2}(z^L)],{} & {} \!\!\text {if } \textbf{z} \subset [\mu -\sigma ,\mu +\sigma ],\\&[-\eta _1,\eta _1],{} & {} \!\!\text {if } [\mu -\sigma ,\mu +\sigma ] \subset \textbf{z}, \\&[\min \{\varphi '_{\mu ,\sigma ^2}(z^L),\varphi '_{\mu ,\sigma ^2}(z^U)\},\eta _1],{} & {} \!\!\text {if } (z^L \le \mu -\sigma ) \wedge (z^U \!\!\in \![\mu -\sigma ,\mu +\sigma ]), \\&[-\eta _1, \max \{\varphi '_{\mu ,\sigma ^2}(z^L),\varphi '_{\mu ,\sigma ^2}(z^U)\}],{} & {} \!\!\text {else.} \end{aligned} \right. \end{aligned}\nonumber \\ \end{aligned}$$
(3.5)

The exact evaluation of the interval Gaussian function also provides optimal enclosures for the interval extensions of the landmark functions \({f_{y}},~y\in {\mathcal {X}}_M\).

Unfortunately, we encounter widely overestimating interval extensions \([f_{\gamma }]\) for curves \(\gamma \in {\mathcal {X}}_{C}\) due to the following reason: The integration along a curve \(\gamma \) in Equation (2.7) is performed with a quadrature formula. We apply the trapezoidal rule, because higher-order Newton-Cotes formulas (Davis and Rabinowitz 1984) do not provide better results for our purposes. Hence, we define a set \({\mathcal {Q}}:=\{(t_j,\lambda _j) ~|~ t_j \in [0,1],~\lambda _j\ge 0,~j=0,\dots ,N_{{\mathcal {Q}}},~N_{{\mathcal {Q}}}\in {\mathbb {N}}\}\) to provide a numerical approximation of \(f_{\gamma }\) by

$$\begin{aligned} f_{\gamma }^{{\mathcal {Q}}}(x)= w_{X_0,{\mathcal {S}}}(x) w(\gamma ,{\mathcal {S}})\sum _{j=0}^{N_{{\mathcal {Q}}}} \lambda _j~ \phi (\gamma ,{\mathcal {S}})(\gamma (t_j) -x) ~ \Vert \gamma '(t_j) \Vert _{\Sigma ^{-1}}. \end{aligned}$$
(3.6)

For an interval \({\textbf{x}} \subset {\mathbb {R}}^3\), the summation of weighted enclosures \([\phi (\gamma ,{\mathcal {S}})]({\gamma (t_j) -{\textbf{x}}})\) results in some significant overestimation for \([f^{{\mathcal {Q}}}_{\gamma }]({{\textbf{x}}})\) compared to \(f^{{\mathcal {Q}}}_{\gamma }({\textbf{x}})\), although \([\phi (\gamma ,{\mathcal {S}})]({\gamma (t_j) -{\textbf{x}}}) = \phi (\gamma ,{\mathcal {S}})(\gamma (t_j) -{\textbf{x}})\) for all \(j=1,\dots ,N_{{\mathcal {Q}}}\). The \({\textbf{x}}\)-dependence between the summands gets lost, and the interval sum is expansive. In general, the larger \(N_{{\mathcal {Q}}}\), i.e., the more quadrature points are used to cover \(\gamma \), the larger is the size of the output interval. Upper-bounding \([f_{C_{l}}]({{\textbf{x}}})\) by \(w_{l}\) avoids inaccurate upper limits for the interval enclosure, since for \([f_{C_{l}}]({{\textbf{x}}})=[a,b] \subset {\mathbb {R}}\) and \(w\in {\mathbb {R}}\) it holds \(\min \{ w, [f_{C_{l}}]({{\textbf{x}}})\} = [\min \{w,a\},\min \{w,b\}]\). The cut-off with \(w_{l}\) also removes small oscillations of \(f_{C_{l}}\) along its ridges (cf. Fig. 1A) and at the same time greatly improves the quality of interval extension of \(\min \{f_{C_{l}},w_{l}\}\) compared to the extension of \(f_{C_{l}}\).

3.2 INT: Interval optimization approach

With this interval arithmetic knowledge, we will specify our rigorous branch-and-bound maximization algorithm based on the considerations for interval optimization (Hansen 1980), (Hansen and Walster 2004, chs. 7–8, 12–13), convex overestimation (Adjiman et al. 1998; Adjiman and Floudas 1996), and quadratic approximation (Burer and Letchford 2009; De Angelis et al. 1997). The basic methodology of the branch-and-bound algorithm and the advanced bounding methods will be adapted to the emergency stop scenario by using the Problem (3.2) as well as the smoother reformulation in Problem (3.3).

After an initialization step, the branching and bounding strategies are repeatedly applied until a stopping criterion is fulfilled. The general pattern for the interval algorithm INT, with various refinements that are described later, has the following form:

3.2.1 Initialization

Let \(\mathbf {x_\text {init}}={X_{l}}\) be the initial interval box containing the feasible set of Problem (3.2) for the \(l\)-th category, thus \(X^{*}_{l}\subset \mathbf {x_\text {init}}\), with interval enclosure \([F_{l}]({\mathbf {x_\text {init}}}) = [f^L_{\mathbf {x_\text {init}}},f^U_{\mathbf {x_\text {init}}}]\). Instead of computing a reference value assumed by \(F_{l}\) on \(\mathbf {x_\text {init}}\), we perform a simplified search restricted to \({\mathcal {X}}\) in Sect. 3.4 that approximates the maximum of \(F_{{\mathcal {S}}}\) on \({\mathcal {X}}\) before investigating Problem (3.2) for all categories. Let \({\bar{f}}_{\text {best},l}\) denote the current best value for category \(l\) that is initialized with the result of this lower-dimensional maximization on \({\mathcal {X}}\). We define \({\mathcal {B}}:=\{\mathbf {x_\text {init}}\}\) to be the set of active interval boxes which remain to be investigated.

3.2.2 Branching

In every iteration, we select a box \(\mathbf {x_\text {current}}\in {\mathcal {B}}\) with \(f^U_{\mathbf {x_\text {current}}} = {\max }_{{\textbf{x}} \in {\mathcal {B}}} ~f^U_{{\textbf{x}}}\) if \({\mathcal {B}}\ne \emptyset \). Then, \(\mathbf {x_\text {current}}\) is subdivided by multisection (Csallner et al. 2000) into four subboxes \(\{{\textbf{x}}^j_{\text {sub}}\}_{j=1,\dots ,4}\) through its midpoint along the two directions with the longest side lengths so that \(\mathbf {x_\text {current}}= \bigcup _{j=1,\dots ,4} {\textbf{x}}^j_{\text {sub}}\). \({\mathcal {B}}\) is updated by \(({\mathcal {B}}\backslash \{\mathbf {x_\text {current}}\}) \cup \{{\textbf{x}}^j_{\text {sub}}\}_{j=1,\dots ,4}\). We compute the interval enclosures \([F_{l}]({{\textbf{x}}^j_{\text {sub}}}) = [f^L_{{\textbf{x}}^j_{\text {sub}}},f^U_{{\textbf{x}}^j_{\text {sub}}}]\) and reference values \({\bar{f}}_{{\textbf{x}}^j_{\text {sub}}} = F_{l}(x)\) with midpoint \(x\in {\textbf{x}}^j_{\text {sub}}\) for all \(j=1,\dots ,4\), and update the current best function value \({\bar{f}}_{\text {best},l}\) if \( {\bar{f}}_{{\textbf{x}}^j_{\text {sub}}}>{\bar{f}}_{\text {best},l}\) for some \(j\). For efficiency reasons, we compute the reference value only when the box is used for further considerations, i.e., it holds \(\mathbf {x_\text {current}}= {\textbf{x}}^j_{\text {sub}}\).

3.2.3 Bounding

Every interval \({\textbf{x}} \in {\mathcal {B}}\) with \(f^U_{{\textbf{x}}} < {\bar{f}}_{\text {best},l}\) is removed from \({\mathcal {B}}\). Since \(F_{l}(x)\) is smaller than \({\bar{f}}_{\text {best},l}\) for every point \(x \in {\textbf{x}}\), the box \({\textbf{x}}\) clearly cannot contain a global optimum. Apart from this derivative-free bounding, we also apply a gradient check (Hansen and Walster 2004, sec. 12.4). The interval gradient is used to identify whether \(F_{l}\) is monotone along at least one coordinate direction in \({\textbf{x}}\), so that there is no local solution in the interior of the box and we can remove it from \({\mathcal {B}}\). Since \(F_{l}\) is not differentiable everywhere in \(\mathbf {x_\text {init}}\) due to the upper cut-off, special attention must be paid to the construction of the interval gradient \([\nabla F_{l}]({\mathbf {x_\text {current}}})\).

For \(x\in \mathbf {x_\text {current}}\) either \(F_{l}(x) = f_{C_{l}}(x) + f_{M}(x)\) or \(F_{l}(x) = w_{l}+ f_{M}(x)\) holds. Therefore, the gradient of \(F_{l}\) has the following form for all \(x\) not lying on the boundary of \(\mathbf {x_\text {init}}\) or on the boundary of the cutting area \(R^{l}(\mathbf {x_\text {current}}):= \{x \in \mathbf {x_\text {current}};~ f_{l}(x) = w_{l}\}\):

$$\begin{aligned} \nabla F_{l}(x) = \left\{ \begin{aligned}&\nabla f_{C_{l}}(x) + \nabla f_{M}(x),{} & {} \text {if}~ f_{l}(x) = f_{C_{l}}(x),\\&\nabla f_{M}(x),{} & {} \text {if}~ f_{l}(x) = w_{l}. \end{aligned} \right. \end{aligned}$$
(3.7)

For \(x\in \partial R^{l}(\mathbf {x_\text {current}})\), \(F_{l}\) is in general only directionally differentiable. In order to combine the gradient criterion with the different expressions for \(\nabla F_{l}\), the interval gradient check proceeds with

$$\begin{aligned} \begin{aligned}&[\nabla F_{l}]({{\textbf{x}}}) = \sum _{\gamma \in C_{l}} [\nabla f_{\gamma }]({{\textbf{x}}}) + \sum _{y\in {\mathcal {X}}_{M}} [\nabla f_{y}]({{\textbf{x}}}) \quad \text {and}\quad&[\nabla F_{l}]({{\textbf{x}}}) = \sum _{y\in {\mathcal {X}}_{M}} [\nabla f_{y}]({{\textbf{x}}}). \end{aligned}\nonumber \\ \end{aligned}$$
(3.8)

The gradient criterion must be fulfilled for both interval extensions \([\nabla F_{l}]\) to be applicable without explicit knowledge of \(R^{l}(\mathbf {x_\text {current}})\). We compute those gradient interval enclosures in the branching step for every new box resulting from the subdivision.

Since the boxes in \({\mathcal {B}}\) always cover the set of global optima \(X^{*}_{l}\), the union of the boxes in \({\mathcal {B}}\) converges monotonically decreasing (in terms of set inclusion) to \(X^{*}_{l}\) when it is ensured that the maximum size of the boxes, i.e., the length of the longest edge of all boxes, tends to zero. The gap between the best function value \({\bar{f}}_{\text {best},l}\) and the upper bound of its interval function value defines a certificate for global optimality for a box. In the algorithm, we determine the global optimal value only up to a guaranteed accuracy \(\epsilon _f\), i.e., we stop to branch if this gap is smaller than \(\epsilon _f\). Furthermore, we also stop branching if the box size is smaller than a predefined minimum box size \(\epsilon _x \in {\mathbb {R}}^3_{>0}\), because the limited accuracies for the vehicle localization and the map data make further box examinations needless. Finally, the category \(l^{*}\) with \({\bar{f}}_{\text {best}}:={\bar{f}}_{\text {best},l^{*}} \ge {\bar{f}}_{\text {best},l}\) for all \(l=1,\dots ,L\), provides the global optimal value \({\bar{f}}_{\text {best}}\) and the global solution box set.

3.3 Two local second-order methods

We consider the set of active boxes \({\mathcal {B}}\) after several steps of the interval optimization algorithm. Since the interval extensions \([f_{\gamma }]\) of the curve functions \(f_{\gamma }\) significantly overestimate the interval enclosure for large boxes \({\textbf{x}}\), the upper limit of \([f_{l}]({{\textbf{x}}})\) often matches \(w_{l}\), i.e., the upper bounds do not improve by further branching, and the number of removed boxes decreases. Therefore, we will investigate two different second-order methods, \(\alpha \)BB and a quadratic approximation approach which we apply for boxes that fulfill specific subsequently introduced criteria. Hence, we derive the algorithms INT-\(\alpha \)BB (INTerval \(\alpha \)BB) and INT-QUAP (INTerval QUadratic APproximation).

3.3.1 INT-\(\varvec{\alpha }\)BB: An \(\varvec{\alpha -}\)based Branch-and-Bound Method

The idea of \(\alpha \)BB (Adjiman et al. 1998; Adjiman and Floudas 1996; Meyer and Floudas 2005) is to compute locally valid concave overestimators of \(F_{l}\) using second-order curvature information to generate better upper bounds than the interval extension \([F_{l}]\) that we use in Sect. 3.2. The \(\alpha \)BB algorithm is summarized in Algorithm 3.1. \(F_{l}\) is regularized by a nonnegative quadratic term, weighted by \(\alpha \ge 0\), to construct a local overestimator \(L_{{\textbf{x}}}:{\textbf{x}} \rightarrow {\mathbb {R}},~L_{{\textbf{x}}}(x) = F_{l}(x) + {\alpha }/{2} \cdot \sum _{k=1}^{3} (x_k-x^L_k)(x^U_k-x_k)\) on the interval \({\textbf{x}}=[x^L,x^U] \in {\mathcal {B}}\). If \(\alpha \) is large enough, then the Hessian \(H{L_{{\textbf{x}}}}(x) = H{F_{l}}(x) -\alpha I\) is negative semi-definite and, therefore, \(L_{{\textbf{x}}}\) is concave. The maximum value \(\max _{x\in {\textbf{x}}}L_{{\textbf{x}}}\) can be exactly computed by local searches for the concave function, e.g., with SQP methods (Boggs and Tolle 1995). This maximum value, just as \(f^U_{{\textbf{x}}}\), is an upper bound for \(F_{l}({\textbf{x}})\). In our case, \(F_{l}\) is not smooth enough to compute such an \(\alpha \). For this purpose, we present a method that combines \(\alpha \)BB with the cut-off for \(f_{C_{l}}\) to derive a concave overestimator of \(F_{l}\) in \({\textbf{x}}\).

Instead of a single \(\alpha \)-value, we compute a vector \((\alpha _1,\dots ,\alpha _p)\in {\mathbb {R}}^p_{\ge 0}\) of \(\alpha \)-values with \(p=|{\mathcal {X}}_M\cup C_{l}|\) to obtain individual concave overestimators \(L^{{\textbf{x}}}_{{\chi _{i}}}\) for the functions \(f_{{\chi _{i}}}\) in order to define a concave overestimator \(L^{{\textbf{x}}}_{l}\) of \(F_{l}\) and to compute the corresponding maximum value. We define

$$\begin{aligned} L^{{\textbf{x}}}_{l}(x)&= \min \Big \{ w_{l}, \sum _{{\chi _{i}}\in C_{l}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) \Big \} + \sum _{{\chi _{i}}\in {\mathcal {X}}_{M}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) \quad \text {with}\quad \nonumber \\ L^{{\textbf{x}}}_{{\chi _{i}}}(x)&= f_{{\chi _{i}}}(x) + \frac{\alpha _{i}}{2} \sum _{k=1}^{3} (x_k-x^L_k) (x^U_k-x_k), \quad \alpha _i\ge \max \big \{0,\underset{x\in {\textbf{x}}}{\max }~ \lambda _{\max }(H{f_{{\chi _{i}}}}(x))\big \}, \end{aligned}$$
(3.9)

(cf. Maranas and Floudas 1994), where \(\lambda _{\max }(H{f_{{\chi _{i}}}}(x))\) is the maximum eigenvalue of \(H{f_{{\chi _{i}}}}(x)\).

Fig. 3
figure 3

The function \(L^{{\textbf{x}}}_{{\chi _{i}}}\) is a concave overestimator of \(f_{{\chi _{i}}}\) in \({\textbf{x}}\) provided by the \(\alpha \)BB method. \(d^{{\textbf{x}}}_{{\chi _{i}}}\) pictures the maximum distance between \(f_{{\chi _{i}}}\) and \(L^{{\textbf{x}}}_{{\chi _{i}}}\) that is used to measure the quality of this overestimation

An example of \(\alpha \)BB is visualized in Fig. 3. The function \(f_{{\chi _{i}}}\) is overestimated by the concave function \(L^{{\textbf{x}}}_{{\chi _{i}}}\) in the box \({\textbf{x}}= [-2,2]^2\). Their maximum distance \(d^{{\textbf{x}}}_{{\chi _{i}}}=(L^{{\textbf{x}}}_{{\chi _{i}}}- f_{{\chi _{i}}})(x_0)\) is attained at the center \(x_0\) of the box \({\textbf{x}}\). Similar to Problem (3.3), the maximum value \({\bar{L}}_{l}:=\max _{x\in {\textbf{x}}}L^{{\textbf{x}}}_{l}(x)\) results from the concave maximization problem

$$\begin{aligned} \max _{{x \in {\mathbb {R}}^3},{u_{l}\in {\mathbb {R}}}}~ u_{l}+ \sum _{{\chi _{i}}\in {\mathcal {X}}_{M}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) \quad \text {s.t.} \quad x^L\le x \le x^U, ~u_{l}\le w_{l}, ~u_{l}\le \sum _{{\chi _{i}}\in C_{l}} L^{{\textbf{x}}}_{{\chi _{i}}}(x).\nonumber \\ \end{aligned}$$
(3.10)

We use the maximum difference \(L^{{\textbf{x}}}_{l}-F_{l}\) to decide whether it is promising to solve this problem. Since \(\sum _{{\chi _{i}}\in C_{l}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) \ge f_{C_{l}}(x)\) for every \(x\in {\textbf{x}}\), it holds that \( \min \{ w_{l}, \sum _{{\chi _{i}}\in C_{l}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) \} - \min \{ w_{l}, f_{C_{l}}(x)\} \le \sum _{{\chi _{i}}\in C_{l}} L^{{\textbf{x}}}_{{\chi _{i}}}(x) - f_{C_{l}}(x) \), and, therefore, with Equation (3.9) the maximum gap between \(F_{l}\) and \(L^{{\textbf{x}}}_{l}\) in \({\textbf{x}}\) is bounded by

$$\begin{aligned} \max _{x\in {\textbf{x}}} L^{{\textbf{x}}}_{l}(x)-F_{l}(x)\le & {} \max _{x\in {\textbf{x}}} \sum _{i=1}^{p} \tfrac{\alpha _i}{2} \sum _{k=1}^{3} (x_k-x^L_k)(x^U_k-x_k))\nonumber \\= & {} \tfrac{1}{8}\Vert x^U-x^L \Vert _2^2 \sum _{i=1}^{p} \alpha _i=: d^{\alpha BB}_{\i } {{\textbf{x}}}. \end{aligned}$$
(3.11)

For large \(d^{\alpha BB}_{\i }(\textbf{x})\), it is unlikely that \({\bar{L}}_{l}\) yields a tighter overestimator than \(f^U_{{\textbf{x}}}\), and, thus, \({\bar{L}}_{l}\) is only computed if \(d^{\alpha BB}_{\i }(\textbf{x}) < \epsilon _d\) holds for some predefined threshold \(\epsilon _d > 0\). Nevertheless, this still requires to calculate \((\alpha _1,\dots ,\alpha _p)\) for each interval \({\textbf{x}}\).

These \(\alpha _i\) can be computed as follows: We have to determine the maximum eigenvalues of the interval Hessians \([Hf_{{\chi _{i}}}]({{\textbf{x}}}),~i\in I\) (cf. Equation (3.9)), which are symmetric matrices with interval enclosures in the entries. However, the interval Hessians \([Hf_{\gamma }]({{\textbf{x}}})\) for the curves \(\gamma \in {\mathcal {X}}_{C}\) are usually widely overestimated for the same reasons as explained for the interval-valued function evaluation in Sect. 3.1. As a remedy, we replace \(w_{X_0,{\mathcal {S}}}\) by its maximum value \(w^{\max }_{X_0,{\mathcal {S}}}({\textbf{x}}):= {\max }_{x \in {\textbf{x}}} ~w_{X_0,{\mathcal {S}}}(x)\) on \({\textbf{x}}\) to reduce the complexity of the function. This modified curve function overestimates \(f_{\gamma }\) slightly by the function

$$\begin{aligned} {\bar{f}}_{\gamma }(x):= w^{\max }_{X_0,{\mathcal {S}}}({\textbf{x}}) w(\gamma ,{\mathcal {S}})\int _{0}^{1} \phi (\gamma ,{\mathcal {S}})(\gamma (t) -x) \cdot \Vert \gamma '(t) \Vert _{\Sigma ^{-1}} ~dt, \end{aligned}$$
(3.12)

since the slope of \(w_{X_0,{\mathcal {S}}}\) in \({\textbf{x}}\) is negligible in small boxes that qualify for \(\alpha \)BB.

Furthermore, we can expect the size of those boxes to be smaller than the size of the kernel function support, which is roughly comparable to the width of a lane. In combination with our highway scenario, we assume that, in this case, the curve \(\gamma \) is approximately a straight line \({\bar{\gamma }}\). We derive the curvature of the corresponding function \({\bar{f}}_{{\bar{\gamma }}}\) in Appendix A. It depends only on the distance to \({\bar{\gamma }}\), and, therefore, it is sufficient to determine \([\cdot -\text {proj}_{\Vert \cdot \Vert _{\Sigma ^{-1}}}(\cdot ,{\bar{\gamma }}) \Vert _{\Sigma ^{-1}}]({{\textbf{x}}})\), which contains all \(\Sigma ^{-1}\)-distances between the points in \({\textbf{x}}\) and their projections onto the line \({\bar{\gamma }}\). If we insert this enclosure into the result of Appendix A, we get an interval extension for the maximum eigenvalue of \(H{\bar{f}}_{{\bar{\gamma }}}\) in \({\textbf{x}}\), i.e., an upper bound for the eigenvalues of \({H{\bar{f}}_{{\bar{\gamma }}}}({\textbf{x}})\). In fact, an optimal interval enclosure for the second derivative of the Gaussian function, analogous to Equation (3.5), in Equation (A.7) provides a tight bound for \(\alpha _{{\bar{\gamma }}} = \max _{\Vert u\Vert _2=1,~\!x\in {\textbf{x}}}~u^T H{\bar{f}}_{{\bar{\gamma }}}(x) u\). However, this is only a tight bound for an overestimator \({\bar{f}}_{{\bar{\gamma }}}\) while, strictly speaking, we are looking for a bound \(\alpha _{\gamma }\) for \(f_{\gamma }\).

figure a

3.3.2 INT-QUAP: local quadratic approximation approach

In the following, we will describe an alternative method beside the \(\alpha \)BB theory to extend the interval algorithm. The local linearization of a curve \(\gamma \) is used again, but the objective function is locally approximated instead overestimated. We approximate the objective function \(F_{l}\) locally in a box \({\textbf{x}}\) by a quadratic function \(Q_{l}^{{\textbf{x}}}:{\textbf{x}} \rightarrow {\mathbb {R}}\). The maximum value of \(Q_{l}^{{\textbf{x}}}\) is used to improve the upper bound provided by \([F_{l}]({{\textbf{x}}})\), although we have to add the approximation error \(R_{l}^{{\textbf{x}}}\) to this value. In the following, we will construct the quadratic approximation, determine the remainder term and present the computation of their upper bounds.

The Taylor expansion at a point \(x_0=(x_{0,1},x_{0,2},x_{0,3})^T \in {\textbf{x}}\) provides quadratic approximations \(Q_{{\chi _{i}}}^{{\textbf{x}}}:{\textbf{x}} \rightarrow {\mathbb {R}}\),

$$\begin{aligned} Q_{{\chi _{i}}}^{{\textbf{x}}}(x)= f_{{\chi _{i}}}(x_0) + \nabla f_{{\chi _{i}}}^T(x_0) (x-x_0) + \tfrac{1}{2} (x-x_0)^T Hf_{{\chi _{i}}}(x_0) (x-x_0) \end{aligned}$$
(3.13)

of \(f_{{\chi _{i}}}\) for every \({\chi _{i}}\in {\mathcal {X}}\), so that \(f_{{\chi _{i}}}(x) = Q_{{\chi _{i}}}^{{\textbf{x}}}(x) + R_{{\chi _{i}}}^{{\textbf{x}}} (x;x_0)\) with Taylor remainders

$$\begin{aligned} R_{{\chi _{i}}}^{{\textbf{x}}} (x;x_0) = \sum _{\vert \nu \vert = 3} \big ( \tfrac{1}{\nu ! } \partial ^{\nu } f_{{\chi _{i}}}(x_0 + \theta (x-x_0)) \prod _{k=1,2,3} (x_k-x_{0,k})^{\nu _k} \big ) \end{aligned}$$
(3.14)

for some \(\theta =\theta (x,x_0) \in [0,1]\) and a multi-index \(\nu = (\nu _1,\nu _2,\nu _3),~\vert \nu \vert = \nu _1 + \nu _2 + \nu _3,~ \nu ! = \nu _1!\nu _2!\nu _3!\). Then, we can write \(F_{l}(x) = \min \{Q_{l,1}^{{\textbf{x}}}(x) + R_{l,1}^{{\textbf{x}}}(x;x_0), Q_{l,2}^{{\textbf{x}}}(x) + R_{l,2}^{{\textbf{x}}}(x;x_0)\}\) with

$$\begin{aligned} \begin{aligned}&Q_{l,1}^{{\textbf{x}}}(x) = w_{l}+ \sum _{{\chi _{i}}\in {\mathcal {X}}_{M}} Q_{{\chi _{i}}}^{{\textbf{x}}}(x),{} & {} R_{l,1}^{{\textbf{x}}}(x;x_0) = \sum _{{\chi _{i}}\in {\mathcal {X}}_{M}} R_{{\chi _{i}}}^{{\textbf{x}}}(x;x_0), \\&Q_{l,2}^{{\textbf{x}}}(x) = \sum _{{\chi _{i}}\in ({C_{l}\cup {\mathcal {X}}_M})} Q_{{\chi _{i}}}^{{\textbf{x}}}(x),{} & {} R_{l,2}^{{\textbf{x}}}(x;x_0) = \sum _{{\chi _{i}}\in ({C_{l}\cup {\mathcal {X}}_M})} R_{{\chi _{i}}}^{{\textbf{x}}}(x;x_0). \end{aligned} \end{aligned}$$
(3.15)

Each argument in this \(\min \)-representation of \(F_{l}\) contains a quadratic function and a remainder term, and, therefore, an upper-bounding approach for each of these two arguments provides an upper bound for \(F_{l}\) in \({\textbf{x}}\) by

$$\begin{aligned} F_{l}({\textbf{x}}) \le \underset{x \in {\textbf{x}}}{\max }~Q_{l,m}^{{\textbf{x}}}(x) + \underset{x \in {\textbf{x}}}{\max }~R_{l,m}^{{\textbf{x}}}(x;x_0), \quad \text {for}~ m = 1,2. \end{aligned}$$
(3.16)

In practice, we compute the exact maximum value of the quadratic function \(Q_{l,m}^{{\textbf{x}}}\) in this formula, but we determine only interval enclosures of all the remainders \(R_{{\chi _{i}}}^{{\textbf{x}}}({{\textbf{x}};x_0})\) to get an upper bound for the remainder \(R_{l,m}^{{\textbf{x}}}\).

However, we can use the cut-off \(\min \{w_{l}, [f_{C_{l}}]({{\textbf{x}}})\}=:[a,b]\) to reduce the computational effort. If the upper bound \(w_{l}\) for \(f_{C_{l}}\) is not active anywhere in \({\textbf{x}}\), i.e., \(b<w_{l}\), then \(F_{l}(x) = Q_{l,2}^{{\textbf{x}}}(x) + R_{l,2}^{{\textbf{x}}}(x;x_0)\) for all \(x\in {\textbf{x}}\). On the other hand, if the upper bound \(w_{l}\) is active for every \(x\in {\textbf{x}}\), i.e., \(a=b=w_{l}\), then \(f_{l}= w_{l}\) in \({\textbf{x}}\), and \(F_{l}(x) = Q_{l,1}^{{\textbf{x}}}(x) + R_{l,1}^{{\textbf{x}}}(x;x_0)\). If only \(b\) equals \(w_{l}\), then we cannot determine whether or where the cut-off is active in \({\textbf{x}}\). Therefore, we compute the right-hand side of Equation (3.16) for both \(m=1\) and \(m=2\), and we choose the smaller value as an upper bound for \(F_{l}({\textbf{x}})\). An example for this case is visualized in Fig. 4, which shows the partial cut-off in \({\textbf{x}}\) and the resulting versions of the Taylor approximation. It should be noted that this figure gives just a qualitative description, since the approximation error here is too large for real application.

Fig. 4
figure 4

A The function \(F_{l}\) is bounded above by \(f_{C_{l}}+ f_{M}\). They only differ inside the red surrounded area, where the cut-off with \(w_{l}\) is active. B The functions \(w_{l}+f_{M}\) and \(f_{C_{l}}+ f_{M}\) are locally approximated by the quadratic Taylor expansions \(Q_{l,1}^{{\textbf{x}}}\) and \(Q_{l,2}^{{\textbf{x}}}\) We maximize these quadratic functions and add the approximation error to get an upper bound for \(F_{l}\) in \({\textbf{x}}\)

This quadratic approximation approach is also only performed on sufficiently small boxes (cf. Sect. 3.3.1), i.e., if the approximation error in Equation (3.16) is sufficiently small. Analogously to the \(\alpha \)BB method, we simplify our problem by using the locally valid function \({\bar{f}}_{{\bar{\gamma }}}\) (cf. Equation (A.3)) with the linearized curve \({\bar{\gamma }}\), since widely overestimates the true enclosure in general. If we apply \(\partial _{x_m}(\tfrac{1}{2} \Vert x-p\Vert ^2_{\Sigma ^{-1}}) = (x_m-p_m)/\sigma ^2_m\) from the first line of Equation (A.4) and we recap \(\Vert v \Vert _{\Sigma ^{-1}}=1\), then the third directional derivatives \(\partial ^{\nu } {\bar{f}}_{{\bar{\gamma }}}= \partial ^{\nu _1}_{x_1} \partial ^{\nu _2}_{x_2} \partial ^{\nu _3}_{x_3} {\bar{f}}_{{\bar{\gamma }}}\) with \(\vert \nu \vert =3,~m,q,z = 1,2,3,~m\ne q, m\ne z, q\ne z\) are

$$\begin{aligned}{} & {} \qquad \partial ^3_{x_m} {\bar{f}}_{{\bar{\gamma }}}(x)= \tfrac{{\bar{f}}_{{\bar{\gamma }}}(x)}{{\sigma ^6_m}} \big [- (x_m-p_m)^3 + 3(x_m-p_m)( \sigma ^2_m - {v_m}^2)\big ],\nonumber \\{} & {} \partial ^2_{x_m} \partial _{x_q} {\bar{f}}_{{\bar{\gamma }}}(x) = \tfrac{{\bar{f}}_{{\bar{\gamma }}}(x)}{{\sigma ^4_m}{\sigma ^2_q}} \big [- (x_m-p_m)^2 (x_q-p_q) + (x_q-p_q) ({\sigma ^2_m} - {v_m}^2)\nonumber \\{} & {} \quad -2 (x_m-p_m) v_m v_q \big ], \quad \quad \partial _{x_m} \partial _{x_q} \partial _{x_z}{\bar{f}}_{{\bar{\gamma }}}(x) = -\tfrac{{\bar{f}}_{{\bar{\gamma }}}(x)}{\sigma ^2_m \sigma ^2_q \sigma ^2_z} \big [\big (\!\prod _{k=m,q,z}\!\!\!(x_k-p_k)\big )\nonumber \\{} & {} \quad \qquad \qquad \qquad \qquad \qquad \qquad +(x_m - p_m) v_q v_z +(x_q - p_q)v_m v_z +(x_z - p_z)v_m v_q\big ].\nonumber \\ \end{aligned}$$
(3.17)

Inserting these formulas into Equation (3.14) provides an integral-free interval extension \([R_{{\bar{\gamma }}}^{{\textbf{x}}}]({{\textbf{x}},x_0})\) with \(\varvec{\theta } = [0,1]\) and an estimate for \([\Vert \cdot -\text {proj}_{\Vert \cdot \Vert _{\Sigma ^{-1}}}(\cdot ,{\bar{\gamma }}) \Vert _{\Sigma ^{-1}}]({{\textbf{x}}})\). Then, the second term in the objective function of Problem (3.16) is bounded by

$$\begin{aligned} \begin{aligned}&\sup ([R_{l}^{{\textbf{x}}}]({{\textbf{x}}})) =: d^{\text {QUAP}}_{l}({\textbf{x}}) \qquad \text {with} \quad [R_{l}^{{\textbf{x}}}]({{\textbf{x}}}) = [R_{l,1}^{{\textbf{x}}}]({{\textbf{x}}}) \qquad \text {or} \quad [R_{l,2}^{{\textbf{x}}}]({{\textbf{x}}}),\\&[R_{l,1}^{{\textbf{x}}}]({{\textbf{x}}}) =\sum _{y\in {\mathcal {X}}_{M}} [R_{y}^{{\textbf{x}}}]({{\textbf{x}};x_0}),\quad [R_{l,2}^{{\textbf{x}}}]({{\textbf{x}}}) =\sum _{y\in {\mathcal {X}}_{M}} [R_{y}^{{\textbf{x}}}]({{\textbf{x}};x_0}) + \sum _{{\bar{\gamma }}\in {C_{l}}} [R_{{\bar{\gamma }}}^{{\textbf{x}}}]({{\textbf{x}};x_0}), \end{aligned}\nonumber \\ \end{aligned}$$
(3.18)

where this choice of the interval extension \([R_{l}^{{\textbf{x}}}]({{\textbf{x}}})\) for the Taylor remainder depends on the explicit form of the approximation that we discussed in Equation (3.15). It remains to show how to maximize the quadratic function \(Q_{l}^{{\textbf{x}}}\).

In general, the quadratic optimization problems with box constraints of the form

$$\begin{aligned} \underset{y \in {\textbf{y}}}{\max }~Q(x)= \underset{y \in {\textbf{y}}}{\max }~\tfrac{1}{2}y^TAy+b^T y+c, \qquad \text {box }{\textbf{y}} \subset {\mathbb {R}}^d, A \in {\mathbb {R}}^{d \times d},~ b \in {\mathbb {R}}^{d},~ c\in {\mathbb {R}},\nonumber \\ \end{aligned}$$
(3.19)

with are NP-hard (Burer and Letchford 2009) for \(d \in {\mathbb {N}}\). Instead of approaching the solution iteratively (De Angelis et al. 1997), we solve Problem (3.19) analytically by subdividing it into subproblems of dimension \(0\) to \(d\). The decomposition of a \(d\) -dimensional box creates \(3^d\) lower-dimensional components in total (Banchoff 1990, ch. 4). For a box \({\textbf{x}} = [x^L,x^U] \subset {\mathbb {R}}^3\), this subdivision is given by

$$\begin{aligned} {\textbf{x}} = \text {int}({\textbf{x}}) ~{\dot{\cup }}~ \big ({\dot{\bigcup }}_{j_f=1,\dots ,6} ~\text {int}(F_{j_f})\big ) ~{\dot{\cup }}~ \big ({\dot{\bigcup }}_{j_e=1,\dots ,12} ~\text {int}(E_{j_e})\big ) ~{\dot{\cup }}~ \big ({\dot{\bigcup }}_{j_v=1,\dots ,8} ~V_{j_v}\big ),\nonumber \\ \end{aligned}$$
(3.20)

where \(\text {int}({\textbf{x}})\) is the interior of \({\textbf{x}}\), \(F_{j_f},~j_f=1,\dots ,6\), are the two-dimensional faces of \({\textbf{x}}\), \(E_{j_e},~j_e = 1,\dots ,12\), are the one-dimensional edges and \(V_{j_v},~j_v=1,\dots ,8\), are the vertices of \({\textbf{x}}\). Except for the vertices, the decomposition provides \(3^3-2^3=19\) quadratic maximization problems on open sets. Since all solutions are either a vertex of \({\textbf{x}}\) or a stationary point in the relative interior of either \({\textbf{x}}\) or one of the 1- or 2-dimensional faces, we can compute all candidates for global solutions by solving linear equations up to the third dimension and comparing them to all the vertices \(V_{j_v}\). Even though this approach is inefficient for higher dimensions, it is computationally feasible for \(d=3\). This completes our description of an upper-bounding approach for the objective function \(F_{l}\) on \({\textbf{x}}\) based on maximizing quadratic Taylor polynomials and bounding the remainder terms. The resulting method is summarized in Algorithm 3.2. In the worst case, we must apply this optimization method two times per box to maximize the quadratic functions \(Q_{l,1}^{{\textbf{x}}}\) and \(Q_{l,2}^{{\textbf{x}}}\).

figure b

Analogous to the concave overestimator in \(\alpha \)BB, we could merge these quadratic problems and obtain a quadratically constrained quadratic program of the form

$$\begin{aligned} \max _{{x \in {\mathbb {R}}^3},{u_{l}\in {\mathbb {R}}}}{} & {} ~u_{l}+ \sum _{{\chi _{i}}\in {\mathcal {X}}_{M}} Q_{{\chi _{i}}}^{{\textbf{x}}}(x) \quad \text {s.t.}\quad x \ge x^L,~~ x \le x^U,~~ u_{l}\le w_{l},~~ u_{l}\le \sum _{{\chi _{i}}\in {C_{l}}} Q_{{\chi _{i}}}^{{\textbf{x}}}(x). \nonumber \\ \end{aligned}$$
(3.21)

However, this problem cannot be solved with the previous box-constrained approach. Polynomial optimization (Lasserre 2001) was also considered, but it was significantly slower than our approach for this application.

3.4 1D optimization along the map data

Large values for the current best function value \({\bar{f}}_{\text {best}}\) are essential for the bounding step in the interval algorithm. Therefore, we perform a one-dimensional maximization along all curves \(\gamma \in {\mathcal {X}}_C\) to obtain an initial a priori reference value (i.e., a lower bound) for the interval algorithm. For every curve \(\gamma \in {\mathcal {X}}_C\), we solve the one-dimensional problem

$$\begin{aligned} \underset{t \in [0,1]}{\max }~ f^{\text {1D}}_{\gamma }(t), ~\text { with } f^{\text {1D}}_{\gamma }: [0,1] \rightarrow {\mathbb {R}},~ f^{\text {1D}}_{\gamma }(t) = w_{X_0,{\mathcal {S}}}(\gamma (t))w(\gamma ,{\mathcal {S}}) + f_{M}(\gamma (t)),\nonumber \\ \end{aligned}$$
(3.22)

where \(f^{\text {1D}}_{\gamma }\) approximates the function values along \(\gamma \). Since the solutions of these problems are only initial values for the interval algorithm, they do not need to be solved rigorously. We use the idea of topographical global optimization (TGO) (Endres et al. 2018) with equidistant sampling, which compares the function values of neighboring samples to identify subintervals in \([0,1]\) where local optima must occur. We will extend this procedure by also computing the derivative \( d/dt ~{f^{\text {1D}}_{\gamma }}(t) = w(\gamma ,{\mathcal {S}})\nabla {w_{X_0,{\mathcal {S}}}(\gamma (t))}^{T} \gamma '(t) + \nabla f_{M}(\gamma (t))^{T}\gamma '(t) \) at the samples.

figure c

Let \(\{t_1,\dots ,t_N;~t_1 = 0,~t_N = 1,~t_i< t_j \text { for }i<j\}\) be a set of \(N\in {\mathbb {N}}\) sampling points. We choose a subset of the quadrature knots that are used for the evaluation of the curve functions. We define the set \( S_{\gamma } = \{ (t_j,f_j,df_j);~ f_j = {f^{\text {1D}}_{\gamma }}(t_j),~ df_j = d/dt~{f^{\text {1D}}_{\gamma }}(t_j),~j = 1,\dots ,N \} \) for every \(\gamma \in {\mathcal {X}}_C\). Note, that \(df_{1}\) and \(df_{N}\) are only one-sided derivatives. Our goal is to determine for every interval \([t_j,t_{j+1}],~j=1,\dots ,N-1\), whether it contains a local maximum. Therefore, we will state three criteria that prove the existence of local optima due to the differentiability (and the continuity) of \(f^{\text {1D}}_{\gamma }\):

  1. 1.

    The value at the right bound is at least as large as the value at the left bound and the function is non-increasing at the right bound, thus \((f_j \le f_{j+1}) \wedge (df_{j+1} \le 0)\).

  2. 2.

    The value at the right bound is not larger than the value at the left bound and the function is non-decreasing at the left bound, thus \((f_j \ge f_{j+1}) \wedge (df_{j} \ge 0)\).

  3. 3.

    The function is non-decreasing at the left bound and non-increasing at the right bound, thus \(\left( df_j \ge 0\right) \wedge (df_{j+1} \le 0)\).

On the boundary intervals \([t_1,t_2]\) and \([t_{N-1},t_{N}]\), we check the derivative-free criteria \(f_{1} \ge f_{2}\) or \(f_{N-1} \le f_{N}\), respectively, to ensure the existence of local optima. For the intervals that demonstrably contain local solutions, we apply the algorithm by (Brent 1973, chs. 3–5). It performs a combination of the golden-section search (Kiefer 1953) and parabolic interpolation near the interval bounds.

However, it is necessary to compare \(f^{\text {1D}}_{\gamma }(t)\) with \(F_{{\mathcal {S}}}(\gamma (t))\) for all local solutions to overcome discrepancies between \(f_{\gamma }\) and its approximation \(f^{\text {1D}}_{\gamma }\). The maximum of these optimal values and the values \(F_{{\mathcal {S}}}(y)\) for all landmarks \(y\in {\mathcal {X}}_{M}\) is chosen as an initial reference value for the full-dimensional interval approach.

3.5 Overall algorithms INT-\(\varvec{\alpha }\)BB and INT-QUAP

The steps of our global optimization algorithm are shown in Algorithm 3.4. We perform the one-dimensional maximization along the curve data \({\mathcal {X}}_{C}\) in Line 5 before starting the interval algorithm according to Sect. 3.2. We either apply INT-\(\alpha \)BB that utilizes concave overestimation, or INT-QUAP that incorporates the computation of quadratic approximations to extend the interval algorithm INT with second-order methods. Note, that we use two different rules with \(d_l^{\alpha \textrm{BB}}(\textbf{x})\) or \(d_l^{\textrm{QUAP}}(\textbf{x})\) in Line 17 to decide whether we apply INT-\(\alpha \)BB or INT-QUAP in \({\textbf{x}}\) at all.

The design of the algorithm enables the solution to be computed by independent optimization problems for the different categories. Here, we do not exploit this parallelism, since this requires further effort for the synchronization to keep determinism. We use the optimal value \({\bar{f}}_{\text {best},l}\) of category \(l\) as an initial value for the category \(l+1\) problem in Line 13, even though this creates dependencies between the categories. The solutions \(x^{\text {best},l}\) are compared in Line 34 to obtain the global optimal solution \(x^{\text {best}}\).

Interval Newton (Krawczyk and Neumaier 1986), interval constraint propagation (Kjøller et al. 2007), and non-diagonal shift matrices for \(\alpha \)BB (Skjäl and Westerlund 2014; Skjäl et al. 2012) were also considered for the sake of completeness, but these approaches did not improve the performance of the overall algorithm for our problem.

figure d

4 Function evaluation improvement by curve linearization

An efficient evaluation of the objective function is decisive for a fast execution of the optimization. It strongly depends on the number of point and curve functions (cf. Eqs. (2.6) and (2.7)) that are derived for each map element. We use geometrical observations for the evaluation of these map element functions to decide in advance whether we can set the function value validly without an explicit evaluation. These geometrical checks reduce the computation time significantly for the curve evaluations in particular

4.1 Underestimating the distance to a curve

The functions \(f_{{\chi _{i}}}\) in Equation (2.4) become infinitesimally small with increasing distance to the map elements \({\chi _{i}}\in {\mathcal {X}}\) due to the convolution with the Gaussian kernel function \(\phi ({\chi _{i}},{\mathcal {S}})\). With Equation (2.8), it can be shown that \( \phi ({\chi _{i}},{\mathcal {S}})(y - x) < \epsilon _{\phi }~~~\text {if}~~~ \Vert x-y \Vert ^2_{\Sigma ^{-1}} > 2 \ln (c_{\chi _{i}}/\epsilon _{\phi }) =: r_{\phi ,{\chi _{i}}}^2 \) for some small \(\epsilon _{\phi }> 0 \). In other words, if the distance between \(x \in {\mathbb {R}}\) and \(y\) is larger than \(r_{\phi ,{\chi _{i}}}\) for all \(y\in {\chi _{i}}\), then \(f_{{\chi _{i}}}(x)\) has no impact on the objective function and we can set the value to zero. It is straightforward to verify this condition for the map points. In the following, we present a method how to perform this relevance check for the map curves \(\gamma \) by using their B-spline representation.

We want to provide a simple lower bound for \({\min }_{t\in [\tau _{j},\tau _{j+1}]} \Vert x-\gamma (t)\Vert _{\Sigma ^{-1}}\) for every B-spline knot interval \([\tau _{j},\tau _{j+1}],~j=4,\dots ,n-3\), to compare it with \(r_{\phi ,\gamma }\) in order to decide whether or not to compute \(f_{\gamma }(x)\) explicitly. Therefore, we approximate \(\gamma \) with a polygonal chain (or polyline) \(s_{\gamma }\) using a set of vertices by

$$\begin{aligned} s_{\gamma }:[0,1] \rightarrow {\mathbb {R}}^3,\quad s_{\gamma }(t) = \tfrac{\tau _{j+1} -t}{\tau _{j+1} - \tau _{j}} S_j+ \tfrac{t - \tau _{j}}{\tau _{j+1} - \tau _{j}} S_{j+1} ~~~~~\text {for }t\in [\tau _{j},\tau _{j+1}],~j=4,\dots ,n-3.\nonumber \\ \end{aligned}$$
(4.1)

\(s_{\gamma }\) interpolates \(\gamma \) and, hence, provides a better approximation than the polyline induced by the B-spline control points \(P_{\gamma }\) of \(\gamma \). Since \(\Vert x-\gamma (t)\Vert _{\Sigma ^{-1}} \ge \Vert x-s_{\gamma }(t)\Vert _{\Sigma ^{-1}} - \Vert \gamma (t)-s_{\gamma }(t)\Vert _{\Sigma ^{-1}}\) for all \(t \in [0,1]\), the minimum distance between \(x\) and the curve segment \(\gamma \mid _{[\tau _{j},\tau _{j+1}]}:=\{\gamma (t);~t\in [\tau _{j},\tau _{j+1}]\}\) is bounded below by

$$\begin{aligned} \underset{t \in [\tau _{j},\tau _{j+1}]}{\min } \Vert x-\gamma (t)\Vert _{\Sigma ^{-1}} \ge \underset{t \in [\tau _{j},\tau _{j+1}]}{\min } \Vert x-s_{\gamma }(t)\Vert _{\Sigma ^{-1}} - \underset{t \in [\tau _{j},\tau _{j+1}]}{\max } \Vert \gamma (t)-s_{\gamma }(t)\Vert _{\Sigma ^{-1}}.\nonumber \\ \end{aligned}$$
(4.2)

This lower-bounding property is visualized in Fig. 5.

Here, the minimum distance \(d^{j}_{s_{\gamma }}(x)\) between \(x\) and the line segment \([S_{j}, S_{j+1}]:=\{s_{\gamma }(t);~t\in [\tau _{j},\tau _{j+1}]\}\) equals \(\Vert x - \text {proj}_{\Vert \cdot \Vert _{\Sigma ^{-1}}}(x,[S_{j}, S_{j+1}]) \Vert _{\Sigma ^{-1}}, \) given that the projection onto the line segment \([S_{j}, S_{j+1}]\) has the form

$$\begin{aligned} \text {proj}_{\Vert \cdot \Vert _{\Sigma ^{-1}}}(x, [S_{j}, S_{j+1}]) = \left\{ \begin{aligned}&S_{j},{} & {} \mu _{j}<0,\\&S_{j} + \mu _{j}(S_{j+1}-S_{j}),{} & {} \mu _{j}\in [0,1],\\&S_{j+1},{} & {} \mu _{j}> 1, \end{aligned} \right. \end{aligned}$$
(4.3)

with \(\mu _{j}= {\langle x - S_{j}, S_{j+1} - S_{j} \rangle _{\Sigma ^{-1}}}/ \Vert S_{j+1} - S_{j} \Vert _{\Sigma ^{-1}}^2\). The second term on the right-hand side of Equation (4.2) is bounded above by \(d^{j}_{\max }\), which is given in Equation (B.3) in Appendix B. In total, we decide to set \(f_{\gamma }(x) = 0\) if \( d^{j}_{s_{\gamma }}(x) - d^{j}_{\max }> r_{\phi ,\gamma }\), because \( d^{j}_{s_{\gamma }}(x) - d^{j}_{\max }\) is a lower bound for \({\min }_{t\in [\tau _{j},\tau _{j+1}]} \Vert x-\gamma (t)\Vert _{\Sigma ^{-1}}\). The values for \(d^{j}_{\max }\) and \(r_{\phi ,\gamma }\) are independent of \(x\) and can be computed a priori.

Analogously, we need a lower bound for \(d^{j}_{s_{\gamma }}({\textbf{x}})\) to apply that rule for a box \({\textbf{x}}\). An efficient computation of the exact minimum distance \(\inf (d^{j}_{s_{\gamma }}({\textbf{x}}))\) between a box and a line segment is demonstrated in (Schneider and Eberly 2002, sec. 10.9.4). Then, we set \([f_{\gamma }]({{\textbf{x}}})=0\) if \(\inf (d^{j}_{s_{\gamma }}({\textbf{x}})) - d^{j}_{\max } > r_{\phi ,\gamma }\). In particular, this result transmits to subsets of \({\textbf{x}}\), i.e., we can set \(f_{\gamma }\) also zero for points in \({\textbf{x}}\) and subboxes of \({\textbf{x}}\), which is a huge advantage in the branch-and-bound algorithm to reduce the computational effort. For every box, we store the active curve segments that contribute to the objective function.

Fig. 5
figure 5

We depict some distances between the point \(x\), the curve segment \(\gamma \mid _{[\tau _{j},\tau _{j+1}]}\) and the line segment \([S_{j}, S_{j+1}]\). They are used to derive a lower bound for \(d_1\). The angles with an -sign indicate perpendicularity with respect to \(\langle \cdot ,\cdot \rangle _{\Sigma ^{-1}}\). The ellipsoid around \(x\) symbolizes the support of the Gaussian function. \(f_{\gamma }(x)\) is infinitesimally small if the ellipsoid does not intersect \(\gamma \)

4.2 Intersection between a box and a curve

Analogously, we can use geometrical tools to identify whether \(f_{\gamma }\) takes its maximum value inside \({\textbf{x}}\). If the curve \(\gamma \) intersects the box \({\textbf{x}}\), then we set \([f_{\gamma }]({{\textbf{x}}}) = [0, u_{\gamma }^{{\textbf{x}}}]\) with \(u_{\gamma }^{{\textbf{x}}}:= w^{\max }_{X_0,{\mathcal {S}}}({\textbf{x}})~\! w(\gamma ,{\mathcal {S}})\) according to the line approximation in Equation (2.9). \(u_{\gamma }^{{\textbf{x}}}\) is an upper bound for \(\max _{x \in {\textbf{x}}} f_{{\bar{\gamma }}}(x)\), where \({\bar{\gamma }}\) is a line approximation of \(\gamma \) in \({\textbf{x}}\). The difference between \(\max _{x \in {\textbf{x}}} f_{\gamma }(x)\) and \(u_{\gamma }^{{\textbf{x}}}\) depends on the slope of \(w_{X_0,{\mathcal {S}}}\), the curvature of \(\gamma \) in \({\textbf{x}}\), and to a largest extent on the fact whether the intersection of \(\gamma \) and \({\textbf{x}}\) is only close to one of the ends of the curve, because \(f_{\gamma }\) fails to approximate \(w(\gamma ,{\mathcal {S}})\) there as discussed below Equation (2.7). We want to stress that we use this curve intersection approach especially for large boxes to avoid the numerical integration, whose computational effort is high due to the number of required quadrature points.

Since we determine only the intersection of the polyline \(s_{\gamma }\) and \({\textbf{x}}\) in practice (this is much easier than for B-spline curves), we define a subbox \(\mathbf {x_\text {sub}}\subset {\textbf{x}}\) so that \(\gamma \) intersects \({\textbf{x}}\) if \(s_{\gamma }\) intersects \(\mathbf {x_\text {sub}}\). Consider again the line segment \([S_{j},S_{j+1}]\) and the curve segment \(\gamma |_{[\tau _{j},\tau _{j+1}]}\) and assume that \({\bar{x}} \in [S_{j},S_{j+1}] \cap \mathbf {x_\text {sub}}\). For \({\bar{y}} = \text {proj}_{\Vert \cdot \Vert _{\Sigma ^{-1}}}({\bar{x}}, \gamma \mid _{[\tau _{j},\tau _{j+1}]})\), it holds that \(\vert {\bar{x}}_k- {\bar{y}}_k\vert \le {\max }_{t_k\in [\tau _{j},\tau _{j+1}]} \vert q_{jk}(t_k) \vert =: e_{k}\) for every component \(k=1,2,3 \) according to the considerations for Equation (B.3). Therefore, we define \( \mathbf {x_\text {sub}}= [x^L+e,x^U-e]\) with \(e= (e_1,e_2,e_3)^T\) and \({\textbf{x}} = [x^L,x^U]\). The side lengths of \({\textbf{x}}\) must be larger than \(2e\) so that \(\mathbf {x_\text {sub}}\) is well-defined and this box intersection criterion can be applied. This distance computation of the previous section with zero distance detects these intersections.

5 Prototypical implementation and evaluation

The optimization program has been implemented in MATLAB R2018b that is running on an Ubuntu 16.04 virtual machine with a four core processor and 16 GB RAM. The underlying hardware specifications are an Intel(R) Core(TM) i7-8850 H 2.60GHz processor with six cores and 32 GB RAM in total. Apart from MATLAB built-in functionalities, we use the CORA software package (Althoff and Grebenyuk 2017) that provides an implementation of interval arithmetic in MATLAB. In practice, basic arithmetic operations (addition, subtraction, multiplication, division) and elementary functions (cos, sin, exp,...) are overloaded such that they take intervals as inputs and map them to the smallest interval that contains the image of the input set (or, if this is too costly, to a reasonably tight upper estimate). Furthermore, we include libraries to MATLAB by using MEX files. The Geometric Tools Engine (Eberly 2014, chs. 5–7) helps to compute the distances between the boxes and the polyline segments. We also use the interval arithmetic functionality of the C-XSC library (Hofschuster and Krämer 2004) to accelerate the expensive interval Gaussian described in Sect. 3.2 by means of code. A single implementation of interval arithmetic is advised for production-ready code.

The fminbnd function of the MATLAB Optimization Toolbox performs the one-dimensional maximization described in Sect. 3.4. Furthermore, we use the fmincon function to apply the SQP-algorithm for the concave overestimator. We limit the maximum number of steps to \(20\) due to the fast convergence. We choose the knot parameters in \(\tau _{\gamma }\) (cf. Sect. 2.1) for the quadrature to be piecewise equidistant in such a way that \(\Vert y_{m} -y_{m+1} \Vert _{\Sigma ^{-1}} \approx 10^{-1}\) for two neighboring quadrature points \(y_{m}=\gamma (t_{m}),~y_{m+1}=\gamma (t_{m+1})\) (cf. Sect. 3.1). Those parameters must be determined a priori for the continuity of \(f_{\gamma }\).

Fig. 6
figure 6

All the 76 lanes and 21 landmarks, emergency call boxes and kilometer signs of the emergency stop example (cf. Sect. 5) are highlighted in the above map, as well as the vehicle’s positions for the examples Ego 1 and Ego 2

5.1 Framework for the emergency stop example

We define two examples for an emergency stop while driving on the highway to evaluate the developed algorithm and compare the results. For this scenario, we observe a short highway section of around five kilometers length on the Autobahn A9 near Ingolstadt, Bavaria, Germany. This highway section reflects the usual complexity of a german highway, since there are long monotonous passages, but it also contains complex areas, e.g., highway entries and exits, as well as a service stationFootnote 2. The map excerpt and the map elements are visualized in Fig. 6. The evaluation is applied for two distinct ego vehicle positions \(X_0^1\) in northern direction, this example is referred to as Ego 1, and \(X_0^2\) in southern direction of the highwayFootnote 3, this example is referred to as Ego 2. Apart from the lanes along the route of the vehicle, we also reward emergency call boxes and kilometer posts since they help to call for the breakdown service or to localize the emergency stop. In total, there are 76 lane objects, which are split into 11 categories and 21 landmarks for our example. We perform a further selection of the relevant map data based on the ego position, since, e.g., opposite lanes or road sections behind the vehicle are not considered. Therefore, we reduce the map data to 12 lanes (8 categories) and six landmarks corresponding for Ego 1 and 23 lane objects (10 categories) and six landmarks for Ego 2.

For an explicit formulation of the objective function, we need to define a weighting for the curve categories and the landmark types, as well as for the variances \((\sigma _{1}^2,\sigma _{2}^2,\sigma _{3}^2)\) of the Gaussian functions \(\phi ({\chi _{i}},{\mathcal {S}})\). Here, we have tried to find a plausible weighting visualized in Table 1, but explicit rules for this weighting need to be established for a real application.

Table 1 For the curves and landmarks, we list all types of map elements in \({\mathcal {X}}\) as well as the corresponding category weights and the variances for the kernel functions that are required to define the map element functions

We determine the ego position factor \(w_{X_0,{\mathcal {S}}}(x) = w_{\delta }(\Vert x - X_0\Vert _2)\) by the decreasing radial function \(w_{\delta }: {\mathbb {R}}_{\ge 0} \rightarrow {\mathbb {R}}\) that is defined by the positive, scenario-dependent constant \(\delta =\delta ({\mathcal {S}}) > 0\):

$$\begin{aligned} w_{\delta }(r) = 1/\big ({1 + {r}/{\delta }}\big ),\quad w_{\delta }'(r) = -1/\big ({\delta (1 + {r}/{\delta })^2}\big ),\quad w_{\delta }''(r) = 2/\big ({\delta ^2(1 + {r}/{\delta })^3}\big ).\nonumber \\ \end{aligned}$$
(5.1)

The smaller \(\delta \) is, the faster the contribution of the map data to the objective function decays with increasing distance from the ego position. We set \(\delta _{X_0^1} = 5000\) for a larger relevant area around \(X_0^1\) and \(\delta _{X_0^2} = 1000\) for a smaller surrounding of \(X_0^2\). Furthermore, we choose \(\epsilon _f = 10^{-3}\) for the maximum error of the guaranteed optimal function value, \(\epsilon _x = 0.05\) for the minimum box side length and \(\epsilon _d = 10^{-1}\) (\(\alpha \)BB), respectively \(\epsilon _d = 5\cdot 10^{-1}\) (quadratic approximation) for the threshold that activates the second-order method.

Table 2 The results of the Example Ego 1 demonstrate that INT-\(\alpha \)BB is faster than INT-QUAP here. For each category, we count the different described enhancements of the interval algorithm and the number of function evaluations

5.2 Example Ego 1

Table 2 show the results of INT-\(\alpha \)BB and INT-QUAP for the first example Ego 1. For every category, the weight \(w_{l}\) and the number of active curves \(\vert C_{l}\vert \) are shown, as well as the number of iterations and boxes. The tables A and B also depict the number of boxes that fulfill the box intersection criterion of Sect. 4.2, the gradient monotonicity criterion and the second-order method criterion to apply concave overestimation or quadratic approximation in Algorithm 3.4. Furthermore, the number of real-valued and interval-valued function evaluations of the landmark function \(f_{M}\) and curve functions \(f_{l}\) are depicted, next to the number of evaluations of the concave overestimators \(L^{{\textbf{x}}}_{l}\) (in the \(\alpha \)BB case). The last columns show the run times for each single category and the total algorithm run time.

We start the algorithm with the category of the highest weight \(w_{l}\), because this category is most likely to provide the global optimal solution. Since the result for the first category already returns the global optimum here, the interval algorithms for the remaining categories stop after a few steps. INT-\(\alpha \)BB performs better than INT-QUAP, since a closer examination reveals that \(\alpha \)BB always removes irrelevant boxes, whereas there remain active boxes by quadratic approximation that do not contain a global optimum. Since the higher-order approaches are expensive, those unsuccessful quadratic approximations affect the run time negatively.

Fig. 7
figure 7

The visualization of the results for the Example Ego 1. A The figure shows all boxes for INT-\(\alpha \)BB of the topmost category that provides the global optimal solution. Digital orthophoto: ©GeoBasis-DE/BKG 2020. B The zoom into the optimal solution near the emergency call box shows those green boxes that qualify for applying \(\alpha \)BB

The maximization for the leading category is visualized in Fig. 7. We observe all the boxes of the interval algorithm with \(\alpha \)BB in Fig. 7A. The curves that belong to the specific category are marked in pink. In Fig. 7B are all boxes colored in green for which the \(\alpha \)BB method is applied. In practice, these boxes must be sufficiently small. The optimal solution is close to an emergency lane, with a slight offset towards the emergency call box as the weighting of the map elements has already suggested. As the reader can imagine, this is a valid emergency stop.

5.3 Example Ego 2

For the second example Ego 2 with the vehicle position \(X_0^2\), the computation of the global maximum results in an emergency stop that is close to the next kilometer sign in the driving direction. Note that \(\delta _{X_0^2}\) has been reduced compared to the first example. We use this example to compare the performance of our algorithm to existing global optimization methods. The results are shown in Table 3.

Table 3 We compare different global optimization methods for the second example Ego 2 by examining the averages of the number of function evaluations, the maximum value and the run time for the different methods. INT-\(\alpha \)BB performs the best among the deterministic algorithms
Fig. 8
figure 8

We perform 100 runs of the example Ego 2 for each of the nine algorithms (only five runs of GOP) to compare the total numbers of function evaluations and the resulting optimal values. Each dot represents one algorithm run. The values are independent of the runs of the deterministic optimization methods. A The total numbers of function evaluations. B The optimal values

Apart from our algorithms INT-\(\alpha \)BB and INT-QUAP, we examine INT without any second-order methods (third method) and the \(\alpha \)BB algorithm which computes concave estimators for every observed box (fourth method). We must only adjust the parameter \(\epsilon _d\) in our implementation to enforce these borderline cases. Furthermore, we apply several global optimization algorithms implemented in MATLAB by handing over the objective function as a black box. These include MCS (Huyer and Neumaier 1999) and DIRECT (Finkel 2003) (cf. Sect. 1.2), as well as the GOP algorithm (Pál and Csendes 2009), which uses the INTLAB software package (Rump 1999). We apply this advanced interval algorithm to get a comparison for the performance of a state of the art implementation without handing over specific properties of our objective function. Furthermore, we compare the deterministic algorithms to the stochastic algorithms GLOBAL (Csendes et al. 2008) and CMA-ES (Hansen and Ostermeier 2001).

Table 3 presents the number of function evaluations (as detailed as in Table 2) and the run time of the different methods, as well as the returned optimal value (the average values as well as the appearing optimal value ranges). The plots in Fig. 8 depict the ranges for the number of total function evaluations and for the resulting optimal values, which are especially interesting for the stochastic algorithms. Every dot corresponds to one of the 100 runs that we perform for every of the compared algorithms (only five runs of GOP). We can see that INT-\(\alpha \)BB is the overall winner. This hybrid algorithm is almost twice as fast as our interval algorithm INT without \(\alpha \)BB enhancement. The large number of pointwise function evaluations in pure \(\alpha \)BB results from the maximization of the overestimators. In this example, INT-QUAP is even a bit worse than INT, which does not use second-order approaches. It creates less boxes but takes quite some effort in the quadratic approximation part. However, we also have investigated this example with \(\epsilon _f = 10^{-2}\) (rather than \(\epsilon _f = 10^{-3}\)). Then, without reporting details here, INT-\(\alpha \)BB and INT-QUAP process approximately the same number of boxes and, in these cases, INT-QUAP is slightly faster, since solving the quadratic problem with box constraints is cheaper than maximizing the concave overestimator. Although computing more boxes, INT required roughly the same run time as INT-\(\alpha \)BB.

GOP fails here with regard to the run time, although we even provided two function handles, one for the pointwise and one for the interval-based evaluation. Hence, we see the benefit of our customized implementation for this problem. The runtime of MCS is comparable to our quadratic approximation interval approach, but the optimal value is a bit worse and achieving a better value would require a significantly higher runtime of MCS. We used \(X_0^2\) as initial point. Applying MCS to various other examples showed that in some situations MCS does not find a sufficiently good approximate global solution within an acceptable run time. We made similar observations for the DIRECT algorithm. Both methods can have difficulties to find the narrow ridges around the map elements where good candidates for global solutions are located. Also, they do not offer a certificate for approximate global optimality and thus the stopping criteria are based on other conditions that do not guarantee that always a sufficiently good solutions is returned.

The stochastic methods GLOBAL and CMA-ES are suitable alternatives to the interval \(\alpha \)BB method in terms of the computation time, but they are not deterministic and, therefore, the range of the returned optimal values is significantly larger. The plots in Fig. 8 show their ranges of the number of function evaluations and of the optimal values. CMA-ES performs fastest and it is also able to return the global optimal value to the predefined accuracy. However, this stochastic algorithm is not able to provide the true optimal value in all of the test runs (cf. last row in Fig. 8). Furthermore, this result has only been reached for an explicitly chosen covariance matrix \(\Sigma \). When we use this choice for Example Ego 1, then CMA-ES misses the global optimal value by more than 0.5. In total, those algorithms that only use pointwise evaluations to find the global optima have limited success here. The support of the objective function is small compared to the search space and the solutions are close to the map elements. The large gradients of the curve functions \(f_{\gamma }\) impede the success of algorithms that principally use single samples to infer the behavior of the function, e.g., MCS or DIRECT.

6 Conclusion and outlook

We developed a novel optimization-based approach for deriving secondary information from high definition digital maps based on a given scenario \({\mathcal {S}}\) of lanes, landmarks and weights. Thereby, new points of interest are generated by solving a global maximization problem derived from the scenario. A new rigorous deterministic global maximization algorithm was developed. It combines and, in several aspects, extends state-of-the-art approaches, enhancing them by exploiting specific properties of our problem. The resulting algorithm achieves a performance that compares favorably with other global optimization methods.

We proposed two variants of our method that both build on an interval algorithm. INT-\(\alpha \)BB outperforms other deterministic algorithms and shows a performance that is competitive with non-deterministic approaches (cf. Table 3). However, the latter do not provide certificates of global optimality and they would not meet the safety requirements demanded for autonomous driving applications. Although INT-QUAP cannot fully compete with INT-\(\alpha \)BB here, we have observed scenarios with an increased tolerance \(\epsilon _f\) for the optimal function value, where both algorithms examine roughly the same number of boxes. In that case, the advantages of INT-QUAP by solving box-constrained quadratic problems result in less run times. For the future, higher-order polynomials could reduce the approximation error, but need more elaborate methods for the optimization with box constraints.

The local linearization of the curves has been fundamental to apply second-order techniques. This approach is valid for the highway scenario, since sufficiently small curvatures are guaranteed there. We plan to extend our results to urban scenarios, which are expected to be more complex and challenging due to the increased information density. On the other hand, this can be partly alleviated as the foresight could be significantly reduced. Therefore, we aim for investigating the validity of the local curve linearization for urban maps. As the suggested optimization approach can be highly parallelized, we plan to exploit this in future work. We also will focus on specifically tailored data structures for the management of the quadrature points.