1 Introduction

Despite being a classical subject, minimum-lap-time planning (MLTP) still stirs a very lively research activity. This is indeed well documented in the recent extensive survey paper [1]. Here, besides approaches based on quasi-steady-state (QSS) and transient vehicle models, also fundamentals on nonlinear optimal control problems (OCPs), road modeling, and vehicle positioning are well summarized.

In the recent literature the approaches that tackle MLTP as the solution of an OCP seem the most widely adopted. Among these, direct methods [24], based on multiple shooting or collocation, are usually preferred over indirect ones, which typically involve the utilization of Pontryagin’s maximum principle, as in [5, 6]. Within the OCP context, the variations proposed mainly consist in the model simplifications tolerated. These can be broadly classified as follows: a priori description of the vehicle’s trajectory [79] (as opposed to leaving the trajectory free [3, 10]), quasi-steady-state (QSS) models [8, 10, 11] (as opposed to dynamic models [6, 12]), and simplified tyre and aerodynamics models [3, 13] (as opposed to adoption of elaborate tire models [14] and aerodynamic maps [15]). A further computational classification criterion is the particular OCP suite adopted, like MUSCOD-II [16], GPOPS-II [17], ICLOCS2 [18], or CasADi [19] and the specific backend solver, which handles the resulting NLP. This choice is typically restricted among SNOPT [20], IPOPT [21], or WORHP [22].

The common characteristic to all of the above approaches is that the resulting NLP is solved as a single problem. This inevitably imposes an upper bound to the combined effect of growing vehicle model complexity and increasing the track length, mainly due to the conceivably high-memory footprint and the large number of variables in the resulting NLP. In the present paper, such standard approaches are designated as serial methods. Even if some computational steps in the solution of the NLP can take advantage of multicore CPU architectures, no specific MLTP problem reformulation is devised from the outset to profitably spread the computational load across different processors according to a divide-and-conquer strategy.

For problems characterized by the number of variables growing at an unprecedented scale, distributed optimization algorithms seem the only viable solution [23]. In particular, the alternating direction method of multipliers (ADMM) has proved to tackle efficiently, and in a distributed manner, global consensus optimization problems in different research contexts, such as control of microgrid electricity networks and smart grids [24, 25], estimation problem by a network of agents [26], and optimal asset utilization in power market operation [27], to mention only a few.

In the consensus-based ADMM method, the original cost function \(f(\boldsymbol{x})\) is split additively as \(\sum f_{i}(\boldsymbol{x}_{i})\), where \(f_{i}(\boldsymbol{x}_{i})\) pertain to a specific agent (or worker) who retains a local representation \(\boldsymbol{x}_{i}\) of the global variable \(\boldsymbol{x}\). Specifically, in a distributed computing environment, the agents represent the CPU-cores performing the optimization in parallel. and the term “consensus” denotes the requirement for each worker to achieve an agreement with its neighbors. One of the nodes, called the master, is responsible for updating the so-called consensus variable \(\boldsymbol{z}\). Each worker seeks to minimize its local objective (based on its data subset) but is forced by a mechanism to reach a consensus with the other workers. After completing the local optimization, each worker sends its updated local copy \(\boldsymbol{x}_{i}\) to the master. The master, in turn, updates the consensus variable \(\boldsymbol{z}\) so that the \(\boldsymbol{x}_{i}\) reach a consensus and then distributes the updated value back to the workers. The process is then repeated until convergence.

Even if convergence proofs of the ADMM algorithm are not available except under very special conditions [28], in practice, ADMM has proved successful in countless scenarios, and clever adjustments have been proposed in the literature to increase its robustness and to speed up its convergence [2933]. Many studies also investigated the influence on the convergence speed and robustness of relevant parameters such as the topology of the connections between workers [34] and the effect of synchronous/asynchronous flow of information among workers [35].

In the context of trajectory optimization, which is the broad area of our work, there are previous contributions. However, they pertain to different domains like human-Mars entry trajectory for a landing mission [36], multiagent/multirobot trajectory synthesis [37, 38], obstacle avoidance for kinematic vehicles models [39], legged locomotion [40], and multicontact model predictive control for robotic manipulation [41]. However, it is worth noting that, to the best of our knowledge, no prior contribution can be counted among distributed approaches for solving MLTPs with reliable dynamic models from a vehicle engineer’s perspective.

In the present paper, in contrast to prior work on MLTP, we focus on the specific structure of MLTPs and exploit it via ADMM [23]. In more detail, the optimal trajectory planning problem on a long track is broken into several segments, each pertaining to a given sector of the track. Each segment becomes a subproblem with a reduced number of variables and is discretized using the direct collocation method. The subproblems are solved classically, but in parallel and on distinct CPU cores. This ensures reduced memory footprint for each subproblem and increased robustness thanks to the reduced dimensionality. The solutions are iteratively forced to adhere to a single trajectory via properly devised consensus constraints, which progressively enforce continuity of state and input variables across segments via an augmented Lagrangian mechanism. This scheme offloads the overall problem complexity by distributing it among segments (and CPU cores) and enables synthesizing optimal trajectories on very long tracks with dense discretizations. It is worth noting that the proposed formulation is amenable to any existing (serial) trajectory optimizer for solving each subproblem.

Finally, it is worth remarking that MLTPs are of great importance in the automotive field. The capability to effectively address the minimum-lap-time challenge equips the automotive industry with the means to systematically assess different vehicle configurations and determine the most efficient option while providing guidelines for drivers. Furthermore, in the motorsport context, MLTP can be used as a powerful tool for investigating the impact of specific vehicle parameters on the driving behavior of the pilot, as evidenced in [3]. Here we want to establish the groundwork for extending these benefits to encompass large-scale problem domains.

2 Vehicle model

In this section, we introduce the dynamic vehicle model that serves as a benchmark to assess the potential of the ADMM methodology in solving MLTP problems. The proposed model is described in detail in [42], whereas here, to ensure clarity, we provide only the main highlights.

While the vehicle model used here has been previously applied in solving an MLTP on the Nordschleife circuit, as documented in [42], our current work introduces an innovative algorithm tailored for parallelizing the solution of extensive MLTPs, particularly in multilap scenarios. In contrast to [42], where the primary focus was on describing the vehicle model, the central objective of this paper is to present an original parallel approach. This approach is highlighted as crucial for enhancing computational efficiency when dealing with large-scale MLTPs.

Although our vehicle model is relatively accurate, its complexity falls in-between a classic double-track and a fully fledged multibody system. By utilizing mathematical tools mostly used in robotics, which are based on Lie-group and Lie-algebra methods [43, 44], and recursive algorithms [45] to express the dynamics, the equations are systematically structured in the form of a serial kinematic chain. In particular, the recursive algorithm described in [42] is employed, which is based upon the articulated body algorithm in [45]. Although our model does not possess a huge number of degrees of freedom, the ABA offers us a systematic approach while enjoying an algorithmic complexity of \(O(n)\), which scales linearly with the number of degrees of freedom. This leads to a significant reduction in the volume of the algebra during the assembly of the dynamic equations. In contrast, the classical Lagrange equation-based approach, with its complexity of \(O(n^{3})\), is not considered in this analysis due to its inferior performances.

2.1 Track parameterization

The first ingredient of the proposed model is the track parameterization, which is schematically illustrated in Fig. 1b. The track centerline \(\boldsymbol{c}(q_{1}) = \begin{bmatrix} x_{c}(q_{1}) & y_{c}(q_{1}) &z_{c}(q_{1}) \end{bmatrix} ^{T} \in \mathbb{R}^{3}\) with \(q_{1} \in [0, 1]\) is described by a NURBS curve as a function of the parameter \(q_{1}\), which not necessarily coincides with the curvilinear abscissa. The local frame \(\{B_{1}\}\), which follows the curve, is obtained using a special orthonormal parameterization, which also takes into account the track banking and slope. Specifically, once the tangent vector \(\boldsymbol{t}\) is computed, its projected counterpart \(\boldsymbol{t}_{\Pi _{\boldsymbol{k}_{0}}}\) on the \((xy)\)-plane \(\Pi _{\boldsymbol{k}_{0}}\) of the ground-fixed frame \(\{B_{0}\}\) is employed to derive the intermediate normal vector \(\boldsymbol{v}= \boldsymbol{k}_{0} \times \boldsymbol{t}_{\Pi _{ \boldsymbol{k}_{0}}}\). The final normal and binormal vectors are obtained by rotating \(\boldsymbol{k}_{0}\) and \(\boldsymbol{v}\) by the banking angle \(\nu \) around \(\boldsymbol{t}\). This formulation, in contrast to the classical Frenet–Serret formulas, in addition to account for banking, also avoids undesirable direction changes of \(\boldsymbol{n}\).

Fig. 1
figure 1

Schematics of the vehicle model embedded into the track (Color figure online)

In the context of our model, the pose transformation from \(\{B_{0}\}\) to \(\{B_{1}\}\) represents the effect of a special nonelementary joint, contained in the overall kinematic chain that represents the vehicle. This joint accounts for the sliding motion of the vehicle along the track centerline and has an associate twist (generalized velocity) induced by the curvilinear path constraint.

2.2 Vehicle parameterization

The model has six degrees of freedom (DoFs), associated with variables \(\boldsymbol{q}\), and is composed of two main parts: a kart-like part, which includes the unsprung masses and the wheels, and a chassis body, which represents the sprung masses. The schematics are shown in Fig. 1a. The first three virtual joints of the kinematic chain, configured as two prismatic joints followed by a revolute joint, constrain the position and orientation of the kart along the track surface. The remaining three joints, consisting of a prismatic joint followed by two revolute joints, connect the kart to the sprung chassis. These last joints also capture the first-order information of the overall suspension geometry as they contain springs and dampers equivalent to bounce \(k\), pitching \(k_{\theta}\), and rolling \(k_{\varphi}\) stiffness coefficients (and their corresponding damping coefficients \(c\), \(c_{\theta}\), and \(c_{\varphi}\)).

Overall, we define seven reference frames denoted as \(\{B_{i}\}\) (\(i = 0,\dots ,6\)), attached to the corresponding \(i\)th link. The frames \(\{B_{0}\}\) and \(\{B_{1}\}\) are the already mentioned ground-fixed frame and the track reference frame.

The relative 1-DoF motions (with the exception of the first one) are kinematically described by the unitary twists \(X_{i} \in \mathbb{R}^{6}\) \((i = 2,\dots , 6)\). These are associated with standard joints (either prismatic or revolute), which are used in the local product-of-exponentials (POE) formula [44] to define each relative homogeneous transformation matrix \(g_{i-1,i}(\boldsymbol{q}) \in SE(3)\) \((i=2,\dots ,6)\), between the links of the chain. Specifically, the last five joints represent (i) the lateral displacement \(q_{2}\) of the vehicle relative to the track centerline, (ii) the yaw angle \(q_{3}\) between the tangent vector of the track centerline and the kart frame \(\{B_{3}\}\), (iii) the vertical displacement \(q_{4}\) of the chassis relative to the ground, (iv) the pitch angle \(q_{5}\) of the chassis, and (v) the roll angle \(q_{6}\) of the chassis.

Among all the body velocities in the chain, the distal twist of \(\{ B_{3} \}\), denoted by \(V_{3}^{3} = [ v_{3_{x}}^{3}\ v_{3_{y}}^{3}\ v_{3_{z}}^{3}\ \omega _{3_{x}}^{3}\ \omega _{3_{y}}^{3}\ \omega _{3_{z}}^{3}]^{T}\),Footnote 1 is of special importance when computing tire slips and aerodynamic forces. Furthermore, \(v_{3_{x}}^{3}\) and \(v_{3_{y}}^{3}\) represent the classical longitudinal and lateral velocities, usually analyzed in telemetry inspections. The calculation of \(V_{3}^{3}\) is achieved during a recursive forward propagation of body poses and twists along the chain, which is needed in the subsequent derivation of the dynamic equations.

2.3 Dynamic equations

The dynamic equations are obtained adapting Featherstone’s articulated body algorithm (ABA) [45], which provides an efficient factorization of the direct dynamics of a kinematic chain. Including also the reconstruction equations, the algorithm gives

$$\begin{aligned} \dot{\boldsymbol{x}} = \begin{bmatrix} \dot{\boldsymbol{q}} \\ \dot{\boldsymbol{q}}_{v} \end{bmatrix} = \begin{bmatrix} \boldsymbol{q}_{v} \\ \text{ABA}\text{dyn}(\boldsymbol{q}, \boldsymbol{q}_{v}, f_{xa}, f_{xb}, \delta , \boldsymbol{w}), \end{bmatrix} = F(\boldsymbol{x}). \end{aligned}$$
(1)

Here the state is defined as \(\boldsymbol{x}= [\boldsymbol{q}^{T}\ \boldsymbol{q}_{v}^{T}]^{T}\), and \(\boldsymbol{q}_{v}\) and \(\dot{\boldsymbol{q}}_{v}\) represent the joint velocities and accelerations, respectively. By \(f_{xa}\) and \(f_{xb}\) we denote the total accelerating and braking forces exchanged between the tires and the road, respectively. The steer angle \(\delta \) is an additional input to the system. To solve the algebraic loops that arise during the calculation of the load transfers, we introduce additional algebraic variables \(\boldsymbol{w}\in \mathbb{R}^{7}\), whose terms are detailed later.

The interactions between the tire and track, globally represented by the resultant wrench \(W_{3}^{3} \in \mathbb{R}^{6}\) (generalized force acting on \(\{ B_{3} \}\)), can be split into two contributions, the in-plane (plane locally tangent to the road) \(W_{3_{E}}^{3} = [f_{3_{x}}^{3}\ f_{3_{y}}^{3} \ 0 \ 0 \ 0 \ m_{3_{z}}^{3} ]^{T}\) and out-of-plane \(W_{3_{J}}^{3} = [ 0\ 0 \ f_{3_{z}}^{3} \ m_{3_{x}}^{3}\ m_{3_{y}}^{3}\ 0 ]^{T}\) components. Since the vehicle is parameterized as a serial kinematic chain, the out-of-plane components arise, in the ABA framework, from the structural constraints exerted by the third (virtual) joint on \(\{ B_{3} \}\), whereas the in-plane components act as external “driving” forces. It is worth remarking that \(W_{3}^{3} = W_{3_{E}}^{3} + W_{3_{J}}^{3}\) has its in-plane components evaluated as the resultants of the forces exchanged between the road and each tire, encoded in \(f_{ij_{x}}\), \(f_{ij_{y}}\), and \(f_{ij_{z}}\) for the \(ij\)th wheel.Footnote 2

Assuming that a rear-wheel drive vehicle is equipped with an open differential, the longitudinal tire forces can be expressed as

$$\begin{aligned} f_{11_{x}} &= f_{12_{x}} = \frac{1}{2}f_{xb}k_{b}, \end{aligned}$$
(2)
$$\begin{aligned} f_{21_{x}} = f_{22_{x}} &= \frac{1}{2}f_{xb}(1-k_{b}) + \frac{1}{2}f_{xa}, \end{aligned}$$
(3)

where, \(k_{b}\) is the braking ratio.

The lateral forces \(f_{ij_{y}}(f_{ij_{z}})\) are derived from the Pacejka’s magic formula [46] as functions of the vertical forces \(f_{ij_{z}}\) on each wheel.

2.4 Additional algebraic equations

The out-of-plane components present the issue of giving rise to an algebraic loop: the tire vertical forces \(f_{ij_{z}}\) depend on the overall system variables \(\boldsymbol{q}\) and their derivatives \(\boldsymbol{q}_{v}\) and \(\dot {\boldsymbol{q}}_{v}\), which, in turn, depend on the in-plane forces. In the present implementation, we employ the following algebraic variables \(\boldsymbol{w}\) to conveniently cut open this loop:

$$ \boldsymbol{w}= \begin{bmatrix} f_{11_{z}}&f_{12_{z}}&f_{21_{z}}&f_{22_{z}}& f_{3_{x}}^{3} &f_{3_{y}}^{3} &m_{3_{z}}^{3}\end{bmatrix} ^{T}, $$
(4)

where \(f_{3_{x}}^{3}\) and \(f_{3_{y}}^{3}\) are the in-plane resultant forces (longitudinal and lateral), and \(m_{3_{z}}^{3}\) is the corresponding resultant moment along \(\boldsymbol{k}_{3}\) (yaw moment), which are the nonzero components of \(W_{3_{E}}^{3}\).

Therefore seven additional algebraic equations are required. The first four equations are associated with the vertical loads and are expressed as follows:

$$ f_{ij_{z}} = f_{z_{i0}} + f_{z_{ia}} + \Delta f_{z} + (-1)^{j}\Delta f_{z_{i}}, $$
(5)

where the notation used is borrowed from [47]. They are, in order, the static load, the aerodynamic force, and the two longitudinal and lateral load transfers. These terms are linked to the kinematic quantities and structural reaction loads of the third joint, as well as the in-plane forces, and can be easily obtained during the unfolding of the ABA algorithm. The remaining three equations expressing the resultant of the in-plane components are

$$\begin{aligned} f_{3_{x}}^{3} &= (f_{11_{x}} + f_{12_{x}})\cos \delta -(f_{11_{y}} + f_{12_{y}}) \sin \delta + f_{21_{x}} + f_{22_{x}}, \end{aligned}$$
(6)
$$\begin{aligned} f_{3_{y}}^{3} &= (f_{11_{y}} + f_{12_{y}})\cos \delta +(f_{11_{x}} + f_{12_{x}}) \sin \delta + f_{21_{y}} + f_{22_{y}}, \end{aligned}$$
(7)
$$\begin{aligned} m_{3_{z}}^{3} &= \left [(f_{11_{y}} + f_{12_{y}})\cos \delta +(f_{11_{x}} + f_{12_{x}})\sin \delta \right ]a_{1} -(f_{21_{y}} + f_{22_{y}})a_{2}. \end{aligned}$$
(8)

The quantities \(a_{1}\) and \(a_{2}\) in (8) represent the longitudinal distances of \(G_{6}\) from the front and rear axles, respectively.

Additional constraints are also imposed during the MTLP construction to account for power limits, adherence, complementarity constraints between the accelerating and braking forces, and path constraints required to remain within the track bounds, as detailed in Sect. 4. More details of the model employed are described in [42].

3 ADMM approach to the solution of MLTPs

The scope of our work is to solve an MLTP problem on a given track with a specific vehicle model. The formulation of a discretized optimal control problem takes the form of a general nonlinear programming problem (NLP) and is described by the following equations:

$$\begin{aligned} \underset{\boldsymbol{x}}{\text{minimize}}\;&f(\boldsymbol{x}), \end{aligned}$$
(9a)
$$\begin{aligned} \text{s.t.}\hspace{6mm} &\boldsymbol{g}(\boldsymbol{x}) = 0, \end{aligned}$$
(9b)
$$\begin{aligned} &\boldsymbol{h}(\boldsymbol{x}) \leq 0, \end{aligned}$$
(9c)

where \(\boldsymbol{x}\in \mathbb{R}^{N}\) are the optimization variables, \(f(\boldsymbol{x}) \in \mathbb{R}\) is the cost function, \(\boldsymbol{g}(\boldsymbol{x}) \in \mathbb{R}^{N_{g}}\) is the set of equality constraints, and \(\boldsymbol{h}(\boldsymbol{x}) \in \mathbb{R}^{N_{h}}\) is the set of inequality constraints.

Among the many techniques available to discretize and solve OCPs (see, e.g., [48]), in this study, we use the direct collocation method (more details will be provided in Sect. 4.2). Clearly, the problem dimension can increase dramatically when the MLT planning horizon grows large as in the case of long race tracks. The maximum size of the MLTP that can be solved is limited by both the complexity of the dynamic model and the performance of the hardware used. Within such limits, the problem can be solved as a single, separate one. In this case, we refer to such an approach as a serial approach and to its corresponding solution as a serial solution.

When we cross these limits or go well beyond, a parallel approach becomes mandatory. In this case, we refer to such an approach as a parallel approach and to its corresponding solution as a parallel solution. In the latter perspective, an asset of multicore processors can provide not only an effective computational improvement, but may represent the only way to handle huge problems, as demonstrated in many applications; see, e.g., [2427]. Among the most successful approaches to tackle large-scale optimization problems in parallel, we count the alternating direction method of multipliers (ADMM). This is the preferred choice in the present paper due its relative ease of implementation and intrinsic robustness. The main steps of the ADMM and the original adaptations specific to our OCP are described hereafter.

3.1 Partition of the variables in the parallel approach

The original problem is divided into \(N_{p}\) subproblems. The discretized states, controls, and algebraic variables are organized so that, internally to the \(i\)th sector, the sequence of states \(\boldsymbol{s}_{i}=\{ \boldsymbol{s}_{i,0}\ \boldsymbol{s}_{i,1}\ \cdots \ \boldsymbol{s}_{i,n_{i}} \}\) consists in \(n_{i}+1\) discretized samples of \(\boldsymbol{s}\in \mathbb{R}^{n_{s}}\), the sequence of controls \(\boldsymbol{u}_{i}=\{ \boldsymbol{u}_{i,0}\ \boldsymbol{u}_{i,1}\ \cdots\ \boldsymbol{u}_{i,{n_{i}-1}} \}\) represents \(n_{i}\) discretized samples of \(\boldsymbol{u}\in \mathbb{R}^{n_{u}}\), and the sequence of algebraic variables \(\boldsymbol{w}_{i}=\{ \boldsymbol{w}_{i,0}\ \boldsymbol{w}_{i,1}\ \cdots\ \boldsymbol{w}_{i,n_{i}-1} \}\) stands for \(n_{i}\) samples of \(\boldsymbol{w}\in \mathbb{R}^{n_{w}}\). Therefore \(\boldsymbol{s}_{i}\in \mathbb{R}^{N_{s_{i}}}\), \(\boldsymbol{u}_{i}\in \mathbb{R}^{N_{u_{i}}}\), and \(\boldsymbol{w}_{i}\in \mathbb{R}^{N_{w_{i}}}\), where \(N_{s_{i}}=(n_{i}+1) n_{s}\), \(N_{u_{i}}=n_{i} n_{u}\), and \(N_{w_{i}} = n_{i} n_{w}\). For convenience, we cast the internal augmented sequence in the \(i\)th sector as \(\boldsymbol{x}_{i}=\{ \boldsymbol{s}_{i}\ \boldsymbol{u}_{i}\ \boldsymbol{w}_{i} \}\in \mathbb{R}^{N_{i}}\), where \(N_{i}=(n_{i}+1)n_{s} + n_{i} (n_{u}+n_{w})\).

To account for the states, controls, and algebraic variables at the boundaries between neighboring sectors, it is convenient to define the head (h) and tail (t) sequences. With reference to the \(i\)th sector, assuming that \(o\) is the length of the sequence of states in the overlapping area, we define the head quantities \(\boldsymbol{s}_{i,h}=\{ \boldsymbol{s}_{i,-o} \ \boldsymbol{s}_{i,-o+1}\ \cdots \ \boldsymbol{s}_{i,0} \}\), \(\boldsymbol{u}_{i,h}=\{ \boldsymbol{u}_{i,-o}\ \boldsymbol{u}_{i,-o+1}\ \cdots\ \boldsymbol{u}_{i,-1} \}\), and \(\boldsymbol{w}_{i,h}=\{ \boldsymbol{w}_{i,-o}\ \boldsymbol{w}_{i,-o+1}\ \cdots \ \boldsymbol{w}_{i,-1} \}\). Similarly, we define the tail quantities \(\boldsymbol{s}_{i,t}=\{ \boldsymbol{s}_{i,n_{i}} \ \boldsymbol{s}_{i,n_{i}+1}\ \cdots \ \boldsymbol{s}_{i,n_{i}+o} \}\), \(\boldsymbol{u}_{i,t}=\{ \boldsymbol{u}_{i,n_{i}}\ \boldsymbol{u}_{i,n_{i}+1}\ \cdots \ \boldsymbol{u}_{i,n_{i}+o-1} \}\), and \(\boldsymbol{w}_{i,t}=\{ \boldsymbol{w}_{i,n_{i}}\ \boldsymbol{w}_{i,n_{i}+1}\ \cdots \ \boldsymbol{w}_{i,n_{i}+o-1} \}\). It comes handy to cast the augmented head and tail sequences in the \(i\)th sector as \(\boldsymbol{x}_{i,h}=\{ \boldsymbol{s}_{i,h}\ \boldsymbol{u}_{i,h}\ \boldsymbol{w}_{i,h} \}\) and \(\boldsymbol{x}_{i,t}=\{ \boldsymbol{s}_{i,t}\ \boldsymbol{u}_{i,t}\ \boldsymbol{w}_{i,t} \}\), respectively.

With reference to Fig. 2, considering as an example the situation at the boundary between sector \(i\) and \(i+1\), it is worth observing that \(\boldsymbol{x}_{i,t}\) and \(\boldsymbol{x}_{i+1,h}\) are local representations of the augmented states for the \(i\)th and \((i+1)\)th sectors in the transition area. Then, to gradually enforce coherence in a consensus fashion, the consensus augmented states \(\boldsymbol{z}_{i}\) are introduced, whose role is to negotiate possibly conflicting requirements of adjacent sectors. In particular, with reference to the transition between sector \(i\) and \(i+1\), coherence is registered if the following conditions are met: \(\boldsymbol{x}_{i,t}=\boldsymbol{x}_{i+1,h}=\boldsymbol{z}_{i}\).

Fig. 2
figure 2

Conceptual scheme for the allocation of variables among the adjacent sectors (subproblems running in parallel) in the ADMM setting. The upper panel shows a start condition. In the lower panel a typical situation obtained upon convergence is depicted. Here adjacent states and corresponding consensus variables have reached an agreement (Color figure online)

We also introduce the extended head (H) and extended tail (T) sequences. Assuming that \(e\) is the length of the sequence of states in the extended areas, we define the extended head quantities \(\boldsymbol{s}_{i,H}=\{ \boldsymbol{s}_{i,-e} \ \boldsymbol{s}_{i,-e+1}\ \cdots\ \boldsymbol{s}_{i,-1} \}\), \(\boldsymbol{u}_{i,H}=\{ \boldsymbol{u}_{i,-e}\ \boldsymbol{u}_{i,-e+1}\ \cdots \ \boldsymbol{u}_{i,-1} \}\), and \(\boldsymbol{w}_{i,H}=\{ \boldsymbol{w}_{i,-e}\ \boldsymbol{w}_{i,-e+1}\ \cdots\ \boldsymbol{w}_{i,-1} \}\). Similarly, we define the extended tail quantities \(\boldsymbol{s}_{i,T}=\{ \boldsymbol{s}_{i,n_{i}+1} \ \boldsymbol{s}_{i,n_{i}+2}\ \cdots \ \boldsymbol{s}_{i,n_{i}+e} \}\), \(\boldsymbol{u}_{i,T}=\{ \boldsymbol{u}_{i,n_{i}}\ \boldsymbol{u}_{i,n_{i}+1}\ \cdots \ \boldsymbol{u}_{i,n_{i}+e-1} \}\), and \(\boldsymbol{w}_{i,T}=\{ \boldsymbol{w}_{i,n_{i}} \ \boldsymbol{w}_{i,n_{i}+2}\ \cdots\ \boldsymbol{w}_{i,n_{i}+e-1} \}\). For brevity, both extended augmented head and tail sequences in the \(i\)th sector are casted as \(\boldsymbol{x}_{i,H}=\{ \boldsymbol{s}_{i,H} \ \boldsymbol{u}_{i,H}\ \boldsymbol{w}_{i,H} \}\) and \(\boldsymbol{x}_{i,T}=\{ \boldsymbol{s}_{i,T} \ \boldsymbol{u}_{i,T}\ \boldsymbol{w}_{i,T} \}\), respectively. It is worth noting that incorporating segments of variables that extend into neighboring sectors can aid in promoting agreement among independent solutions that arise from adjacent sectors right from the initial ADMM iteration, as detailed in Sect. 4.3.2.

It is convenient also to introduce \({\hat{\boldsymbol{x}}}_{i}=\{ \boldsymbol{x}_{i,H} \ \boldsymbol{x}_{i}\ \boldsymbol{x}_{i,T}\}\) and \({{\check{\boldsymbol{x}}}_{i}}= \{ \boldsymbol{z}_{i-1}\ \boldsymbol{x}_{i}\}\). It is worth noting that while building \({{\check{\boldsymbol{x}}}_{i}}\), the duplication of \(s_{i,0}\), contained in both \(\boldsymbol{z}_{i-1}\), \(\boldsymbol{x}_{i}\), is avoided by counting it only once during the concatenation. Hereafter, for concision, both the augmented sequences \(\boldsymbol{x}_{i}\) and the extended augmented sequences \({\hat{\boldsymbol{x}}}_{i}\) will be equivalently referred to as variables (\(\boldsymbol{x}_{i}\)) and extended variables (\({ \hat{\boldsymbol{x}}}_{i}\)), respectively.

3.2 A naive parallel approach

Having partitioned the variables \(\boldsymbol{x}\) as described above, the solution of original problem (9a)–(9c) can be recovered by properly piecing together the solutions for all sectors. In principle, for the \(i\)th sector, the solution \({\hat{\boldsymbol{x}}}^{*}_{i}\) can be computed independently from the others as follows:

$$\begin{aligned} {\hat{\boldsymbol{x}}}^{*}_{i} =\, & \underset{{\hat{\boldsymbol{x}}}_{i}}{\operatorname{arg min}}\; f_{i}({ \hat{\boldsymbol{x}}}_{i}), \end{aligned}$$
(10a)
$$\begin{aligned} \text{s.t.}\hspace{6mm} &\boldsymbol{x}_{i,h}- \boldsymbol{z}_{i-1} = 0, \end{aligned}$$
(10b)
$$\begin{aligned} &\boldsymbol{x}_{i,t}- \boldsymbol{z}_{i} = 0, \end{aligned}$$
(10c)
$$\begin{aligned} &\boldsymbol{g}({\hat{\boldsymbol{x}}}_{i}) = 0, \end{aligned}$$
(10d)
$$\begin{aligned} &\boldsymbol{h}({\hat{\boldsymbol{x}}}_{i}) \leq 0 , \end{aligned}$$
(10e)

where (10b) and (10c) are, as explained above, the equality constraints necessary to reestablish coherence in the transition between sectors \(i-1\) and \(i\) and between sectors \(i\) and \(i+1\). Hereafter, for concision, these constraints will be simply referred to as consensus constraints. Note, however, that the optimal values of consensus variables \(\boldsymbol{z}_{i}\) and \(\boldsymbol{z}_{i-1}\) are not known in advance.Footnote 3 Therefore a mechanism to make them progress toward the unknown optimal transition points should be devised.

3.3 A consensus-based ADMM parallel approach

With reference to [23], to solve in parallel problems of the form (10a)–(10e), one clever mechanism is provided by the alternating direction method of multipliers algorithm. It is worth remarking that the method presented hereafter is our original adaptation of the classical algorithm found in [23].

At a generic \(k\)th iteration of the ADMM algorithm, the following augmented Lagrangian is introduced for the \(i\)th sector:

$$ \begin{aligned} & L_{i}({\hat{\boldsymbol{x}}}_{i}; \boldsymbol{z}_{i-1}^{k}, \boldsymbol{z}_{i}^{k}; \boldsymbol{y}_{i,t}^{k}, \boldsymbol{y}_{i-1,h}^{k}) = f_{i}({\hat{\boldsymbol{x}}}_{i}) + \phi (\boldsymbol{x}_{i,t}, \boldsymbol{x}_{i,h}, \boldsymbol{z}_{i-1}^{k}, \boldsymbol{z}_{i}^{k}; \boldsymbol{y}_{i,t}^{k}, \boldsymbol{y}_{i-1,h}^{k}), \end{aligned} $$
(11)

where we defined the auxiliary function \(\phi (\cdot )\) as follows:

$$ \begin{aligned} \phi (\cdot ) = \boldsymbol{y}_{i,t}^{k T}(\boldsymbol{x}_{i,t}- \boldsymbol{z}_{i}^{k}) + \frac{\rho _{i,t}^{k}}{2}\lVert \boldsymbol{x}_{i,t}-\boldsymbol{z}_{i}^{k}\rVert _{2}^{2} + \boldsymbol{y}_{i-1,h}^{k T}(\boldsymbol{x}_{i,h}-\boldsymbol{z}_{i-1}^{k}) + \frac{\rho _{i,h}^{k}}{2}\lVert \boldsymbol{x}_{i,h}-\boldsymbol{z}_{i-1}^{k} \rVert _{2}^{2}. \end{aligned} $$
(12)

The vectors \(\boldsymbol{y}_{i,t}^{k}\), \(\boldsymbol{y}_{i-1,h}^{k}\) are the dual variables (multipliers), and \(\rho ^{k}_{i,t}, \rho ^{k}_{i,h} > 0\) are penalty parameters. This new cost is composed by the partial cost \(f_{i}({\hat{\boldsymbol{x}}}_{i})\), and the term \(\phi (\cdot )\) that penalizes the mismatch at the boundaries of sector \(i\), i.e., between \(\boldsymbol{x}_{i,t}\) and \(\boldsymbol{z}_{i}^{k}\) and between \(\boldsymbol{x}_{i,h}\) and \(\boldsymbol{z}_{i-1}^{k}\), via an augmented Lagrangian strategy.

One ADMM iteration is composed by three fundamental steps. As the first step, the following problem is solved:

$$\begin{aligned} {\hat{\boldsymbol{x}}}_{i}^{k+1} = \underset{{\hat{\boldsymbol{x}}}_{i}}{\operatorname{arg min}} \; &L_{i}({ \hat{\boldsymbol{x}}}_{i}; \boldsymbol{z}_{i-1}^{k}, \boldsymbol{z}_{i}^{k}; \boldsymbol{y}_{i,t}^{k}, \boldsymbol{y}_{i-1,h}^{k}), \end{aligned}$$
(13a)
$$\begin{aligned} \text{s.t.}\hspace{6mm} &\boldsymbol{g}({\hat{\boldsymbol{x}}}_{i}) = 0, \end{aligned}$$
(13b)
$$\begin{aligned} &\boldsymbol{h}({\hat{\boldsymbol{x}}}_{i}) \leq 0. \end{aligned}$$
(13c)

Let us denote by \({}^{\sharp}G_{i}^{k}(\boldsymbol{z}_{i}) = {^{\sharp}}L_{i}^{k}( \boldsymbol{z}_{i}) + {^{\sharp}}L_{i+1}^{k}(\boldsymbol{z}_{i})\) the sum of those cost functions where the \(i\)th consensus variable \(\boldsymbol{z}_{i}\) comes into play. The notation implies that the evaluation of the function \(G_{i}^{k}(\boldsymbol{z}_{i})\) takes place using quantities at the \(k\)th ADMM iteration except for the argument \({\hat{\boldsymbol{x}}}_{i}^{k}\), which is replaced with its best update \({\hat{\boldsymbol{x}}}_{i}^{k+1}\), computed from (13a)–(13c). Then, as the second step of the ADMM procedure, the best a posteriori update for the consensus \(\boldsymbol{z}_{i}^{k+1}\) is computed as follows:

$$ \boldsymbol{z}_{i}^{k+1} = \underset{\boldsymbol{z}_{i}}{\operatorname{arg min}} {^{\sharp}}G_{i}^{k}(\boldsymbol{z}_{i}). $$
(14)

An analytical solution for (14) can be derived, which defines the following simple update rule:

$$\begin{aligned} \boldsymbol{z}_{i}^{k+1} = \frac{\left (\rho _{i,t}^{k}\boldsymbol{x}_{i,t}^{k+1} + \rho _{i+1,h}^{k}\boldsymbol{x}_{i+1,h}^{k+1} \right ) + \left ( \boldsymbol{y}_{i,t}^{k}+ \boldsymbol{y}_{i,h}^{k}\right )}{\rho _{i,t}^{k} + \rho _{i+1,h}^{k}}. \end{aligned}$$
(15)

As the third and last step of the ADMM procedure, the update of the dual variables is performed following an integral law as follows:

$$\begin{aligned} \boldsymbol{y}_{i,h}^{k+1}&= \boldsymbol{y}_{i,h}^{k}+ \rho _{i+1,h}^{k} ( \boldsymbol{x}_{i+1,h}^{k+1} - \boldsymbol{z}_{i}^{k+1} ), \end{aligned}$$
(16a)
$$\begin{aligned} \boldsymbol{y}_{i,t}^{k+1}&= \boldsymbol{y}_{i,t}^{k}+ \rho _{i,t}^{k} ( \boldsymbol{x}_{i,t}^{k+1} - \boldsymbol{z}_{i}^{k+1} ). \end{aligned}$$
(16b)

Hence it is worth underlining that the ADMM algorithm consists of the steps described in Eqs. (13a)–(13c), (15), and (16a)–(16b). The algorithm proceeds until certain convergence criteria are met. In particular, in Sect. 3.4 a stopping criterion for ADMM algorithm is explained. A general idea of the flow of information involved in the update steps in Eq. (16a)–(16b) can be elicited from Fig. 3.

Fig. 3
figure 3

Block-diagram representation of our ADMM approach tailored to the solution of MLTPs. From left to right the three ADMM steps described in Eqs. (13a)–(13c), (14), and (16a)–(16b) are illustrated. From top to bottom it is possible to see how the consensus variables and multipliers are shared between adjacent sectors and how they interact with each other in the unfolding computations (Color figure online)

3.4 Proposed stopping criterion

Necessary and sufficient optimality conditions for the ADMM problem are presented by Boyd et al. [23], along with a reasonable termination criterion to determine ADMM convergence. Their criterion has been widely used and tested in the literature; see, e.g., [27, 31], or [39].

Here, inspired by their approach, we propose an adaptation, which is more practical in the context of MLTP solutions.

The convergence of ADMM is characterized in terms of the residuals

$$ \boldsymbol{r}^{k}_{i,h}= \boldsymbol{x}_{i,h}^{k} - \boldsymbol{z}_{i-1}^{k},\qquad \boldsymbol{r}^{k}_{i,t}= \boldsymbol{x}_{i,t}^{k} - \boldsymbol{z}_{i}^{k},\qquad \boldsymbol{d}^{k}_{i}= \boldsymbol{z}_{i}^{k + 1} - \boldsymbol{z}_{i}^{k}, $$
(17)

where \(\boldsymbol{r}^{k}_{i,h}\) and \(\boldsymbol{r}^{k}_{i,t}\) are the head and tail primal residual, respectively, and \(\boldsymbol{d}^{k}_{i}\) is the dual residual. It is worth pointing out a peculiarity of the first step of the proposed formulation. In the original formulation proposed by Boyd et al. [23], the constraints (13b) and (13c) are restricted to be only linear equality constraints. They are treated with a penalty approach, incorporating them into the cost function. Then each of them comes to play a role when computing the primal residual. However, in our present study, we take a distinct approach. In fact, since constraints (13b) and (13c) are nonlinear and lack a direct dependence on the consensus variables, these are fulfilled at each iteration of the ADMM algorithm by separate interior-point solver (IPOPT) instances that solve problems (13a)–(13c) (in parallel). As a consequence, the primal residuals are computed only as the error between the states \(\boldsymbol{x}_{i,h}^{k}\) and \(\boldsymbol{x}_{i,t}^{k}\), which always comply with constraints (13b) and (13c) and their corresponding consensus variables \(\boldsymbol{z}_{i-1}^{k}\) and \(\boldsymbol{z}_{i}^{k}\).

As discussed in [23], a reasonable termination criterion is that the primal and dual residuals must be small. Considering that in our problem states, controls, and algebraic parameters have precise physical meanings for the vehicle engineer, it makes more sense to declare ADMM convergence when the following inequalities hold for each sector:

$$ |\boldsymbol{r}^{k}_{i,h}| \leq \boldsymbol{\epsilon }_{r}, \qquad | \boldsymbol{r}^{k}_{i,t}| \leq \boldsymbol{\epsilon }_{r}, \qquad | \boldsymbol{d}^{k}_{i}| \leq \boldsymbol{\epsilon }_{d}, $$
(18)

where \(\boldsymbol{\epsilon }_{r}\) and \(\boldsymbol{\epsilon }_{d}\) are the vectors of tolerance values (possibly with different components), and \(|\cdot |\) returns a vector with the absolute values of each component of the original entry; \(\boldsymbol{\epsilon }_{r}\) is defined as the primal residual tolerance, and \(\boldsymbol{\epsilon }_{d}\) as the dual residual tolerance. This allows a finer grain treatment of the convergence for all state, control, and algebraic variables.

4 Optimal control problem formulation

In this section, we outline the key aspects of the minimum lap-time problem formulation, including (i) the components of the cost function, (ii) the structure of the inequality, equality, and terminal constraints, and (iii) the initial guess and variable scaling. These considerations are essential for achieving a practical and efficient solution.

4.1 States, controls, and algebraic variables

With reference to the model illustrated in Sect. 2, the states, controls, and algebraic parameters of the dynamic model are related to the vectors of variables introduced in Sect. 3 as follows:

$$\begin{aligned} \boldsymbol{s}&= [q_{1}, q_{2}, q_{3}, q_{4}, q_{5}, q_{6}, \dot{q}_{1}, \dot{q}_{2}, \dot{q}_{3}, \dot{q}_{4}, \dot{q}_{5}, \dot{q}_{6}] \in \mathbb{R}^{12}, \end{aligned}$$
(19)
$$\begin{aligned} \boldsymbol{u}&= [f_{xa}, f_{xb}, \delta ] \in \mathbb{R}^{3}, \end{aligned}$$
(20)
$$\begin{aligned} \boldsymbol{w}&= [f_{11_{z}}, f_{12_{z}}, f_{21_{z}}, f_{22_{z}}, f_{3_{x}}^{3}, f_{3_{y}}^{3}, m_{3_{z}}^{3}] \in \mathbb{R}^{7}. \end{aligned}$$
(21)

It is important to highlight that our MLTP is formulated in the spatial domain using the track coordinate \(q_{1}\) (defined in Sect. 2) as the independent variable, as thoroughly described in our previous work [49]. Therefore the vehicle position \(q_{2}\) and orientation \(q_{3}\) (along with the motions \(q_{4}\), \(q_{5}\), and \(q_{6}\) of bounce, pitch, and roll) with respect to the track reference frame should depend on the track coordinate \(q_{1}\), highlighted in Fig. 1. To reconcile the differential equations obtained through the ABA algorithm, which naturally depend on time \(t\) as the independent variable, we need to translate the equations as functions of the independent track coordinate. The spatial formulation of dynamics can be easily recovered by computing \(\boldsymbol{s}_{,q_{1}} = d\boldsymbol{s}/dq_{1}\) as follows:

$$ \boldsymbol{s}_{,q_{1}}(q_{1}) = \dot{\boldsymbol{s}}/\dot{q_{1}} = F( \boldsymbol{s}(q_{1}), \boldsymbol{u}(q_{1}), \boldsymbol{w}(q_{1}))/ \dot{q_{1}}, $$
(22)

where \(F(\cdot )\) is the dynamic vector field defined in Eq. (1), in which the reconstruction equations \(\dot{\boldsymbol{q}} = \boldsymbol{q}_{v}\) are pieced together with the accelerations \(\dot{\boldsymbol{q}}_{v}\) obtained through the articulated body algorithm. The explicit dependence on the variable \(q_{1}\) instead of time \(t\) is remarked.

4.2 OCP discretization via direct collocation

The generic OCP written for the \(i\)th sector is discretized following a direct collocation strategy, as described in [3]. Its peculiarity is that the original OCP is transformed into a large (but sparse) nonlinear program (NLP).

We now focus on the structure of the \(i\)th subproblem (corresponding to the \(i\)th sector of the track). First, an equally spaced grid of track coordinates (\(n_{i}\) mesh intervals) is sampled such that, within sector \(i\), \(q_{1_{j}} = j h_{q}\) \((j = 0, \dots , n_{i})\) with \(h_{q} = (q_{1_{n_{i}}} - q_{1_{0}})/n_{i}\), and \(q_{1_{0}}\) and \(q_{1_{n_{i}}}\) are the starting and final values, respectively. However, according to the definitions in Sect. 3.1, the discretized augmented states \({\hat{\boldsymbol{x}}}_{i}\) include extended head and tails instances, which serve the purpose of easing convergence offering to the \(i\)th problem a landscape beyond its boundaries. Therefore we have \({\hat{\boldsymbol{x}}}_{i}=\{{\hat{\boldsymbol{x}}}_{i,j}|j=-e, \ldots , n_{i}+e\}\), where \({\hat{\boldsymbol{x}}}_{i}(q_{1_{j}}) = {\hat{\boldsymbol{x}}}_{i,j}\) is the discretized set of the optimization variables associated with the \(j\)th node. In agreement with the dimension of controls, states, and algebraic vectors stemming from the dynamic model in Sect. 2, each \({\hat{\boldsymbol{x}}}_{i,j}\in \mathbb{R}^{22}\).

The first step in (13a)–(13c) of the ADMM, at its \(k\)th iteration, can now be reshaped as

$$\begin{aligned} {\hat{\boldsymbol{x}}}_{i}^{k+1} = \underset{{\hat{\boldsymbol{x}}}_{i}}{\operatorname{arg min}}\; &\left [ \phi ( \boldsymbol{x}_{i,t}, \boldsymbol{x}_{i,h}, \boldsymbol{z}_{i-1}^{k}, \boldsymbol{z}_{i}^{k}; \boldsymbol{y}_{i,t}^{k}, \boldsymbol{y}_{i-1,h}^{k}) + \sum _{j = -e}^{n_{i}-1+e} f_{i,j}({\hat{\boldsymbol{x}}}_{i,j}, { \hat{\boldsymbol{x}}}_{i,j+1}, {\hat{\boldsymbol{v}}}_{i,j})\right ] \end{aligned}$$
(23a)
$$\begin{aligned} \text{s.t.}\hspace{6mm} & \dots \\ &\boldsymbol{g}({\hat{\boldsymbol{x}}}_{i,j}, {\hat{\boldsymbol{x}}}_{i,j+1}, {\hat{\boldsymbol{v}}}_{i,j}) = 0, \end{aligned}$$
(23b)
$$\begin{aligned} &\boldsymbol{h}({\hat{\boldsymbol{x}}}_{i,j}, {\hat{\boldsymbol{x}}}_{i,j+1}, {\hat{\boldsymbol{v}}}_{i,j}) \leq 0. \\ & \dots \end{aligned}$$
(23c)

It is worth noting that, as usual, a set of \(d\) collocation points \(q_{1_{j,m}}\) \((m = 1, \dots , d)\) can be chosen in each interval \([q_{1_{j}}; q_{1_{j+1}}]\), allowing for a \(d\)th-degree polynomial representation of the state trajectory within each \(j\)th interval. Therefore the collocation states have dimension \({\hat{\boldsymbol{v}}}_{i,j}\in \mathbb{R}^{22(d+1)}\).

The equality constraints \(g(\cdot )\) include the dynamic equations (22) and the path algebraic equations (here omitted for brevity) involving variables \(\boldsymbol{w}\) and \(\boldsymbol{u}\). It is worth noting that this set of constraints contains also the complementary constraint \(f_{xa}f_{xb} = 0\), which prevents the traction and braking forces from acting simultaneously.

The inequalities \(\boldsymbol{h}(\cdot )\leq 0\) (23c) involve all path constraints limiting states, controls, and algebraic parameters. Power limits, adherence constraints, and bounds on the lateral displacement \(q_{2}\), necessary to remain within track bounds, are all included in this form. Finally, the cost function is approximated in each interval by a quadrature formula. The typical stage cost \(f_{i,j}(\cdot )\) for the \(i\)th sector takes the form

$$ f_{i,j} = (h_{q}/\dot{q}_{1_{i,j}})^{2} + K_{\delta}(\delta _{i,j+1} - \delta _{i,j})^{2}, $$
(24)

where the first term penalizes lap time, and the second terms prevents abrupt variations of the steering angle weighted by the coefficient \(K_{\delta}\). It is worth noting that comparing (11) with (23a)–(23c), \(f_{i}(\cdot )\) is simply \(\sum f_{i,j}\) \((j=-e,\ldots ,n_{i}-1+e)\).

4.3 Choice of the ADMM parameters

We shift now our attention to the calibration of the ADMM parameters, whose choice plays an important role on the convergence performance. The parameters are (i) the number of subproblems \(N_{p}\) in which the original MLTP problem is divided, (ii) the number of discretization intervals \(n_{i}\) in each sector, (iii) the length of the overlapping areas measured by the number of samples \(o\) (see Sect. 3), (iv) the lengths of the extended head and the extended tail measured by the number of samples \(e\), (v) the penalty parameters \(\rho _{i,t}\) and \(\rho _{i,h}\) for the head and tail of sector \(i\), and (vi) the tolerance vectors \(\boldsymbol{\epsilon }_{r}\) and \(\boldsymbol{\epsilon }_{d}\).

4.3.1 Number of subproblems \(N_{p}\)

It is useful to highlight that in the MLTP framework, when the optimization of a single lap of a given track is considered, the number of subproblems \(N_{p}\) corresponds to the number of sectors \(N_{s}\) in which the track is divided, i.e., \(N_{p} = N_{s}\). Instead, when considering multiple laps \(N_{\text{lap}}\), the subproblems are \(N_{p} = N_{s} N_{\text{lap}}\). The sectors may or may not be of the same length, depending on the user settings. For of simplicity, in this paper, we divide the track into sectors of equal length.

A small value of \(N_{p}\) is the obvious choice when performing the computations on a laptop with a low CPU core count. This results in high-dimensional NLP subproblems (13a)–(13c), which can be slow to solve. On the other hand, a small value of \(N_{p}\) generally requires few ADMM iterations to converge since consensus among a small number of interfaces (i.e., few sectors) is more likely to be achieved (see conditions in (18)). On the contrary, a high value of \(N_{p}\) is the obvious choice on a cluster with many CPU cores since this allows us to drastically cut the size of the NLP subproblems (13a)–(13c), thus promoting their fast convergence. However, the introduction of many consensus variables may require many ADMM iterations for convergence (18). Therefore, as it is clear from the results discussed in Sect. 5, the optimal trade-off is to be decided on a case-by-case basis. When the dimensionality of a problem makes it difficult to solve as a whole, splitting it into smaller parts becomes the only viable solution.

4.3.2 Discretization intervals \(n_{i}\), length of overlapping areas \(o\), and length of extended head/tail \(e\)

The proposed algorithm has three setup parameters, \(n_{i}\), \(o\), and \(e\). In our approach, a uniform value of \(n_{i}\) is set for all subproblems. To determine the appropriate value of \(n_{i}\) for a given track, a convergence analysis of the optimal time is performed through a serial solution.

The value of \(o\), which indicates the number of points in the mesh grid to be considered for coherence between sectors, is set to 1 to ensure fast convergence while maintaining continuity. The positions of overlapping areas along the track depends on \(N_{p}\) and \(n_{i}\). In practice, they should be located at key points for better convergence properties, for example, in the straights. Future work should investigate the optimal location for the overlapping areas to ensure sector coherence where vehicle dynamics are smoother, as this can improve ADMM convergence. It is worth noting that we did not implement any specific strategy for the location of overlapping areas to test the robustness of our framework. This aspect remains open for future investigation.

Our work also makes an original contribution through the incorporation of extended heads and tails of length \(e\) in the algorithm. This parameter does not appear in classical ADMM algorithms [23], but its presence in our work is motivated by a speed up in the convergence process. By incorporating segments of variables that extend into neighboring sectors can aid in promoting agreement among independent solutions that arise from adjacent sectors right from the first ADMM iteration. The key observation is that, considering neighboring solutions of sectors \(i\) and \(i+1\), if the extended tail \(\boldsymbol{x}_{i,T}\) and head \(\boldsymbol{x}_{i+1,H}\) extend sufficiently beyond the overlapping areas \(\boldsymbol{x}_{i,t}\) and \(\boldsymbol{x}_{i+1,h}\), then at the solution they will be near to the (unique) global optimum and hence already close to each other. Therefore the two consecutive subproblems \(i\) and \(i+1\) will quickly reach consensus at their interface. However, a good compromise between the length \(o\) of the overlapping areas and the subproblem extremities \(e\) is necessary for fast convergence. Choosing a small value of \(e\) may not ensure this distance, whereas an excessive value may increase the problem dimension and computational time unnecessarily.

4.3.3 Update rules for the adaptive penalty parameters \(\rho _{i,t}\) and \(\rho _{i,h}\)

Nonconstant penalty parameters and adaptive laws to update them are already available in the literature [23, 31, 33]. In the present work, due to the particular structure of the connection graph between subproblems, two distinct values \(\rho _{i,t}\) and \(\rho _{i,h}\) have been introduced in (12) with the goal of improving convergence in practice. Their update strategy is reminiscent of that proposed in [23], and for the \(i\)th subproblem, we pose

$$\begin{aligned} \rho _{i,j}^{k+1} = \textstyle\begin{cases} \tau \rho _{i,j}^{k} &\text{if} \hspace{5mm} \| r_{i,j}^{k} \|_{2} > \mu \| d_{i}^{k} \|_{2}, \\ \rho _{i,j}^{k}/\tau &\text{if} \hspace{5mm} \| d_{i}^{k} \|_{2} > \mu \| r_{i,j}^{k} \|_{2}, \\ \rho _{i,j}^{k} &\text{otherwise,} \end{cases}\displaystyle \end{aligned}$$
(25)

where \(\mu > 1\) and \(\tau > 1\) are parameters, and \(j = \{t,h\}\) corresponds to the tail or head penalty, respectively. Here we set \(\mu = 10\) and \(\tau = 2\). This approach guarantees that the primal and dual norms are kept within a factor of \(\mu \) of one another while they converge to zero.

Finally, the tolerance vectors \(\boldsymbol{\epsilon }_{r}\) and \(\boldsymbol{\epsilon }_{d}\) have been assigned for each optimization variable based on sensible engineering accuracy.

4.4 Warm start & scaling

The success of the ADMM algorithm is closely tied to the convergence of each subproblem during the \(k\)th iteration, as missing even one of these convergences can cause the algorithm to fail. Therefore it is crucial to properly “warm start” the ADMM. In this work, this is achieved through a homotopy approach [50]. The key idea of this method is to set up a sequence of optimal control problems that are easier to solve, gradually transforming them into the original problem until the original NLP is solved. In this process the previous solution provides a “warm start” to initialize the subsequent problem. Note that while the homotopy technique increases robustness, it also increases the computational time required to obtain the optimal solution. In this work, we perform three homotopy iterations before starting the ADMM. It is worth noting that the homotopy technique has not been employed to warm start the serial solution, as our numerical tests revealed that this was not necessary. Including it anyway would have added unnecessary ballast to the serial solution and lead to an unfair comparison with the parallel (ADMM) solution.

As said, before starting the actual ADMM iterations, three homotopy steps are performed where the Lagrangian function \(L_{i}^{k}(\cdot )\) of the \(i\)th subproblem in Eq. (11) is substituted with

$$ \tilde{f}_{i}({\hat{\boldsymbol{x}}}_{i}) = \lambda f_{i}({ \hat{\boldsymbol{x}}}_{i}) + (1-\lambda )J_{i}({\hat{\boldsymbol{x}}}_{i}), $$
(26)

where \(J_{i}({\hat{\boldsymbol{x}}}_{i}) = (v_{3_{x}}^{3} - \bar{v}_{3_{x}}^{3})^{2} + q_{2}^{2}\). According to the model notation (see Sect. 2), minimizing cost function \(J_{i}(\cdot )\) would lead to a solution where the vehicle has to maintain a constant speed, \(\bar{v}_{3_{x}}^{3}\) while progressing along the track centerline. The scalar parameter \(\lambda \in [0,1]\) is such that when \(\lambda = 0\), Eq. (26) defines the artificial problem whose solution is easy to obtain, whereas \(\lambda = 1\) is associated with the original cost function. It is worth remarking that while performing these iterations, the term \(\phi (\cdot )\) defined in (12) related to the consensus constraints in (11) is removed, whereas \(\phi (\cdot )\) is reintroduced coherently from the first ADMM iteration onward.

Another variation with respect to problem (13a)–(13c) is that also the complementary constraint is relaxed through a tolerance \(\epsilon _{ab} \geq 0\) such that \(-\epsilon _{ab} \leq f_{xa}f_{xb} \leq \epsilon _{ab}\). Therefore our three homotopy iterations start with \(\lambda = 0\) and \(\epsilon _{ab} > 0\) and end with \(\lambda = 1\) and \(\epsilon _{ab} = 0\).

In line with the strategy adopted for the serial problem, during the first homotopy iteration, the provided initial guess assumes a constant speed of the vehicle along the track centerline. The guess for the remaining optimization variables (controls and algebraic variables) are estimated via inverse dynamics. The final solutions, obtained after this preliminary process, are used to initialize the first ADMM iteration. In this way, a very robust warm start is provided.

To conclude, it is worth mentioning that also scaling is crucial for avoiding numerical issues and improving the convergence rate in NLPs as they contain states, controls, and algebraic variables with different ranges. Therefore a normalization is performed on all discretization points using the corresponding expected maximum values, resulting in the scaled variables falling within the range \([-1,1]\).

5 Results

The proposed consensus-based ADMM algorithm is validated and put on a test on the Nurburgring race track, which is one of the longest and most famous circuits in the motorsport context. In particular, the MLTP is solved for the Nordschleife version of Nurburgring, whose length is \(\simeq 21\) km.

The optimal solution of the MLTP is obtained and discussed for a formula SAE vehicle. The choice was driven by the availability, for this setup, of accurate model parameters. For more detail on vehicle parameters, omitted here for brevity, we refer the interested reader to our previous work [42].

The ADMM algorithm is first validated against the serial solution on one lap of Nordschleife. In the serial solution the problem is solved as a single problem and is considered here as the ground-truth solution since its reliability has already been shown in [42].

Then, to evaluate the performance of the ADMM approach in solving high-dimensional optimization problems, a series of multilap problems are formulated and solved. The aim of these problems is to investigate whether the ADMM approach offers a more efficient solution when faced with a significant increase in problem dimensionality compared to the traditional serial approach.

Finally, a single lap is solved multiple times with different ADMM settings to investigate its performances when the number of subproblems \(N_{p}\) is increased but the size of the overall problem is not excessively large.

The validation process is performed on a laptop with 2.30 GHz (boosted at 4.5 GHz) Intel(R) Core(TM) i7-10875H CPU and 32 GB RAM representative of a standard portable personal computer. Instead, the multilap problems are solved, and the lap splitting is analyzed on a cluster workstation equipped with four sockets containing an Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40 GHz (boosted at 3.4 GHz) each, and 3.70 TB RAM made available by the Sistema Informatico Dipartimentale (SID) of the Università di Pisa. The latter has overall 96 CPU cores and was chosen as the ideal platform for exploiting distributed computations. Hereafter, for concision, we refer to the above hardware settings as laptop and cluster, respectively.

The optimal control problems are coded in a scripting environment using the MATLAB interface to the open-source CasADi framework [51], which provides building blocks to efficiently formulate and solve large-scale optimization problems. To solve each NLP in (13a)–(13c), the IPOPT [21] solver is used.

5.1 Validation of the ADMM approach

5.1.1 Mesh size calibration

The validation process begins with a convergence analysis necessary to accurately determine the appropriate mesh size in the discretization of the track. This is accomplished by solving the MLTP on Nordschleife as a single problem while gradually increasing the number of mesh intervals. According to the ADMM notation, this involves setting \(N_{p} = 1\) and increasing the number of discretization points \(n_{1}\). The results of this analysis, which was conducted on the laptop configuration, are presented in Table 1.

Table 1 Convergence analysis with respect to the overall number \(n_{1}\) of discretization intervals on a single lap of the Nordschleife circuit, solved as a single (serial) problem. Hardware configuration: laptop. Whereas the optimal time \(t_{\text{opt}}\) does not change significantly while increasing \(n_{1}\), the computational time \(T_{\text{s}}\) and the number of iterations vary substantially. The mesh \(n_{1} = 1500\) is the best compromise between the speed of the solution process, measured by \(s/\text{Iter}\), and its quality, measured by \(t_{\text{opt}}\). This value is therefore chosen to perform the subsequent comparisons between the serial and ADMM solutions

During the analysis, various key performance indicators (KPIs) are monitored as \(n_{1}\) is increased. These include (i) the optimal time \(t_{\text{opt}}\), required for the vehicle to complete one lap, (ii) the computational time \(T_{\text{s}}\), required by IPOPT to solve the problem, (iii) the number Iter of IPOPT iterations to convergence, (iv) the average amount of seconds per iteration \(s/\text{Iter}\), which gives an insight on the speed of the solution process, and (v) the total number of NLP variables, which is given by \(N_{\text{var}} = n_{1}(n_{s}(d+1) + n_{u} + n_{z}) + n_{s}\) and is helpful in tracking the problem dimension.

Table 1 shows how the optimal time \(t_{\text{opt}}\), along with other KPIs, varies as a function of the size of the mesh. The best compromise between the speed of the solution process, measured by \(s/\text{Iter}\), and its quality, measured by \(t_{\text{opt}}\), is given by setting \(n_{1} = 1500\). This will be fixed in all the tests performed hereafter for the validation and testing of ADMM on the Nordschleife track. This choice does not trade accuracy for efficiency, like \(n_{1}=1000\), for which an error in \(t_{\text{opt}}\) of 0.5 s (518.1 vs 517.6 s) may not be acceptable. On the other end, it would require to select \(n_{1}=3000\), thereby almost doubling the number of variables \(N_{\text{var}}\) (from 69012 to 138012) and almost doubling \(s/\text{Iter}\) (from 6.07 to 12.1), to improve by only 0.1 s (\(0.02\%\)) the value of \(t_{\text{opt}}\). With the chosen value \(n_{1}=1500\), the mesh size corresponds to one discretization point every 14 meters along the track.

5.1.2 ADMM parameters employed

Turning our attention to the setup of the ADMM parameters, the optimal settings for the Nordschleife track have been determined through numerical tests by the authors. These settings consist of \(N_{p}=4\), the number of sector elements \(n_{i}\) coherent with the selected mesh for the track, and \(e=40\) points, so that both \(\boldsymbol{x}_{i,H}\) and \(\boldsymbol{x}_{i,T}\) span approximately 560 meters beyond the consensus interface. The distance pertaining to each \(i\)th subproblem and corresponding to the variable \({\hat{\boldsymbol{x}}}_{i}\) is computed by dividing the track length by \(N_{p}\) and adding the extended head and tail lengths. These internal sectors corresponding to \({\hat{\boldsymbol{x}}}_{i}\) (\(i=2,\ldots , N_{p} -1\)) have an optimized distance of approximately 6300 meters. Instead, the first (\(i=1\)) and last (\(i=N_{p}\)) subproblems, which have only either an extended tail or an extended head, respectively, present an optimized distance of approximately 5750 meters.

5.1.3 Discussion of the validation process

Using the above-discussed setup parameters, the ADMM results are presented in a concise format in Table 2. This table encompasses the KPIs that pertain specifically to the parallel approach. These comprise (i) the optimal time \(t_{\text{opt}}\), required for the vehicle to complete one lap, (ii) the computational time \(T_{\text{ADMM}}\), required to execute the ADMM iterations, (iii) the homotopy time \(T_{h}\), i.e., the time spent in performing the three homotopy iterations necessary to warm start the problem, (iv) the total computational time \(T_{p}\) of the parallel approach, with \(T_{p} = T_{h} + T_{\text{ADMM}}\), (v) the number of iterations of the ADMM algorithm, denoted as Iter (or \(k\)), (vi) the average amount of seconds per ADMM iteration \(s/\text{Iter}\), computed as \(T_{\text{ADMM}}/k\), and (vii) the total number of NLP variables \(N_{\text{var}}\), given by \(\sum _{i = 1}^{N_{p}} N_{\text{var},i}\). Here the number of variables of the \(i\)th subproblem is computed as \(N_{\text{var},i} = (n_{i} + n_{e} e)(n_{s}(d+1) + n_{u} + n_{z}) + n_{s}\), where \(n_{e} = 2\) (\(i = 2, \dots , N_{p} - 1\)), and \(n_{e} = 1\) (for \(i=1\) and \(i=N_{p}\), i.e., the first and last subproblems), which have only either an extended tail or an extended head, respectively.

Table 2 Numerical results obtained through the ADMM algorithm for an overall number of discretization intervals \(n_{1} = 1500\). Hardware configuration: laptop. The overall computational time \(T_{p}\) for the parallel approach encompasses the homotopy time \(T_{h}\) and the ADMM time \(T_{\text{ADMM}}\), i.e., \(T_{p} = T_{h} + T_{\text{ADMM}}\). Although the ADMM algorithm is very efficient, requiring only three iterations \(k=3\), the overhead of a warm start process via homotopy causes the parallel approach to be slower than the serial approach

Upon comparing Table 2 with the second row of Table 1, two key observations are in order. The first one is that for the same overall track discretization points, the optimal laptime value \(t_{\text{opt}}\) provided by the ADMM algorithm and the serial solution are exactly the same, i.e., \(t_{\text{opt}}=517.7\) s. This indeed validates the ADMM algorithm against the classical solution. The second observation is that the parallel approach requires a computational time \(T_{p} = 581.8\) s (see the 5th column of Table 2), larger than \(T_{s} = 516.1\) s required by the serial solution.

The latter result can be attributed mainly to three different factors: (i) In the first place, the homotopy approach, used to warm start the ADMM algorithm, but absent in the serial solution, increases the total computational time \(T_{p}\) since \(T_{p} = T_{\text{ADMM}} + T_{h}\); however, this is inevitable at this stage to ensure that a fair comparison is performed between two equally robust approaches to the solution; (ii) The laptop configuration employed in the validation process may not be very efficient in managing parallel computations due to its hardware architecture, thereby favoring the serial computation at the expense of the solution of the ADMM; note in support of this particular reason that when the cluster configuration is employed, the situation is reversed (see results in Table 3 and the corresponding discussion in Sect. 5.2); (iii) The NLP dimension considered for the validation (\(N_{\text{var}}=69012\) for the serial case; see Table 1) is not large enough to cause the serial approach either to max out the computational resources of the laptop configuration or to fail completely.

Table 3 Results for MLTPs on multilap scenarios for both serial and ADMM approaches. Hardware configuration: cluster. The hardware architecture allows the ADMM algorithm to be more efficient with respect to the serial approach already on \(N_{\text{lap}} = 1\) lap. As the number of laps \(N_{\text{lap}}\) increases and, as a result, the problem size \(N_{\text{var}}\) grows, the ADMM algorithm outperforms the serial one and represents the sole feasible option to maintain computational times within acceptable bounds

As said, there is no way out for the additional computational burden due to point (i), since the homotopy approach is required to increase ADMM robustness. However, to demonstrate the computational advantages of the parallel approach when run on an appropriate hardware configuration, issues raised in points (ii) and (iii) will be explored in greater detail in Sects. 5.2 and 5.3. In particular, in Sect. 5.2, the NLP dimension is increased by optimizing on multilap horizon, whereas the parallel and serial optimizations are both performed using the cluster configuration.

To demonstrate the consistency of the solutions provided by the serial and parallel solutions in the validation case, in addition to obtaining the same optimal lap time \(t_{\text{opt}}\), a compelling comparison is reported in Figs. 4 and 5. Specifically, Fig. 4 depicts the optimal longitudinal speed profiles of the vehicle for the serial and ADMM solutions. Here the red curve represents the serial solution, whereas the marker-style thick lines (in the background) refer to the ADMM solution. The different colors refer to the distributed solutions computed in parallel for the \(N_{p}=4\) sectors. It is worth noting that the solutions correctly match at the interfaces where both the primal and dual residual vectors \(\boldsymbol{r}^{k}_{i,h}\), \(\boldsymbol{r}^{k}_{i,t}\), and \(\boldsymbol{d}^{k}_{i}\) satisfy conditions (18).

Fig. 4
figure 4

[Upper panel] Comparison of optimal speed profiles computed as the serial solution (thin red line) and parallel (ADMM) solution (marker-style thick lines). Hardware configuration: laptop. Different colors refer to different sectors computed in parallel in the ADMM setting. The additional vertical dashed lines mark the interfaces between adjacent sectors. [Lower panel] The error profile \(e_{v_{x}}\), computed as the difference of the serial and ADMM solutions (thin orange line), even considering the small ripples at the interfaces (with negligible max value in the order of 5 mm/s), demonstrates a perfect agreement between the two solutions everywhere (Color figure online)

Fig. 5
figure 5

Optimal trajectory comparison of serial solution (red) and parallel solution (blue). The trajectories are perfectly overlapped, underling the equivalence of the two solutions (Color figure online)

In Fig. 4, also a scaled version of the profile of the longitudinal speed error \(e_{v_{x}}\) between serial and ADMM solutions is visible. The small ripples at the interfaces, with max value in the order of 5 mm/s, foster the statement of a perfect agreement, from a practical standpoint, between the two solutions everywhere. It is worth noting, in passing, that the range in which the speed error exponentially vanished may hold significant implications. These intervals may represent a form of extinction lengths, which indicate how deeply perturbations at the sector’s boundary affect the solution inward the sector. Note that neither the error entity nor the length remains constant across different interfaces, as the local curvature, inclination, and banking of the track at the boundaries may have a notable influence. Consequently, additional research is required in the future to fully elucidate the true significance of this phenomenon.

Additionally, Fig. 5 provides a qualitative comparison of the optimal trajectories for a particular track sector of the Nurburgring circuit, underscoring again the equivalence of the two solutions.

As a result, the suggested findings serve to verify the efficacy of the ADMM algorithm in providing an identical solution to that of a traditional serial approach.

5.2 Comparison of ADMM and serial solutions in multilap scenarios

In this section, we set out to quantitatively verify whether the ADMM-based approach delivers on the promises of increased efficiency when faced with solving problems of very large dimensions. To this sake, starting from the usual horizon of one lap (\(N_{\text{lap}}=1\)), a series of multilap problems (\(N_{\text{lap}}=2, 3, 4, 8, 16\)) are formulated and solved either in a serial or parallel fashion via the ADMM approach.

With reference to Sect. 4.3, in the ADMM framework, each lap is divided into \(N_{s}=4\) sectors, the number of sector elements \(n_{i}\) is coherent with the selected mesh for the track, and \(e=40\) points, so that both \(\boldsymbol{x}_{i,H}\) and \(\boldsymbol{x}_{i,T}\) span approximately 560 meters. With reference to Table 3, the number of ADMM subproblems is \(N_{p} = N_{s} N_{\text{lap}}\), so that \(N_{p}=4, 8, 12, 16, 32, 64\).

Table 3 shows the results of the tests run on the cluster configuration for both serial and ADMM approaches. The first row corresponds to a single-lap MLTP (\(N_{\text{lap}}=1\)), whereas rows 2–6 correspond to multilap MLTP problems (\(N_{\text{lap}}=2, 3, 4, 8, 16\)). The optimal solutions obtained from both methods are consistent, as seen from the corresponding (minimum-time) \(t_{\text{opt}}\) columns. Even for a single-lap problem, when multicore CPU architectures are employed, the ADMM approach beats the serial one already for \(N_{\text{lap}}=1\) with \(T_{p}=567.1\) s vs \(T_{s}=633.8\) s.

Furthermore, it is apparent that as the number of laps increases, the ADMM outperforms the serial approach significantly. Comparing the \(T_{s}\) and \(T_{p}\) columns, \(T_{s}\) increases almost linearly with \(N_{\text{lap}}\) until \(N_{\text{lap}}=8\), peaking for \(N_{\text{lap}}=16\) (i.e., \(N_{\text{var}}=1 104 012\) variables), when a huge \(T_{s}\simeq 13\) h and 14 min is registered. In contrast, \(T_{p}\) remains almost constant with increasing \(N_{\text{lap}}\) until \(N_{\text{lap}}=8\), with a slight increase for \(N_{\text{lap}}=16\), where \(T_{p}\simeq 15\) min, which still denotes a remarkable performance. This problem contains \(N_{\text{var}}=1 336 608\) variables. The increase of \(T_{p}\) as \(N_{\text{lap}}\) increases, which is not theoretically expected due to the availability of a large number of CPU cores, is mostly due to inter-CPU communication overheads and is unavoidable. However, the key takeaway is that for large-scale problems (over 1 million variables), which could also be the result of planning on shorter horizons but with larger dynamic models, the proposed distributed approach is the only viable solution to keep computational times within acceptable limits.

5.3 Testing ADMM performances for \(N_{\text{lap}}=1\) with varying \(N_{p}\)

In this section, we set out to evaluate the performance of ADMM by increasing the number of subproblems while the optimization horizon is kept constant. For this purpose, we select a fixed optimization horizon of one lap (\(N_{ \text{lap}} = 1\)) and solve the associated MLTP by dividing the track into \(N_{p} = 2, 4, 8, 16, 32\) sectors. By doing so we can analyze the effect of increasing the number of subproblems on ADMM performance without changing the optimization horizon on a small-scale problem.

For these test cases, the number of sector elements \(n_{i}\) is coherent with the selected mesh for the track, i.e., one discretization point every 14 meters (similarly to the choice in Sect. 5.1.1). The length of the extended tail and head is the same for each test case, and it is set according to the length of the shortest subproblem. For example, considering that for \(N_{p} = 32\), the distance pertaining to the \(i\)th subproblem (corresponding to variable \({{\check{\boldsymbol{x}}}_{i}}\)) is approximately 647 meters, we set \(e = 22\). Hence both \(\boldsymbol{x}_{i,H}\) and \(\boldsymbol{x}_{i,T}\) span approximately 305 meters. With these settings, the length of the internal sectors corresponding to \({\hat{\boldsymbol{x}}}_{i}\) (\(i=2,\ldots , N_{p} -1\)) have an optimization distance of approximately 11000 meters when \(N_{p} = 2\) and of 1260 meters when \(N_{p} = 32\). It is worth remarking that the optimization distance for the other cases (i.e., \(N_{p} = 4, 8, 16\)) assumes intermediate values in that range.

Table 4 presents the results of the lap-splitting analysis in a condensed format. From the table four key aspects of the parallel approach are apparent. The first one is highlighted in the 2nd row, where \(N_{p} = 4\). A comparison of this row with the ADMM outcome in the 1st row of Table 3 reveals the impact of the parameter \(e\). Reducing the length of the extended tail and head by half causes a four-fold increase in both the computational time \(T_{p}\) and the number of iterations \(k\). This finding suggests that the chosen value of \(e\) may not provide a sufficient distance beyond the overlapping area to ensure a quick consensus convergence (refer to Sect. 4.3.2).

Table 4 The table shows the results for the lap-splitting analysis. Here the planning horizon is fixed to \(N_{\text{lap}} = 1\) lap, and the number of subproblems \(N_{p}\) is increased. Hardware configuration: cluster. A high value of \(N_{p}\) leads to a rise in the number of iterations \(k\) to convergence while the computational time required for each ADMM iteration, encoded in \(s/\text{Iter}\), decreases

In Fig. 6 a pictorial representation of the convergence process with \(N_{p} = 4\) is shown. For clarity, the extended tail and head have been trimmed in the plots. With reference to Figs. 6a, 6b, 6c, and 6d, we can observe that the most significant changes in trajectory occur during the initial four iterations. From a careful observation we can note that the extended head traj\(_{2,H}\) is displaced from the right of traj1 (at \(k=1\), Fig. 6a) to its left (at \(k=4\), Fig. 6d). Then the variations in the shape of the trajectories are reduced gradually, as evident from the sequence of Figs. 6e–6i for \(k=8,\ldots ,13\). During this phase, ADMM addresses the strict constraints on the primal and dual residuals, leading the consensus to its optimal configuration.

Fig. 6
figure 6

An highlight on ADMM convergence process for optimal trajectory when \(N_{p} = 4\) (Color figure online)

A second key observation can be elicited from the 6th column of Table 4. Increasing the number of subproblems leads to an increase in the number of ADMM iterations required for convergence. This is because a larger value of \(N_{p}\) leads to an increased number of consensus interfaces, which may require more iterations to reach an agreement. This phenomenon can be attributed to two factors. Firstly, increasing the number of consensus interfaces raises the likelihood of them being located in areas with nonsmooth vehicle dynamics, such as the center of a turn. Secondly, as \(N_{p}\) increases, the consensus variables associated with the head and tail interfaces become closer. Subsequent changes to the consensus variables during ADMM iterations can significantly impact the optimal solution \(\boldsymbol{x}_{i}\).

The third pretty intuitive aspect is confirmed from the 7th column of Table 4: when parallelization is increased (\(N_{p}\) increases), the computational time per iteration \(s/\text{Iter}\) decreases. As discussed in Sect. 4.3.1, high values of \(N_{p}\) can drastically cut the size of the NLP subproblems.

However, a fourth observation is in order: in terms of overall computational time \(T_{p}\), it is not always clear which is the most influential factor between the increased numbers of ADMM iterations (Iter) required to reach consensus and the reduction in the computational time per iteration (\(s/ \text{Iter}\)) as \(N_{p}\) increases. In fact, from the 3rd column in Table 4, in this particular test case, the ADMM computational time \(T_{p}\) does not show a clear trend as \(N_{p}\) increases. This is observed when comparing the results for \(N_{p} = 16\), for which \(T_{p}=2146\) s, with those for \(N_{p} = 8\) and 32, where \(T_{p}=2655\) s and \(T_{p}=2793\) s are registered, respectively. As a general rule, however, we can state that when the problem dimension is small, as in the case of Table 4, using ADMM is not necessarily a wise choice, as the serial approach may be more efficient in providing satisfactory results. Based on our numerical test cases, the ADMM approach should be devoted either to large- and huge-scale problems or to more complex models. Specifically, when we mention a more complex model, we are referring to advanced multibody vehicle models. These models include features like individual wheel handling, suspension system dynamics, aerodynamic maps, and comprehensive Pacejka or thermo-mechanical tire models. Additionally, this complexity may extend to cover hybrid or electric powertrain systems, including considerations for battery dynamics. In those cases the presented distributed optimization becomes more of a necessity.

6 Conclusions

In this paper, we presented a parallel approach to address minimum-lap-time problems characterized by a number of variables growing at an unprecedented scale. Starting from the classical serial formulation, in which the resulting NLP is solved as a single problem, we presented a consensus-based alternating direction method of multipliers (ADMM) approach to solve MLTPs. It seems the first time that MLTPs involving accurate dynamic models, from a vehicle engineer’s perspective, are solved in a distributed fashion.

The aim of our work is to demonstrate the convenience and, in some cases, the necessity of using the parallel approach. To this sake, various tests are presented involving a race-car on the Nurburgring track taking into account both slope and banking.

First and foremost, the ADMM approach is shown to successfully converge in locating optimal consensus interfaces and adhering to user-defined tolerances. As a matter of fact, a comparison between the parallel and serial optimization, employed as the reference solution, yields identical outcomes. However, although the parallel and serial approaches achieve the same optimal trajectories, they show a significant difference in their computational performances.

Based on the validation procedure (Sect. 5.1) and analysis of the multilap scenarios (Sect. 5.2), two interwoven conditions favor the convenience of the ADMM over the serial approach. The first one, evident from the multilap analysis (see Table 3), is the problem dimension. For large-scale problems (over 1 million variables), which could also occur when dealing with shorter horizons but with larger dynamic models, the proposed distributed approach outperforms the serial one and allows us to keep computational time within acceptable limits. The second one is the availability of multicore CPU architectures, in which the distributed algorithm can be efficiently deployed.

The analysis involving the lap-splitting scenario (Sect. 5.3) brings about the importance of two parameters that must be carefully addressed when employing the ADMM approach. These are the number of subproblems \(N_{p}\) and the extended horizon beyond the consensus interface \(e\). A high degree of parallelization, associated with a large value of \(N_{p}\), should be devoted to both large- and huge- scale problems. Indeed, increasing the number of subproblems in other situations may lead to a higher number of ADMM iterations and larger computational times with respect to a standard serial approach. A special attention should be reserved also to the preview length associated with \(e\). Reducing the length of the extended tail or head may drastically increase the parallel algorithm computational time, as clearly shown when comparing the single-lap results with \(N_{p} = 4\) obtained in Sect. 5.2 with those in Sect. 5.3.

To sum up, the key finding of this study is that the ADMM is particularly beneficial for MLTP problems when planning for long horizons or using complex models. In such cases the ADMM approach with multicore CPU architectures can drastically reduce the computational time. When pushing the limits to very large- and huge-scale problems, resorting to a distributed approach like ADMM becomes mandatory. However, a final caveat is in order: when utilizing the proposed parallel approach, careful attention must be paid to its setup parameters. Further insights in this direction are left for future work.