Keywords

1 Introduction

Over the past decade, the penetration of non-gasoline vehicles such as electric, hybrid, and plug-in hybrid vehicles in Germany has multiplied (Fig. 12.1). At present, there are close to 240,000 registered electric and plug-in hybrid vehicles in Germany. Moreover, according to the German Association of Energy and Water Industries (BDEW), as of March 2020, there are 27,730 publicly available charging stations to serve the electric and plug-in hybrid vehicle fleet of Germany VDA (2020). Today, the vast majority of electricity-powered vehicles in Germany are private cars. As the German government has ambitious plans to encourage electric mobility in Germany in the coming years, the penetration of electricity-powered vehicles should grow steadily.

Fig. 12.1
figure 1

The growth of electric, hybrid, and plug-in hybrid vehicle penetration in Germany from 2011 to 2020. (Data: Kraftfahrt-Bundesamt)

Electrification has profound economic and ecological impacts on the future of the mobility sector. New business models arise based on the provision of different mobility services and shared economy. Moreover, we see the emergence of a complete ecosystem around mobility due to the converging trends of the physical and digital domains and the innovation and business models (Dia, 2019; Barreto et al., 2020).

This article focuses on the digital domain, which provides the playing field for the stakeholders, e.g., electric vehicles (EVs), autonomous vehicles (AS), charging stations, smart grid, and fleet management, to interact seamlessly. The efficiency of these complex interactions between the stakeholders determines the optimality of the outcome received by each participating stakeholder. This article presents an application based on deep reinforcement learning to optimize agent interactions and decision-making in an IoT-enabled smart mobility ecosystem. The optimization objective of agent interactions is the aggregate utility of all interacting agents described using the weighted sum approach. The algorithm also defines a set of soft and hard constraints. The hard constraints must always be adhered to by the agents; however, the soft constraints may be violated conditionally under extreme circumstances. The methodology adheres to the automatic template-based approach via CI/CD framework using GitLab runner.

Previous research work has addressed specific aspects of optimal agent interactions in a smart mobility landscape. For example, Lin et al. (2016) present a linear programming model of an optimal routing problem that takes into consideration charging station location and the cost. Chen et al. (2018) present a weighted sum multi-objective optimization model that takes into account the different preferences of the user. The mixed-integer quadratic programming model presented in this study is solved using a commercial optimizer such as CPLEX or Gurobi. Bessler and Grønbæk (2012) use a heuristic-based approach. The algorithm evaluates the optimal routing plan for EVs by considering the distance to the charging stations in the vicinity, the traffic situation, and the feasible charging patterns. However, to our understanding, the simultaneous consideration of multiple stakeholder perspectives has not been done in the past.

2 Architecture

The overall objective in this section is to describe the proposed architecture in detail with the addition of continuous integration or a continuous deployment (CI/CD)-based approach (Sharif et al. 2020b). We also explain how the algorithm processes contextual information to meet the necessary optimality conditions for each of the stakeholders. Moreover, four types of stakeholders are identified while keeping the concept of smart mobility in mind. These four types of stakeholders are EV end-user, grid operator, fleet operator, and charging station maintainer which have been explained thoroughly in our previous publication (Sharif et al. 2020a). Two of them, i.e., EV end-user and grid operator, are in the focus of the smart mobility use case in this article.

As shown on the left-hand side of Fig. 12.2 of the proposed architecture, each stakeholder provides a set of individual inputs (Xi, XjXm) which are “daily travel activities, routing suggestion(s), car-battery, and environment” with associated actions (Ai, AjAm). The rewards (Ri, RjRm) are the computed output(s) such as “charging type, the distance towards charging station, charging cost, etc.” Furthermore, there might be a probability that few of the inputs are mutually equivalent in more than one stakeholder with dissimilar priorities as well as constraints.

$$ Q\left(s,a\right)=r+\gamma {\max_a}^{\prime }Q\left({s}^{\prime }+{a}^{\prime}\right) $$
Fig. 12.2
figure 2

ARaaS: a deep reinforcement learning base architecture

After this, these sets of information are processed via a state-of-the-art approach using a deep reinforcement learning-based algorithm via a Bellman equation, in which the system learns from the Q-value of state s and action a as inputs: (Q(s, a)) must be equivalent to the instantaneous reward r acquired as a result of that action and additionally the Q-value of the finest feasible next action a’ taken from the next state s’, which is multiplied by a given discount factor. Moreover, this is a value with premises range ∈ (0,1] which is a hyper-parameter.

We further need to decide how much weight to assign to the impurity for the short-term and long-term rewards (Tang et al., 2020; Nguyen et al., 2020a; François-Lavet et al., 2018). The right-hand side of Fig. 12.2 shows a specific output generated for different environments. For example, the EV end-user will acquire the best schedule and routing selection bestowing toward the needs of the car’s battery and the environment with appropriate personal convenience. Grid operators will acquire the demand forecasts of electricity of a specific region according to the reservation of the charging stations, which decreases the fluctuation of the electric. The system will continuously learn from its environment and observe state(s) by interpolating weights, etc. (Li et al., 2020; Nguyen et al., 2020b; Wang et al., 2013).

The core functionality is exposed as a component to stakeholders from other domains with a distinct objective over a self-developed middleware-as-a-service component (see the right-hand side of Fig. 12.2). This extension leads us to prove the performance of our model at the urban scale level where high-dimensional data and scalability of models are required. For example, “Stakeholder ‘X’ would like to collect information coming from EV end-users, fleet-managers, charging stations, and the power-grid for a certain area. Use an algorithm to process this data over high computing nodes and suggest the best trade-off for all actors in the eco-system.” To fulfill such a type of user scenario, we develop our electro-vehicle middleware where we promote a smart mobility use case established on the interaction between the stakeholders as depicted in Fig. 12.2, where stakeholders from different domains exchange information according to their objective. The middleware-as-a-service utility provides a set of services for each application (app. SAx, SAy, SAz, etc.) to handshake with high-performance computing nodes such as Nx, Ny, and so on. Each of these services requires high-performance computation nodes to execute their service request, and finally, the results calculated by the algorithm are assigned back to the respective app (Sharif et al., 2017; Amogh Vardhan et al., 2019; Espeholt et al., 2018; Jiang et al., 2019). The initial objective of the algorithm is to find the optimal trade-off scenario for EV charging by considering a set of conflicting interests of multiple stakeholders acting in the smart mobility ecosystem (Fig. 12.2).

Suppose that the City Council of Stuttgart, Germany, would like to organize an event where people from all over the country are expected to participate. We continue with the same set of stakeholders. Due to the popularity of the electric car, the event organizer expects many people from neighboring cities will participate via personal transportation, i.e., often EV. The event organizer needs to distribute resources optimally in terms of mobility, which will be a very challenging task (Alyousef et al., 2018). To fulfill this extent of expectation, organizers require frequent and timely updates of the resource distribution.

We proposed an automation-based service mobility which keeps running algorithm(s) using a GitLab CI/CD runner to pass computation-intensive tasks over to a high-performance computing cluster and find the best trade-off for all actors in the ecosystem. In the next section, we present a user scenario that promotes the proposed methodology.

3 User Scenario

Tina lives in Tübingen and owns an electric car. She drives her car to Stuttgart regularly, where she spends much of the time working and networking. Once, on the way to Stuttgart, she observes that her battery is low on charge, and she would like to locate a charging station near her current location P(x, y). She finds the charging stations CS1(x′, y′), CS2(x′, y′), CS3(x′, y′), and CS4(x′, y′), each of which has different charging options (i.e., fast charging or slow charging) at different charging costs. The price of charging the EV (in EUR/kWh) at the four charging stations are a1(t)–a4(t) (see Fig. 12.3). Note that charging prices are given as a function of time to accommodate for time-varying electricity prices. To locate the optimal charging location that matches her requirements, she uses the algorithm proposed in this paper. Optimality is a perception that merely depends on the user’s primacies to a set of conflicting interests.

Fig. 12.3
figure 3

The optimal charging station’s route selection

In this example, Tina has a high priority to requirements such as the charging station availability, charging price, distance to the charging station, charging time, and potential service disruptions. Once Tina chooses her priorities, the algorithm processes her requirement and recommends her the most appropriate charging location that fits her requirements. Moreover, the algorithm can compare the charging station of choice with the other charging stations in the vicinity. The application allows her to make an informed decision about the best charging station that fits her need and also gives her the option to reserve a charging point ahead of time to confirm availability. Once the reservation is complete, the application ensures that the charging point remains available at the specified time (see Fig. 12.4).

Fig. 12.4
figure 4

Use case elaboration as per stakeholder’s perspective

On the other hand, the local grid operator monitors the electricity demand forecast, i.e., Objy in Fig. 12.3, variation due to EV charging requirements. For example, Wolfgang works for the local grid operator and is responsible for the uninterrupted electricity supply in his control area. Due to the rapid adoption of EVs and many public charging stations set up to serve those vehicles, he knows that there can be peak times when electricity demand can suddenly increase. He has several strategies to deal with such peak demands; for example, he can activate reserve power supplies or activate a demand response plan. However, without an accurate forecast or a warning in advance, the activation of demand response or reserve power can be more expensive.

The application can provide the grid operator with a forecast of the electricity demand due to vehicle charging the next 15–60 minutes’ period. Note that the forecast has more likelihood to be precise for 15 minutes’ forecast period rather than for 60 minutes’ forecast period due to uncertainty, which the application takes into account. Based on the grid operator’s constraints, the application can also highlight where the potential demand-supply bottlenecks can occur. This information helps Wolfgang to plan the best course of action to ensure the reliability of the electricity supply in advance. With our application’s service, now Wolfgang also has an additional action that he can take to mitigate supply bottlenecks, which is to advise charging station owners to interrupt their services for the incoming service requests. In other words, the service availability status of a charging station can be updated by request from the grid operator that serves as a “proactive” demand response strategy. Activation of demand response, either passive or proactive, incurs a cost to the grid operator and a loss of utility to vehicle owners whose services are denied.

4 Simulation Environment

The system evaluation is assessed with respect to an EV end-user objective such as optimal cost and with respect to a grid operator objective such as energy demand or charging station availability, on behalf of the event organizer, i.e., the City Council. The event organizer would like to take a closer look at the resource demand and supply distribution optimally in between the event’s participants.

Revisiting the end-user from the use case example (see Sect. 12.3), Tina’s car has a usable battery capacity of 120 Ah, and it is compatible with both slow- (maximum charging rate of 11 kW) and fast-charging (maximum charging rate of 50 kW) connectors. When Tina decides to look for a charging station, the state of charge of the battery has already degraded to 10%.

Figure 12.4 shows a graphical overview of the use case. The EV drives past four different charging stations CS1–CS4. Tina may decide to check for a charging location at any random point along the route denoted by P1–P3, and the optimizer yields a different objective value depending on the context related to that point. Figure 12.5 shows the optimal objective function value calculated for the EV at different locations on the driving route. The different colors represent the different charging stations. In this example, the optimal objective value is also the minimum cost, which, however, is not always true when multiple contradicting concerns and end-user priorities are taken into consideration when evaluating the objective function.

Fig. 12.5
figure 5

(a) Optimal cost calculation graph of the EV. (b) A 30-min electricity demand forecast for one charging station. The uncertainty of the forecast increases for longer prediction horizons

5 Conclusion and Future Work

Presently, we have only picked up one scenario with a smaller time-stamp, i.e., EV end-user and grid operator, in our simulation environment. Moreover, the simulated use case in a static environment that predicts the maturity of the system with multiple stakeholder participation. Therefore, the current simulation environment does not include time-varying contexts such as variable pricing and power distribution forecasts. From the software architecture point of view, the service app middleware layer is already presented in another paper.

The dynamic coupling between the optimal charging resource distribution and electricity network models enables us to define the network capacity as a finite resource in the resource distribution algorithm and observe the state and impacts of the local distribution network during the smart charging process. This future extension enables us to simulate optimal EV charging resource distribution scenarios in combination with other distributed loads and generators in a city. Furthermore, this will be a significant step forward in the field of integrated urban energy system planning.