AI-Based Power Split Strategy for Hybrid Commercial Vehicle Applications

Nair, Pratheesh Sivaraman; Bukic, Tomislav; Burnner, Dominik; Koutroulis, Georgios; Zivadinovic, Milan

doi:10.1007/978-3-031-70392-8_137

Pratheesh Sivaraman Nair¹⁷,
Tomislav Bukic¹⁷,
Dominik Burnner¹⁷,
Georgios Koutroulis¹⁷ &
…
Milan Zivadinovic¹⁷

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Included in the following conference series:

Advanced Vehicle Control Symposium

Abstract

In the rapidly evolving landscape of hybrid commercial vehicle technology, integrating artificial intelligence (AI) with fuel cell applications offers a promising frontier for efficient, sustainable and eco-friendly road freight system. There has been many different approaches on optimization of power split between the electric motor and the fuel cell system (FCS). Conventional approaches use quadratic optimization to determine the optimal power from the electric motor at each discrete grid point along the route, with initial and final battery state of charge (SoC) as constraints. This paper proposes a deep reinforcement learning-based approach to optimize the power split between the electric motor and the FCS in a hybrid vehicle at every time point during the vehicle’s trip. The agent demonstrated the ability to autonomously learn and improve power split decisions, resulting in enhanced fuel efficiency.

You have full access to this open access chapter, Download conference paper PDF

Keywords

1 Introduction

1.1 Hybrid Commercial Vehicles

Hybrid commercial vehicles with hydrogen Fuel Cell System (FCS)are powered by a combination of fuel cell system and an electric propulsion system. We are considering hybrid vehicles in which both the power sources can drive the vehicle independently and simultaneously and the battery of the electric motor is charged from recuperation. Most of the power requirement is met by the FCS. This type of hybrid systems are called series-parallel hybrid system. Along with the FCS, electric motor and battery pack, this type of vehicle contains a power split device which manages the distribution of power between the FCS and the electric motor. The power split device runs on certain logic to split the total required power at a particular instance between the FCS and the electric motor. It also charges the battery from recuperation when the total required energy goes negative. We use deep reinforcement learning for deriving the logic to split the required power between the above power sources.

1.2 Reinforcement Learning

Reinforcement Learning (RL) is a machine learning technique where an agent learns to make decisions in an environment to maximize returns based on specific objectives. The environment is a bounded system that changes its state in response to the agent’s actions. An agent, an external entity, observes the system’s state and influences it to achieve and maintain a favorable state. Therefore, the agent’s objective is to learn the action-state dynamics and control the system with actions that result in the desired system state.

Rewards are defined to meet the system’s output objectives and are calculated based on the observed state of the system. During the learning phase, the agent observes the system’s state, performs actions to change it, and evaluates the new state by calculating the return. The return consists of the immediate reward and the discounted future reward at the new state. Over time, the agent adjusts its actions based on the returns to transition from the current state to a favorable state.

Artificial neural networks, which can learn complex non-linear relationships between variables, are used to represent the agent in deep reinforcement learning. These networks serve as universal function approximators, enabling the agent to learn and adapt effectively. The objective of the agent is to learn a policy , which is a mapping from state s to a probability distribution over action a, parameterized by a neural network \(\theta \), that maximizes the expected return \(\textbf{J}_{\theta } = \mathbb {E}_{\pi }[\mathbb {G}_t]\). The return\( \mathbb {G}_t = \sum _{k=0}^{\infty }\gamma ^k \mathbb {R}_{t+k+1} \) where \(\gamma \) is the discounting factor for future rewards and \(\mathbb {R}_{t+k+1}\) is the reward at time step \(t+k+1\). The policy is improved over iterations by updating \(\theta \) from the gradient of the return \(\nabla _{\theta }\mathbb {J}_{\theta } = \mathbb {E}_{\pi }\bigl [\nabla _{\theta }log \pi (a|s,\theta )\mathcal {Q}^{\pi }(s,a)\bigr ]\).

2 Related Work

Ferrara et al. [2] used quadratic optimization for optimal power split in hybrid commercial vehicles. Liessner et al. [6] addressed the power split problem using RL with Deep Deterministic Policy Gradient (DDPG) [7] algorithm. DDPG is a breakthrough in off-policy continuous action space RL applications, but is sensitive to hyper parameters [3]. Manio et al. [8] used Q-learning to address the problem with explicit inclusion of SoC conservation in the reward function. Hu et al. [4] uses deep Q-learning with reward function discretized on the value of SoC. Q-learning suffers from over estimation bias. We use Soft Actor Critic (SAC) [3] algorithm for learning the optimal power split. SAC is robust to hyper parameters and converges fast in high dimensional control problems. SAC avoids hyper parameter sensitivity by incorporating policy entropy term in policy update, thereby encouraging exploration.

3 Training Architecture

The objective of this exercise is to develop a model that optimally splits the total required power between the electric motor and the FCS. Optimal power split involves minimizing hydrogen fuel consumption, preventing battery drain, and battery charging when the wheels do not require power (e.g. during downhill travel). The model considers look ahead information on the next downhill travel and the total battery charge during the next downhill travel. The architecture consists of two main components: the vehicle model and the reinforcement learning (RL) module. The RL module receives observations from the vehicle model at each time instance and outputs the power split ratio between the FCS and electric motor. Based on this ratio, the vehicle model calculates the power from the FCS and the electric motor at every time point of the trip (Fig. 1).

3.1 Vehicle

The vehicle model consist of two sub modules- vehicle configuration and road data. Vehicle configuration consist of vehicle mass, FCS configuration, battery configuration, auxiliary power and vehicle dynamics including acceleration, deceleration, velocity, traction power and driving resistance. Road data consist of road slope, curvature, speed limits etc. which are captured using sensors on vehicle while driving along 8 different routes in Austria. Driving mode is a variable in the range (0,1) representing economy to aggressive driving. From the road data, vehicle dynamics and the driving mode, the required power for the vehicle at each instant of time is calculated. When the RL module sends the power split ratio as an action through the environment module, the vehicle model calculates the power from FCS as (total power * power split ratio) and power from electric motor as (total power * (1-power split ratio)). The SoC expenditure from the battery for the power from electric motor is calculated according to the battery configuration and dynamics. Subsequently, the remaining SoC is calculated as the difference between the current SoC and the SoC expenditure for the power. The FCS fuel consumption and efficiency are then determined from the FCS model. When the total power goes negative, the battery is charged and FCS power is kept at zero.

3.2 Deep RL Module

This is an episodic task involving finding the optimal power split at any given time point of truck’s trip.

Observation Space: The observations space consist of the total required power at the wheels at the moment in time, the battery SoC at the moment, altitude and curvature of the road, desired velocity, time steps to episode end, time steps till next downhill descent and total SoC charge in the next descent. The total required power is a continuous value in kW and is derived from the vehicle and route data. The battery SoC is also a continuous value ranging from 0 to 1.

Action Space: The action space is the power split ratio between FCS and electric motor which is a continuous value in the range [0,1].

Algorithm:With the episodic task setting with continuous action space, Soft Actor Critic (SAC) [3] algorithm best suits the problem. SAC is an off-policy actor critic deep reinforcement algorithm suited for continuous action space. Off-policy algorithms are sample efficient algorithms which uses past experience gathered in a replay buffer for learning. This suits our scenario where the route and vehicle dynamics are constants and environment exploration is bounded.

Reward: Table 1 shows the observations used for reward calculation. A reward of 100 is given when the agent navigates successfully to the end of the episode without SoC drain. A small reward is given for every step towards the episode end which promotes saving SoC since the episode gets terminated when the SoC goes to 0. A negative reward is given for cumulative H2 consumption during the travel with the objective of reducing H2 consumption. A penalty is given when the SoC goes below 10%.

Table 1. Reward components, positive/negative (+/-) contribution and their weights.

Full size table

4 Experiment

4.1 Setup

Building on the methodology and tool chain by Bukic et al. [1] created around Ray [5], an open-source distributed computing framework, we trained the network with 16 CPUs and 1 GPU. A vehicle model with 40 ton weight, initial battery SoC of 0.5 and driving style of 0.5 were used for training. The driving style effect the acceleration, deceleration and velocity calculation and subsequently the required power at the wheels. Total of 8 routes were used for training. Policy and Q value network used 256,256 fully connected network with a learning rate of 0.003 for both the networks. A prioritized replay buffer with a capacity of 10,00,000 was used and a train batch size of 512 was used. Training was done for 50,000 iterations.

4.2 Results

Figure 2 shows the power split between FCS and electric motor using SAC method on the Brenner route. The approach splits the total power almost equally. The battery charges when the total power goes negative. The Brenner route is a high uphill/downhill route.

Figure 3 shows the battery SoC comparison between SAC and QO approaches on Brenner route. The red line shows the altitude of the route. Table 2 shows the comparison of H2 consumption with SAC power split method and Quadratic Optimization method on the 8 routes. On all routes, SAC approach has shown less H2 consumption with an highest improvement of 6% and the lowest improvement of 1.4 % compared to quadratic optimization approach.

Table 2. Drive cycles, distance covered and H2 consumption with SAC power split approach and quadratic optimization(QO) approach from Ferrara et al. [2]

Full size table

4.3 Conclusion

RL based power split strategy has demonstrated its effectiveness in reducing fuel consumption in hybrid vehicles. The above approach relies on offline predetermined route information and ideal velocity profile to calculate the power requirement. However, enhancing this strategy by predicting real-time power requirements using sensor data from the moving vehicle will improve its applicability and accuracy in real-world scenarios. Additionally, incorporating battery health parameters into the model will yield more sustainable battery performance. The current experiment is limited to vehicles of a specific weight.

References

Bukic, T., et al.: Advanced reinforcement learning-based thermal management strategy for battery electric vehicles. In: Fisita World Congress (2023)
Google Scholar
Ferrara, A., et al.: Optimal calibration of an adaptive and predictive energy management strategy for fuel cell electric trucks. Energies 15(7), 2394 (2022)
Article Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)
Google Scholar
Yue, H., Li, W., Kun, X., Zahid, T., Qin, F., Li, C.: Energy management strategy for a hybrid electric vehicle based on deep reinforcement learning. Appl. Sci. 8(2), 187 (2018)
Article Google Scholar
Johri, S., Goyal, M., Jain, S., Baranwal, M., Kumar, V., Upadhyay, R.: A novel machine learning-based analytical framework for automatic detection of covid-19 using chest x-ray images. Int. J. Imaging Syst. Technol. 31(3), 1105–1119 (2021)
Article Google Scholar
Liessner, R., Schroer, C., Dietermann, A.M., Bäker, B.: Deep reinforcement learning for advanced energy management of hybrid electric vehicles. In: ICAART (2), pp. 61–72 (2018)
Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprintarXiv:1509.02971 (2015)
Google Scholar
Maino, C., Mastropietro, A., Sorrentino, L., Busto, E., Misul, D., Spessa, E.: Project and development of a reinforcement learning based control algorithm for hybrid electric vehicles. Appl. Sci. 12(2), 812 (2022)
Google Scholar

Download references

Acknowledgement

This work is funded by the European Union under Grant Agreement No. 101138110. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or CINEA. Neither the European Union nor the granting authority can be held responsible for them.

Author information

Authors and Affiliations

AVL List GmbH, Graz, Austria
Pratheesh Sivaraman Nair, Tomislav Bukic, Dominik Burnner, Georgios Koutroulis & Milan Zivadinovic

Authors

Pratheesh Sivaraman Nair
View author publications
You can also search for this author in PubMed Google Scholar
Tomislav Bukic
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Burnner
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Koutroulis
View author publications
You can also search for this author in PubMed Google Scholar
Milan Zivadinovic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pratheesh Sivaraman Nair .

Editor information

Editors and Affiliations

Department of Mechanical Engineering, Politecnico di Milano, Milano, Italy
Giampiero Mastinu
Department of Mechanical Engineering, Politecnico di Milano, Milano, Italy
Francesco Braghin
Department of Mechanical Engineering, Politecnico di Milano, Milano, Italy
Federico Cheli
Department of Electronics, Information Technology and Bioengineering, Politecnico di Milano, Milano, Italy
Matteo Corno
Department of Electronics, Information Technology and Bioengineering, Politecnico di Milano, Milano, Italy
Sergio M. Savaresi

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nair, P.S., Bukic, T., Burnner, D., Koutroulis, G., Zivadinovic, M. (2024). AI-Based Power Split Strategy for Hybrid Commercial Vehicle Applications. In: Mastinu, G., Braghin, F., Cheli, F., Corno, M., Savaresi, S.M. (eds) 16th International Symposium on Advanced Vehicle Control. AVEC 2024. Lecture Notes in Mechanical Engineering. Springer, Cham. https://doi.org/10.1007/978-3-031-70392-8_137

Download citation

DOI: https://doi.org/10.1007/978-3-031-70392-8_137
Published: 04 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70391-1
Online ISBN: 978-3-031-70392-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

AI-Based Power Split Strategy for Hybrid Commercial Vehicle Applications