5.1 Introduction

Due to the promising advancements of Internet of Things technology and wireless communications, smart vehicles empowered with environmental perception, information processing, and automatic control capabilities have emerged. Smart vehicles bring us powerful vehicular applications, such as autonomous driving, voice recognition, and car entertainment, and help to build a smarter, safer, and more sustainable transportation system. These applications usually require intensive computational processing. However, constrained by on-board computing resources, an individual smart vehicle might not provide sufficient computing power, which makes it difficult to ensure that application tasks are completed on time.

Mobile edge computing (MEC) provides a feasible approach to the above problem. By deploying computing servers in vehicular access networks, application tasks can be offloaded to the network edge for efficient execution. The offloading process leverages the wireless links between smart vehicles and roadside units (RSUs) for task data delivery and the acquisition of processing results. Moreover, smart vehicles that have spare computing resources can be exploited as edge computing servers to serve adjacent vehicle task generators in vehicle-to-vehicle (V2V) communication [53]. To specifically describe this edge computing approach that uses vehicular communication in task offloading, we call it vehicular edge computing (VEC).

In a VEC system, the high-speed movements of vehicles and rapid changes in network topology lead to unique characteristics that, unlike traditional edge computing systems, are designed for handheld mobile smart terminals. Moreover, these characteristics lead to new challenges and require the implementation of key techniques in the MEC architecture design, computing service scheduling, and resource management, which are investigated and described as follows.

5.2 Challenges in VEC

Identifying the technical challenges of VEC design and management is a prerequisite for optimal edge computing services. According to the characteristics of road traffic environments and vehicular edge networks, we summarize the challenges into four items.

  • A highly dynamic network topology and unstable service relationships: The dynamic changes of the network topology due to high-speed vehicle movement is the most important feature of VEC. This topology change can greatly affect transmission rates, interference, energy consumption, and so on. Since communication plays a key role in VEC task offloading, a dynamic topology implies complicated wireless access point switching, power adjustments, and interference suppression for edge service management. Moreover, considering the limited coverage of densely deployed base stations (BSs) in 5G/beyond 5G networks, high-speed moving vehicles can leave the communication range of a BS within a short time. When a high-speed vehicle generates a task with an intensive computing demand, it is difficult for a single VEC server equipped on a BS to complete the calculation process within the time the vehicle remains within the BS’s coverage. Unstable service relations are thus induced between VEC servers and users, which further complicates the VEC management mechanism.

  • Strict low latency constraints and large amounts of task data: Most of Internet of Vehicles applications are related to autonomous driving control and traffic safety improvement, which always have strict low latency constraints. For example, a vehicle’s reaction time to a suddenly appearing obstacle needs to be limited to milliseconds. Thus, the fast and efficient processing of obstacle identification and of control instruction generation becomes a necessary prerequisite. This requires edge servers to provide sufficient computing resources. However, on congested roads with a large number of vehicles, adequate serving capacities are often difficult to achieve. Furthermore, as mentioned before, edge computing services rely on task data transmission between user nodes and servers. In autonomous driving applications, vehicular sensors, such as cameras, millimeter wave radar, and lidar, continue to generate large amounts of data, which seriously challenges the communication capabilities of vehicular networks.

  • Heterogeneous and complex communications: Vehicular networks consist of smart vehicles, RSUs, and BSs. These devices and infrastructures form a variety of communication relationships, including V2V, vehicle-to-RSU (V2R), and vehicle-to-infrastructure (V2I), which are collectively referred to as vehicle-to-everything, or V2X. Diverse V2X communications can work in different frequency bands or share the same spectrum resources. In addition, consistent sets of standards have been created for the deployment and operation of vehicular communication, such as Dedicated Short Range Communication in the United States, Cooperative Intelligent Transport Systems in Europe, and IEEE 802.11bd and 5G New Radio V2X in the 5G era. Large-scale heterogeneous devices following multiple types of standards communicate in parallel in constrained frequency bands, which makes vehicular communication extremely complicated and leads the efficient task offloading a challenge.

  • Decentralized and independently controlled edge service nodes: Empowered with a processor, cache, and communication interface, a smart vehicle can be considered a mobile edge server when it has surplus computing resources and helps other vehicles through V2V task offloading. For application tasks with highly intensive computing requirements, the capability of a single vehicular server might not meet demands. In such a case, aggregating multiple vehicular servers to form a group entity with powerful service capabilities is a promising approach to the problem. However, since the vehicles in the network are mobile and distributed, a centralized control mechanism is spectrum inefficient and time-consuming. Furthermore, each vehicle’s service willingness and driving behavior are independently controlled by its owner. It is impractical to request that all vehicle owners follow scheduling instructions unconditionally.

5.3 Architecture of VEC

To address the challenges mentioned, we propose an architecture of VEC to guide VEC service management that illustrates the system components and their logical relations.

Fig. 5.1
figure 1

Architecture of a VEC system

Figure 5.1 shows the proposed architecture, which is divided into four layers. The bottom layer is the application layer. It consists of the smart vehicles and powerful vehicular applications. These vehicles have varied computing and communication capabilities while driving throughout large areas with different speeds and route plans. The applications they run, such as autonomous driving and navigation, can differ in terms of computing resource requirements and delay constraints. Vehicle characteristics and application requirements can be used as the input of the upper layers, which drive the edge service strategy adjustment.

In the edge server layer, there are three types of vehicular serving infrastructures. The first one consists of BSs equipped with MEC servers. The BSs can be macro BSs, micro BSs, or even pico BSs. They use V2I transmission to gather the computation tasks of the vehicles traveling in the coverage area, send the tasks to the MEC server for processing, and finally return the results to the vehicles. Similar to the BS operation, RSUs equipped with MEC servers are deployed along roads, serving the vehicles traveling past them.

It is worth discussing the edge service group formed by multiple smart vehicles with idle computing resources. The group members can be either stationary vehicles in a parking lot or moving vehicles on roads. The offloaded computing task needs to be shared by multiple vehicles, and the execution of each part of the task usually requires the close cooperation of the other parts. The communication capabilities among team members should be efficient and reliable. A service group is thus usually formed by vehicles geographically adjacent to each other.

At the resource layer, the edge serving resources provided by BSs, RSUs, MEC servers, and smart vehicles are logically divided into three categories: computing resources, communication resources, and cache resources. All the types of resources work cooperatively in task offloading and processing. The communication resource is responsible for task transmission and the delivery of calculation results, while the cache resource helps store task data in the servers. It is worth noting that, in some cases, cross-type collaboration can be implemented between heterogeneous resources. Tasks generated in an area with poor computing service but sufficient bandwidth can be transmitted to a remote area with powerful computing capabilities. This case can be viewed as using the cost of communication resources in exchange for computing resources.

The control layer is at the top of the architecture, monitoring the service states of the edge system and deriving optimal scheduling strategies. More specifically, the control units gather data on network topology, task attributes, vehicle characteristics, and resource states. The gathered information is then input into an AI module to analyze service supply and demand trends. Based on the analysis results, the units are used to form an effective management plan that determines offloading target servers, decides the multi-vehicle grouping mode, coordinates the interaction between heterogeneous servers, and optimizes various types of edge resources. From the implementation perspective, the control module can be a centralized control entity in charge of an entire network or distributed controllers equipped on BSs or RSUs that are responsible for scheduling service resources within a local area, or even a head vehicle in vehicular groups.

5.4 Key Techniques of VEC

Under the guidance of the architecture, many works have focused on several key technical issues in VEC construction, management, and operation, which are investigated in the following.

5.4.1 Task Offloading

Task offloading is the essential motivation for the proposed edge service, and it is also the core function of the VEC system. Since offloading processes can have diversified optimization goals under different application scenarios, there are a variety of corresponding offloading mechanisms.

To reduce energy bills and create green edge systems, energy efficiency has been considered an optimization goal in many studies. The energy consumption of task offloading is mainly split into two parts: consumption in data communication and task processing. Lower radio power and shorter communication times can reduce transmission energy costs. Based on a signal fading model and Shannon’s theorem, offloading target servers with smaller transmission distances and less wireless interference must be chosen. Regarding the processing energy part, different devices have diverse energy efficiency features. For example, given the use of many different types of silicon chips, the energy cost of a unit calculation performed by an on-board processor is usually higher than that of the dedicated processor in an MEC server. Thus, without significantly increasing the communication energy overhead, offloading tasks from a vehicle to an edge server usually improves the system’s overall energy efficiency.

Many works have been devoted to optimizing offloading energy efficiency. Pu, Chen, et al. [54] introduced a hybrid computing framework that integrates an edge cloud radio access network to augment vehicle resources for large-scale vehicular applications. Based on this framework, a Lyapunov-theoretic task scheduling algorithm was proposed to minimize system energy consumption. Zhou, Feng, Chang and Shen [55] leveraged vehicular offloading to alleviate the energy constraints of in-vehicle user equipment with energy-hungry workloads. They designed a consensus-based alternating direction driven approach to determine the optimal portion of tasks to be offloaded to edge servers. Li, Dand, et al. [56] modeled task offloading process at the minimum assignable wireless resource block level and presented a measure of the cost-effectiveness of allocated resources to help reduce the required offloading energy.

Another optimization goal is the quality of experience (QoE), which has drawn great interest recent years. The QoE reflects users’ satisfaction with the task offloading performance, and it can be quantified in several ways. One way is through the offloading delay. In most cases, users want tasks to be completed as quickly as possible. To meet this demand, both the task data transmission delay and computation delay should be minimized. However, because of vulnerable communications between user vehicles and RSUs, as well as task congestion at edge servers, guaranteeing timely offloading is a challenge. The division and distributed execution of computing tasks, which reduce the transmission channel requirements and server loads, has become a promising paradigm for addressing this challenge. Ren, Yu, He and Li [57] leveraged the collaboration of cloud and edge computing to partially process vehicular application tasks. They further proposed a joint communication–computation resource allocation scheme that minimizes the weighted-sum task offloading latency. Lin, Han, et al. [58] took a software-defined networking approach to edge service organization and introduced a distributed delay-sensitive task offloading mechanism over multiple edge server–empowered BSs.

Edge service reliability is also a key concern of QoE. However, the dynamic and uncertain vehicular environments create critical challenges in preserving user satisfaction. Many works have addressed this challenge. Ku, Chiang and Dey [59] focused on providing sustainable computing services for edge infrastructures powered by solar energy. Through offline solar energy scheduling and online user association management, the risk of power deficiency and intermittent edge service have been reduced. To ensure the high reliability of completion of vehicular application tasks, Hou, Ren, et al. [60] utilized both partial computation offloading and reliable task allocation in a VEC system, and proposed a fault-tolerant particle swarm optimization algorithm for maximizing computing reliability under delay constraints. To minimize long-term computing quality loss in unpredictable network states, Sun, Zhao, Ma and Li [61] formulated a nonlinear stochastic optimization problem to jointly optimize radio allocation and computing resource scheduling.

In addition to service reliability, cost is an important factor in user QoE. From the different perspectives of VEC operators and users, cost has distinct measurement approaches. Operators mainly consider the cost of the deployment of service facilities. To fully cover an area, a large number of MEC servers could be required, significantly increasing edge construction costs. To address this problem, Zhao, Yang, et al. [62] used unmanned aerial vehicles to act as relay nodes in forwarding computation tasks between smart vehicles and MEC servers. In this way, the service coverage of a single MEC server is improved, and both the number and construction cost of servers are reduced. Users, on the other hand, are concerned about minimizing the costs of using edge services. For instance, Du, Yu, et al. [63] made full use of TV white space bands to supplement the bandwidth for task offloading and introduced a cognitive vehicular edge networking mechanism that minimizes the communication costs of vehicular terminals. Deng, Cai and Liang [64] leveraged multi-hop vehicular ad hoc networks in task offloading and found the multi-hop routing path with the lowest costs through a binary search approach.

With the severe increase in malicious attacks and eavesdropping, security is an increasingly important issue. In VEC systems, due to the various ownership and management strategies of different vehicles and edge servers, it is difficult to ensure that all participating edge service entities are trustworthy. Consequently, security protection mechanisms and privacy preservation measures need to be implemented. Hui, Su, Luan and Li [65] focused on securing vehicle cooperation in relaying service requests and designed a trusted relay selection scheme that identifies relay vehicles by their reputations. Blockchain technology, a tamper-resistant distributed ledger of blocks that keeps data in a secure manner, is attracting growing attention and has been adopted in VEC. For instance, Liu, Zhang, et al. [66] presented a blockchain-empowered group authentication scheme that identifies vehicles distributively based on secret sharing and a dynamic proxy. To protect data privacy in task offloading, Zhang, Zhong, et al. [67] proposed a fuzzy logic mathematical authentication scheme to select edge vehicles, maintaining sensitive communications and verifications only between vehicles.

5.4.2 Heterogeneous Edge Server Cooperation

Although VEC is a promising paradigm for alleviating the heavy computation burden of smart vehicles, an individual edge server is still resource constrained, raising several challenges in the pervasive and efficient deployment of edge services. On the one hand, limited computing power and energy supply make it hard for servers to complete complex tasks under delay constraints. On the other hand, the wireless communication range of the BS or RSU on which the server depends is limited, which further constrains edge service capabilities from the communication perspective. An effective way to resolve these problems is the use of multi-server collaboration. Considering the different types of edge servers in VEC, combining these heterogeneous servers into a joint service leads to a variety of collaboration modes and approaches, which are illustrated as follows.

Cooperation among multiple edge servers equipped on communication infrastructures is the most widely used mode [68]. This mode benefits greatly from the wire connections between infrastructures, such as optical fiber and large capacity twisted pair cable, through which the task data can be transferred and exchanged between multiple servers at high speed and low cost [69]. However, the different computing capabilities of these servers pose challenges in selecting the offloading server and computing resource allocation. Whether to split a large task into multiple subtasks and distribute them in several parallel servers or to merge multiple small tasks into a few large tasks to run on selected servers needs to be carefully designed [70]. In the parallel execution mode, the inefficiency of the entire task processing due to the lag of a server of poor capability is also a key issue to be addressed. In addition, we need to optimize how tasks are offloaded from vehicles to servers. For example, task data can be collected through a BS and then spread to other BSs and servers using the wired connection between BSs. One could also use concurrent wireless delivery between multiple vehicles and the RSUs they can access.

Groups of multiple smart vehicles providing sufficient service capabilities to other user vehicles are another mode of edge service collaboration [71]. This mode makes full use of unoccupied on-board computing resources and is characterized by flexible organization and pervasive availability. However, challenges still exist in the efficient implementation of inter-vehicle service collaboration. The most serious challenge comes from the independence of the different vehicles. The vehicles are owned and controlled by different persons, with various driving route plans and degrees of service willingness. In addition, these vehicles can differ in terms of their idle resource capacity, maximum communication distance, and server energy supply [72]. This independence brings complexity and uncertainty to the vehicular server collaboration. Moreover, the number and geographic distribution of cooperative vehicle servers are also key factors that should be taken into account, since the vehicle distribution density will affect the trade-off between the computing service capacity and spectrum multiplexing efficiency. Thus, vehicular service collaboration schemes with efficient server grouping, resource scheduling, and vehicle owner incentives are required.

Integrating the servers equipped on infrastructures with on-board servers produces a heterogeneous edge service collaboration mode [73]. This mode takes full advantage of the large coverage and strong capabilities of infrastructure servers and leverages on-board servers to make up for the lack of flexibility of the infrastructures. In this mode, taking into account the advantages of V2V communication in terms of small path loss and low transmission delay, tasks with light loads and strict delay constraints are offloaded to on-board servers, while tasks of high computational intensity and loose delay constraints are usually offloaded to infrastructure servers. In case the edge servers cannot meet the vehicular task demands, the number and scope of collaborative servers can be expanded; that is, three-level coordination consisting of cloud servers, infrastructure servers, and on-board servers can be jointly scheduled in matching various types of application tasks.

5.4.3 AI-Empowered VEC

In recent years, we have witnessed unprecedented advancements and interest in artificial intelligence (AI). Machine learning, a key AI technology, provides entities working in complex systems the ability to automatically learn and improve from experience without being previously and explicitly programmed.

A vehicular network is such a complex system that is characterized by unpredictable vehicle movements, a dynamic topology, unstable communication connections, and frequent handover events. Computation task offloading and resource scheduling in vehicular network are a challenge, since an optimal solution should be aware of the network environment, understand the service requirements, and consider numerous other factors. Leveraging a machine learning approach in vehicular edge management is a promising paradigm for addressing the challenges mentioned above.

Various types of machine learning techniques have been applied in VEC, among which reinforcement learning is the most important. Reinforcement learning makes agents gain experience from their interactions with the environment and adjust action strategies along the learning process. This learning mode is suitable for dynamic road traffic states and complex vehicular networks. However, in large-scale networks handling massive amounts of state information, especially states represented by continuous values, the reinforcement learning approach cannot be directly implemented to solve the edge management problem. To address this issue, we can resort to deep Q reinforcement learning, which uses a Q-function as an approximator to capture complex interactions among various states and actions. Moreover, in the context of vehicular networks, some offloading actions could be chosen from a continuous space, such as wireless spectrum allocations and transmission power adjustments. To address the demands of this action space, edge service scheduling utilizes deep deterministic policy gradient learning, a branch of deep reinforcement learning that concurrently learns policy and value functions in a policy gradient learning process.

Many studies have applied machine learning to vehicular edge management. Some research focused on the relations of offloading decisions in the time dimension. Since data transmission and task execution are hard to complete instantaneously, previous actions will affect subsequent decisions through an extension of edge service states. The action dependence raises challenges in the optimization of current offloading strategies. To address them, Qi, Wang, et al. [74] designed a vehicular knowledge-driven offloading decision framework that scruples the future data dependence of the following generated tasks and helps obtain the optimal action strategies directly from the environment.

Another intersecting research issue of AI-empowered VEC is the adaptability of learning models in the context of complex vehicular networks. Considering potentially multiple optimization goals for offloading service management and that a single learning model can meet only part of the requirements, the incorporation of multiple models in the learning process is a promising approach. Sonmez, Tunca, Ozgovde and Ersoy [75] proposed a two-stage machine learning mechanism that consists of classification models in the first stage to improve the task completion success rate and regression models in the second stage to minimize edge service time costs. In [76], multi-model cooperation is adapted to become more flexible. To address the diversity and dynamics of the factors impacting edge service in vehicular networks, Chen, Liu, et al. introduced a meta-learning approach that adaptively selects the appropriate machine learning models and achieves the lowest offloading costs under different scenarios.

Running a learning process always consumes a great deal of computing resources, which will further aggravate the tension of edge service capabilities. To address this issue, a flexible, efficient, and lightweight learning mechanism is strongly needed. Research that has focused on this issue includes [77], where the authors aimed to reduce the learning complexity and processing costs. In the deep reinforcement learning–based offloading schemes proposed, Zhan, Luo, et al. avoided a large number of inefficient exploration attempts in the training process by deliberately adjusting the state and reward representations. Wang, Ning, et al. [78] presented an imitation learning–enabled online task scheduling scheme. In this scheme, the learning agents find optimal offloading strategies by solving an optimization problem with a few offline samples; near-optimal edge service performance is then achieved at a low learning cost.

Fig. 5.2
figure 2

An MEC-empowered vehicular network

5.5 A Case Study

In this section, we present two case studies to illustrate the vehicular task offloading mechanisms. The first one incorporates vehicle mobility into edge service management and proposes a predictive task offloading strategy [79]. The second one focuses on computation offloading in complex vehicular networks with multiple optional target servers and diverse data transmission modes, and it leverages AI technique to design optimal offloading schemes [80].

5.5.1 Predictive Task Offloading for Fast-Moving Vehicles

5.5.1.1 A System Model

We consider an MEC-empowered vehicular network, as illustrated in Fig. 5.2. Uninterrupted traffic in a free flow state is running on a unidirectional road. Along the road are RSUs. The distance between two adjacent RSUs is L. The transmission range of each RSU is L/2. The road can be divided into several segments of length L. Through the V2I communication mode, vehicles traveling on a given segment can only access the RSU located in the corresponding segment.

In the scenarios we studied, such as a temporarily deployed vehicular network, the RSUs communicate with each other through wireless backhauls. Each RSU is equipped with an MEC server with limited computational resources. To improve the transmission efficiency of the wireless backhauls, the task input file cannot be transmitted between the RSUs. Moreover, since the task output data size is small, the computation output can be transmitted between RSUs through wireless backhauls. All the vehicles move at a constant speed. The distribution of the vehicles on the road follows a Poisson distribution with density \(\lambda \).

Each vehicle has a computation task. The task can be either carried out locally by the vehicular terminal or computed remotely on the MEC servers. The computation task is denoted as \(T=\{c,d,t_{\max }\}\), where c is the amount of the required computational resources, d is the size of the computation input file, and \(t_{\max }\) is the delay tolerance of the task. We further classify the tasks into S types and present the tasks as \(T_i=\{c_i,d_i,t_{i,\max }\}\), \(i \in S\). The vehicles can be correspondingly classified according to their computation task types into S types. The proportion of vehicles with a task of type i in the total number of vehicles on the road is \(\rho _i\), where \(i \in S\) and \(\sum \nolimits _{i = 1}^S {{\rho _i}= 1}\).

5.5.1.2 Offloading with Optimal Predictive Transmission

There are two transmission modes for task offloading. One is through a direct V2I mode. In this mode, a vehicle can only offload its task to the MEC server equipped on the RSU that the vehicle can currently access. Considering that a vehicle travels down an expressway at high speed, if its computation task costs a relatively long time, the vehicle can pass by several RSUs during the task execution period. In this case, the output of the computation to be sent back to the vehicle needs to be transmitted from the MEC server that has accomplished the task to the remote RSU that the vehicle is newly accessing. The time overhead and transmission cost of the multi-hop relay seriously degrade the task transmission’s effectiveness.

Another offloading mode is predictive V2V transmission, whose main framework is illustrated in Fig. 5.3. In this mode, the vehicles send their task input files to the MEC servers ahead of them, in their direction of travel, through multi-hop V2V relays. Based on the accurate prediction of the file transmission time and the task execution time, as well as the time spents for the vehicle traveling down the road, vehicle k can arrive within the communication area of RSU\(_n\) at the exact time its task has been completed. The computation output can be transmitted directly from RSU\(_n\) to the vehicle through V2I transmission without a multi-hop backhaul relay. Transmission costs for task offloading can thus be reduced.

Fig. 5.3
figure 3

Vehicle mobility-aware predictive task data transmission

Let \(t_{i,v2v}\) denote the average time delay for the transmission of the input file of a task of type i through a one-hop V2V relay. The total time consumption of completing the task in this predictive mode is

$$\begin{aligned} t_{i,j} = y_j \cdot t_{i,v2v}+t_{i,upload}+t_{i,remote}+t_{i,download} \end{aligned}$$
(5.1)

where j is the number of hops the upload destination RSU is from the vehicle’s current position, where \(j>1\) means the vehicles adopt predictive mode transmission. We define \(y_j\) as the number of V2V relay hops that are required to transmit the input file to an RSU j hops away. Furthermore, the total cost of this type of task offloading is

$$\begin{aligned} f_{i,j} = y_j \cdot f_{i,v2v}+f_{i,upload}+f_{i,remote}+f_{i,download} \end{aligned}$$
(5.2)

where \(1<j \le J_{i,\max }\).

To minimize the offloading cost of both data transmission and task execution while satisfying the latency constraints, the objective function of the optimal offloading schemes is

$$\begin{aligned} \begin{aligned} \begin{array}{l} \mathop {\min }\limits _{\{ {P_{i,j}}\} } \sum \limits _{i = 1}^S {\sum \limits _{j = 0}^{{J_{i,\max }}} {\rho _i {P_{i,j}}{f_{i,j}}} } \\ \mathrm{such that} \quad t_{i,j} \le {t_{i,\max }}, \quad i \in \{1,S\}, j \in \{0,J_{i,\max }\} \end{array} \end{aligned} \end{aligned}$$
(5.3)

The objective function in (5.3) gives the average offloading costs of all types of vehicles when they choose offloading strategies \(\{P_{i,j}\}\), where \(\{P_{i,j}\}\) is the probability of a vehicle of type i choosing to offload its task to the MEC server j road segments away from its current position. To solve (5.3), we resort to a game approach to find the optimal offloading strategies of each type of vehicles. This game involves S players, where each player is a set of vehicles with the same type of tasks. We denote the vehicle set with tasks of type i as set i. The strategies of vehicle set i \((i=\{1,2,\ldots ,S\})\) are \(\{P_{i,j}\}\). Vehicles in set i can choose to either execute tasks locally or offload them to MEC servers j hops away. The payoff for set i is the sum of the vehicles’ offloading costs. Using a heuristic method in which each vehicle set adopt its best response action given the strategies of other vehicle sets, we can obtain a Nash equilibrium, which is the solution of (5.3).

5.5.1.3 Performance Evaluation

In the simulation scenario, we consider 10 RSUs located along a four-lane one-way road. The vehicles are traveling at 120 km an hour. Their computation tasks are classified into five types, with the probabilities \(\{0.05, 0.15, 0.3, 0.4, 0.1\}\), respectively. In addition, we set the computation resource requirement of each type of task at \(\{7, 13, 27, 33, 48\}\) units, respectively.

Fig. 5.4
figure 4

Task offloading costs in terms of vehicles density

Figure 5.4 shows the computation offloading costs with different densities of vehicles on the road. We compare the performance of our proposed predictive offloading scheme with the V2I direct transmission scheme. It can be seen that the predictive scheme greatly reduces the cost when the road has high vehicle density. In the case of high traffic density, long task execution times on the MEC servers lead to more RSUs that the vehicles have traveled past. Due to the transmission cost of the wireless backhaul between RSUs, the total costs of the direct V2I scheme rise quickly with an increase in the density \(\lambda \). However, in the predictive scheme, part of the transmission is offloaded to the V2V relay, which has a lower cost compared with wireless backhaul transmission. Thus, computation offloading costs can be saved.

It is worth noting that the performance improvement brought about by predictive offloading is based on the accurate prediction of vehicle mobility. With the development of AI technology, the prediction of vehicle mobility patterns has become much more accurate, especially on highways that have stable traffic flows. Thus, this proposed predictive scheme is promising and effective in practical applications.

5.5.2 Deep Q-Learning for Vehicular Computation Offloading

5.5.2.1 A System Model

In this case study, we consider an MEC-enabled vehicular network in an urban area, as illustrated in Fig. 5.5. Various types of computation tasks are generated in the traveling vehicles. We classify these tasks into G types. A task is described in four terms, as \(\kappa _i = \{f_i, g_i, t_i^{\max }, \varsigma _i\}\), \(i \in \mathcal {G}\), where \(f_i\) and \(g_i\) are the size of the task input data and the amount of required computation, respectively, and \(t_i^{\max }\) is the maximum delay tolerance of task \(\kappa _i\). The offloading system receives utility \(\varsigma _i \Delta t\) upon completion of task \(\kappa _i\), where \(\Delta t\) is the time saved in accomplishing \(\kappa _i\) compared to \(t_i^{\max }\). The probability of a task belonging to type i is denoted as \(\beta _i\), with \(\sum \nolimits _{i \in \mathcal {G}} {{\beta _i} = 1}\).

The urban area is covered by a heterogeneous wireless network that consists of a cellular network BS, M RSUs, and mobile vehicles. Compared to a BS that has seamless coverage and a high data transmission cost, RSUs provide spotty coverage but inexpensive access service. The costs for using a unit of the spectrum of the cellular network and the spectrum belonging to the vehicular network per unit of time are \(c_c\) and \(c_v\), respectively.

The BS is equipped with an MEC server, denoted as Serv\(_0\), through wired connections. In addition, each RSU hosts an MEC server. These servers are denoted Serv\(_1\), Serv\(_2,\ldots ,\) Serv\(_M\), respectively. The MEC servers receive data from their attached BS or RSUs directly. Let \(\{W_0, W_1, W_2,\ldots , W_M\}\) denote the computing capacities of these servers. Each MEC server is modeled as a queuing network where the input is the offloading task. The tasks that arrive are first cached on an MEC server and then served according to a first-come, first-served policy. A server utilizes all of its computing resources to execute the currently served task. The cost for a task to use a computing resource per unit of time is \(c_x\).

Fig. 5.5
figure 5

Task offloading in an MEC-enabled vehicular network

In the heterogeneous network formed by the overlapping coverage of the BS and the RSUs, vehicles can offload their tasks to the MEC servers through multiple modes. The task file transmission between a vehicle and the BS is called V2I. When a vehicle turns to the LTE-Vehicle network for task offloading, the file can be transmitted to an MEC server in a mode with joint V2V–V2R transmission.

The task offloading scheduling and resource management are considered to operate in a discrete time model with fixed length time frames. The length of a frame is denoted as \(\tau \). In each time frame, a vehicle generates a computing task with probability \(P_g\). For each generated task, its offloading can only adopt a single transmission mode. Since the topology can change in different time frames due to the mobility of the vehicles, to facilitate the modeling of the dynamic offloading service relations, we split the road into E segments. The position of a vehicle on the road is denoted by the index of the segment e, where \(1 \le e \le E\). All the vehicles have fixed transmission power for a given transmission mode, that is, \(P_{tx,b}\) in V2I mode and \(P_{tx,v}\) in the V2R and V2V modes. To receive a task file from a V2I mode vehicle, the signal-to-noise-plus-interference ratio (SINR) at the BS is presented as \(\gamma _{v,b}\). In addition, when vehicles choose V2R or V2V communication, the SINR at receiver r is \(\gamma _{v,r}\).

5.5.2.2 Optimal Offloading Scheme in a Deep Q-Learning Approach

We next formulate an optimal offloading problem and propose deep Q-learning–based joint MEC server selection and offloading mode determination schemes. In a given time frame, for a vehicle located on road segment e and generating task \(\kappa _i\), we use \(x_{i,e}=1\) to indicate the task offloading to Serv\(_0\) through V2I. Similarly, we use \(y_{i,e,m}=1\) and \(z_{i,e,m}=1\) to indicate the task offloading to Serv\(_m\) in the V2R and joint V2V–V2R modes, respectively. Otherwise, these indicators are set to zero. The proposed optimal task offloading problem, which maximizes the utility of the offloading system under task delay constraints, is formulated as follows:

$$\begin{aligned} \begin{aligned} \begin{array}{l} \mathop {\max }\limits _{\{ x,y,z\}} U = \sum \limits _{l = 1}^\infty \sum \limits _{j=1}^n \sum \limits _{i=1}^G \beta _i(\varsigma _i(t_i^{\max }-t_{i,e_j,l}^\mathrm{total}) - x_{i,e_j}^l(q_c c_c f_i \\ \quad \quad /R_{v,b,e_j} + {g_i}{c_x}/{W_0}) - {y_{i,e_j,m}^l}({q_v}{c_v}{f_i}/{R_{v,r,e_j}} \\ \quad \quad + {g_i}{c_x}/{W_m}) - {z_{i,e_j,m}^l}(\sum \limits _{h = 1}^{H_{e_j}} {{q_v}{c_v}{f_i}/{R_{v,j,e_j}} + {g_i}{c_x}/{W_m})} ) \\ \mathrm{such that} \quad \mathrm{C1:} {x_{i,e_j}^l} = \{ 0,1\}, {y_{i,e_j,m}^l} = \{ 0,1\}, {z_{i,e_j,m}^l} = \{ 0,1\}\\ \quad \quad \mathrm{C2:} {x_{i,e_j}^l}{y_{i,e_j,m}^l} = {x_{i,e_j}^l}{z_{i,e_j,m}^l} = {y_{i,e_j,m}^l}{z_{i,e_j,m'}^l} = 0\\ \quad \quad \mathrm{C3:} {x_{i,e_j}^l} + {y_{i,e_j,m}^l} + {z_{i,e_j,m}^l} = 1 \\ \quad \quad \mathrm{C4:} t_{i,e_j}^\mathrm{total} \leqslant t_i^{\max } , \quad i \in \mathcal {\kappa }, \quad m, m' \in \mathcal {M} \end{array} \end{aligned} \end{aligned}$$
(5.4)

where n is the number of tasks generated in a time frame; \(e_j\) is the road segment index of vehicle j’s location; \(H_{e_j}\) is the number of transmission hops; \(q_c\) and \(q_v\) are the amount of spectrum resources allocated for each task file offloading through the cellular and vehicular networks, respectively; \(R_{v,b,e_j}\) is the transmission rate of offloading the task file from the vehicle in road segment \(e_j\) to the BS, which can be written \(R_{v,b,e_j} = q_c \log (1+\gamma _{v,b})\); and \(R_{v,r,e_j}\) and \(R_{v,j,e_j}\) can be calculated similarly, based on the allocated spectrum \(q_v\) and the SINR \(\gamma _{v,r}\). Constraint C1 indicates that, for any offloading mode, either a choice is made or not, and C2 and C3 ensure that each task should select only a single offloading mode.

Since the current serving state of a server can affect the time costs of the following tasks, we can formulate (5.4) as a Markov decision process. The state of the offloading system in time frame l is defined as \({S^l} = (s_0^l,s_1^l,\ldots ,s_M^l)\), where \(s_0^l\) is the total computation required by the tasks queuing in Serv\(_0\) in frame l. Similarly, \(s_1^l,\ldots ,s_M^l\) denote the required computation of the tasks queuing in Serv\(_1\), Serv\(_2, \ldots ,\) Serv\(_M\) in time frame l, respectively. The actions taken by the control center in frame l can be shown to be \({a^l} = (X^l,Y^l,Z^l)\), where \(X^l=\{x_{i,e}^l\}\), \(Y^l=\{y_{i,e,m}^l\}\) and \(Z^l=\{z_{i,e,m}^l\}\) are the sets of task offloading strategies with various transmission modes and offloading targets for the generated tasks in frame l, respectively.

To derive the optimal offloading strategy \(\pi ^*\), we turn to reinforcement learning technology. The previous Markov decision process is turned into a reinforcement learning problem. The optimal value of the Q-function is

$$\begin{aligned} \begin{aligned} {Q^*}(S^l,a^l) = \mathrm{E}_{S^{l+1} }[U^l + \eta \mathop {\max }\limits _{a^{l+1} } {Q^*}(S^{l+1} ,a^{l+1} )|S^l,a^l] \end{aligned} \end{aligned}$$
(5.5)

where the maximum-utility as well as optimal offloading strategies can be derived by value and strategy iteration. A classical algorithm of reinforcement learning technologies, Q-learning can be used in modifying the iterations. In each iteration, the value of the Q-function in the learning process is updated as

$$\begin{aligned} \begin{aligned} \begin{array}{l} Q({S^l},{a^l}) \leftarrow Q({S^l},{a^l}) + \alpha [{U^l} + \eta \mathop {\max }\limits _{{a^{l + 1}}} {Q^*}({S^{l + 1}},{a^{l + 1}}) - Q({S^l},{a^l})] \end{array} \end{aligned} \end{aligned}$$
(5.6)

where \(\alpha \) is the learning rate.

Moreover, the states of the offloading system consist of the amount of computation queuing required in the MEC servers, a continuous value. We thus transform the Q-function into a function approximator and choose a multilayered neural network as a nonlinear approximator that can capture complex interactions among various states and actions. Based on the Q-function estimation, we utilize deep Q-learning technology to obtain the optimal offloading strategies \(\pi ^*\). With the help of the Q-network, the Q-function can be estimated as \({Q}(S^l,a^l) \approx Q'(S^l,a^l;\theta )\), where \(\theta \) is the set of network parameters. The Q values are trained to converge to real Q values over iterations. Based on the Q values, the optimal offloading strategies in each state are derived from the actions that lead to maximum utility. The action chosen in frame l can now be written as \({a^{l*}} = \mathop {\arg \max } \nolimits _{a^l} Q'({S^l},a^l;\theta )\). During Q-learning updates, a batch of stored experiences drawn randomly from the replay memory are used as samples in training the Q-network’s parameters. The goal of the training is to minimize the difference between \(Q(S^l,a^l)\) and \(Q'(S^l,a^l;\theta )\). The loss function is given as

$$\begin{aligned} \begin{aligned} Loss({\theta ^l})=\mathrm{E}[\frac{1}{2} {(Q_\mathrm{tar}^l - Q'(S^l,a^l;\theta ^l))^2}] \end{aligned} \end{aligned}$$
(5.7)

We deploy a gradient descent approach to modify \(\theta \). The gradient derived through differentiating \(Loss({\theta ^l})\) is calculated as

$$\begin{aligned} \begin{aligned} {\nabla _{\theta ^l}}Loss({\theta ^l})= \mathrm{E} [{\nabla _{{\theta ^l}}}Q'(S^l,a^l;{\theta ^l}) (Q'(S^l,a^l;{\theta ^l}) - Q_\mathrm{tar}^l)] \end{aligned} \end{aligned}$$
(5.8)

Then, \(\theta ^l\) is updated according to \({\theta ^{l}} \leftarrow {\theta ^l} - \varpi {\nabla _{{\theta ^l}}}Loss({\theta ^l})\) in time frame l, where \(\varpi \) is a scalar step size.

5.5.2.3 Performance Evaluation

We evaluate the performance of the proposed task offloading schemes based on real traffic data, which consist of 1.4 billion GPS traces of more than 14,000 taxis recorded during 21 day in a city. We consider a scenario with one BS and five RSUs on each selected road. We set a computing capacity \(W_0\) = 1,000 units, and the capacities of the MEC servers equipped on the RSUs are randomly selected from the range of [100, 200] units.

Fig. 5.6
figure 6

Average utilities under different offloading schemes

Figure 5.6 shows the impact of road traffic on the average utility of a task under different offloading schemes. Our proposed deep Q-learning scheme clearly yields higher offloading utility compared to other schemes, especially in the non-rush period from 12:00 to 16:00. This is because our scheme jointly considers transmission efficiency and the load states of the MEC servers. However, the offloading scheme that chooses the target server according to the vehicle’s best transmission path and the scheme that selects the MEC server according to the server state only take one factor into account. The ignored factor could seriously affect the offloading efficiency.

In the game-theoretic approach, the vehicles traveling on a road segment act as players that compete for task offloading services to gain higher utility. Since each vehicle independently determines its offloading strategy from the perspective of its own interests and ignores cooperation with other vehicles, system performance worsens. In the greedy algorithm, each vehicle chooses its offloading strategy in a distributed manner. Although the greedy algorithm jointly optimizes the file transmission path and MEC server selection in the current frame, it ignores the follow-up effects between consecutive time frames. In contrast, our proposed learning scheme considers both of these effects in the design of offloading strategies, leading to better performance.