A cooperative heterogeneous vehicular clustering framework for efficiency improvement

Heterogeneous vehicular clustering integrates multiple types of communication networks to work efficiently for various vehicular applications. One popular form of heterogeneous network is the integration of long-term evolution (LTE) and dedicated short-range communication. The heterogeneity of such a network infrastructure and the non-cooperation involved in sharing cost/data are potential problems to solve. A vehicular clustering framework is one solution to these problems, but the framework should be formally verified and validated before being deployed in the real world. To solve these issues, first, we present a heterogeneous framework, named destination and interest-aware clustering, for vehicular clustering that integrates vehicular ad hoc networks with the LTE network for improving road traffic efficiency. Then, we specify a model system of the proposed framework. The model is formally verified to evaluate its performance at the functional level using a model checking technique. To evaluate the performance of the proposed framework at the micro-level, a heterogeneous simulation environment is created by integrating state-of-the-art tools. The comparison of the simulation results with those of other known approaches shows that our proposed framework performs better.


Introduction
The ever-growing number of cars on the road results in road traffic congestion, which in turn leads to long hours driving on the road and increased fuel consumption and environmental pollution. This situ-ation also causes wastage of precious time and money. Academia and vehicle industries are working determinedly and investing a large amount of money to ensure easy, efficient, and secure road travel by emerging technological infrastructures. Various route planning and navigational applications have been developed to improve road traffic efficiency, leading to a reduction of travel time and congestion on the roads. These applications depend on underlying information communication technologies such as the Internet, Global Positioning System (GPS), and vehicular ad hoc networks (VANETs) (Alam et al., 2016). www.jzus.zju.edu.cn;engineering.cae.cn;www.springerlink.com ISSN 2095-9184 (print); ISSN 2095-9230 (online) E-mail: jzus@zju.edu.cn Our first objective is to develop a heterogeneous network by incorporating VANETs and telecom networks such as long-term evolution (LTE) (available today) into a complex network. Our second objective is to handle the non-cooperativeness of vehicles involved to reduce the cost of using the Internet. Route planning and driver assistance applications require access to a remote server over the Internet via an underlying telecom network. The usage of telecommunication technologies is not free. Dealing with such situations and scenarios requires the development of a proper framework. While developing such a framework, the handling of the heterogeneity of the system introduced by the integration of different types of technologies is a challenging task. Therefore, having the goal of improving road traffic efficiency while incorporating multiple technologies and taking care of the cost factors involved is a big issue. There are some existing systems (Jia et al., 2020;Wang YP et al., 2020), but they were developed to work in a particular scenario. The traffic efficiency class of applications has distinct requirements in terms of frequency, data rate, and latency.

Frontiers of Information Technology & Electronic Engineering
VANETs play a significant role and provide basic infrastructure for the development of applications for road safety, traffic efficiency, and in-car entertainment. The communication networks in VANETs have distinctive properties, and the routes of the nodes are organized and have great speed. VANETs rely on the establishment of reliable communication among vehicles for the development of various applications for road traffic management (Alam et al., 2016;Ahmad et al., 2019). The integration of wireless access in a vehicular environment with LTE forms a heterogeneous network (Yaqoob et al., 2017). One of the use scenarios of the Internet of vehicles (IoV) is a traffic information system (TIS). A TIS consists of online servers connected to vehicles on the roads via the LTE network. Vehicles send data to servers that are related to the road traffic situation, and the servers compute traffic trends and send information back to the intended vehicles. This information enables vehicles to regulate and manage their journey in an efficient way. Each vehicle on the road accesses the information from the servers individually, resulting in high data consumption and incurred cost. To handle this, vehicles are grouped to share data and cost.
By integrating VANETs with LTE, we propose a heterogeneous network for vehicular clustering. First, we develop a clustering mechanism based on the unique clustering criteria that suit road traffic efficiency applications. Then, we incorporate a strategic algorithm for cooperation among vehicles, so that vehicles act cooperatively in sharing data and cost while moving on the road. To verify the performance of the proposed destination and interest-aware clustering (DIAC) framework, a formal verification is carried out at the macro-level and a simulation study at the micro-level. A system model of the framework is developed by specifying its states and properties at different levels of the design. The desired properties of the system are checked by a model checking technique to verify the working correctness. Simulation results are compared with those of existing approaches. Consequently, the performance is verified at both the macro-and micro-level to demonstrate the reliability of the framework. Therefore, the contributions of our work include the development of the DIAC framework, the system model of the framework and its design level evaluation, micro-level performance comparison, and evaluation with stateof-the-art algorithms.

Related works
Technology is evolving very rapidly, so is the development of its applications to various aspects of life. Life on the road no longer involves only traveling to a desired destination, but also aspects of traffic safety and efficiency with associated socio-economic benefits. These aspects with ever-evolving technology gave birth to the concept of smart cities and intelligent transportation systems. Different technologies are merging for better performance in terms of cost, resource usage, safety, and comfort. Vehicular networks nowadays incorporate emerging telecom networks with dedicated short-range communication (DSRC) to form heterogeneous network infrastructures such as 3G, LTE, and 5G with DSRC (Liu et al., 2016). One example (Salvo et al., 2017) involves floating car data over a typical road segment to help vehicles decide on route planning. One of the main schemes dealing with the critical use of limited radio resource of the vehicular network scenario was proposed by Garbiso et al. (2021). They presented a useful self-adaptive clustering system for ensuring a suitable trade-off between data aggregation and communication congestion due to cluster management. The system is based on a distributive justice approach for selecting cluster heads (CHs) to improve fairness among car drivers. Hui et al. (2020) proposed a collaborative content delivery scheme to improve the use of participants, including a base station, roadside units (RSUs), and vehicles in the softwaredefined heterogeneous network. A cooperation mechanism was developed between the base station and the RSUs for cooperative content delivery to serve a group of vehicles with multicast technology. However, being based on RSUs makes it difficult and expensive to deploy, and there are solutions that are more promising, such as clustering at the vehicle-tovehicle level.
The central problem of a vehicular network is high-speed mobility, which causes instability in links that results in less cluster stability. Modified distributed and mobility-adaptive clustering (MDMAC) (Wolny, 2008) is one of the basic algorithms developed for basic VANETs. The vehicular multi-hop algorithm for stable clustering (VMaSC) (Ucar et al., 2016) was for heterogeneous networks of a larger scale. Signal strength of LTE was not considered in VMaSC. A clustering-based multi-metric adaptive mobile gateway management (CMGM) mechanism (Benslimane et al., 2011) works with a hybrid network and minimizes the number of gateways to connect to the Internet via a 3G network. The protocol of Morales et al. (2012) was designed for traffic efficiency improvement and took the destination of vehicles as a criterion, but it is entirely dependent on the LTE network.
For cooperativeness in a heterogeneous vehicular network, game theory was used (Srivastava et al., 2005;Roughgarden, 2010;Ficco et al., 2018). An important solution proposed by Lobato et al. (2018) was used for content downloading from a remote server for vehicles within the cluster. A solution proposed by Gerla et al. (2014) is used in our scenario to achieve the cooperation behavior, which is essential when the cost is included in the clustering process. Ahmad et al. (2020) presented a good example of a heterogeneous vehicular network, in which a fullyfledged game-theoretic mechanism was proposed for cooperation among vehicles within the cluster. Wang TY et al. (2019) proposed heterogeneous vehicular networks, in which DSRC, LTE, and vehicle-toeverything were integrated. A clustering approach was adopted, named self-adopting clustering, based on an iterative self-organizing data analysis algorithm, targeting at multiple clusters at the same time in a wide coverage area. Hui et al. (2019) proposed an optimal access control scheme for vehicles in a heterogeneous environment to manage the downloading of data and related cost. The proposed scheme is based on a coalition formation game for cooperation among vehicles having interest in contents cached in vehicles. Vehicles request content cooperatively to minimize costs. Another interest-aware vehicular clustering scheme (Ahmad et al., 2019) was proposed for traffic information systems to improve road traffic efficiency. The cost factor was considered in this scheme, as well as the stability. This represents the preliminary stage of the implementation of intelligent transportation system applications, in which data of traffic flows was collected for multiple scenarios, and then a queuing model was adopted to improve road traffic efficiency (Mirabile, 2020). A framework that optimizes the configuration parameters of arbitrary clustering algorithms was presented by Alsuhli et al. (2020). They compared the optimization techniques to identify the metaheuristic with the best quality solutions used to optimize a recent clustering algorithm.
All the above approaches suit their typical network scenarios and application requirements. Simulations are used mostly to check their performance over a defined network. Formal verification of the system is essential to confirm the working and functionality of the system. Before exposing any proposed system to real-world scenarios, comprehensive performance evaluation is important in most cases, because the deployment of a real-world network infrastructure involves expensive hardware and software development and installation kit.

Vehicular clustering frameworks
The DIAC framework consists of a clustering of vehicles (destination and interest-aware mechanism for VANETs) and is extended to VANET-LTE based heterogeneous clustering. For the clustering process, a clustering criterion is used, consisting of speed, location, and vehicles having an interest in obtaining the information from online TIS servers for route planning over the road. The basic algorithm of Ahmad et al. (2020) is incorporated in our clustering process. The DIAC framework that consists of three basic clustering phases is presented in Fig. 1. The aim of the cluster status check phase is to determine whether there are any existing clusters. In the absence of an existing cluster, the cluster formation phase starts, in which a base vehicle (BV) sets initial values of the average speed (AS), direction (θ), destination heading (D), and received signal strength (RSS). Then, the CH election process is started, as shown in the expanded red rectangular block on the right in Fig. 1.
The CH election process begins by sending Hello packets to adjacent neighbors. In response, neighbors compute their parameters (AS, θ, D, and RSS of LTE). If AS, θ, and D of neighbors are not equivalent to the Hello packets' parameters, then the Hello packet received from the BV is discarded. Otherwise, the negative (N), positive (P), and strategic (S) variables are checked to determine whether they have a zero value. If the answer is yes, then the neighbor will accept the request to be part of the cluster and the values of P and S are set to 1. The BV will record the RSS value of each received packet from the neighbors and compare their RSS values with those of cluster members (CMs). RSS of BV is named RSSbv and RSS of CM is named RSScm. If RSSbv is greater than RSScm, the BV will be elected as a CH. The values of P and S will be set to 1. The BV will propagate to all CMs about its selection as CH. All other CMs will set their N to 1. If RSSbv is not greater than those of its neighbors, the neighbors with the maximum RSS value will be elected as CHs and all other vehicles will be notified. Now, the BV and other vehicles, except the one elected as a CH, will set their N variable to 1. N represents the number of rejected requests, P represents the number of accepted requests, and S represents the strategic values that cannot be greater than 5. The value of S controls the repetitive accepted and rejected requests by the same vehicle within the same clustering environment. Five is set to control the logic of our algorithm that depends on the incremental values of P and S. This restricts the number of times that a vehicle can be selected as CH to at most twice under the same clustering environment. This is done to load-balance our cluster by avoiding the same vehicle being repeatedly selected as CH. Whenever the value of S is 5 in the next iteration, this typical vehicle will reject the offer of being selected as a CH. This is how we control fair sharing in terms of the participation of vehicles as CH. We assume that all vehicles are intended to be part of the clustering process and that no one is deviating from this rule. So, we control only that the same vehicle will not be elected as CH every time, because the sharing of data accessed over the LTE network is not free. Further details of the parameter setting, definitions, and schemas are presented in Section 4.

Design and evaluation
A system analysis technique is required to check and increase the reliability of the system with mathematical proofs. One important technique is formal verification, in which a mathematical model of a given system is developed to formally verify that the model meets the specified properties of the intended behavior. Formal methods enable verification of the system properties and provide a conceptual understanding of the system or protocols (Qadir and Hasan, 2015). Our formal evaluation has three purposes: (1) to represent clarity through high-level abstraction of the proposed DIAC framework, (2) to capture the system states that involve vehicle connectivity and participation of the distributed vehicular communications, and (3) to evaluate the performance of the model using system performance metrics formulated as formation success, CH selection success, and accept/reject control.
In our work, we perform system analysis through model checking (Clarke et al., 1986), using a transformational model system. This kind of model consists of pre-and post-conditions and a specification language such as "Z." Model-checking is the application of formal methods, in which a set of assertions for system design are created against a set of conditions that must or must not happen. A model is designed with a defined set of states of the system, and changes in these states are triggered by binary logic (true or false basis) (Baier and Katoen, 2008). The workflow of the formal evaluation process of our system is presented in Fig. 2.

System specifications
4.1.1 System schemas Z language consists of Z notations based on the set theory and mathematical logic. The language is developed by integrating first-order predicate calculus with the set theory. The beauty of the Z language lies in how the mathematics can be constructed. Mathematical objects and their properties are incorporated in schemas with a pattern of declaration and constraints. The schemas describe the state of the system and changes in the states. Z notations are also used to describe system properties and logical reasoning related to refinement.
First, we define the terminologies of our system and develop different schemas. DIAC is the name of the proposed schema, and the schema signatures within the first part of the container are Vi: ℙ VE-HICLE, CVsit: VEHICLE⇸SPEED. The second part contains the schema predicates. The schema predicates are always true and refer only to elements in the signature of the schema. The Z notations and symbols used in the definitions and schemas are shown in Table 1.  Fig. 3 presents a schema called DIAC. It is a state space schema that shows the contents in its signature and predicates. Fig. 4 presents a schema called DIAC initialization. It presents an initial form of signature, and predicates are defined. Initially, all values are empty.
The schema called SERAC is presented in Fig. 5. It is related to the control procedure within strategy and changes in variables N, P, and S. For simplicity, in elaboration, we denote N=R, P=A, and S=C. So, flag A represents accept, flag R represents reject, and C is the control variable for our strategy to work.
The selection-of-clusterhead schema shows the selection of CH, as presented in Fig. 6. Once a CH is selected, parameter C is increased by 2.
The add-vehicle-to-cluster schema in Fig. 7 represents an addition of vehicle to the cluster. Predicates represent the main clustering criteria, as presented in Fig. 7. This schema specifies a criterion to choose a suitable vehicle for inclusion in the cluster.

Clustering procedures
Our system model consists of six different states with two properties, namely, ACK and wait, as shown in Fig. 8. The first state is an initial state, at which all defined properties of the system hold negative. After booting, the system is started, but the remaining properties such as selected, error, ACK, and wait do not hold.
When the system is started, it sends a Hello message and waits for the response. If the system receives ACKs, then it finds the vehicle with the maximum link quality (LQ) value and selects that vehicle as a CH. In this state, only properties of error and wait hold. If the connection is lost at this state, then the system goes to a state at which all properties hold, except wait.
At this stage, the system can go to one of these two states. If the connection to LTE is re-established, then the system goes into a state at which only the properties of started and wait hold; if the connection to LTE is not re-established within a specified time, then the system is started and has an error. From here, the system returns to a state at which only the property start holds. The system will again send a Hello message. If no vehicle is in range, there is no ACK, or all vehicles reject the offer to be selected as a CH, then the system goes to a state at which properties of start and error hold, and the rest are false.

Accept/Reject decision model
We present an accept/reject decision model that is incorporated into the system model (Fig. 9). The accept decision model consists of five states, and each state reflects the status of our flag variables within the strategic game theory algorithm. These flags control the default behavior of game theory, which motivates vehicles to accept the offer of being selected as a CH. The first state represents R, A, and C initial statuses; if the value of C is greater than 5, then the decision system goes to a state at which the property of accept does not hold, and the statuses of C and R change.
From this state, if any other vehicles accept the offer, then the state changes to a new state at which only the value of C decreases by 1. From here, the system is triggered to its initial state, but with new values of flags; from the initial state, if the value of C is greater than 0 and is less than or equal to 5, then the system goes into a new state. In this state, the property of accept is true and the values of A and C change. The value of R remains unchanged. At this stage, if CH is selected, then the system goes to a new state at which only the status of C changes. From this state, the system returns to the initial state.

Model checking of the temporal logic
In model checking, users produce a system model and some logical formulas that describe the properties. If the system is complex, then a modelchecking algorithm is used to determine whether the system satisfies the desired properties (logical formulas). If the properties are not satisfied, then a counterexample is produced. Model-checking is based on the state specifications that describe all the possible behaviors of the system. In the system model (graph), nodes are states and transactions of the system are edges of the graph. For model checking, a model must be closed. Model-checking has three steps: 1. Modeling-system→model. 2. Specification-specification in natural language→properties (formal logic).
3. Verification-algorithm checks whether the desired system properties are met.
Our system is not very complex and has few states. Thus, this system can be checked by simply traversing the graph using a branching-time logic called computation tree logic (CTL). In CTL, the model is unwinding to show all possible computational paths. Then, the number of computational paths that indicate (hold) the desired property is checked.
In our case, CTL is explored to check the temporal logic.
The computational tree is made from a model transition diagram by unwinding the model into an We define several desired properties of the system as those intended to be true or false.
1. First desired property: CH will eventually be selected. AF (started→selected) true.
For all computation paths (AF) within the closed graph, the first desired property holds. Fig. 10 shows that one of the computational paths, shown as a dotted red line, has at least one state at which the property of selected is positive.
2. Second desired property: the same vehicle is not selected as CH every time. EF (started→accept) true.
For some computation paths (EF) within the closed graph, the second desired property holds. Fig. 11 shows that one of the computational paths, shown as a dotted red line, has states at which the property of not accepted holds. When one vehicle is not at an accepted state, another is chosen. Thus, the property holds.
3. Third desired property: CH is not selected every time. EF (selected⇸reject) true.
The computational path in Fig. 12 shows that the system starts properly in some cases, but no cars are around; cars are not meeting the criteria of CH selection, or all cars reject to be selected. The system is thus at the state of not selected.
The main desired properties (intended behavior) of our system hold, showing the correctness of our model. Formal verification proves that the design and behavior of DIAC are correct. In Section 5, we validate our system by simulation at the micro-level, and the results are benchmarked against popular approaches.

Simulation and results
For simulation, the popular simulators OM-Net++ (Varga, 2005) and SUMO (Krajzewicz et al., 2012) with Veins (Sommer et al., 2010) were integrated. The OMNeT++ model consists of hierarchically nested modules that communicate by passing messages to each other. OMNeT++ models are often referred to as networks. The SimuLTE tool (Virdis et al., 2015) was incorporated for heterogeneous network infrastructure behavior. We compared our DIAC algorithm with heterogeneous VANET-LTE based networks in terms of performance metrics such as CH duration, CM duration, CH changing rate, packet delivery ratio (PDR), and cost. Our simple DIAC was compared with the VANET version of approaches, namely, mobile gateway selection algorithm (MGSA) (Benslimane et al., 2011), MDMAC (Wolny, 2008), CMGM (Benslimane et al., 2011), and VMaSC (Ucar et al., 2016), to check the performance at the VANET level.
The heterogeneous architecture was built by integrating VANETs and LTE to obtain reliable simulation results. The simulation was repeated for various velocities of vehicles ranging from 10 to 35 m/s. To increase the confidence level, the simulation was carried out thrice at each velocity of the vehicles, and the average was taken for comparative analysis in a graphical form. The simulation tools were set up as shown in Fig. 13.
The simulation parameters and their values are described in Table 2 for both VANETs and LTE networks.
In Fig. 14, the CH duration of our proposed DIAC was compared against various velocities of vehicles. The CH duration of DIAC was much greater than those of the known approaches VMaSC, CMGM, MDMAC, and MGSA. This result shows the stability of DIAC due to its criteria of the same destination and the vehicle with stronger RSS being chosen as CH. The performances of all approaches decreased as the speed of the vehicle increased, but DIAC showed a minimum change in the performance compared to the other approaches.
The CM duration within the cluster of DIAC was also greater than those of the other approaches at various velocities of the vehicles (Fig. 15). DIAC maintained the performance as the velocity of the VMaSC, which is the most recent approach, showed a comparatively good performance. The increase in the CM duration shows the stability of the cluster because it increases the overall cluster life.
The CH changing rate of DIAC was the lowest among those of all other approaches (Fig. 16). A higher CH changing rate means that the cluster is more unstable. The CH changing rate decreases at lower velocities, but increased at higher velocities. The CH changed at a higher rate for MGSA, CMGM, and MDMAC than for the proposed DIAC.
Cluster breakup is lower in DIAC because it considers the strongest connection to LTE while selecting CH. Fig. 17 shows that the clustering overhead  of DIAC was lower than those of the other approaches (Fig. 17). The clustering overhead of DIAC increased at a much lower rate with the increase in the velocity of vehicles than the other schemes.
The PDR of DIAC was greater than those of the other approaches when compared at different vehicle velocities (Fig. 18). At the lower velocity of 10 m/s, PDR of DIAC was almost 97%, which is quite impressive. This is because only the vehicles that are interested in obtaining information from the server form the cluster, and at the same time, vehicles traveling towards the same destination bound them to be part of the cluster up to a certain destination ahead. This factor greatly improves the performance of the DIAC algorithm. Fig. 19 shows the average CH duration over the simulation time. This is to show how the responsibility of being a CH is fairly distributed among the vehicles participating in the clustering process, which results in a fair sharing of cost among the CMs. No single vehicle acted as CH all the time or paid the cost all the time. When there was only one CH, the averaged CH durations of all the schemes were almost the same, but for two CHs, the averaged time duration of the DIAC was less than those of the other schemes (Fig. 19).
This is because in the other schemes, once one vehicle is selected as CH in the first run, the same vehicle is selected as CH again in the second run. Even if another vehicle is selected, its CH duration is reduced, so the average remains high. We can see that when there were five overall CHs over the simulation time, the average CH duration of DIAC was around 20 s, showing a fair allocation of responsibility among the participating CMs.
Thus, in all metrics, DIAC showed better performance than the other popular approaches.

Conclusions
The integration of multiple information communication technologies helps us develop a heterogeneous vehicular clustering framework. The heterogeneity of the network and the cost of using the Internet are handled in our DIAC framework by keeping road traffic efficiency in mind. The proposed framework, design analysis, and simulation study were described in detail. The performances of the framework at the design level and implementation level were studied. A comparison of performance showed that the proposed framework works well compared with well-known approaches in this network scenario. This comprehensive evaluation of DIAC will promote the development of applications in related fields of research in the future.

Contributors
Iftikhar AHMAD and Rafidah Md NOOR designed the research. Rafidah Md NOOR and Zaheed AHMED designed the methodology and processed the data. Iftikhar AHMAD drafted the manuscript. Zaheed AHMED and Umm-e-HABIBA conducted the mathematical modeling and the simulations. Naveed AKRAM and Fausto Pedro GARCÍA MÁRQUEZ helped organize the manuscript. Iftikhar AHMAD, Naveed AKRAM, and Fausto Pedro GARCÍA MÁRQUEZ revised and finalized the paper.

Fig. 18 PDR against various velocities of vehicles
Packet delivery ratio (%)

Fig. 19 Averaged CH duration with different numbers of CHs
GARCÍA MÁRQUEZ declare that they have no conflict of interest.

Open access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.